hackernews_scraper

Python library for retrieving comments and stories from HackerNews

These details have not been verified by PyPI

Project links

Homepage

Project description

hackernews-scraper
==================

Scrape [hacker news](https://news.ycombinator.com) comments and posts
using the [Algolia API](http://hn.algolia.com/api/).

Usage
=====

```python
from hackernews-scraper import CommentScraper

CommentScraper.getComments(since=1394039447)
```

The above will return a generator that will yield one comment at a time.
It will keep on going until there are no more comments to fetch, or until
it reaches the 50 pages limit set by hacker news. In the latter case, a
`TooManyItemsException` will be raised.

If the hacker news API response is missing any required fields, the scraper
will raise `KeyError`.

Response format
===============

Comments:
```
{
'author': u'dhmholley',
'comment_id': u'7531026',
'comment_text': u'Are people still blowing this whistle?...',
'created_at': u'2014-04-04T12:57:38.000Z',
'parent_id': 7530853,
'points': 1,
'story_id': None,
'story_title': None,
'story_url': None,
'timestamp': 1396616258,
'title': None,
'url': None
}
```

Stories:
```
{
'author': u'sethco',
'created_at': u'2014-04-04T12:56:23.000Z',
'objectID': None,
'points': 1,
'story_text': 1,
'timestamp': 1396616183,
'title': u'Opower IPO today',
'url': u'http://www.businesswire.com/news/home/20140403006541/en#.Uz4cbq1dVih'
}
```

Testing
=======

You need to have [httpretty](https://github.com/gabrielfalcao/HTTPretty)
and [factory-boy](https://github.com/rbarrois/factory_boy) installed.

Run `nosetests` in the root folder or the `tests` folder.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

1.0.2

Jul 21, 2014

1.0.1

Jul 11, 2014

1.0.0

Jul 11, 2014

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hackernews_scraper-1.0.2.tar.gz (5.4 kB view details)

Uploaded Jul 21, 2014 Source

File details

Details for the file hackernews_scraper-1.0.2.tar.gz.

File metadata

Download URL: hackernews_scraper-1.0.2.tar.gz
Upload date: Jul 21, 2014
Size: 5.4 kB
Tags: Source
Uploaded using Trusted Publishing? No

File hashes

Hashes for hackernews_scraper-1.0.2.tar.gz
Algorithm	Hash digest
SHA256	`83e78a533c0db1e4a5288c2d55efa302c5523072c6fbbbafefb00c8b0b51ef3d`
MD5	`71cab268b526b0997e4e5fdceb744e5b`
BLAKE2b-256	`e442248201768b9bceef4fb7e463d9377e45d9c434ef8ac255df857486729383`

See more details on using hashes here.

hackernews_scraper 1.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes