Python package for scraping apartment data from Craigslist.
Project description
craigapts
Python package for scraping apartment data from Craigslist.
Install
Stable version from PyPI:
pip install craigapts
Dev version from GitLab:
pip install git+https://gitlab.com/everetr/craigapts.git
Examples
from craigapts import CLSearch
GEO = "newjersey"
QUERY = "'no section 8'"
# get basic data available on search result pages
c1 = CLSearch(GEO, QUERY)
print(c1.data)
# get details by navigating to each individual ad
c2 = CLSearch(GEO, QUERY, deep=True)
print(c2.data)
Changelog
2020.3.6.1
-
Scraper now gets ads' post IDs from ad URLs. Before, a deep scrape was required to get post IDs.
-
Data columns are rearranged so
post_id
anddatetime_scr
appear first. -
datatime_scr
now contains seconds, so it will differ across pages ifdeep=False
or across ads ifdeep=True
.
2020.2.23.1
- First release.
TODO
- Replace
requests
dependency withurllib3
? Because minimalism. - Let user specify which variables, how many pages, and how many ads to scrape
- CLI
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
craigapts-2020.3.6.1.tar.gz
(18.5 kB
view hashes)
Built Distribution
Close
Hashes for craigapts-2020.3.6.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c9c790e35edb806f919ff71b29f596967a7c9a211f476b8ec6c86caef5914528 |
|
MD5 | 7cc9fc6f563baddd3e7ca8e6c71386c6 |
|
BLAKE2b-256 | 94041f9e39321fbdc27fdb3f6af2d6d060a37b37c25bf467f46f6b3065ea4564 |