Python real estate scraping library
Project description
PY-Realty
This is a library designed to scrape publicly available Real Estate listing information.
Specifically, all this library does is send the bare minimum/near bare minimum requests to the public web servers of supported listing websites. Meaning that is is effectively the same as visiting the website (such as Zillow) manually with a web browser with the exception that with PY-Realty you can avoid loading unnecessary JavaScript and media (images, etc.).
That being said scrapping large number of listings with no delays may result in the source (Zillow, Realtor.com, etc.) blocking your IP address (Zillow can probably figure out you're not manually looking at listings if you're querying thousands a minute.), although this has not been tested.
Status
VERSION: 0.1.0
Early development. The library is still in early development and is missing key features and may not be working entirely as expected.
This library is now somewhat usable (mileage may vary) and can be installed (and built) via pip, see installation instructions below.
Road Map
- Zillow
- Zillow Search Query
- Implementation
- Manual Testing
- Unit Testing
- Zillow Listing Details Parsing
- Property Sale Listing
- Implementation
- Manual Testing
- Unit Testing
- Property Rental Listing
- Implementation
- Manual Testing
- Unit Testing
- Apartment Rental Listing
- Implementation
- Manual Testing
- Unit Testing
- Property Sale Listing
- Zillow Search Query
- Realtor.com
- Realtor.com Sale Search Query
- Implementation
- Manual Testing
- Unit Testing
- Realtor.com Rental Search Query
- Implementation
- Manual Testing
- Unit Testing
- Property Sale Listing
- Implementation
- Manual Testing
- Unit Testing
- Property Rental Listing
- Implementation
- Manual Testing
- Unit Testing
- Realtor.com Sale Search Query
- LandWatch
- LandWatch search Query
- Implementation
- Manual Testing
- Unit Testing
- LandWatch Listing Details Parsing
- Implementation
- Manual Testing
- Unit Testing
- LandWatch search Query
- Apartments.com
- Apartments.com Search Query
- Implementation
- Manual Testing
- Unit Testing
- Apartments.com Listing Details Parsing
- Implementation
- Manual Testing
- Unit Testing
- Apartments.com Search Query
Regarding Unit Tests
Due to the nature of the data, it is impractical to have Unit tests check if data is extracted correctly. If testing against a source of Truth, say a listing, that listing may change over time. Additionally even though Zillow and Realtor.com may have similar data, if there is a mismatch it does not necessarily mean that scraping is done incorrectly, could just be a mismatch between the two websites.
Therefore Unit tests will check against unexpected behavior as opposed to extraction correctness. This meaning that they will check against exceptions and check if value types are correct. That being said the Unit Test criteria still needs to ensure reasonable robustness.
Usage
More detailed documentation will follow as development continues. However attributes and functions are well documented withing class and function doc-strings.
Zillow
Inside zillow the Query class is used to construct and send query request to Zillow's public API. Once the Query is configured as desired, the get_response method will return the search results.
The search results can then be scraped via the functions details.crapes_listing , details.scrape_listings, and details.lazy_scrape_listings.
Installation
- Clone this repository
- Within the repository folder, run
pip install .
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file py-realty-0.1.1.tar.gz.
File metadata
- Download URL: py-realty-0.1.1.tar.gz
- Upload date:
- Size: 73.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
696ab58d51342202cfb955a3b875d8f3483a9bab32c2973d256c47a0af866c81
|
|
| MD5 |
b7f16789c1d1b7678fe18b268524554a
|
|
| BLAKE2b-256 |
915b04f0b39dbf95a36f1d66c4edfd28da3d02bd34fda8857f3cc4a453af56be
|
File details
Details for the file py_realty-0.1.1-py3-none-any.whl.
File metadata
- Download URL: py_realty-0.1.1-py3-none-any.whl
- Upload date:
- Size: 67.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4c0af35576fb8a6f878cabd7c7735915d56d0f24c80dc3549a4d824415cbcf4c
|
|
| MD5 |
7dae1ef10dba1b1cc1d545d12e2c78ab
|
|
| BLAKE2b-256 |
4b8ea3520e9cd4594943ac1e8159f4f648da12ad5c3f2dffe9204d3fdea08744
|