Skip to main content

Zillow scraper in Python

Project description

Zillow scraper in Python

Overview

This project is an open-source tool developed in Python for extracting product information from Zillow. It's designed to be easy to use, making it an ideal solution for developers looking for Zillow product data.

Features

  • Full search support
  • Extracts detailed product information from Zillow
  • Implemented in Python just because it's popular
  • Easy to integrate with existing Python projects

Important

  • Use rotating residential proxies, zillow will block if you make multiple requests with the same IP,

Install

$ pip install pyzill

Examples

import pyzill
import json
ne_lat = 38.602951833355434
ne_long = -87.22283859375
sw_lat = 23.42674607019482
sw_long = -112.93084640625
pagination = 1
#pagination is for the list that you see at the right when searching
#you don't need to iterate over all the pages because zillow sends the whole data on mapresults at once on the first page
#however the maximum result zillow returns is 500, so if mapResults is 500
#try playing with the zoom or moving the coordinates, pagination won't help because you will always get at maximum 500 results
pagination = 1
proxy_url = pyzill.parse_proxy("[proxy_ip or proxy_domain]","[proxy_port]","[proxy_username]","[proxy_password]")
results_sold = pyzill.sold(pagination, 
              search_value="miami",
              min_beds=1,max_beds=1,
              min_bathrooms=None,max_bathrooms=None,
              min_price=10000,max_price=None,
              ne_lat=ne_lat,ne_long=ne_long,sw_lat=sw_lat,sw_long=sw_long,
              zoom_value=5,
              proxy_url=proxy_url)
              
results_sale = pyzill.for_sale(pagination, 
              search_value="",
              min_beds=None,max_beds=None,
              min_bathrooms=3,max_bathrooms=None,
              min_price=None,max_price=None,
              ne_lat=ne_lat,ne_long=ne_long,sw_lat=sw_lat,sw_long=sw_long,
              zoom_value=10,
              proxy_url=proxy_url)

results_rent = pyzill.for_rent(pagination, 
              search_value="",is_entire_place=False,is_room=True,
              min_beds=1,max_beds=None,
              min_bathrooms=None,max_bathrooms=None,
              min_price=10000,max_price=None,
              ne_lat=ne_lat,ne_long=ne_long,sw_lat=sw_lat,sw_long=sw_long,
              zoom_value=15,
              proxy_url=proxy_url)
jsondata_sold = json.dumps(results_sold)
jsondata_sale = json.dumps(results_sale)
jsondata_rent = json.dumps(results_rent)
f = open("./jsondata_sold.json", "w")
f.write(jsondata_sold)
f.close()
f = open("./jsondata_sale.json", "w")
f.write(jsondata_sale)
f.close()
f = open("./jsondata_rent.json", "w")
f.write(jsondata_rent)
f.close()

For homes

import pyzill
import json
property_url="https://www.zillow.com/homedetails/858-Shady-Grove-Ln-Harrah-OK-73045/339897685_zpid/"
proxy_url = pyzill.parse_proxy("[proxy_ip or proxy_domain]","[proxy_port]","[proxy_username]","[proxy_password]")
data = pyzill.get_from_home_url(property_url,proxy_url)
jsondata = json.dumps(data)
f = open("details.json", "w")
f.write(jsondata)
f.close()
import pyzill
import json
property_id=2056016566
proxy_url = pyzill.parse_proxy("[proxy_ip or proxy_domain]","[proxy_port]","[proxy_username]","[proxy_password]")
data = pyzill.get_from_home_id(property_id,proxy_url)
jsondata = json.dumps(data)
f = open("details.json", "w")
f.write(jsondata)
f.close()

For departments

import pyzill
import json
property_url="https://www.zillow.com/apartments/kissimmee-fl/the-nexus-at-overbrook/9DSWrh/"
proxy_url = pyzill.parse_proxy("[proxy_ip or proxy_domain]","[proxy_port]","[proxy_username]","[proxy_password]")
data = pyzill.get_from_deparment_url(property_url,proxy_url)
jsondata = json.dumps(data)
f = open("details.json", "w")
f.write(jsondata)
f.close()
import pyzill
import json
property_id="CgKZT4"
proxy_url = pyzill.parse_proxy("[proxy_ip or proxy_domain]","[proxy_port]","[proxy_username]","[proxy_password]")
data = pyzill.get_from_deparment_id(property_id,proxy_url)
jsondata = json.dumps(data)
f = open("details.json", "w")
f.write(jsondata)
f.close()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyzill-1.0.3.tar.gz (7.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyzill-1.0.3-py3-none-any.whl (8.0 kB view details)

Uploaded Python 3

File details

Details for the file pyzill-1.0.3.tar.gz.

File metadata

  • Download URL: pyzill-1.0.3.tar.gz
  • Upload date:
  • Size: 7.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for pyzill-1.0.3.tar.gz
Algorithm Hash digest
SHA256 0e3e947229f0efc9fcb973d48a81eb79cd0408ccf65b232b325e48a136dc9021
MD5 2ee0c72655ab1f087035db79f896c933
BLAKE2b-256 6ba54042b8c8f38379b889755a51e79b05f9743a4dbcc3face80e9a6ebf14af9

See more details on using hashes here.

File details

Details for the file pyzill-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: pyzill-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 8.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for pyzill-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 3e3bab66d23dedfbac3ed11c13f535ddce28434298ab981f6197dc48c025eae6
MD5 1bdcf2c02ff85c5b3ce3a809e50b5b75
BLAKE2b-256 c22dcd602205af00992803516d8feec85ae319625d23cf9954ce4892400f3a29

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page