Python client for scraping Google Flights using the ScrapingBee web scraping API
Project description
google-flights-scraper-api
A Python client for the Google Flights scraper API powered by ScrapingBee. It turns a public Google Flights page into clean data you can load into pandas, a database, or a price-monitoring job, without you running a single headless browser or proxy.
Google does not ship a public Google Flights API, and the page builds its fares with JavaScript behind anti-bot protection. This package sends the work to ScrapingBee, which renders the page, rotates residential proxies, and hands back rendered HTML or structured JSON.
Built on the ScrapingBee web scraping API
If you searched for any of these, you are in the right place:
- google flights api
- google flights scraper
- google flights scraper api
Why a Google Flights scraper API instead of plain requests
A direct requests.get() against Google Flights returns an empty shell. The fares, durations, and stop counts are injected by JavaScript after load, and Google quickly blocks datacenter IPs with consent walls and challenges.
A managed google flights api layer removes that whole class of problems:
- Executes the page JavaScript in a real headless browser
- Rotates residential proxies so requests are not blocked
- Skips the Google consent interstitial
- Returns structured JSON when you supply extraction rules
You write the query and read the data. The infrastructure is someone else's problem.
Installation
pip install google-flights-scraper-api
Requires Python 3.8+ and requests.
Quick start
from google_flights_scraper_api import GoogleFlightsScraper
scraper = GoogleFlightsScraper(api_key="YOUR_API_KEY")
html = scraper.search(query="Flights to London from New York")
print(html[:500])
Grab a free key first. ScrapingBee gives 1,000 credits with no card required at scrapingbee.com.
How it works
Every call hits the ScrapingBee HTML API:
https://app.scrapingbee.com/api/v1/
The client builds the request with documented parameters: the Google Flights url, render_js=true, premium_proxy=true, and the Google CONSENT cookie so the consent page is skipped. You never assemble the query string yourself.
Structured data with AI extraction
Rather than parse Google's rotating markup, pass ai_extract_rules and get JSON back. The schema you define becomes the response shape.
from google_flights_scraper_api import GoogleFlightsScraper
scraper = GoogleFlightsScraper(api_key="YOUR_API_KEY")
data = scraper.search(
query="Flights to Tokyo from San Francisco",
ai_extract_rules={
"flights": {
"description": "every flight result on the page",
"type": "list",
"output": {
"airline": "name of the airline",
"price": "ticket price in dollars",
"departure_time": "departure time",
"arrival_time": "arrival time",
"duration": "total trip duration",
"stops": "number of stops",
},
},
},
)
for flight in data.get("flights", []):
print(flight["airline"], flight["price"], flight["stops"])
The description, type, and output keys follow ScrapingBee's documented extraction schema. type accepts string, list, number, boolean, and item.
Waiting for fares to load
Google Flights sometimes streams results in after first paint. Use a js_scenario to wait or scroll before the page is captured. A scenario runs up to 40 seconds.
html = scraper.search(
query="Flights to Rome from Boston",
js_scenario={
"instructions": [
{"wait": 3000},
{"scroll_y": 1000},
{"wait": 1000},
],
},
)
Configuration options
| Argument | API parameter | Description |
|---|---|---|
query |
url (?q=) |
Natural-language flight search appended to the Google Flights URL |
url |
url |
A full Google Flights URL, used instead of query |
render_js |
render_js |
Execute page JavaScript (default True) |
premium_proxy |
premium_proxy |
Residential proxies (default True) |
stealth_proxy |
stealth_proxy |
Stealth tier for the hardest blocks |
country_code |
country_code |
ISO country code, needs premium_proxy=True |
ai_extract_rules |
ai_extract_rules |
Natural-language extraction, returns JSON, adds 5 credits |
extract_rules |
extract_rules |
CSS or XPath extraction rules |
js_scenario |
js_scenario |
Script waits, scrolls, and clicks before capture |
wait |
wait |
Fixed wait in milliseconds |
screenshot_full_page |
screenshot_full_page |
Return a full-page screenshot as bytes |
json_response |
json_response |
Wrap the response in a JSON envelope |
What you get back
- Default: the rendered HTML of the Google Flights page as a string.
- With
ai_extract_rulesorextract_rules: parsed JSON matching the schema you defined. - With
screenshot_full_page=True: raw PNG bytes.
Production use cases
This google flights scraper fits cleanly into:
- Fare-tracking jobs that alert when a route drops below a threshold
- Competitive pricing dashboards for travel agencies and OTAs
- Route and demand research across markets
- Data pipelines feeding a warehouse or a notebook for analysis
Pricing
ScrapingBee bills successful requests. A request that fails with HTTP 500 is not charged. Scraping a Google URL through the HTML API is a flat rate, and toggling JS does not change it:
- Classic or Premium proxy: 20 credits per request
- Stealth proxy: 75 credits per request
ai_extract_rules: adds 5 credits
Current rate card: scrapingbee.com/pricing.
FAQ
Is there an official Google Flights API? No. Google does not offer a public Google Flights API for fares, so a scraper API that renders the public page is the practical route. This package wraps that approach.
Why not parse the HTML myself?
You can, but Google Flights uses obfuscated, rotating class names. Defining ai_extract_rules is more durable than maintaining selectors that break every few weeks.
Can I target a specific country or currency view?
Yes. Set country_code together with premium_proxy=True. The country code has no effect without a premium proxy.
Does it handle the Google consent page?
Yes. The client sends the Google CONSENT cookie by default. Disable it with skip_consent=False.
Documentation
License
MIT
Disclaimer
This is an unofficial Python client built on top of the ScrapingBee web scraping API. It is not affiliated with ScrapingBee or Google. Scrape only public pages, and comply with Google's terms of service and applicable data-protection law.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file google_flights_scraper_api-0.0.1.tar.gz.
File metadata
- Download URL: google_flights_scraper_api-0.0.1.tar.gz
- Upload date:
- Size: 6.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e9b741cf6ba28d1d8cedbcd03887dc62d33c475bdebb436709e8411a89617b4e
|
|
| MD5 |
b2710561e115c39838ac2685c0393575
|
|
| BLAKE2b-256 |
0cf89ed9c96cf088afd1bfbcfbd1e2e930a5136b4f30c1c68b52a3cf12a7d2fc
|
File details
Details for the file google_flights_scraper_api-0.0.1-py3-none-any.whl.
File metadata
- Download URL: google_flights_scraper_api-0.0.1-py3-none-any.whl
- Upload date:
- Size: 6.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
92e744182cbc12caf6789aa5a8944dfe3ba66830c962ea5e47726338cab3e15f
|
|
| MD5 |
f2a8b67916490bd3b9f31dbfaa3d6df7
|
|
| BLAKE2b-256 |
c83889b14ca05a4a5072d0dbbadfacaa26c425ef4250ebaa5dcb20ed2c0619b3
|