A library to scrape and analyze Google Flight data
Project description
google-flight-scraper
A library to scrape and analyze Google Flight data
Installing
Clone the repository into your local drive. This project was developed with Poetry as the dependency manager. Simply run poetry install
in the cloned directory to install the required dependencies.
Using the library
Quickstart
First define our travel plans:
from google_flight_scraper.query import TravelPlan, FlightQuery
seattle_vacation = TravelPlan(
origin="SFO",
destination="SEA",
departure_dates=[
f"2023-10-{day}" for day in range(2, 12)
],
num_adults=2,
num_children=1
)
queries = FlightQuery.from_travel_plans(seattle_vacation)
This will set up the library to search for all one-way flights from San Francisco to Seattle/Tacoma between October 2, 2023 and October 12, 2023. Pass the travel plans into FlightQuery
for it to construct all the pertinent queries. Any number of travel plans can be provided.
Lastly, we define the webdriver we'd like to use to scrape. A Driver
class is provided for easily switching between different browsers and their associated webdrivers.
from selenium import webdriver
from google_flight_scraper.driver import Driver
from google_flight_scraper.scraper import Scraper
with Driver(webdriver.Firefox, webdriver.FirefoxOptions) as driver:
df = Scraper(timeout_seconds=5)(driver, queries)
Scraped data will be in the df
DataFrame for further analysis.
One-way vs return trip behaviour
There is a difference in the way that the library queries return trips and returning one-way flights (return trip, two one-way tickets). The reason for doing this is that the latter may have cheaper flight combinations. We can set up both searches by providing multiple TravelPlan
to FlightQuery
:
# ...
FlightQuery.from_travel_plans(
# Searches for return trip tickets on Google Flights
TravelPlan(
"SFO",
"SEA",
departure_dates=[f"2023-12-{d}" for d in range(1, 10)],
return_dates=[f"2023-12-{d}" for d in range(11, 20)],
),
# Searches to-and-fro one-way tickets on Google Flights
TravelPlan(
"SFO",
"SEA",
departure_dates=[f"2023-12-{d}" for d in range(1, 10)],
),
TravelPlan(
"SEA",
"SFO",
departure_dates=[f"2023-12-{d}" for d in range(11, 20)],
),
)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for google_flight_scraper-0.2.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | d4b755afd1d60e83071cccca634c00b6b0965d19f8d93e53c704d466d855a230 |
|
MD5 | d4b318b42785dad40919544b9373e172 |
|
BLAKE2b-256 | 8fcf3a83165ef2beb143d5f6a2e339143b18cf163225b54c1e002338cc508f1d |
Hashes for google_flight_scraper-0.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 124c3225cdf35d3928056bd3406c373f8ef2d14946f41adab48688cc20d92809 |
|
MD5 | 0e3d585dc7b7e693b5d895236f719924 |
|
BLAKE2b-256 | 104e043b5570a82b4d2b4f24fed76ca623cf0e4c8bc951f1c53f483f55da61e8 |