Skip to main content

Scrape and parse Facebook Ad Library result pages.

Project description

Facebook Ad Library Scraper

A Python scraper for Facebook Ad Library search results. It builds Ad Library URLs, opens them with Selenium/undetected-chromedriver, scrolls through results, saves optional HTML snapshots, parses ad cards, removes duplicates, and exports ads to JSON and CSV.

This project depends on Facebook's public page markup, so parser behavior may need updates if Facebook changes its HTML.

Table of Contents

Features

  • Build Facebook Ad Library URLs from simple filters
  • Scrape with Chrome using undetected-chromedriver
  • Save HTML snapshots while scrolling
  • Parse ad metadata, body text, page info, images, CTAs, and destination URLs
  • Export results as ads.json and/or ads.csv
  • Re-parse saved HTML snapshots without opening a browser

Installation

pip install facebook-ad-library-scraper

For local development:

git clone https://github.com/samfastone/facebook-ad-library-scraper.git
cd facebook-ad-library-scraper
pip install -e .

You also need Google Chrome installed.

Quick Start

CLI

facebook-ad-library-scraper --query "mpesa" --country KE --output-dir output --headless

This writes:

  • output/ads.json
  • output/ads.csv
  • output/html_snapshots/*.html

Python

from pathlib import Path

from facebook_ad_library_scraper.core import ScraperConfig, build_url, scrape

url = build_url("mpesa", country="KE")

config = ScraperConfig(
    url=url,
    output_dir=Path("output"),
    max_scrolls=10,
    headless=True,
    save_json=True,
    save_csv=True,
)

ads = scrape(config)
print(f"{len(ads)} ads found")

CLI Usage

Use either a full Facebook Ad Library URL:

facebook-ad-library-scraper --url "https://www.facebook.com/ads/library/?active_status=active&ad_type=all&country=KE&q=mpesa"

Or build the URL from filters:

facebook-ad-library-scraper \
  --query "loan" \
  --country KE \
  --active-status active \
  --media-type image \
  --languages en sw \
  --start-date-min 2026-05-01 \
  --output-dir output

Parse existing snapshots only:

facebook-ad-library-scraper --output-dir output --parse-only

CLI Options

Option Description
--url Full Facebook Ad Library URL. When provided, filter flags are ignored.
--query, -q Search keyword or phrase.
--country Two-letter country code. Default: CD.
--active-status active, inactive, or all. Default: active.
--ad-type Ad type filter. Default: all.
--media-type all, image, meme, image_and_meme, video, or none.
--search-type keyword_unordered or keyword_exact_phrase.
--is-targeted-country Sets is_targeted_country=true in the URL.
--sort-mode total_impressions or relevancy_monthly_grouped.
--sort-direction desc or asc.
--languages One or more language codes, e.g. --languages en fr.
--page-ids One or more Facebook page IDs.
--start-date-min Earliest ad start date, YYYY-MM-DD.
--start-date-max Latest ad start date, YYYY-MM-DD.
--output-dir Directory for exports and snapshots. Default: ad_library_output.
--max-scrolls Maximum scroll attempts. Default: 50.
--scroll-pause Seconds to wait between scrolls. Default: 3.0.
--snapshot-every Save HTML every N scrolls. Default: 5.
--headless Run Chrome without a visible browser window.
--chrome-version Chrome major version to pass to the driver.
--parse-only Parse saved snapshots without scraping again.

Python API

build_url(...)

Builds a Facebook Ad Library search URL.

from facebook_ad_library_scraper.core import build_url

url = build_url(
    query="mpesa",
    country="KE",
    active_status="active",
    media_type="all",
)

ScraperConfig

Runtime settings for a scrape.

from pathlib import Path
from facebook_ad_library_scraper.core import ScraperConfig

config = ScraperConfig(
    url="https://www.facebook.com/ads/library/?...",
    output_dir=Path("output"),
    max_scrolls=50,
    headless=False,
)

scrape(config)

Runs a full browser scrape, parses ads, removes duplicates, saves configured outputs, and returns a list of ad dictionaries.

from facebook_ad_library_scraper.core import scrape

ads = scrape(config)

parse_ads(html)

Parses one HTML string and returns ads found in it.

from facebook_ad_library_scraper.core import parse_ads

ads = parse_ads(html)

parse_from_dir(html_dir)

Parses all .html files in a snapshot directory.

from pathlib import Path
from facebook_ad_library_scraper.core import parse_from_dir

ads = parse_from_dir(Path("output/html_snapshots"))

save_json(ads, path) and save_csv(ads, path)

Save parsed ads to disk.

Parameters

build_url

Parameter Default Description
query Required Search keyword or phrase.
country CD Two-letter country code.
active_status active Ad status: active, inactive, or all.
ad_type all Facebook ad type filter.
media_type all Media filter such as image, video, or all.
search_type keyword_unordered Keyword matching mode.
is_targeted_country False Whether to target ads aimed at the selected country.
sort_mode total_impressions Sort field used by Facebook.
sort_direction desc Sort order: desc or asc.
languages None List of content language codes.
page_ids None List of Facebook page IDs.
start_date_min None Earliest start date, YYYY-MM-DD.
start_date_max None Latest start date, YYYY-MM-DD.

ScraperConfig

Parameter Default Description
url Built-in sample URL Facebook Ad Library URL to scrape.
output_dir ad_library_output Directory for exports and snapshots.
max_scrolls 50 Maximum number of scroll attempts.
scroll_pause 3.0 Base pause between scrolls, in seconds.
snapshot_every 5 Save one HTML snapshot every N scrolls.
headless False Run Chrome in headless mode.
chrome_version None Chrome major version for undetected-chromedriver.
store_html True Save HTML snapshots to disk.
save_json True Write ads.json.
save_csv False Write ads.csv.
wait_timeout 20 Seconds to wait for ad cards on initial load.

Output

Each parsed ad may include:

Field Description
library_id Facebook Ad Library ID.
status Current ad status text.
start_date Start date shown by Facebook.
page_name Advertiser page name.
page_url Advertiser page URL.
body Main ad text.
destination_url Decoded outbound URL when available.
image_url First image URL.
images All image URLs found for the ad.
cta_domain CTA domain text.
cta_headline CTA headline text.
cta_button CTA button text.
cta_texts Raw CTA text blocks.
has_multiple_versions Whether Facebook shows multiple versions.
has_whatsapp_cta Whether WhatsApp appears in the CTA or body.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

facebook_ad_library_scraper-0.1.0.tar.gz (12.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

facebook_ad_library_scraper-0.1.0-py3-none-any.whl (11.9 kB view details)

Uploaded Python 3

File details

Details for the file facebook_ad_library_scraper-0.1.0.tar.gz.

File metadata

  • Download URL: facebook_ad_library_scraper-0.1.0.tar.gz
  • Upload date:
  • Size: 12.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.13 {"installer":{"name":"uv","version":"0.11.13","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for facebook_ad_library_scraper-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e1ad487ef3969cffaab1b6a91db181fa0d5178e694b79328babca2bc795f4cb8
MD5 2091cece1e9d75acc179c091935f10fe
BLAKE2b-256 bd0be40dbfb3097645336266ac9639619cd1aec31621185297c0db2931159312

See more details on using hashes here.

File details

Details for the file facebook_ad_library_scraper-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: facebook_ad_library_scraper-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 11.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.13 {"installer":{"name":"uv","version":"0.11.13","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for facebook_ad_library_scraper-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e3aff69db5b1b00062e054c510250a56db04f710e40b8cf1c6b108460abec25d
MD5 f27fc3df598ceba3051b302a320342d1
BLAKE2b-256 3e76d2b017d595a240f1e9e4841889720d87ce75f724551ce423f7baf0fdb09b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page