Skip to main content

A module designed to automate the extraction of follower counts and post details from a public Facebook page.

Project description

Licence Python Wheel Latest Releases Stars Forks Issues PRs Downloads Last commit Workflow PyPI Maintained OS Documentation Status

Support this package by donating here! ➡️ Buy Me a Coffee Paypal

MetaDataScraper

MetaDataScraper is a Python package designed to automate the extraction of information like follower counts, and post details & interactions from a public Facebook page, in the form of a list. It uses Selenium WebDriver for web automation and scraping.
The module provides two classes: LoginlessScraper and LoggedInScraper. The LoginlessScraper class does not require any authentication or API keys to scrape the data. However, it has a drawback of being unable to access some Facebook pages. The LoggedInScraper class overcomes this drawback by utilising the credentials of a Facebook account (of user) to login and scrape the data.

Installation

You can install MetaDataScraper using pip:

pip install MetaDataScraper

Make sure you have Python 3.x and pip installed.

Usage

To use MetaDataScraper, follow these steps:

  1. Import the LoginlessScraper or the LoggedInScraper class:

    from MetaDataScraper import LoginlessScraper, LoggedInScraper
    
  2. Initialize the scraper with the Facebook page ID:

    page_id = "your_target_page_id"
    scraper = LoginlessScraper(page_id)
    email = "your_facebook_email"
    password = "your_facebook_password"
    scraper = LoggedInScraper(page_id, email, password)
    
  3. Scrape the Facebook page to retrieve information:

    result = scraper.scrape()
    
  4. Access the scraped data from the result dictionary:

    print(f"Followers: {result['followers']}")
    print(f"Post Texts: {result['post_texts']}")
    print(f"Post Likes: {result['post_likes']}")
    print(f"Post Shares: {result['post_shares']}")
    print(f"Is Video: {result['is_video']}")
    print(f"Video Links: {result['video_links']}")
    

Features

  • Automated Extraction: Automatically fetches follower counts, post texts, likes, shares, and video links from Facebook pages.
  • Comprehensive Data Retrieval: Retrieves detailed information about each post, including text content, interaction metrics (likes, shares), and multimedia (e.g., video links).
  • Flexible Handling: Adapts to diverse post structures and various types of multimedia content present on Facebook pages, like post texts or reels.
  • Enhanced Access with Logged-In Scraper: Overcomes limitations faced by anonymous scraping (loginless) by utilizing Facebook account credentials for broader page access.
  • Headless Operation: Executes scraping tasks in headless mode, ensuring seamless and non-intrusive data collection without displaying a browser interface.
  • Scalability: Supports scaling to handle large volumes of data extraction efficiently, suitable for monitoring multiple Facebook pages simultaneously.
  • Dependency Management: Utilizes Selenium WebDriver for robust web automation and scraping capabilities, compatible with Python 3.x environments.
  • Ease of Use: Simplifies the process with straightforward initialization and method calls, facilitating quick integration into existing workflows.

Dependencies

  • selenium
  • webdriver_manager

License

This project is licensed under the Apache Software License Version 2.0 - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metadatascraper-1.0.4.tar.gz (13.6 kB view details)

Uploaded Source

Built Distribution

MetaDataScraper-1.0.4-py3-none-any.whl (13.3 kB view details)

Uploaded Python 3

File details

Details for the file metadatascraper-1.0.4.tar.gz.

File metadata

  • Download URL: metadatascraper-1.0.4.tar.gz
  • Upload date:
  • Size: 13.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for metadatascraper-1.0.4.tar.gz
Algorithm Hash digest
SHA256 731e6b94d85a32c3db76f8e4f4356b95429e027378230a59f07d3e97ca9d3bd2
MD5 49ee7a4708f10db32d4d9ff4c2ae170f
BLAKE2b-256 a00c3e9706544734db509f31a8b33b0640ed8b421c123934eab50e78b7041622

See more details on using hashes here.

File details

Details for the file MetaDataScraper-1.0.4-py3-none-any.whl.

File metadata

File hashes

Hashes for MetaDataScraper-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 66e968f19b6d3e7e1127ffda11241d88c2a18bec860fcb165f99e797eb09b880
MD5 83e45ff49dc9bfb8f6e7c31445db7622
BLAKE2b-256 0b54383b83694bfb524ee64be197c73ba5de96b7801730694b7b409d55da463d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page