Facebook Scrapper
Project description
Facebook Post Scraper App
This app leverages streamlit
and selenium
to scrape public Facebook posts from designated accounts. Multiprocessing enables scraping from multiple accounts concurrently.
Key Features
Concurrent Scraping: Supports concurrent scraping from multiple Facebook accounts using multiprocessing, significantly speeding up the data collection process.
Important Note: Ensure the number of threads does not exceed your CPU cores to avoid performance issues.
Configurable Scraping: Allows users to define the number of posts to scrape and the number of concurrent scraping threads through the app’s interface.
Data Processing Pipeline: Processes collected data through a pipeline to prepare it for analysis.
Random Forest Model: Utilizes the processed data to create a Random Forest model for predictions.
Setup and Configuration:
Prerequisites
Python 3.9
or higher
streamlit
selenium
multiprocessing
pandas
sklearn
numpy
scikit-learn
scipy
webdriver-manager
matplotlib
Installation
1- Clone the repository:
git clone https://github.com/yourusername/facebook-post-scraper.git
cd facebook-post-scraper
2- Install the required packages:
pip install -r requirements.txt
Fake Accounts Setup
To use the app, you'll need to create at least two fake Facebook accounts.
Configure these accounts in the config.py
file as follows:
email_account1 = "facebook@email.com"
password_account1 = "password"
email_account2 = "facebook@email.com"
password_account2 = "password"
email_account3 = "facebook@email.com"
password_account3 = "password"
Running the App
To run the app, execute the following command:
streamlit run .\Scrap_and_predict_accounts.py
User Interface
The app features an intuitive interface with two sidebars:
Scrap and predict accounts: Select and configure accounts for prediction. Training Model: Select and configure accounts for training.
Visualization
- The app generates visualizations to compare data from fake and true accounts.
- This includes plots that help in understanding the distribution and characteristics of the scraped data.
Usage Instructions
1. Define Scraping Parameters:
- Set the number of posts to scrape.
- Set the number of concurrent threads in the app’s sidebar.
Run the Scraper:
- Initiate the scraping process by clicking the appropriate button.
Capabilities of the App for Training Regression Models and Generating Metrics
The app is able to train many regression models including:
- Logistic Regression
- K-Nearest Neighbors
- Random Forest
- Decision Tree
- Gradient Boosting
It also creates metrics such as:
- Accuracy
- Precision
- Recall
- F1
- Roc_auc
- Mean_metrics
-
View Results:
-
Once the scraping is complete, view the processed data and generated plots.
Model Training:
- The app will process the data through a pipeline and create the following models for prediction purposes:
- Logistic Regression
- K-Nearest Neighbors
- Random Forest
- Decision Tree
- Gradient Boosting
Using FastAPI for GET and POST requests.
Additionally, you can use FastAPI to retrieve true or false accounts that you want to scrape, to post the processed DataFrame, or to post the metrics to FastAPI.
In this screenshot, true accounts have been fetched from FastAPI.
In this screenshot, the DataFrame will be sent to FastAPI as JSON data.
In this screenshot, the metrics will be sent to FastAPI.
Install facebook-scrapper package
package-facebook-scrapper
is a Python package for scraping data from Facebook.
Installation
You can install the package using pip:
pip install package-facebook-scrapper
Final Notes
- Ensure you follow all guidelines and ethical considerations while scraping data from Facebook.
- Use this tool responsibly and only for purposes that comply with Facebook's policies and legal requirements.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file facebook_scrapper_ml-0.1.tar.gz
.
File metadata
- Download URL: facebook_scrapper_ml-0.1.tar.gz
- Upload date:
- Size: 16.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 04648411d1fa2b4abde18bb948715b1436d8e60bfa2fdeba3d3142a0bbcc2185 |
|
MD5 | f29ace135badd0d6b3c5d274efbeabb5 |
|
BLAKE2b-256 | 594d857707ee5c73a91854f0f9787f4556f2b316a9512b2c107e0b6750088561 |
File details
Details for the file facebook_scrapper_ml-0.1-py3-none-any.whl
.
File metadata
- Download URL: facebook_scrapper_ml-0.1-py3-none-any.whl
- Upload date:
- Size: 19.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1d98bcff65a9b64d6ceaacfd1dc5f0dc48aaf2a1940d29fd7c0cc33eabc92703 |
|
MD5 | 6376869233dfed7addda0d5c39f76b12 |
|
BLAKE2b-256 | d7a13cf7d5813b3fe142bf3a52d868f6f94d1fd56f80d454e5ce24ee37fa8674 |