Skip to main content

Python package to scrap facebook's pages front end with no limitations

Project description

Facebook Page Scraper

Maintenance PyPI license Python >=3.6.9

No registration, No need of API key, No limitation on number of requests. Import the library and Just Do It !

Prerequisites

  • Internet Connection
  • Python 3.6+
  • Chrome or Firefox browser installed on your machine


Installation:

Installing from source:

git clone https://github.com/shaikhsajid1111/facebook_page_scraper 

Inside project's directory

python3 setup.py install

Installing with pypi

pip3 install facebook-page-scraper


How to use?

#import Facebook_scraper class from facebook_page_scraper
from facebook_page_scraper import Facebook_scraper

#instantiate the Facebook_scraper class

page_name = "facebookai"
posts_count = 10
browser = "firefox"

facebook_ai = Facebook_scraper(page_name,posts_count,browser)

Parameters for Facebook_scraper(page_name,posts_count,browser) class

Parameter Name Parameter Type Description
page_name string name of the facebook page
posts_count integer number of posts to scrap, if not passed default is 10
browser string which browser to use, either chrome or firefox. if not passed,default is chrome



Done with instantiation?. Let the scraping begin!


For post's data in JSON format:

#call the scrap_to_json() method

json_data = facebook_ai.scrap_to_json()
print(json_data)

Output:

{
    "1739843239525955": {
        "name": "Facebook AI",
        "shares": 43,
        "reactions": {
            "likes": 129,
            "loves": 11,
            "wow": 8,
            "cares": 0,
            "sad": 0,
            "angry": 0,
            "haha": 0
        },
        "reaction_count": 148,
        "comments": 3,
        "content": "We’re transitioning the Visdom project to the team at FOSSASIA. Visdom is a flexible tool for creating, organizing, and sharing visualizations of live, rich data. It aims to facilitate visualization of remote data with an emphasis on supporting scientific experimentation. We’re excited to see where the team, in collaboration with the developer and user community, take the project.",
        "posted_on": "2021-01-05T17:22:54",
        "video": "https://www.facebook.com/facebookai/videos/1739843239525955",
        "image": [
            "https://scontent-bom1-2.xx.fbcdn.net/v/t1.0-0/s526x296/135871741_1739843246192621_8564947121610203331_o.png?_nc_cat=108&ccb=2&_nc_sid=da1649&_nc_ohc=Hk7peLe8e-cAX_xLejp&_nc_ht=scontent-bom1-2.xx&_nc_tp=30&oh=856a17109cbc4a6657dbb68564dfc568&oe=60291FC7"
        ],
        "post_url": "https://www.facebook.com/facebookai/posts/1739843239525955"
    }, ...

}

Output Structure for JSON format:

{
    "id": {
        "name": string,    
        "shares": integer,
        "reactions": {
            "likes": integer,
            "loves": integer,
            "wow": integer,
            "cares": integer,
            "sad": integer,
            "angry": integer,
            "haha": integer
        },
        "reaction_count": integer,
        "comments": integer,
        "content": string,
        "video" : string,
        "image" : list,
        "posted_on": datetime,  //string containing datetime in ISO 8601
        "post_url": string
    }
}



For saving post's data directly to CSV file

#call scrap_to_csv(filename,directory) method


filename = "data_file"  #file name without CSV extension,where data will be saved
directory = "E:\data" #directory where CSV file will be saved
facebook_ai.scrap_to_csv(filename,directory)

content of data_file.csv:

id,name,shares,likes,loves,wow,cares,sad,angry,haha,reactions_count,comments,content,posted_on,video,image,post_url
1791700921006853,Facebook AI,45,150,19,5,0,0,0,0,174,8,"Facebook AI has built TimeSformer, an entirely new architecture for video understanding. It’s the first that’s based exclusively on the self-attention mechanism used in Transformers.  TimeSformer outperforms the state of the art while being more efficient than 3D ConvNets for video.",2021-03-15T17:14:30,,https://scontent-bom1-2.xx.fbcdn.net/v/t39.2365-6/p540x282/156274680_471569777206221_706631440205169419_n.jpg?_nc_cat=110&ccb=1-3&_nc_sid=eaa83b&_nc_ohc=eyfETEUuHzQAX8DqwMU&_nc_ht=scontent-bom1-2.xx&tp=6&oh=2e9c6490fe3ad19a398905b3b615c88b&oe=6075FFE4,https://www.facebook.com/FacebookAI/posts/1791700921006853?__xts__%5B0%5D=68.ARCfsjOoZa0yc0TPws1koBr9ezS44Xf6Up04CqOhWnoDqrO35NdIdgjNSTWBrsUtm_y7MamZTjc_-p2rTobXe5WvxWd_eywuSzt98B7Vaj5hobF4OTZhe7VRgVJJY1wxEeAJf4nCZSs1tF1gWJJ0s5pPUGMmJsfD1UM5a3eERo-2t1JnTBHOSYs9Xsj5fV0iL-FiWAms_2-9KNRGqoojg9KfSAlffh_qxL8ztgznqC1sxfcU6MwAqdPN2va_T8cez29ZvJ1Er1j26VR7pnpWGyTMuW5wMrNxC-pz_8pVls8uk0iDramIOA&__tn__=-R
...



Parameters for scrap_to_csv(filename,directory) method.

Parameter Name Parameter Type Description
filename string name of the CSV file where post's data will be saved
directory string directory where CSV file have to be stored.



Keys of the outputs:

Key Type Description
id string Post Identifier(integer casted inside string)
name string Name of the page
shares integer share count of post
reactions dictionary dictionary containing reactions as keys and its count as value. Keys => ["likes","loves","wow","cares","sad","angry","haha"]
reaction_count integer total reaction count of post
comments integer comments count of post
content string content of post as text
video string URL of video present in that post
image list python's list containing URLs of all images present in the post
posted_on datetime time at which post was posted(in ISO 8601 format)
post_url string URL for that post


Privacy

This scraper only scrapes public data available to unauthenticated user and does not holds the capability to scrap anything private.



Tech

This project uses different libraries to work properly.



If you encounter anything unusual please feel free to create issue here

LICENSE

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

facebook_page_scraper-0.1.8.tar.gz (17.8 kB view details)

Uploaded Source

Built Distribution

facebook_page_scraper-0.1.8-py3.9.egg (27.9 kB view details)

Uploaded Source

File details

Details for the file facebook_page_scraper-0.1.8.tar.gz.

File metadata

  • Download URL: facebook_page_scraper-0.1.8.tar.gz
  • Upload date:
  • Size: 17.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.22.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.9.0

File hashes

Hashes for facebook_page_scraper-0.1.8.tar.gz
Algorithm Hash digest
SHA256 aaef263fd42fe82a7a8bddace1f4c6ff1a16227798f4b346e8a416e109f07002
MD5 2359dab3f075b13ab9fc4272d878ecf9
BLAKE2b-256 a438dd23c5c063d0fe0a4376e569a60ce5c00f997c753bd8894fc1211fa22910

See more details on using hashes here.

File details

Details for the file facebook_page_scraper-0.1.8-py3.9.egg.

File metadata

  • Download URL: facebook_page_scraper-0.1.8-py3.9.egg
  • Upload date:
  • Size: 27.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.22.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.9.0

File hashes

Hashes for facebook_page_scraper-0.1.8-py3.9.egg
Algorithm Hash digest
SHA256 536a33276bad044d06a68894ac6a58ad3b34f85044010dc04ab0a91396dba76a
MD5 49f9b582ab1affdbf2898fe6616fa640
BLAKE2b-256 1e4500e9d2c18ada465d0ab2af7801d9030ae60a82612859af02c72700cab024

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page