Super lightweight Instagram web scraper for data analysis
Project description
instascrape: super lightweight Instagram scraping toolkit
What is it?
instascrape is an incredibly lightweight set of tools geared towards scraping Instagram data. It makes no assumptions about your project and is instead designed for flexibility and developer productivity. It is excellent for for the seasoned data scientist trying to quickly get an idea of a pages engagement as well as beginners looking to explore web scraping and the beauty of Python for the very first time.
Table of Contents
Example of Instagram likes per post data scraped using instascrape (this repository and its author(s) are not affiliated with Real Python)
Installation
pip
Install from PyPI using
$ pip3 install insta-scrape
Clone
Clone right from Github to your local machine using
$ git clone https://github.com/chris-greening/instascrape.git
and install required dependencies using
$ pip3 install -f requirements.txt
Documentation
The official documentation can be found on Read The Docs
Features
Profile
Representation of an Instagram profile. Calling static_load takes care of requesting and scraping static HTML regarding the given URL or username. Profile.static_load scrapes 36 data points including
followers: int
following: int
posts: int
profile_pic_url: str
is_business_account: bool
is_verified: bool
#etc.
Sample code:
from instascrape import Profile
url = 'https://www.instagram.com/gvanrossum/'
post = Profile(url)
post.static_load()
Post
Representation of a single Instagram post. Calling static_load takes care of requesting and scraping static HTML regarding the given URL or post shortcode. Post.static_load scrapes 29 data points including
likes: int
amount_of_comments: int
hashtags: List[str]
tagged_users: List[str]
caption: str
location: str
#etc.
Sample code:
from instascrape import Post
url = 'https://www.instagram.com/p/CFcSLyBgseW/'
post = Post(url)
post.static_load()
Hashtag
Representation of an Instagram hashtag page. Calling static_load takes care of requesting and scraping static HTML regarding the given URL or hashtag name. Hashtag.static_load scrapes 10 data points including
amount_of_posts: int
name: str
is_following: bool
allow_following: bool
#etc.
Sample code:
from instascrape import Hashtag
url = 'https://www.instagram.com/explore/tags/python/'
hashtag = Hashtag(url)
hashtag.static_load()
License
Support
Reach out to me if you have questions or ideas!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.