Super lightweight Instagram web scraper for data analysis
Project description
instascrape: powerful Instagram data scraping toolkit
What is it?
instascrape is a lightweight Python package that provides an expressive and flexible API for scraping Instagram data. It is geared towards being a high-level building block on the data scientist's toolchain and can be seamlessly integrated and extended with industry standard tools for web scraping, data science, and analysis.
Key features
Here are a few of the things that instascrape
does well:
- Powerful, object-oriented scraping tools for profiles, posts, hashtags, reels, and IGTV
- Scrapes HTML, BeautifulSoup, and JSON
- Download content to your computer as png, jpg, mp4, and mp3
- Dynamically retrieve HTML embed code for posts
- Expressive and consistent API for concise and elegant code
- Designed for seamless integration with Selenium, Pandas, and other industry standard tools for data collection and analysis
- Lightweight; no boilerplate or configurations necessary
- The only hard dependencies are Requests and Beautiful Soup
- Proven to work as of January, 2021
Table of Contents
:computer: Installation
Minimum Python version
This library currently requires Python 3.7 or higher.
pip
Install from PyPI using
$ pip3 install insta-scrape
WARNING: make sure you install insta-scrape and not a package with a similar name!
:mag_right: Sample Usage
All top-level, ready-to-use features can be imported using:
from instascrape import *
instascrape uses clean, consistent, and expressive syntax to make the developer experience as painless as possible.
# Instantiate the scraper objects
google = Profile('https://www.instagram.com/google/')
google_post = Post('https://www.instagram.com/p/CG0UU3ylXnv/')
google_hashtag = Hashtag('https://www.instagram.com/explore/tags/google/')
# Scrape their respective data
google.scrape()
google_post.scrape()
google_hashtag.scrape()
print(google.followers)
print(google_post['hashtags'])
print(google_hashtag.amount_of_posts)
>>> 12262794
>>> ['growwithgoogle']
>>> 9053408
See the Scraped data points section of the Wiki for a complete list of the scraped attributes provided by each scraper.
:books: Documentation
The official documentation can be found on Read The Docs
:newspaper: Blog Posts
Check out blog posts on the official site or DEV for ideas and tutorials!
- Scrape data from Instagram with instascrape
- Visualizing Instagram engagement with instascrape
- Exploratory data analysis of Instagram using instascrape and Python
- Creating a scatter matrix of Instagram data using Python
- Downloading an Instagram profile's recent photos using Python
- Scraping 25,000 data points from Joe Biden's Instagram using instascrape
- Compare major tech Instagram page's with instascrape
- Tracking an Instagram posts engagement in real time with instascrape
- Dynamically generate embeddable Instagram HTML with instascrape
- Scraping an Instagram location tag with instascrape
- Scraping Instagram reels with instascrape
- Scraping IGTV data with instascrape
- Scraping 10,000 data points from Donald Trump's Instagram with Python
:pray: Contributing
All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome!
Feel free to open an Issue, check out existing Issues, or start a discussion.
Beginners to open source are highly encouraged to participate and ask questions if you're unsure what to do/where to start :heart:
:spider_web: Dependencies
:credit_card: License
This library operates under the MIT license.
:grey_question: Support
Check out the FAQ
Reach out to me if you want to connect or have any questions!
- Email:
- Twitter:
DISCLAIMER: With great power comes great responsibility. This is a research project and I am not responsible for how you use it. Independently, the library is designed to be responsible and respectful and it is up to you to decide what you do with it.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file insta-scrape-2.1.2.tar.gz
.
File metadata
- Download URL: insta-scrape-2.1.2.tar.gz
- Upload date:
- Size: 21.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2de91ac950b4104fce9cf4752d78fac455aa036f5113de150e14db1cbf4af9ef |
|
MD5 | f1de092c76dd1619d400f00e821a230a |
|
BLAKE2b-256 | 3b97b940e4afa24f53388607b7c997b75e3013d785f6cc3740c5d6ec41c018f0 |
File details
Details for the file insta_scrape-2.1.2-py3-none-any.whl
.
File metadata
- Download URL: insta_scrape-2.1.2-py3-none-any.whl
- Upload date:
- Size: 26.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9fb31a2cb573cfe0ed25d93c4875c0a9be48498a1b031a49d130bb5c6a43b754 |
|
MD5 | 19774b532a9e480fa5a64afd0b843c71 |
|
BLAKE2b-256 | b28c73305c1d502612d76f57f67edd90dd65cc6120b15db47fcb5bf54d38a3cc |