Skip to main content

Super lightweight Instagram web scraper for data analysis

Project description

instascrape: powerful Instagram data scraping toolkit

Version Downloads Release License

Activity Dependencies Issues Code style: black

What is it?

instascrape is a lightweight Python package that provides an expressive and flexible API for scraping Instagram data. It is geared towards being a high-level building block on the data scientist's toolchain and can be seamlessly integrated and extended with industry standard tools for web scraping, data science, and analysis.

Key features

Here are a few of the things that instascrape does well:

  • Powerful, object-oriented scraping tools for profiles, posts, hashtags, reels, and IGTV
  • Scrapes HTML, BeautifulSoup, and JSON
  • Download content to your computer as png, jpg, mp4, and mp3
  • Dynamically retrieve HTML embed code for posts
  • Expressive and consistent API for concise and elegant code
  • Designed for seamless integration with Selenium, Pandas, and other industry standard tools for data collection and analysis
  • Lightweight; no boilerplate or configurations necessary
  • The only hard dependencies are Requests and Beautiful Soup
  • Proven to work as of January, 2021

Table of Contents


:computer: Installation

Minimum Python version

This library currently requires Python 3.7 or higher.

pip

Install from PyPI using

$ pip3 install insta-scrape

WARNING: make sure you install insta-scrape and not a package with a similar name!


:mag_right: Sample Usage

All top-level, ready-to-use features can be imported using:

from instascrape import *

instascrape uses clean, consistent, and expressive syntax to make the developer experience as painless as possible.

# Instantiate the scraper objects 
google = Profile('https://www.instagram.com/google/')
google_post = Post('https://www.instagram.com/p/CG0UU3ylXnv/')
google_hashtag = Hashtag('https://www.instagram.com/explore/tags/google/')

# Scrape their respective data 
google.scrape()
google_post.scrape()
google_hashtag.scrape()

print(google.followers)
print(google_post['hashtags'])
print(google_hashtag.amount_of_posts)
>>> 12262794
>>> ['growwithgoogle']
>>> 9053408

See the Scraped data points section of the Wiki for a complete list of the scraped attributes provided by each scraper.

:books: Documentation

The official documentation can be found on Read The Docs


:newspaper: Blog Posts

Check out blog posts on the official site or DEV for ideas and tutorials!


:pray: Contributing

All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome!

Feel free to open an Issue, check out existing Issues, or start a discussion.

Beginners to open source are highly encouraged to participate and ask questions if you're unsure what to do/where to start :heart:


:spider_web: Dependencies


:credit_card: License

This library operates under the MIT license.


:grey_question: Support

Check out the FAQ

Reach out to me if you want to connect or have any questions!


DISCLAIMER: With great power comes great responsibility. This is a research project and I am not responsible for how you use it. Independently, the library is designed to be responsible and respectful and it is up to you to decide what you do with it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

insta-scrape-2.1.2.tar.gz (21.5 kB view details)

Uploaded Source

Built Distribution

insta_scrape-2.1.2-py3-none-any.whl (26.6 kB view details)

Uploaded Python 3

File details

Details for the file insta-scrape-2.1.2.tar.gz.

File metadata

  • Download URL: insta-scrape-2.1.2.tar.gz
  • Upload date:
  • Size: 21.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.8.5

File hashes

Hashes for insta-scrape-2.1.2.tar.gz
Algorithm Hash digest
SHA256 2de91ac950b4104fce9cf4752d78fac455aa036f5113de150e14db1cbf4af9ef
MD5 f1de092c76dd1619d400f00e821a230a
BLAKE2b-256 3b97b940e4afa24f53388607b7c997b75e3013d785f6cc3740c5d6ec41c018f0

See more details on using hashes here.

File details

Details for the file insta_scrape-2.1.2-py3-none-any.whl.

File metadata

  • Download URL: insta_scrape-2.1.2-py3-none-any.whl
  • Upload date:
  • Size: 26.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.8.5

File hashes

Hashes for insta_scrape-2.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9fb31a2cb573cfe0ed25d93c4875c0a9be48498a1b031a49d130bb5c6a43b754
MD5 19774b532a9e480fa5a64afd0b843c71
BLAKE2b-256 b28c73305c1d502612d76f57f67edd90dd65cc6120b15db47fcb5bf54d38a3cc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page