Skip to main content

Super lightweight Instagram web scraper for data analysis

Project description

instascrape logo

instascrape: Instagram scraping for humans

What is it?

instascrape is a powerful, lightweight Python library for scraping Instagram data and content with no configurations necessary! It is designed with flexibility and developer productivity in mind so you can stop wasting valuable time preparing Instagram data and just start analyzing it :muscle:

Official website

Version Code style: black Release License

Downloads Activity Dependencies Issues

Example showing tech profile scrapes

Key features

  • :muscle: Powerful, object-oriented scraping tools
  • :dancer: Flexibly determines whether you want to scrape HTML, JSON, BeautifulSoup, or request and scrape the URL itself
  • :floppy_disk: Download content to your computer as png, jpg, mp4, and mp3
  • :art: Dynamically retrieve HTML embed code for posts
  • :musical_score: Expressive and consistent API for concise and elegant code
  • :bar_chart: Designed for seamless integration with Selenium, Pandas, and other industry standard tools for data collection and analysis
  • :hammer: Lightweight: you don't have to build a hammer factory when all you need is the hammer
  • :spider_web: The only hard dependencies are Requests and Beautiful Soup; no more worrying about configurations or webdrivers
  • :watch: Proven to work as of December, 2020

Table of Contents


:computer: Installation

Minimum Python version

This library currently requires Python 3.7 or higher.

pip

Install from PyPI using

$ pip3 install insta-scrape

WARNING: make sure you install insta-scrape and not a package with a similar name!


:mag_right: Sample Usage

All top-level, ready-to-use features can be imported using:

from instascrape import *

instascrape uses clean, consistent, and expressive syntax to make the developer experience as painless as possible.

# Instantiate the scraper objects 
google = Profile('https://www.instagram.com/google/')
google_post = Post('https://www.instagram.com/p/CG0UU3ylXnv/')
google_hashtag = Hashtag('https://www.instagram.com/explore/tags/google/')

# Scrape their respective data 
google.scrape()
google_post.scrape()
google_hashtag.scrape()

After being scraped, relevant attributes can be accessed with dot or bracket notation

print(google.followers)
print(google_post['hashtags'])
print(google_hashtag.amount_of_posts)
>>> 12262794
>>> ['growwithgoogle']
>>> 9053408

:books: Documentation

The official documentation can be found on Read The Docs :newspaper:


:newspaper: Blog Posts

Check out blog posts on the official site or DEV for ideas and tutorials!


:pray: Contributing

All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome!

Feel free to open an Issue, check out existing Issues, or start a discussion.

Beginners to open source are highly encouraged to participate and ask questions if you're unsure what to do/where to start :heart:


:spider_web: Dependencies

Instascrape primarily relies on two third-party libraries for requesting and scraping Instagram HTML content:

  1. Requests: HTTP requests
  2. BeautifulSoup: Scraping and parsing HTML data.

The rest of its functionality is provided directly from Python 3's standard library for unobtrusive code under the hood with little to no overhead.


:credit_card: License

MIT


:grey_question: Support

Check out the FAQ

Reach out to me if you have questions or ideas!


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

insta-scrape-1.3.4.tar.gz (18.9 kB view details)

Uploaded Source

File details

Details for the file insta-scrape-1.3.4.tar.gz.

File metadata

  • Download URL: insta-scrape-1.3.4.tar.gz
  • Upload date:
  • Size: 18.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.1 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.0

File hashes

Hashes for insta-scrape-1.3.4.tar.gz
Algorithm Hash digest
SHA256 f5eacb6c0ff0faa5a45d80e27e35f9b88ff68d9e597ac4ccbc6f3583ef26a019
MD5 00a00608909ef646d9fd6ebe21d0fc85
BLAKE2b-256 c03f3d95c2b74da049d31a03534204618dbc2be08d30e034294be64d5c754a0e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page