Professional Instagram data collection toolkit with automation features

These details have not been verified by PyPI

Project links

Project description

InstaHarvest 🌾

Professional Instagram Data Collection Toolkit - A powerful and efficient library for Instagram automation, data collection, and analytics.

📖 Documentation | 🐛 Report Bug | 💡 Request Feature | 🤝 Contributing | 📋 Changelog

✨ Features

📊 Profile Statistics - Collect followers, following, posts count
🔗 Post & Reel Links - Intelligent scrolling and link collection
🏷️ Tagged Accounts - Extract tags from posts and reels
👥 Followers/Following - Collect lists with real-time output
💬 Direct Messaging - Send DMs with smart rate limiting
🤝 Follow/Unfollow - Manage following with rate limiting
⚡ Parallel Processing - Scrape multiple posts simultaneously
📑 Excel Export - Real-time data export to Excel
🌐 Shared Browser - Single browser for all operations
🔍 HTML Detection - Automatic structure change detection
📝 Professional Logging - Comprehensive logging system

🚀 Installation

Method 1: Install from PyPI (Recommended)

# Install the package
pip install instaharvest

# Install Playwright browser
playwright install chrome

Method 2: Install from GitHub (Latest Development Version)

Step 1: Clone the Repository

git clone https://github.com/mpython77/insta-harvester.git
cd insta-harvester

Step 2: Install Dependencies

# Install Python dependencies
pip install -r requirements.txt

# Install Playwright browser
playwright install chrome

Step 3: Install Package in Development Mode (Optional)

# Install as editable package
pip install -e .

OR simply use it without installation:

# Just make sure you're in the project directory
cd /path/to/insta-harvester

# Then run examples
python examples/save_session.py

🔧 Complete Setup Guide

Step 1: Verify Python Installation

# Check Python version (requires 3.8+)
python --version

# Should show: Python 3.8.0 or higher

Step 2: Install InstaHarvest

From GitHub:

git clone https://github.com/mpython77/insta-harvester.git
cd insta-harvester
pip install -r requirements.txt
playwright install chrome

From PyPI:

pip install instaharvest
playwright install chrome

Step 3: Create Instagram Session (REQUIRED!)

# Navigate to examples directory
cd examples

# Run session setup script
python save_session.py

This will:

Open Chrome browser
Navigate to Instagram
Let you log in manually
Save your session to instagram_session.json
All future scripts will use this session (no re-login needed!)

Important: Without this session file, the library won't work!

Step 4: Test Your Setup

# First, create your Instagram session (required!)
python examples/save_session.py

# Try the all-in-one interactive demo (recommended for learning)
python examples/all_in_one.py

# Or try production scraping
python examples/main_advanced.py

⚠️ IMPORTANT: Always Use ScraperConfig! All examples below use ScraperConfig() for proper timing and reliability. Even when using default settings, explicitly creating config is best practice. This prevents timing issues with popups, buttons, and rate limits. See Configuration Guide for customization options.

📖 Quick Start Examples

Example 1: Follow a User

from instaharvest import FollowManager
from instaharvest.config import ScraperConfig

# Create config (customize if needed)
config = ScraperConfig()

# Create manager with config
manager = FollowManager(config=config)

# Load session
session_data = manager.load_session()
manager.setup_browser(session_data)

# Follow someone
result = manager.follow("instagram")
print(result)  # {'success': True, 'status': 'followed', ...}

# Clean up
manager.close()

Example 2: Send Direct Message

from instaharvest import MessageManager
from instaharvest.config import ScraperConfig

# Create config
config = ScraperConfig()
manager = MessageManager(config=config)
session_data = manager.load_session()
manager.setup_browser(session_data)

# Send message
result = manager.send_message("username", "Hello from Python!")
print(result)

manager.close()

Example 3: Collect Followers

from instaharvest import FollowersCollector
from instaharvest.config import ScraperConfig

# Create config
config = ScraperConfig()
collector = FollowersCollector(config=config)
session_data = collector.load_session()
collector.setup_browser(session_data)

# Collect first 100 followers
followers = collector.get_followers("username", limit=100, print_realtime=True)
print(f"Collected {len(followers)} followers")

collector.close()

Example 4: All Operations in One Browser

from instaharvest import SharedBrowser
from instaharvest.config import ScraperConfig

# Create config for better reliability
config = ScraperConfig()

# One browser for everything!
with SharedBrowser(config=config) as browser:
    # Follow users
    browser.follow("user1")
    browser.follow("user2")

    # Send messages
    browser.send_message("user1", "Thanks for the follow!")

    # Collect followers
    followers = browser.get_followers("my_account", limit=50)
    print(f"Followers: {len(followers)}")

📁 Example Scripts

The examples/ directory contains ready-to-use scripts:

🔑 Session Setup (Required First)

python examples/save_session.py

Creates Instagram session (one-time setup, then reused automatically).

🎮 Interactive Demo

python examples/all_in_one.py

Interactive menu with ALL features:

Follow/Unfollow users
Send messages
Collect followers/following
Batch operations
Profile scraping

🚀 Production Scraping

python examples/main_advanced.py

Full automatic profile scraping:

Collects all post/reel links
Extracts data with parallel processing
Exports to Excel + JSON
Advanced diagnostics & error recovery

⚙️ Configuration Examples

python examples/example_custom_config.py

Shows how to customize configuration (delays, viewport, etc.).

📖 Documentation

1. Profile Scraping

from instaharvest import ProfileScraper
from instaharvest.config import ScraperConfig

config = ScraperConfig()
scraper = ProfileScraper(config=config)
session_data = scraper.load_session()
scraper.setup_browser(session_data)

profile = scraper.scrape('username')
print(f"Posts: {profile.posts}")
print(f"Followers: {profile.followers}")
print(f"Following: {profile.following}")

scraper.close()

2. Collect Followers/Following

from instaharvest import FollowersCollector
from instaharvest.config import ScraperConfig

# Create config
config = ScraperConfig()
collector = FollowersCollector(config=config)
session_data = collector.load_session()
collector.setup_browser(session_data)

# Collect first 100 followers
followers = collector.get_followers('username', limit=100, print_realtime=True)
print(f"Collected {len(followers)} followers")

# Collect following
following = collector.get_following('username', limit=50)

collector.close()

3. Follow/Unfollow Management

from instaharvest import FollowManager
from instaharvest.config import ScraperConfig

config = ScraperConfig()
manager = FollowManager(config=config)
session_data = manager.load_session()
manager.setup_browser(session_data)

# Follow a user
result = manager.follow('username')
print(result)  # {'status': 'success', 'action': 'followed', ...}

# Unfollow
result = manager.unfollow('username')

# Batch follow
usernames = ['user1', 'user2', 'user3']
results = manager.batch_follow(usernames)

manager.close()

4. Direct Messaging

from instaharvest import MessageManager
from instaharvest.config import ScraperConfig

config = ScraperConfig()
messenger = MessageManager(config=config)
session_data = messenger.load_session()
messenger.setup_browser(session_data)

# Send single message
result = messenger.send_message('username', 'Hello!')

# Batch send
usernames = ['user1', 'user2']
results = messenger.batch_send(usernames, 'Hi there!')

messenger.close()

5. Shared Browser (Recommended!)

Use one browser for all operations - Much faster!

from instaharvest import SharedBrowser
from instaharvest.config import ScraperConfig

# Create config
config = ScraperConfig()

with SharedBrowser(config=config) as browser:
    # All operations use the same browser instance
    browser.follow('user1')
    browser.send_message('user1', 'Hello!')
    followers = browser.get_followers('user2', limit=100)
    profile = browser.scrape_profile('user3')

    # No browser reopening! Fast and efficient!

6. Advanced: Parallel Processing

from instaharvest import InstagramOrchestrator, ScraperConfig

config = ScraperConfig(headless=True)
orchestrator = InstagramOrchestrator(config)

# Scrape with 3 parallel workers + Excel export
results = orchestrator.scrape_complete_profile_advanced(
    'username',
    parallel=3,        # 3 parallel browser tabs
    save_excel=True,   # Real-time Excel export
    max_posts=100
)

print(f"Scraped {len(results['posts_data'])} posts")

7. Post Data Extraction

from instaharvest import PostDataScraper
from instaharvest.config import ScraperConfig

config = ScraperConfig()
scraper = PostDataScraper(config=config)
session_data = scraper.load_session()
scraper.setup_browser(session_data)

# Scrape single post
post = scraper.scrape('https://www.instagram.com/p/POST_ID/')
print(f"Tagged: {post.tagged_accounts}")
print(f"Likes: {post.likes}")
print(f"Date: {post.timestamp}")

scraper.close()

🎯 Complete Workflow Example

from instaharvest import SharedBrowser
from instaharvest.config import ScraperConfig

# Create config
config = ScraperConfig()

with SharedBrowser(config=config) as browser:
    # 1. Get profile stats
    profile = browser.scrape_profile('target_user')
    print(f"Target has {profile['followers']} followers")

    # 2. Collect their followers
    followers = browser.get_followers('target_user', limit=50)
    print(f"Collected {len(followers)} followers")

    # 3. Follow them
    for follower in followers[:10]:  # Follow first 10
        result = browser.follow(follower)
        if result['status'] == 'success':
            print(f"✓ Followed {follower}")

    # 4. Send welcome message
    for follower in followers[:5]:
        browser.send_message(follower, "Thanks for following!")

📋 Requirements

Python 3.8+
Playwright (with Chrome browser)
pandas
openpyxl
beautifulsoup4
lxml

🔧 Session Setup

First-time setup - Save your Instagram session:

python examples/save_session.py

This will:

Open Chrome browser
Let you log in to Instagram manually
Save session to instagram_session.json
All future scripts will use this session (no re-login needed!)

📁 Project Structure

instaharvest/
├── instaharvest/          # Main package
│   ├── __init__.py        # Package entry point
│   ├── base.py            # Base scraper class
│   ├── config.py          # Configuration
│   ├── profile.py         # Profile scraping
│   ├── followers.py       # Followers collection
│   ├── follow.py          # Follow/unfollow
│   ├── message.py         # Direct messaging
│   ├── post_data.py       # Post data extraction
│   ├── shared_browser.py  # Shared browser manager
│   └── ...                # More modules
├── examples/              # Example scripts
├── README.md              # This file
├── setup.py               # Package setup
└── LICENSE                # MIT License

⚙️ Configuration

from instaharvest import ScraperConfig

config = ScraperConfig(
    headless=True,              # Run in headless mode
    viewport_width=1920,
    viewport_height=1080,
    default_timeout=30000,      # 30 seconds
    max_scroll_attempts=50,
    log_level='INFO'
)

🛡️ Best Practices

Use SharedBrowser - Reuses browser instance, much faster
Rate Limiting - Built-in delays to avoid Instagram bans
Session Management - Auto-refreshes session to prevent expiration
Error Handling - Comprehensive exception handling
Logging - Professional logging for debugging

🔧 Troubleshooting

Installation Issues

Error: "playwright command not found"

# Solution: Install Playwright first
pip install playwright
playwright install chrome

Error: "No module named 'instaharvest'"

# Solution 1: If installed from PyPI
pip install instaharvest

# Solution 2: If using GitHub clone
cd /path/to/insta-harvester
pip install -e .

# Solution 3: Run from project directory
cd /path/to/insta-harvester
python examples/save_session.py  # Works without installation

Error: "Could not find Chrome browser"

# Solution: Install Playwright browsers
playwright install chrome

Session Issues

Error: "Session file not found"

# Solution: Create session first (REQUIRED!)
cd examples
python save_session.py

# Then run your script
python all_in_one.py  # or any other script

Error: "Login required" or "Session expired"

# Solution: Re-create session
cd examples
python save_session.py

# Log in again when browser opens

Operation Errors

Error: "Could not unfollow @username"

Cause: Unfollow popup appears too slowly for the program

Solution: Increase popup delays in configuration

from instaharvest import FollowManager
from instaharvest.config import ScraperConfig

config = ScraperConfig(
    popup_open_delay=4.0,       # Wait longer for popup
    action_delay_min=3.0,
    action_delay_max=4.5,
)

manager = FollowManager(config=config)

See CONFIGURATION_GUIDE.md for detailed configuration options.

Error: "Could not follow @username"

Solution:

config = ScraperConfig(
    button_click_delay=3.0,
    action_delay_min=2.5,
    action_delay_max=4.0,
)

Error: "Instagram says 'Try again later'"

Cause: Instagram rate limiting - you're doing too much too quickly

Solution: Increase rate limiting delays

config = ScraperConfig(
    follow_delay_min=10.0,      # Wait 10-15 seconds between follows
    follow_delay_max=15.0,
    message_delay_min=15.0,     # Wait 15-20 seconds between messages
    message_delay_max=20.0,
)

Slow Internet Issues

Problem: You have slow internet, pages load slowly, getting errors

Solution:

from instaharvest.config import ScraperConfig

config = ScraperConfig(
    page_load_delay=5.0,        # Wait longer for pages
    popup_open_delay=4.0,       # Wait longer for popups
    scroll_delay_min=3.0,       # Slower scrolling
    scroll_delay_max=5.0,
)

# Use with any manager
from instaharvest import FollowManager
manager = FollowManager(config=config)

Getting Help

Check documentation:
- README.md - Main guide (this file)
- CONFIGURATION_GUIDE.md - Complete configuration reference
- examples/README.md - Example scripts guide
- CHANGELOG.md - Version history and changes
- CONTRIBUTING.md - How to contribute
Common issues:
- Unfollow errors → Increase popup_open_delay
- Slow internet → Increase all delays
- Rate limiting → Increase follow_delay_* and message_delay_*
Report bugs:
- GitHub Issues: https://github.com/mpython77/insta-harvester/issues
- See CONTRIBUTING.md for bug report guidelines
Email support:
- kelajak054@gmail.com

⚠️ Disclaimer

This tool is for educational purposes only. Make sure to:

Follow Instagram's Terms of Service
Respect rate limits
Don't spam or harass users
Use responsibly

The authors are not responsible for any misuse of this library.

📜 License

MIT License - see LICENSE file for details

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📞 Support

GitHub Issues: Report a bug
Documentation: Read the docs
Email: kelajak054@gmail.com

🎉 Acknowledgments

Built with:

Playwright - Browser automation
Pandas - Data processing
BeautifulSoup - HTML parsing

Made with ❤️ by Doston

Happy Harvesting! 🌾

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.16.0

Mar 16, 2026

2.15.1

Mar 13, 2026

2.14.8

Mar 11, 2026

2.14.0

Mar 11, 2026

2.13.0

Mar 11, 2026

2.12.1

Feb 28, 2026

2.12.0

Feb 27, 2026

2.11.0

Feb 27, 2026

2.10.1

Feb 27, 2026

2.10.0

Feb 26, 2026

2.9.2

Feb 26, 2026

2.9.1

Feb 26, 2026

2.9.0

Feb 26, 2026

2.8.0

Feb 26, 2026

2.7.3

Feb 26, 2026

2.7.2

Feb 26, 2026

2.7.1

Feb 9, 2026

2.7.0

Feb 9, 2026

2.6.0

Jan 16, 2026

2.5.5

Dec 17, 2025

2.5.4

Nov 23, 2025

2.5.3

Nov 23, 2025

2.5.2

Nov 23, 2025

This version

2.5.1

Nov 23, 2025

2.2.2

Mar 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

instaharvest-2.5.1.tar.gz (71.0 kB view details)

Uploaded Nov 23, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

instaharvest-2.5.1-py3-none-any.whl (70.2 kB view details)

Uploaded Nov 23, 2025 Python 3

File details

Details for the file instaharvest-2.5.1.tar.gz.

File metadata

Download URL: instaharvest-2.5.1.tar.gz
Upload date: Nov 23, 2025
Size: 71.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for instaharvest-2.5.1.tar.gz
Algorithm	Hash digest
SHA256	`9f4a99e37e73caf3fb06725664913337da7ed21713a74ec7083ba742866920a3`
MD5	`353674fccca6393bf5e6fdf422a343e1`
BLAKE2b-256	`81329fa15388e5b2f05d5e0dcd3b0386e56bde18b2c4d812e98b54a0d1171855`

See more details on using hashes here.

File details

Details for the file instaharvest-2.5.1-py3-none-any.whl.

File metadata

Download URL: instaharvest-2.5.1-py3-none-any.whl
Upload date: Nov 23, 2025
Size: 70.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for instaharvest-2.5.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7654052f3590a49ab4022ebeeceb1c381346f8283d90033fa791fb398bf16f54`
MD5	`da35805b2cef49c615fd936662917807`
BLAKE2b-256	`fa6764f0a8d422743e6460f20c4476a1a6be196e060ea473103d578f90d6d388`

See more details on using hashes here.

instaharvest 2.5.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

InstaHarvest 🌾

✨ Features

🚀 Installation

Method 1: Install from PyPI (Recommended)

Method 2: Install from GitHub (Latest Development Version)

Step 1: Clone the Repository

Step 2: Install Dependencies

Step 3: Install Package in Development Mode (Optional)

🔧 Complete Setup Guide

Step 1: Verify Python Installation

Step 2: Install InstaHarvest

Step 3: Create Instagram Session (REQUIRED!)

Step 4: Test Your Setup

📖 Quick Start Examples

Example 1: Follow a User

Example 2: Send Direct Message

Example 3: Collect Followers

Example 4: All Operations in One Browser

📁 Example Scripts

🔑 Session Setup (Required First)

🎮 Interactive Demo

🚀 Production Scraping

⚙️ Configuration Examples

📖 Documentation

1. Profile Scraping

2. Collect Followers/Following

3. Follow/Unfollow Management

4. Direct Messaging

5. Shared Browser (Recommended!)

6. Advanced: Parallel Processing

7. Post Data Extraction

🎯 Complete Workflow Example

📋 Requirements

🔧 Session Setup

📁 Project Structure

⚙️ Configuration

🛡️ Best Practices

🔧 Troubleshooting

Installation Issues

Error: "playwright command not found"

Error: "No module named 'instaharvest'"

Error: "Could not find Chrome browser"

Session Issues

Error: "Session file not found"

Error: "Login required" or "Session expired"

Operation Errors

Error: "Could not unfollow @username"

Error: "Could not follow @username"

Error: "Instagram says 'Try again later'"

Slow Internet Issues

Getting Help

⚠️ Disclaimer

📜 License

🤝 Contributing

📞 Support

🎉 Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details