Skip to main content

Fetch Instagram captions (with optional OCR) and convert them to a normalized table via Gemini.

Project description

insta2table

CI PyPI version

A tiny toolkit to:

  1. Crawl Instagram links from a text file, extract captions and (optionally) OCR text from images → output.csv
  2. Convert those rows into a clean single-row table per link using Geminiresult.csv

Credentials & keys via env:

  • IG_USER, IG_PASS (optional): for Instaloader login (reduces 403s)
  • GOOGLE_API_KEY: required for Gemini

Quickstart

# 1) (Recommended) Create and activate a virtualenv
python -m venv .venv
# Linux/macOS
source .venv/bin/activate
# Windows (PowerShell)
# .venv\Scripts\Activate.ps1

# 2) Install (with OCR & Gemini extras if you need them)
pip install -e .[ocr,genai]

# 3) Prepare links
cp examples/links.txt .

# 4) Crawl -> output.csv
export IG_USER="your_user"
export IG_PASS="your_pass"
insta2csv --links links.txt --out output.csv

# 5) Process with Gemini -> result.csv
export GOOGLE_API_KEY="your_key"
insta2table --in output.csv --out result.csv

CLI

insta2csv --links links.txt --out output.csv [--no-ocr]
insta2table --in output.csv --out result.csv

Notes

  • OCR requires Tesseract installed on your system if you opt in.
  • Instagram scraping without login can trigger 403s. Supplying IG credentials helps.
  • Gemini formatting expects a single-row Markdown table per input.

Publishing

We use trusted publishing from GitHub to PyPI (no API token needed).

  1. Create the project on PyPI (only first time) and enable 'Manage publishing' with GitHub OIDC for your repo.
  2. In GitHub: create a new Release with a tag like v0.1.0.
  3. The Publish to PyPI workflow will build and upload the release automatically.
  4. Alternatively (manual): make build then twine upload dist/* (requires a PyPI token).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

insta2table-0.1.0.tar.gz (9.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

insta2table-0.1.0-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file insta2table-0.1.0.tar.gz.

File metadata

  • Download URL: insta2table-0.1.0.tar.gz
  • Upload date:
  • Size: 9.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for insta2table-0.1.0.tar.gz
Algorithm Hash digest
SHA256 68a54f52f9592460585246bf6d4c218979df2cda9aa732f95b3ce030cdf333d0
MD5 46cf62024d88393dd6bde68f5f246c76
BLAKE2b-256 ac7b9c2e072d8bcb32bc25e5c766acfeea173901410fbea4483eb0611ec85c2f

See more details on using hashes here.

Provenance

The following attestation bundles were made for insta2table-0.1.0.tar.gz:

Publisher: publish.yml on Sk1499/insta2table

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file insta2table-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: insta2table-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for insta2table-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 494f267a461023a5b10ed729ba51a3a0d7cd1592c9b6903488c9b0884206d2f1
MD5 1b8d973bbea617532b3ac1b2adbb52b6
BLAKE2b-256 2d13518ba870a4fbaaaf4edc7ba10e9c5205ab98848600d5a5c9f638e7c00d1a

See more details on using hashes here.

Provenance

The following attestation bundles were made for insta2table-0.1.0-py3-none-any.whl:

Publisher: publish.yml on Sk1499/insta2table

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page