Skip to main content

asynchronously scrapes eksisozluk threads and exports to csv or json

Project description

eksi-scraper

fast, asynchronous eksisozluk thread scraper. exports entries to csv or json.

installation

uv pip install eksi-scraper

or with pip:

pip install eksi-scraper

usage

eksi-scraper -t [thread1] [thread2] ... -f [inputFile.txt] -o (csv or json)

you can pass full URLs or just the slug (the part of the url after '/' and before '?'). for example:

eksi-scraper -t https://eksisozluk.com/murat-kurum--2582131 https://eksisozluk.com/ekrem-imamoglu--2577439 -o json

or using slugs:

eksi-scraper -t murat-kurum--2582131 ekrem-imamoglu--2577439 -o json

or from a file:

eksi-scraper -f threads.txt -o csv

where in threads.txt, threads are listed as URLs or slugs, one per line:

https://eksisozluk.com/murat-kurum--2582131
ekrem-imamoglu--2577439
...

options

flag description
-t, --threads threads to scrape (URLs or slugs)
-f, --file file with threads, one per line
-o, --output output format: csv (default) or json
-v, --verbose show per-page progress
-q, --quiet suppress all console output

console output

by default, eksi-scraper shows progress in the terminal:

[eksi-scraper] Scraping 2 threads
[murat-kurum--2582131] Found 47 pages
[ekrem-imamoglu--2577439] Found 112 pages
[murat-kurum--2582131] Done: 470 entries in 12.3s -> murat-kurum--2582131.json
[ekrem-imamoglu--2577439] Done: 1120 entries in 28.7s -> ekrem-imamoglu--2577439.json
[eksi-scraper] Finished: 2 threads, 1590 entries, 28.7s elapsed

use -v for per-page progress, -q for silent operation.

output

each entry has the following fields:

field description
Content the entry text, with full URLs restored
Author username of the author
Date Created original post date
Last Changed last edit date, or null if never edited

contact

reach out to me at ceylaniberkay@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eksi_scraper-0.2.2.tar.gz (7.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

eksi_scraper-0.2.2-py3-none-any.whl (8.8 kB view details)

Uploaded Python 3

File details

Details for the file eksi_scraper-0.2.2.tar.gz.

File metadata

  • Download URL: eksi_scraper-0.2.2.tar.gz
  • Upload date:
  • Size: 7.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for eksi_scraper-0.2.2.tar.gz
Algorithm Hash digest
SHA256 4c03461abb22912c21a03cb472c4d6156f31d106adcbacbe34a39892a9903a1a
MD5 37f045a599818d0447499252b5b9668e
BLAKE2b-256 1422eaac978aa4b8b85b0dd1855400ccbe916d5f71e0cc2ac96af0d1cc0423a6

See more details on using hashes here.

File details

Details for the file eksi_scraper-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: eksi_scraper-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 8.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for eksi_scraper-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 aa097dfe2f46a976872a77ba476bdeb0f6c1e2ed19df9acd96976a36de0a69b0
MD5 949e9db1e91f0dc6ca86042b23a8e4c0
BLAKE2b-256 cc378155f0c1f5d14ce7cdf6db8bb12be5b91f43014ae7eca9059d67034b1850

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page