14 projects
apibackuper
apibackuper: a command-line tool and python library for API backuping
metacrafter
Metacrafter metadata classification tool
iterabledata
Iterable data processing Python library
metawarc
metawarc: a command-line tool for data extraction from WARC files (web archives)
undatum
undatum: a command-line tool for data processing. Brings CSV simplicity to JSON lines and BSON
spcrawler
spcrawler: A command-line tool to backup Sharepoint public installations data from open API endpoint
wparc
yspcrawler: a command-line tool to backup documents from Yandex.Disk public resources
qddate
Quick and dirty date parsing Python library to parse HTML dates really fast
docx2csv
Extracts tables from .docx files and saves them as csv or xlsx
ydiskarc
yydiskarc: a command-line tool to backup documents from Yandex.Disk public resources
lazyscraper
Lazy simple command line tool, a swiss knife for scraper writers. Automates scraping so much as possible
russiannames
Russian names parser, gender identification and processing tools
newsworker
Advanced news feeds extractor and finder library. Helps to automatically extract news from websites without RSS/ATOM feeds
filerepack
Repacks existing (un)compressed files for higher compression