Skip to main content
Avatar for lopuhin from gravatar.com

Username    lopuhin

Date joined   Joined on

29 projects

eli5

Last released on

Debug machine learning classifiers and explain their predictions

html-text

Last released on

Extract text from HTML

scrapy-rotating-proxies

Last released on

Rotating proxies for Scrapy

json-log-plots

Last released on

json-lines

Last released on

Reading JSON lines (jl) files, recover broken files

scurl

Last released on

python-crfsuite

Last released on

Python binding for CRFsuite

formasaurus

Last released on

Formasaurus tells you the types of HTML forms and their fields using machine learning

tensorboard_logger

Last released on

Log TensorBoard events without Tensorflow

MaybeDont

Last released on

A component that tried to avoid downloading duplicate content

webstruct

Last released on

A library for creating statistical NER systems that work on HTML data

vmprofit

Last released on

vmprof helpers

scrapy-cdr

Last released on

rl_wsd_labeled

Last released on

Labeled contexts of Russian polysemous words

scrapy-kafka-export

Last released on

Export Scrapy items to Kafka

PyPyDispatcher

Last released on

Multi-producer-multi-consumer signal dispatching mechanism

sklearn-crfsuite

Last released on

CRFsuite (python-crfsuite) wrapper which provides interface simlar to scikit-learn

proxy-middleware

Last released on

Scrapy http proxy middleware that gets proxy parameters from settings

autologin

Last released on

A utility for finding login links, forms and autologging into websites with a set of valid credentials.

soft404

Last released on

A classifier for detecting soft 404 pages

url-summary

Last released on

Display a summary of urls in a notebook

autologin-middleware

Last released on

A Scrapy middleware to use with autologin

scrapy-splash

Last released on

JavaScript support for Scrapy using Splash

scrapy-crawl-once

Last released on

Scrapy middleware which allows to crawl only new content

extract-html-diff

Last released on

Extract difference between two html pages

adblockparser

Last released on

Parser for Adblock Plus rules

rlwsd

Last released on

Word sense disambiguation library

autopager

Last released on

Detect and classify pagination links on web pages

arachnado

Last released on

Scrapy-based Web Crawler with an UI

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page