This is a pre-production deployment of Warehouse, however changes made here WILL affect the production instance of PyPI.
Latest Version Dependencies status unknown Test status unknown Test coverage unknown
Project Description

Introduction

This Yet-Another-Mechanize implementation aims to give the developper those new features:

  • It can be proxified
  • It does proxy balancing
  • It fakes user agent by default
  • It does not handle robots by default
  • There is a ‘real” modification which uses an underlying moz repl server to control a distance firefox instance

It uses sys.prefix/etc/config.ini with a part [collective.anonymousbrowser] for its settings:

[collective.anonymousbrowser]
proxies=
; for a mozrepl server
host = localhost
port = 4242
firefox = /path/To/Firefox
ff-profile = /path/to/FFprofile

This file is generated at the first run without proxies. It s your own to feed it with some open proxies.

Of course, it can take another configuration file, please see the __init__ method.

TODO

  • lxml integration, maybe steal z3c.etestbrowser

Tests and Handbook

First, we need to instantiate the sources where we come from:

>>> from collective.anonymousbrowser.browser import Browser, FF2_USERAGENT

User Agent

Oh, my god, we have a brand new user agent by default:

>>> br = Browser()
>>> br.open('http://localhost:45678')
>>> FF2_USERAGENT in br.contents
True
>>> br2 = Browser('http://localhost:45678')
>>> FF2_USERAGENT in br2.contents
True

Proxy mode

But, we want to be anonymous, and we ll set a proxy To define those proxies, just just a config.ini file like:

[collective.anonymousbrowser]
proxies =
    host1:port
    host2:port

When the browser has many proxies defined, it will circly through those ones. But, it will not use the same host indefinitivly, just set the proxy_max_use argument:

>>> from StringIO import StringIO
>>> from tempfile import mkstemp
>>> __, config = mkstemp()
>>> open(config, 'w').write("""[collective.anonymousbrowser]
... proxies =
...     127.0.0.1:45675
...     127.0.0.1:45676
...     127.0.0.1:45677
...     127.0.0.1:45678
...     127.0.0.1:45679
...     """)
>>> b = Browser(config=config, proxy_max_use=3)
>>> b._config._sections
{'collective.anonymousbrowser': {'__name__': 'collective.anonymousbrowser', 'proxies': '\n127.0.0.1:45675\n127.0.0.1:45676\n127.0.0.1:45677\n127.0.0.1:45678\n127.0.0.1:45679'}}
>>> b.proxies
['127.0.0.1:45675', '127.0.0.1:45676', '127.0.0.1:45677', '127.0.0.1:45678', '127.0.0.1:45679']
>>> b.proxified
True
>>> b.open('http://localhost:45678')
>>> 'Host: localhost:45678' in b.contents
True
>>> b._lastproxy['count'] == 1 and b._lastproxy['proxy'] in [0,1,2,3,4]
True

We can have a normal unproxified browser too

>>> b1 = Browser(proxify=False)
>>> b1.proxified
False

Next thing to verify is that we have our pseudo-random loop running First thing is we will choose 2 times the 2nd proxy, then the third And of course, we will set the mocker to change the proxy at each row.:

>>> import mocker
>>> import random
>>> mocked = mocker.Mocker()
>>> custom_random_int = mocked.replace('random.randint')
>>> custom_random_int(0, 4)
<mocker.Mock ...
>>> mocked.result(2)
>>> custom_random_int(0,1)
<mocker.Mock ...
>>> mocked.result(0)
>>> custom_random_int(0, 4)
<mocker.Mock ...
>>> mocked.result(2)
>>> custom_random_int(0,1)
<mocker.Mock ...
>>> mocked.result(0)
>>> custom_random_int(0, 4)
<mocker.Mock ...
>>> mocked.result(2)
>>> custom_random_int(0,1)
<mocker.Mock ...
>>> mocked.result(0)
>>> custom_random_int(0, 4)
<mocker.Mock ...
>>> mocked.result(3)
>>> custom_random_int(0,1)
<mocker.Mock ...
>>> mocked.result(0)
>>> custom_random_int(0, 4)
<mocker.Mock ...
>>> mocked.result(4)
>>> custom_random_int(0,1)
<mocker.Mock ...
>>> mocked.result(0)
>>> custom_random_int(0, 4)
<mocker.Mock ...
>>> mocked.result(2)
>>> custom_random_int(0,1)
<mocker.Mock ...
>>> mocked.result(0)
>>> custom_random_int(0, 4)
<mocker.Mock ...
>>> mocked.result(1)
>>> custom_random_int(0,1)
<mocker.Mock ...
>>> mocked.result(0)
>>> mocked.replay()
>>> b = Browser('http://localhost:45678', config=config, proxy_max_use=3)
>>> b.open('http://localhost:45678')
>>> b._lastproxy
{'count': 1, 'proxy': 2}
>>> b.open('http://localhost:45678')
>>> b._lastproxy
{'count': 2, 'proxy': 2}
>>> b.open('http://localhost:45678')
>>> b._lastproxy
{'count': 3, 'proxy': 2}
>>> b.open('http://localhost:45678')
>>> b._lastproxy
{'count': 1, 'proxy': 0}
>>> b.open('http://localhost:45678')
>>> b._lastproxy
{'count': 1, 'proxy': 3}
>>> b.open('http://localhost:45678')
>>> b._lastproxy
{'count': 1, 'proxy': 4}
>>> b.open('http://localhost:45678')
>>> b._lastproxy
{'count': 1, 'proxy': 2}
>>> b.open('http://localhost:45678')
>>> b._lastproxy
{'count': 1, 'proxy': 1}
>>> mocked.restore()

If the proxies are dead, we remove them from the list:

>>> __, config = mkstemp()
>>> open(config, 'w').write("""[collective.anonymousbrowser]
... proxies =
...     127.0.0.1:35675
...     127.0.0.1:35676
...     127.0.0.1:35677
...     127.0.0.1:45678
...     """)
>>> mybrowser = Browser(config=config, proxy_max_use=3)
>>> mybrowser.proxies
['127.0.0.1:35675', '127.0.0.1:35676', '127.0.0.1:35677', '127.0.0.1:45678']
>>> mybrowser.open('http://localhost:45678')
>>> mybrowser.proxies
['127.0.0.1:45678']
>>> mybrowser.proxies = ['127.0.0.1:34785']
>>> mybrowser.open('http://localhost:45678')
Traceback (most recent call last):
...
Exception: There are no valid proxies left

The loop is recursion protected. If we return always the same host, so the chooser cannot choose anything else. It will loop until it crashes or it handle the recursion:

>>> def randomint(a,b):
...     return 2
>>> import random; random.randint = randomint
>>> b2 = Browser(config=config, proxy_max_use=3)
>>> b2.proxy_max_use
3
>>> b2._lastproxy['count']
0
>>> b2.chooseProxy()
'...
>>> b2._lastproxy['count']
1
>>> b2.chooseProxy()
'...
>>> b2._lastproxy['count']
2
>>> b2.chooseProxy()
'...
>>> b2._lastproxy['count']
3
>>> b2.chooseProxy()
'...
>>> b2.chooseProxy()
Ho, seems we got the max wills to choose, something has gone wrong
'127.0.0.1:35675'

Real Browser implementation throught mozrepl

TODO:

  • Handle configuration with mozrunner for:

    • user agent faking
    • proxies management

First, we need to instantiate the sources where we come from:

>>> from collective.anonymousbrowser.real import *

In the section [collective.anonymousbrowser] of your configuration file you can add those parameters:

  • host : host of firefox mozrepl instance
  • port : port of firefox mozrepl instance
  • firefox : path to the firefox binary
  • firefox-profile : path to the firefox profile to use

Start to use it on our little http server:

>>> b = Browser('http://localhost:45675')
>>> b.contents
'<html>...<pre>...localhost:45675...</pre>...</html>'

>>> b.open('http://localhost:45675')
>>> b.contents
'<html>...<pre>...localhost:45675...</pre>...</html>'

Kill any launched firefox from the browser instance with its configuration settings:

>>> b.stop_ff()
>>> b.start_ff()
<mozrunner.runner.Firefox object at ...>
>>> b.restart_ff()
<mozrunner.runner.Firefox object at ...>

Cleanup:

>>> b.stop_ff()

HISTORY

0.10-<0.11

  • bugfix for 0.9

0.9

  • Fix binary distributions, now with a sample decorator, mozrunner executes its commands in firefox directories

0.8

  • bugfix for js execution

0.7

  • bugfix: firefox is started when you call open… its better.

0.6

  • doc + bugfixes
  • use of testrunner to handle firefox instance
  • robustify the proxy code
  • add tests

0.4

  • doc + bugfixes

0.3

  • adding error message

0.2

  • Adding proxy fallback facility

0.1

  • Initial release
Release History

Release History

0.11

This version

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.11dev-r115880

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.11dev-r115879

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.11dev-r82630

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.9dev-r82608

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.8dev-r82578

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.7dev-r82296

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.6dev-r82291

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.6dev-r82290

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.5dev-r79038

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.4dev-r73660

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.3dev-r73429

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.2dev-r73410

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.2dev-r73407

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.1dev-r73324

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.1dev-r73299

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

Download Files

Download Files

TODO: Brief introduction on what you do with files - including link to relevant help section.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
collective.anonymousbrowser-0.11.zip (21.4 kB) Copy SHA256 Checksum SHA256 Source Apr 21, 2010

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting