A module for collecting and providing popular user agent strings, with a requests session which rotates user agents.
Project description
ua_spoofer
A Python module which collects, lists, and returns up to date and commonly used User Agent strings. This can be helpful for avoiding fingerprinting, and bypassing anti-bot/scraping measures. It also provides a Requests session wrapper which automatically uses a random user agent on every connection.
User Agents
A user agent string is sent as a header in HTTP requests to identify which browser and operating system the client is using. It can be used by websites to tailor the content to the device and software a visitor is using. It can also be used to block or restrict certain programs' access, such as bots, web crawlers and scrapers. Another consequence of these strings is they can help build a profile of a user, using the unique compination of browser and operating system versions, a technique called fingerprinting.
User agent spoofing replaces the user agent string with a random one from a list of common strings, disguising the type of client from the server and making it harder to track the user between requests. This is one of the ways to bypass restrictions and mitigate against fingerprinting.
Details
A problem with similar modules and programs is they either use a static dataset, or scrape user agents from sources which are either badly outdated or completely broken. ua_spoofer attempts to solve this by fetching data which is up to date, based on the latest browser versions, and also amalgamates data from several sources. This provides redundancy and a good mix of current user agents, without depending on an API or downloading a static dataset which quickly goes out of date. More sources can be added over time without breaking compatibility.
Installing
ua_spoofer requires Python 3, plus Requests and BeautifulSoup, commonly used modules for scraping purposes.
pip install ua_spoofer
Using
Getting User Agents
from ua_spoofer import UserAgent
ua = UserAgent()
# Random user agents from a specified browser
ua.chrome
ua.firefox
ua.ie
# Any random user agent
ua.random
# Get a list of supported browsers
ua.BROWSERS
# Get the list of all user agent strings
ua.all
# Update the list
ua.update()
Using the Requests Session wrapper
from ua_spoofer import SpoofSession
s = SpoofSession()
# Each request will use a different user agent string
# A few other headers are randomised too
# To demonstrate:
s.get("https://icanhazheaders.com/").json()
s.get("https://icanhazheaders.com/").json()
s.get("https://icanhazheaders.com/").json()
# To get the UserAgent instance of the session
s.ua
# Updating the user agent list is done as you would expect
s.ua.update()
Other projects
As mentioned earlier, there are other Python modules which attempt to do similar things:
User agent spoofing isn't the only technique to bypass restrictions, with more sites being Javascript based and using more aggressive techniques to protect against crawlers, bots and DDoS attacks, sometimes other methods are necessary, including headless browser automation.
- cloudflare-scrape is a module to bypass Cloudflare's anti-bot system
- PhantomJS is a scriptable headless browser
- Selenium is a full browser automation framework
- Scrapy is a Python framework for building crawlers
- Spynner is another scriptable Python browser module
In some cases, Tor or a VPN can be used to hide the client's IP address for proper anonymity.
License
ua_spoofer is released under the terms of the Apache 2.0 license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ua_spoofer-1.0.tar.gz
.
File metadata
- Download URL: ua_spoofer-1.0.tar.gz
- Upload date:
- Size: 5.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.7.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 916e87f35efabc7fb0df060cd1044c73fa4770275e1b898b5e325e6c600657c9 |
|
MD5 | 795cfb51172bbee458248c12015e2bb4 |
|
BLAKE2b-256 | bff549ee0dc21736d91229bf43b41fb88321ec8e0dce6ccd84470f2f04ee3875 |
File details
Details for the file ua_spoofer-1.0-py3-none-any.whl
.
File metadata
- Download URL: ua_spoofer-1.0-py3-none-any.whl
- Upload date:
- Size: 9.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.7.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d272f66fcc65d080a238599acdaf3e7b1ddd8bbb629faeb80a46842ad826671d |
|
MD5 | da27ce3d8aec2c98535fe75732cdbea3 |
|
BLAKE2b-256 | 602d32c225f60e059bffb80c2ea25586a71e72d3260aa8819143d16d3b2891c9 |