get-free-proxy

A package to get free proxy

These details have not been verified by PyPI

Project links

Homepage

Project description

#get_free_proxy get_free_proxy is a tool to get free proxy from website

install

pip install get-free-proxy

usage

get_free_proxy depends on gen_browser_header.

create gen_browser_header setting
import gen_browser_header.setting.Setting as gbh_setting
import gen_browser_header.self.SelfEnum as gbh_self_enum
cur_gbh_setting = gbh_setting.GbhSetting()
cur_gbh_setting.proxy_ip = ['10.11.12.13:8080']
cur_gbh_setting.browser_type = {gbh_self_enum.BrowserType.All}
cur_gbh_setting.firefox_ver = {'min': 74, 'max': 75}
cur_gbh_setting.chrome_type = {gbh_self_enum.ChromeType.Stable}
cur_gbh_setting.chrome_max_release_year = 1
cur_gbh_setting.os_type = {gbh_self_enum.OsType.Win64}

create get_free_proxy setting
import get_free_proxy.self.SelfEnum as gfp_self_enum
import get_free_proxy.setting.Setting as gfp_setting
cur_gfp_setting = gfp_setting.GfpSetting()
cur_gfp_setting.proxy_type = {gfp_self_enum.ProxyType.HIGH_ANON}
cur_gfp_setting.protocol = {gfp_self_enum.ProtocolType.HTTP, gfp_self_enum.ProtocolType.HTTPS}
cur_gfp_setting.country = {gfp_self_enum.Country.All}
cur_gfp_setting.storage_type = {gfp_self_enum.StorageType.All}
cur_gfp_setting.mysql = { 'host': '127.0.0.1', 'port': 3306, 'user': 'root', 'pwd': '1234', 'db_name': 'db_proxy', 'tbl_name': 'tbl_proxy', 'charset': 'utf8mb4'}
cur_gfp_setting.redis = { 'host': '127.0.0.1', 'port': 6379, 'db': 0, # 0~15 'pwd': None }
cur_gfp_setting.result_file_path = os.path.join(tempfile.gettempdir(), 'result.json')
cur_gfp_setting.valid_time_in_db = 86400
cur_gfp_setting.site_max_page_no = 2
cur_gfp_setting.site = {gfp_self_enum.SupportedWeb.Xici}

gfp_setting

proxy_type
type: set, element is enum=>gfp_self_enum.ProxyType
default: {gfp_self_enum.ProxyType.HIGH_ANON}
description: proxy has 3 type: transparent/anonymous/high_anonymous, TRANS/ANON/HIGH_ANON. There is an addition one All, if set, will be replace by TRANS+ANON+HIGH_ANON
protocol
type: set, element is enum=>gfp_self_enum.ProtocolType default: {gfp_self_enum.ProtocolType.HTTP, gfp_self_enum.ProtocolType.HTTPS}
description: proxy protocol has 4 type: HTTP, HTTPS, SOCK4, SOCK5. There is an addition one All, is set, will be replace by HTTP+HTTPS+SOCK4+SOCK5.
country
type: set, element is enum=>gfp_self_enum.Country
default: {gfp_self_enum.Country.China}
description: some web provide proxy form all countries, the parameter will filter the country. There is an addition one All, is set, will ignore country.
storage_type type: set, element is enum=>gfp_self_enum.StorageType
default: {gfp_self_enum.StorageType.All}
description: current support 3 storage type: Mysql/Redis/File. There is an addition one All, is set, will be replace by Mysql+Redis+File
mysql
type: dict
default: ***
{
'host': '127.0.0.1',
'port': 3306,
'user': 'root',
'pwd': '1234',
'db_name': 'db_proxy',
'tbl_name': 'tbl_proxy',
'charset': 'utf8mb4'
}

description: if storage_type include Mysql, set this parameter to connect mysql.
5. redis
type: dict default: ***
{
'host': '127.0.0.1',
'port': 6379,
'db': 0, # 0~15
'pwd': None
}

description: if storage_type include Redis, set this parameter to connect redis. 6. result_file_path
type: string
default: os.path.join(tempfile.gettempdir(), 'result.json')
description: if storage_type include File, all get proxy will be save into the file defined by result_file_path
7. valid_time_in_db
type: int
description: since all got proxy are free, not sure when these proxy will expire. So set this parameter, it a proxy expire this duration, will not delete/not_choose
8. site_max_page_no
type: int
default: 2
description: min:2, max:9. The web site which provide free proxy, the content are pagationed. This parameter determine how many page will be handled to extract proxy.
9. site
type: set, enum=>gfp_self_enum.SupportedWeb default: {gfp_self_enum.SupportedWeb.Xici}
description: this parameter determine which site will be used to extract proxy. currently only support 4 site: https://www.xicidaili.com, https://www.kuaidaili.com/free, https://hidemy.name/en/proxy-list/#list, https://proxy-list.org/english. and if All is set, will be replace by above 4 site.

start to use
import get_free_proxy.main.main as main
mainOp = main.MainOp(cur_gfp_setting, cur_gbh_setting)
delete all exist proxy mainOp.del_proxy()
mainOp.check_if_site_need_proxy()
some website that provide free proxy can connect directly
proxies = mainOp.get_proxy_without_proxy()
not all proxy are usable, so should pick up useful proxy
validate_proxies = mainOp.async_validate_proxies(proxies, 'https://www.baidu.com')
some website that provide free proxy must use proxy, use proxy get in mainOp.get_proxy_without_proxy()
tmp_proxies = mainOp.get_proxy_with_proxy(proxies=proxies)
validate usable again
validate_proxies += mainOp.async_validate_proxies(tmp_proxies, 'https://www.baidu.com')
store valid proxy to use later
mainOp.save_proxy(proxies=tmp_proxies)

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.1.2

Apr 28, 2020

This version

0.0.1

Apr 24, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

get_free_proxy-0.0.1.tar.gz (23.2 kB view hashes)

Uploaded Apr 24, 2020 Source

Built Distribution

get_free_proxy-0.0.1-py3-none-any.whl (27.9 kB view hashes)

Uploaded Apr 24, 2020 Python 3

Hashes for get_free_proxy-0.0.1.tar.gz

Hashes for get_free_proxy-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`655021f8a8f4a15eb4dfef928afadbaa127ecdd9141caf33231d209f9f9ec3e8`
MD5	`59d485ba1c711d6b84952d2d595bf05f`
BLAKE2b-256	`f55be597bf7f842adb326cf2d88af5e6316f7fdd2411ff4e843d37ea5b8d4f08`

Hashes for get_free_proxy-0.0.1-py3-none-any.whl

Hashes for get_free_proxy-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5b1369cfc4b89808a5ea0ca974fa5761b1dcd3b488a6d906efa53a4aa8cc9b20`
MD5	`dbda20a484dbf9cd16335fa59f8f347f`
BLAKE2b-256	`1d4f21b68e686521b849a4246235baf73f32c9990ae8e6a293547ae34f7d789d`