Skip to main content

A package to get free proxy

Project description

#get_free_proxy get_free_proxy is a tool to get free proxy from website

install

pip install get-free-proxy

usage

get_free_proxy depends on gen_browser_header.

create gen_browser_header setting
import gen_browser_header.setting.Setting as gbh_setting
import gen_browser_header.self.SelfEnum as gbh_self_enum
cur_gbh_setting = gbh_setting.GbhSetting()
cur_gbh_setting.proxy_ip = ['10.11.12.13:8080']
cur_gbh_setting.browser_type = {gbh_self_enum.BrowserType.All}
cur_gbh_setting.firefox_ver = {'min': 74, 'max': 75}
cur_gbh_setting.chrome_type = {gbh_self_enum.ChromeType.Stable}
cur_gbh_setting.chrome_max_release_year = 1
cur_gbh_setting.os_type = {gbh_self_enum.OsType.Win64}

create get_free_proxy setting
import get_free_proxy.self.SelfEnum as gfp_self_enum
import get_free_proxy.setting.Setting as gfp_setting
cur_gfp_setting = gfp_setting.GfpSetting()
cur_gfp_setting.proxy_type = {gfp_self_enum.ProxyType.HIGH_ANON}
cur_gfp_setting.protocol = {gfp_self_enum.ProtocolType.HTTP, gfp_self_enum.ProtocolType.HTTPS}
cur_gfp_setting.country = {gfp_self_enum.Country.All}
cur_gfp_setting.storage_type = {gfp_self_enum.StorageType.All}
cur_gfp_setting.mysql = { 'host': '127.0.0.1', 'port': 3306, 'user': 'root', 'pwd': '1234', 'db_name': 'db_proxy', 'tbl_name': 'tbl_proxy', 'charset': 'utf8mb4'}
cur_gfp_setting.redis = { 'host': '127.0.0.1', 'port': 6379, 'db': 0, # 0~15 'pwd': None }
cur_gfp_setting.result_file_path = os.path.join(tempfile.gettempdir(), 'result.json')
cur_gfp_setting.valid_time_in_db = 86400
cur_gfp_setting.site_max_page_no = 2
cur_gfp_setting.site = {gfp_self_enum.SupportedWeb.Xici}

gfp_setting

  1. proxy_type
    type: set, element is enum=>gfp_self_enum.ProxyType
    default: {gfp_self_enum.ProxyType.HIGH_ANON}
    description: proxy has 3 type: transparent/anonymous/high_anonymous, TRANS/ANON/HIGH_ANON. There is an addition one All, if set, will be replace by TRANS+ANON+HIGH_ANON
  2. protocol
    type: set, element is enum=>gfp_self_enum.ProtocolType default: {gfp_self_enum.ProtocolType.HTTP, gfp_self_enum.ProtocolType.HTTPS}
    description: proxy protocol has 4 type: HTTP, HTTPS, SOCK4, SOCK5. There is an addition one All, is set, will be replace by HTTP+HTTPS+SOCK4+SOCK5.
  3. country
    type: set, element is enum=>gfp_self_enum.Country
    default: {gfp_self_enum.Country.China}
    description: some web provide proxy form all countries, the parameter will filter the country. There is an addition one All, is set, will ignore country.
  4. storage_type type: set, element is enum=>gfp_self_enum.StorageType
    default: {gfp_self_enum.StorageType.All}
    description: current support 3 storage type: Mysql/Redis/File. There is an addition one All, is set, will be replace by Mysql+Redis+File
  5. mysql
    type: dict
    default: ***
    {
    'host': '127.0.0.1',
    'port': 3306,
    'user': 'root',
    'pwd': '1234',
    'db_name': 'db_proxy',
    'tbl_name': 'tbl_proxy',
    'charset': 'utf8mb4'
    }

description: if storage_type include Mysql, set this parameter to connect mysql.
5. redis
type: dict default: ***
{
'host': '127.0.0.1',
'port': 6379,
'db': 0, # 0~15
'pwd': None
}


description: if storage_type include Redis, set this parameter to connect redis. 6. result_file_path
type: string
default: os.path.join(tempfile.gettempdir(), 'result.json')
description: if storage_type include File, all get proxy will be save into the file defined by result_file_path
7. valid_time_in_db
type: int
description: since all got proxy are free, not sure when these proxy will expire. So set this parameter, it a proxy expire this duration, will not delete/not_choose
8. site_max_page_no
type: int
default: 2
description: min:2, max:9. The web site which provide free proxy, the content are pagationed. This parameter determine how many page will be handled to extract proxy.
9. site
type: set, enum=>gfp_self_enum.SupportedWeb default: {gfp_self_enum.SupportedWeb.Xici}
description: this parameter determine which site will be used to extract proxy. currently only support 4 site: https://www.xicidaili.com, https://www.kuaidaili.com/free, https://hidemy.name/en/proxy-list/#list, https://proxy-list.org/english. and if All is set, will be replace by above 4 site.

start to use
import get_free_proxy.main.main as main
mainOp = main.MainOp(cur_gfp_setting, cur_gbh_setting)
delete all exist proxy mainOp.del_proxy()
mainOp.check_if_site_need_proxy()
some website that provide free proxy can connect directly
proxies = mainOp.get_proxy_without_proxy()
not all proxy are usable, so should pick up useful proxy
validate_proxies = mainOp.async_validate_proxies(proxies, 'https://www.baidu.com')
some website that provide free proxy must use proxy, use proxy get in mainOp.get_proxy_without_proxy()
tmp_proxies = mainOp.get_proxy_with_proxy(proxies=proxies)
validate usable again
validate_proxies += mainOp.async_validate_proxies(tmp_proxies, 'https://www.baidu.com')
store valid proxy to use later
mainOp.save_proxy(proxies=tmp_proxies)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

get_free_proxy-0.0.1.tar.gz (23.2 kB view hashes)

Uploaded Source

Built Distribution

get_free_proxy-0.0.1-py3-none-any.whl (27.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page