Skip to main content

A package for removing tracing parameters from URLs. This package supports automatically updating filtering rules from Adguard

Project description

URL Cleaner

PyPI version

Introduction

A package for removing tracing parameters from URLs. This package supports:

  • Automatically updating filtering rules from Adguard.
  • Custom filtering rules.
  • Host pathname specific filtering.
  • Hundreds of filtering rules for using!

Inspired by ClearUrl and URL Bot, thanks for their efforts!

Rules from:

  1. AdguardFilters
  2. url_bot

Examples

原始: https://baijiahao.baidu.com/s?id=1748839822649920321&wfr=spider&for=pc
清除后: https://baijiahao.baidu.com/s?id=1748839822649920321

原始: https://mp.weixin.qq.com/s?__biz=MjM5OTExMjYwMA==&mid=2670081058&idx=6&sn=1ad7112020c2a4104d67ca542ab14444&chksm=bc12eed58b6567c30c78123a9e8901241512642305dabae4fa1f52357f5ce0ac7a85554#rd
清除后: https://mp.weixin.qq.com/s?__biz=MjM5OTExMjYwMA%3D%3D&mid=2670081058&idx=6&sn=1ad7112020c2a4104d67ca542ab14444#rd

原始: https://www.bilibili.com/video/BV158411b7ki/?spm_id_from=333234107.tianma.1-2-2.click
清除后: https://www.bilibili.com/video/BV158411b7ki/

Usage

Install

pip install url-cleaner

Clean URLs

from url_cleaner import UrlCleaner
c = UrlCleaner()
url = "https://baijiahao.baidu.com/s?id=1748839822649920321&wfr=spider&for=pc"
cleaned = c.clean(url)
print(cleaned)

https://baijiahao.baidu.com/s?id=1748839822649920321

Update rules

from url_cleaner import UrlCleaner
c = UrlCleaner()
c.ruler.update_rules()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

url_cleaner-0.1.5.tar.gz (22.2 kB view hashes)

Uploaded Source

Built Distribution

url_cleaner-0.1.5-py3-none-any.whl (24.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page