Skip to main content

一套由各种小脚本堆砌而成的工具集,主要用于爬虫和数据治理。

Project description

Mugwort Tools

这是一套由各种小脚本堆砌而成的工具集,主要用于数据治理和爬虫。

开始使用

因工具集使用了类型提示,故只能在 Python 3.6 以上环境中运行。

  • 快速安装
pip install mugwort
  • 部分安装
pip install mugwort[cryptor]
pip install mugwort[proxy-clash]
  • 完整安装(包含完整依赖)
pip install mugwort[all]

工具列表

工具名 版本 描述
Logger 1.1 支持控制台输出和文件输出的日志工具
MultiTask 1.0 基于多线程、多进程实现的多任务处理工具
Cryptor 1.0 基于各种算法实现的密码学工具
ElasticsearchHelper 1.0 用于快速使用 Elasticsearch 的帮助工具
ClashProxy 1.0 支持订阅更新、节点切换、节点检测功能的代理工具

Logger

支持控制台输出文件输出的日志工具,支持 ANSI 颜色,日志样式参考 SpringBoot 项目。

  • 代码示例
from mugwort import Logger

logger = Logger('foo', Logger.DEBUG, verbose=True)

logger.debug('This is verbose debug log.')
logger.info('This is verbose info log.')
logger.warning('This is verbose warning log.')
logger.error('This is verbose error log.')
logger.critical('This is verbose critical log.')
logger.critical('This is verbose critical log with stack_info.', stack_info=True)

try:
    raise Exception('some exception')
except Exception as e:
    logger.exception(e)
  • 运行示例

LoggerExample

LoggerExample

MultiTask

基于多线程多进程实现的多任务处理工具,以及开箱即用的多线程、多进程数据共享变量。

  • 代码示例
from mugwort import MultiTask

def fn(*args, **kwargs):
    for arg in args:
        print(arg)
    for kw, arg in kwargs.items():
        print(kw, '->', arg)


def main():
    multitask = MultiTask(mode='process', max_workers=4)

    multitask.submit(
        fn,
        multitask.variable.get_lock(),
        multitask.variable.get_r_lock(),
        multitask.variable.get_condition(),
        multitask.variable.get_semaphore(2),
        multitask.variable.get_bounded_semaphore(2),
        multitask.variable.get_event(),
        multitask.variable.get_barrier(2),
        multitask.variable.get_queue(2),
        variable_dict=multitask.variable.get_dict({'a': 1}),
        variable_list=multitask.variable.get_list([1, 2, 3]),
        process_variable_namespace=multitask.variable.get_namespace(),
        process_variable_array=multitask.variable.get_array('i', [1, 2]),
        process_variable_value=multitask.variable.get_value('i', 123456),
    )


if __name__ == '__main__':
    main()

Cryptor

基于各种算法实现的密码学工具,包含对称加密解密、非对称密钥对生成、非对称加密解密、非对称签名校验、密钥交换、双因数令牌生成校验等功能。

  • 支持的加密模式
    • CBC、XTS、ECB、OFB、CFB、CFB8、CTR、GCM
  • 支持的填充方式
    • PKCS7、ANSIX923

AESCryptor

AES 算法实现的支持常用加密模式和常用填充方式的加解密工具,无需实例化即可调用。

  • 代码示例
import os
from mugwort.tools.cryptor import AESCryptor

key = b'this_is_aes_key.'
iv = os.urandom(16)
ciphertext = AESCryptor.cbc_pkcs7_encryptor(
    data=b'this_is_aes_plaintext.', key=key, iv=iv,
)
plaintext = AESCryptor.cbc_pkcs7_decryptor(
    ciphertext, key, iv
)
print(key, iv, plaintext)

TripleDESCryptor

3DES 算法实现的支持常用加密模式和常用填充方式兼容 DES 算法的加解密工具,无需实例化即可调用。

  • 代码示例
import os
from mugwort.tools.cryptor import TripleDESCryptor

# 当密钥长度为 8 时,等价于 DES 算法
key = b'des_key.'
iv = os.urandom(8)
ciphertext = TripleDESCryptor.cbc_pkcs7_encryptor(
    b'this_is_des_plaintext.', key, iv
)
plaintext = TripleDESCryptor.cbc_pkcs7_decryptor(
    ciphertext, key, iv
)
print(key, iv, plaintext)

key = b'this_is_triple_3des_key.'
iv = os.urandom(8)
ciphertext = TripleDESCryptor.cbc_pkcs7_encryptor(
    b'this_is_triple_des_plaintext.', key, iv
)
plaintext = TripleDESCryptor.cbc_pkcs7_decryptor(
    ciphertext, key, iv
)
print(key, iv, plaintext)

RSACryptor

RSA 算法实现的支持密钥对生成、消息加密、消息解密、消息签名、消息校验功能的加解密及签名工具,无需实例化即可调用。

  • 代码示例
from mugwort.tools.cryptor import RSACryptor

public_key, private_key = RSACryptor.generate()

# 从本地文件装载
# with open('public_key.pem', 'rb') as pub, open('private_key.pem', 'rb') as priv:
#     public_key = RSACryptor.load_public_key(pub.read())
#     private_key = RSACryptor.load_private_key(priv.read())

ciphertext = RSACryptor.encrypt(public_key, b'this_is_rsa_plaintext.')
print(ciphertext)
plaintext = RSACryptor.decrypt(private_key, ciphertext)
print(plaintext)

signature = RSACryptor.sign(private_key, b'this_is_rsa_plaintext.')
validity = RSACryptor.verify(public_key, b'this_is_rsa_plaintext.', signature)
print(validity, signature)

# 转储到本地文件
# with open('public_key.pem', 'wb') as pub, open('private_key.pem', 'wb') as priv:
#     pub.write(RSACryptor.dump_public_key(public_key))
#     priv.write(RSACryptor.dump_private_key(private_key, password=b'password'))

Ed25519Cryptor

Ed25519 算法实现的支持密钥对生成、消息签名、消息校验功能的签名工具,无需实例化即可调用。

  • 代码示例
from mugwort.tools.cryptor import Ed25519Cryptor

public_key, private_key = Ed25519Cryptor.generate()

# 从本地文件装载
# with open('public_key.pem', 'rb') as pub, open('private_key.pem', 'rb') as priv:
#     public_key = Ed25519Cryptor.load_public_key(pub.read())
#     private_key = Ed25519Cryptor.load_private_key(priv.read())

signature = Ed25519Cryptor.sign(private_key, b'this_is_ed25519_plaintext.')
print(signature)
validity = Ed25519Cryptor.verify(public_key, b'this_is_ed25519_plaintext.', signature)
print(validity)

# 转储到本地文件
# with open('public_key.pem', 'wb') as pub, open('private_key.pem', 'wb') as priv:
#     pub.write(Ed25519Cryptor.dump_public_key(public_key))
#     priv.write(Ed25519Cryptor.dump_private_key(private_key, password=b'password'))

X25519Cryptor

X25519 算法实现的支持密钥对生成、密钥交换功能的签名工具,无需实例化即可调用。

  • 代码示例
    • 密钥交换,公钥通常是公开的,私钥则仅在本地保存。
    • 通过【我的私钥和对方的公钥】或【我的公钥和对方的私钥】会生成一串相同的密钥。
    • 本示例不模拟密钥传输,而是使用直接生成的两份密钥对。
from mugwort.tools.cryptor import X25519Cryptor

foo_public_key, foo_private_key = X25519Cryptor.generate()
bar_public_key, bar_private_key = X25519Cryptor.generate()

foo_bar_shared_key = X25519Cryptor.exchange(foo_private_key, bar_public_key)
bar_foo_shared_key = X25519Cryptor.exchange(bar_private_key, foo_public_key)
print(foo_bar_shared_key == bar_foo_shared_key, foo_bar_shared_key)

public_key_bytes = X25519Cryptor.dump_public_key(foo_public_key)
private_key_bytes = X25519Cryptor.dump_private_key(foo_private_key)
print(public_key_bytes)
print(private_key_bytes)

# public_key = X25519Cryptor.load_public_key(public_key_bytes)
# private_key = X25519Cryptor.load_private_key(private_key_bytes)

TOTPCryptor

双因素身份验证相关算法实现的一次性密码生成和验证工具,无需实例化即可调用。

  • 代码示例
import time
from mugwort.tools.cryptor import TOTPCryptor

timestamp = int(time.time())

value = TOTPCryptor.generate(b'this_is_totp_key.', timestamp)
validity = TOTPCryptor.verify(b'this_is_totp_key.', value, timestamp)
print(validity, value.decode())

Database

Elasticsearch

用于快速使用 Elasticsearch 的帮助工具

函数名称 功能阐述
index_refresh 刷新索引
index_get 获取索引信息
index_create 创建索引
index_delete 删除索引
index_exists 判断是否存在索引
alias_get 获取索引别名信息
alias_create 创建索引别名
alias_delete 删除索引别名
alias_exists 判断是否存在索引别名
doc_get 获取文档完整内容
doc_get_source 获取文档原始内容
doc_create 创建文档
doc_delete 删除文档
doc_update 更新文档
doc_replace 创建或更新文档
doc_exists 判断是否存在文档
doc_count 统计文档
docs_bulk 批量操作文档
docs_mget 批量获取文档
def docs_reindex 重建索引
search 搜索索引
scroll 滚动查询
scroll_clear 清除滚动查询
bulk 快速操作工具
scan 滚动搜索工具

Proxy

Clash

支持订阅更新节点切换节点检测功能的 Clash 代理工具。

  • 代码示例
from mugwort.tools.proxy.clash_proxy import ClashProxy, ClashConfig

ClashProxy(ClashConfig(
    subscribe_link='https://airplane.com/clash-subscribe-link',
    subscribe_include_keywords=['香港'],
    subscribe_exclude_keywords=['过期时间', '剩余流量', '官网'],
    watcher_blocking=True,
    # 默认每天凌晨两点更新订阅
    # watcher_job_updater_enable=True,
    # watcher_job_updater_config={'trigger': 'cron', 'hour': 2},
    # 默认每间隔一小时切换节点
    # watcher_job_changer_enable=True,
    # watcher_job_changer_config={'trigger': 'interval', 'hours': 1},
    # 默认每间隔三十秒检测节点
    # watcher_job_checker_enable=True,
    # watcher_job_checker_config={'trigger': 'interval', 'seconds': 30},
)).startup()

更新日志

  • 2022-12-30

    • 添加 Elasticsearch 帮助工具
  • 2022-11-09

    • 添加多任务处理工具
  • 2022-10-22

    • 添加代理工具
  • 2022-09-18

    • 添加密码学工具
  • 2022-09-14

    • 添加日志工具

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mugwort-0.5.4.tar.gz (146.0 kB view hashes)

Uploaded Source

Built Distribution

mugwort-0.5.4-py3-none-any.whl (40.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page