Skip to main content

No project description provided

Project description

Intro:

A library to help pre-processing webscraped text files.

Functions:

Currently has only one function:

def filter_list_of_strings(strings: list[str], min_size: int) -> list[str]:

When scraping webpages a lot of useless text is included: menus, headers, footers. These are large sections of text that are repeated exactly between multiple files. This function detects repeated sections of text between files and removes them.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

string_processing-0.1.2-cp313-none-win_amd64.whl (110.1 kB view details)

Uploaded CPython 3.13Windows x86-64

string_processing-0.1.2-cp313-cp313-manylinux_2_34_x86_64.whl (241.9 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

string_processing-0.1.2-cp313-cp313-macosx_11_0_arm64.whl (183.7 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

string_processing-0.1.2-cp312-none-win_amd64.whl (110.2 kB view details)

Uploaded CPython 3.12Windows x86-64

string_processing-0.1.2-cp312-cp312-manylinux_2_34_x86_64.whl (242.0 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

string_processing-0.1.2-cp312-cp312-macosx_11_0_arm64.whl (183.7 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

string_processing-0.1.2-cp311-none-win_amd64.whl (110.5 kB view details)

Uploaded CPython 3.11Windows x86-64

string_processing-0.1.2-cp311-cp311-manylinux_2_34_x86_64.whl (243.2 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

string_processing-0.1.2-cp311-cp311-macosx_11_0_arm64.whl (185.9 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

string_processing-0.1.2-cp310-none-win_amd64.whl (110.6 kB view details)

Uploaded CPython 3.10Windows x86-64

string_processing-0.1.2-cp310-cp310-manylinux_2_34_x86_64.whl (243.5 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

string_processing-0.1.2-cp310-cp310-macosx_11_0_arm64.whl (186.0 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

File details

Details for the file string_processing-0.1.2-cp313-none-win_amd64.whl.

File metadata

File hashes

Hashes for string_processing-0.1.2-cp313-none-win_amd64.whl
Algorithm Hash digest
SHA256 dbbc11af86301467014c2b4f0f4ad6115e52228262f3470e22355d86196d08bc
MD5 255331159e2fb8ebeb0ddf807af783b7
BLAKE2b-256 125fcd1d13b330e8f158bfb02f22851bec52c83919bc4f656d7c3e2357b66ca5

See more details on using hashes here.

File details

Details for the file string_processing-0.1.2-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for string_processing-0.1.2-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 2d8ea442795f76c2c75d4a077d65c9105628e104630796df4118346594dd787f
MD5 34d0789ea0ea5cee324999e018a189f1
BLAKE2b-256 b9eba54a67e7379f2565392bbc5441335f17566ec5324da23ef657a0717752a0

See more details on using hashes here.

File details

Details for the file string_processing-0.1.2-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for string_processing-0.1.2-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 be03e36899d1cd3ad0bbb928b5ee6ff6a00b5470b83a9fc2ae604e78a14e9d16
MD5 d8e9a5a2f103da1e7856dfaa076eb5b0
BLAKE2b-256 dd9ce35c4f51d34e5f83c5a5b0abd6348cd8fb8f0b544ca73e6a303696e600cc

See more details on using hashes here.

File details

Details for the file string_processing-0.1.2-cp312-none-win_amd64.whl.

File metadata

File hashes

Hashes for string_processing-0.1.2-cp312-none-win_amd64.whl
Algorithm Hash digest
SHA256 8f16c82f25f2635f3e46395dcf2ae1e5b1e6f3a5c8a39ea045d795c7bd149bc4
MD5 97ee315df9b2352caa6fa0fdf1f86144
BLAKE2b-256 2880d6482d2b32cebe6e08d21989aa2ebb3858b7b2a025583a191fd5a59ce51f

See more details on using hashes here.

File details

Details for the file string_processing-0.1.2-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for string_processing-0.1.2-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 c04b326da690859883abd9569e6deb2c88526e48b92158fe4352d4403f0306f1
MD5 1cb89891909b32f5cd8a24d8c1da6f3b
BLAKE2b-256 5c63e2fee545aacf81e4e288ef60902226defa4d97b345785b7329a1adf4dc02

See more details on using hashes here.

File details

Details for the file string_processing-0.1.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for string_processing-0.1.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a0164b001b3a6e92dfb1b2ffd110ccb4b1d19564fbbaba723ff6a66fa7908898
MD5 83ad8b239ef5ef86a5eec3b423984590
BLAKE2b-256 3663265f6ca98d96579e52cf007c21ef093742762c9cf2f003112ab4256c9d26

See more details on using hashes here.

File details

Details for the file string_processing-0.1.2-cp311-none-win_amd64.whl.

File metadata

File hashes

Hashes for string_processing-0.1.2-cp311-none-win_amd64.whl
Algorithm Hash digest
SHA256 3e8108b83c3474ac82790fefed77ba6e7eef27d68355b79e6658402efec67a01
MD5 76eec3f27e3a9fed3370c68b588f10d9
BLAKE2b-256 a3807783bf51b035fc4e2c732066a6fd221cab44c57f5f7c3b677f63e4d601d7

See more details on using hashes here.

File details

Details for the file string_processing-0.1.2-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for string_processing-0.1.2-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 95bd34089713b9e4a3b80acd3f2f082d834bd5701c787631db2ba42b1461febc
MD5 05365eb6006462edab5c5005cdefedaa
BLAKE2b-256 18841c4fe77a946d65865278171b0e7e0e9b4517e637ce44d23b0593735a3270

See more details on using hashes here.

File details

Details for the file string_processing-0.1.2-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for string_processing-0.1.2-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 dfa41f80fa25b672336563684eca35ce23fde4cf96b72e623c58826c82591dd4
MD5 f4f20484f1e2f114d249d54ceb469585
BLAKE2b-256 d11d7041cac55138bb728746f3866862b3b645ef721cb7bb2b633d4d5d80a224

See more details on using hashes here.

File details

Details for the file string_processing-0.1.2-cp310-none-win_amd64.whl.

File metadata

File hashes

Hashes for string_processing-0.1.2-cp310-none-win_amd64.whl
Algorithm Hash digest
SHA256 234f8fde02b256ea0192fbba371429e593c849a557b6c5556be80cd83e242301
MD5 280f48e053be686c6fddf1e795c1b52e
BLAKE2b-256 223cc1862f14be253b12bbbbcce06068d659cb0cafd758f54099283d6331b99e

See more details on using hashes here.

File details

Details for the file string_processing-0.1.2-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for string_processing-0.1.2-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 117f95cca16b73ffb38ebe7d824e500946b0f2ec5ef35c70023cad9aece7d658
MD5 84a488ed2bbfb81e0dfaea8db9f932de
BLAKE2b-256 38b7fe237737d0b1c6ff6b2a170fc193fe14f9c7b2e62d5a453429a6954bc607

See more details on using hashes here.

File details

Details for the file string_processing-0.1.2-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for string_processing-0.1.2-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 115f7ad45081dfff60a7d904d42c39bd4511ceeee1a6186304b3db8ae2693636
MD5 fd39708899104b128f293de1401c76d5
BLAKE2b-256 20f20af47a779a5207d97add33d38fbdf2f60535f6d3b377a1900e617736e307

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page