Word Freak is a Python library that extracts word frequencies from files.
Project description
A Python library to extract word frequencies from files.
Supported File Types
File Extension | Explanation | Supported |
---|---|---|
.doc | Microsoft Word document pre-2007 | :x: |
.docx | Microsoft Word document post-2007 | :heavy_check_mark: |
Portable Document Format | :heavy_check_mark: | |
.txt | Plain text file | :heavy_check_mark: |
Installation
Use the package manager pip to install..
pip install wordfreak
Usage
import wordfreak
# Takes a text source and extracts the word frequencies from it in order from most -> least occurring.
# Extracts word frequencies from "inputFile.txt" and returns them as a Python dictionary.
wordFrequencies = wordfreak.extractWordFrequencies("C:\\inputFile.txt")
# If an output file path is given, it will also save the results there as JSON.
wordFrequencies = wordfreak.extractWordFrequencies("C:\\inputFile.txt", "C:\\outputFile.json")
# Takes a saved word frequencies JSON file and converts it to a Python dictionary.
# Loads word frequencies from "wordFrequencies.json" and returns them as a Python dictionary.
wordFrequencies = wordfreak.pythonizeWordFrequencies("C:\\wordFrequencies.json")
Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
wordfreak-2.0.0.tar.gz
(5.7 kB
view details)
File details
Details for the file wordfreak-2.0.0.tar.gz
.
File metadata
- Download URL: wordfreak-2.0.0.tar.gz
- Upload date:
- Size: 5.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e202f95c35fbb3a2c888730b7ed0e9e0ed4783974240129f350bd15bcd09e30d |
|
MD5 | 7a6421ea37cdd5c9816efbdc02aee287 |
|
BLAKE2b-256 | b5b9ede64fb4db10534fff83e019daf75013e7081d000a11587cbb893936cc51 |