Skip to main content

Word Freak is a Python library that extracts word frequencies from files.

Project description

wordfreak logo

A Python library to extract word frequencies from files.

Main Build Last Commit

Supported File Types

File Extension Explanation Supported
.doc Microsoft Word document pre-2007 :x:
.docx Microsoft Word document post-2007 :heavy_check_mark:
.pdf Portable Document Format :heavy_check_mark:
.txt Plain text file :heavy_check_mark:

Installation

Use the package manager pip to install..

pip install wordfreak

Usage

import wordfreak

# Takes a text source and extracts the word frequencies from it in order from most -> least occurring.
# Extracts word frequencies from "inputFile.txt" and returns them as a Python dictionary.
wordFrequencies = wordfreak.extractWordFrequencies("C:\\inputFile.txt")

# If an output file path is given, it will also save the results there as JSON.
wordFrequencies = wordfreak.extractWordFrequencies("C:\\inputFile.txt", "C:\\outputFile.json")

# Takes a saved word frequencies JSON file and converts it to a Python dictionary.
# Loads word frequencies from "wordFrequencies.json" and returns them as a Python dictionary.
wordFrequencies = wordfreak.pythonizeWordFrequencies("C:\\wordFrequencies.json")

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wordfreak-2.0.0.tar.gz (5.7 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page