Skip to main content

Word Freak is a Python library that extracts word frequencies from files.

Project description

wordfreak logo

A Python library to extract word frequencies from files.

Main Build Last Commit

Supported File Types

File Extension Explanation Supported
.doc Microsoft Word document pre-2007 :x:
.docx Microsoft Word document post-2007 :heavy_check_mark:
.pdf Portable Document Format :heavy_check_mark:
.txt Plain text file :heavy_check_mark:

Installation

Use the package manager pip to install..

pip install wordfreak

Usage

import wordfreak

# Takes a text source and extracts the word frequencies from it in order from most -> least occurring.
# Extracts word frequencies from "inputFile.txt" and returns them as a Python dictionary.
wordFrequencies = wordfreak.extractWordFrequencies("C:\\inputFile.txt")

# If an output file path is given, it will also save the results there as JSON.
wordFrequencies = wordfreak.extractWordFrequencies("C:\\inputFile.txt", "C:\\outputFile.json")

# Takes a saved word frequencies JSON file and converts it to a Python dictionary.
# Loads word frequencies from "wordFrequencies.json" and returns them as a Python dictionary.
wordFrequencies = wordfreak.pythonizeWordFrequencies("C:\\wordFrequencies.json")

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wordfreak-2.0.0.tar.gz (5.7 kB view details)

Uploaded Source

File details

Details for the file wordfreak-2.0.0.tar.gz.

File metadata

  • Download URL: wordfreak-2.0.0.tar.gz
  • Upload date:
  • Size: 5.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.0

File hashes

Hashes for wordfreak-2.0.0.tar.gz
Algorithm Hash digest
SHA256 e202f95c35fbb3a2c888730b7ed0e9e0ed4783974240129f350bd15bcd09e30d
MD5 7a6421ea37cdd5c9816efbdc02aee287
BLAKE2b-256 b5b9ede64fb4db10534fff83e019daf75013e7081d000a11587cbb893936cc51

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page