Skip to main content

A package for performing TF-IDF transformation on text data

Project description

TFIDF Transformer Afiniti

A package for performing TF-IDF transformation on text data. This project developing for an assessment:

Create a framework that does tf-idf transformation, you can use sklearn's tfidf function. Keep the following in mind

  • Handle Edge Cases : what happens when new text data arrives
  • Create Unit Tests : check failure scenarios
  • Add Docstrings : assume you will hand this code to some other SWE
  • Obey Engineering Best Practices
  • Use necessary inheritances

Create a (pypi) package out of this framework.

Installation

Use the package manager pip to install.

pip install tfidf-transformer-afiniti

Usage

from tfidf_transformer_afiniti.main import TfidfFramework

framework = TfidfFramework()

# Append some data to the data list
data = ["This is the first document.", "This document is the second document.", "And this is the third one."]
for d in data:
    framework.append_data(d)

# Print the tf-idf matrix
print(framework.tfidf_matrix.toarray())

# Add new one
new_data = "this is a new test document"
framework.append_data(new_data)
# Print the tf-idf matrix
print(framework.tfidf_matrix.toarray())

# Add new list
new_list_data = ["And this is the realy fifth one.","And this is the finaly sixth one."]  
framework.append_list_data(new_list_data)
# Print the tf-idf matrix
print(framework.tfidf_matrix.toarray())

Usage

python -m unittest tfidf_transformer_afiniti/tfidf_test.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

tfidf_transformer_afiniti-0.4-py3-none-any.whl (4.0 kB view details)

Uploaded Python 3

File details

Details for the file tfidf_transformer_afiniti-0.4-py3-none-any.whl.

File metadata

File hashes

Hashes for tfidf_transformer_afiniti-0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 5f7d0b6a31bb80dbde99473a236fc58144b1b80c35c2f9d977e943e464369319
MD5 b2ad2512deeaeb9220ac361d30415090
BLAKE2b-256 15eae7909c869da6ba6924528b554d62433d95a15aed11f24c6f7331af0a956b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page