Skip to main content

A comprehensive NLP and Machine Learning package with example implementations

Project description

PKMB Package

A comprehensive Python package containing various NLP and Machine Learning implementations.

Installation

pip install pkmb

Usage

from pkmb import print_program

# Print any program (1-5, 7, 9)
print_program(1)  # Basic NLP operations
print_program(2)  # Named Entity Recognition
print_program(3)  # TF-IDF implementation
print_program(4)  # N-grams analysis
print_program(5)  # Word Embeddings analysis
print_program(7)  # Text Generation with LSTM
print_program(9)  # Variational Autoencoder for MNIST

Available Programs

  1. Program 1: Natural Language Processing (NLP) Text Analysis

    • Basic NLP operations using NLTK
    • Includes: tokenization, stopword removal, stemming, and lemmatization
    • Demonstrates both sentence and word-level processing
  2. Program 2: Named Entity Recognition (NER)

    • Uses NLTK for entity extraction
    • Identifies persons, organizations, locations
    • Includes BIO tagging and tree representation
  3. Program 3: TF-IDF Implementation

    • Manual implementation of TF-IDF calculation
    • Comparison with scikit-learn's TfidfVectorizer
    • Document similarity analysis
  4. Program 4: N-grams Analysis

    • Uses Pride and Prejudice as corpus
    • Generates unigrams, bigrams, and trigrams
    • Includes frequency analysis and visualization
  5. Program 5: Word Embeddings Analysis

    • Uses GloVe embeddings (50d)
    • Word similarity computation
    • Semantic relationship analysis
  6. Program 7: Text Generation with LSTM

    • Neural network-based text generation
    • Uses TensorFlow/Keras LSTM architecture
    • Includes training and text generation capabilities
  7. Program 9: Variational Autoencoder (VAE)

    • Deep learning model for MNIST dataset
    • Implements both encoder and decoder networks
    • Generates new digit images from latent space

Note: Programs 6 and 8 are intentionally omitted from this collection.

Dependencies

The package requires the following Python packages:

pip install nltk pandas scikit-learn requests gensim scipy==1.11.4 tensorflow matplotlib numpy

Additional Setup

  1. NLTK Data: Required for Programs 1-4

    • Downloads automatically when running the programs
    • Includes: punkt, stopwords, wordnet, averaged_perceptron_tagger, maxent_ne_chunker
  2. GloVe Embeddings: Required for Program 5

    • Downloads automatically on first use (~66MB)
    • Uses the glove-wiki-gigaword-50 model
  3. MNIST Dataset: Required for Program 9

    • Downloads automatically through TensorFlow
    • Used for training and testing the VAE

Note on GPU Support

Programs 7 (LSTM) and 9 (VAE) can benefit from GPU acceleration if TensorFlow is installed with CUDA support.

Error Handling

All programs include proper error handling and will display informative messages if:

  • Required data is not available
  • Words are not found in vocabulary
  • Models fail to load or process

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pkmb-0.1.0.tar.gz (3.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pkmb-0.1.0-py3-none-any.whl (2.7 kB view details)

Uploaded Python 3

File details

Details for the file pkmb-0.1.0.tar.gz.

File metadata

  • Download URL: pkmb-0.1.0.tar.gz
  • Upload date:
  • Size: 3.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for pkmb-0.1.0.tar.gz
Algorithm Hash digest
SHA256 166bf69761e71f9cf9c6de29171783c3eb087003f26c34ad0d4777662bfc1586
MD5 4d362c0d653721f9c95088e074bda459
BLAKE2b-256 4bc5891eeac0803f4d737da123509026d84ee7f72d4332b6ee4e26dcd9672365

See more details on using hashes here.

File details

Details for the file pkmb-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pkmb-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 2.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for pkmb-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1294bfa856cb4c4dcfd13fb1d9a2ab57747f0e3df5c66c602432fc802f870dab
MD5 f6d43d88e4baab9a3e7e639be114337d
BLAKE2b-256 3b01435726039aff05272a5d0397070f4a2f005a52c49d449903ed4d7315a072

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page