Skip to main content

A comprehensive NLP and Machine Learning package with example implementations

Project description

PKMB Package

A comprehensive Python package containing various NLP and Machine Learning implementations.

Installation

pip install pkmb

Usage

from pkmb import print_program

# Print any program (1-5, 7, 9)
print_program(1)  # Basic NLP operations
print_program(2)  # Named Entity Recognition
print_program(3)  # TF-IDF implementation
print_program(4)  # N-grams analysis
print_program(5)  # Word Embeddings analysis
print_program(7)  # Text Generation with LSTM
print_program(9)  # Variational Autoencoder for MNIST

Available Programs

  1. Program 1: Natural Language Processing (NLP) Text Analysis

    • Basic NLP operations using NLTK
    • Includes: tokenization, stopword removal, stemming, and lemmatization
    • Demonstrates both sentence and word-level processing
  2. Program 2: Named Entity Recognition (NER)

    • Uses NLTK for entity extraction
    • Identifies persons, organizations, locations
    • Includes BIO tagging and tree representation
  3. Program 3: TF-IDF Implementation

    • Manual implementation of TF-IDF calculation
    • Comparison with scikit-learn's TfidfVectorizer
    • Document similarity analysis
  4. Program 4: N-grams Analysis

    • Uses Pride and Prejudice as corpus
    • Generates unigrams, bigrams, and trigrams
    • Includes frequency analysis and visualization
  5. Program 5: Word Embeddings Analysis

    • Uses GloVe embeddings (50d)
    • Word similarity computation
    • Semantic relationship analysis
  6. Program 7: Text Generation with LSTM

    • Neural network-based text generation
    • Uses TensorFlow/Keras LSTM architecture
    • Includes training and text generation capabilities
  7. Program 9: Variational Autoencoder (VAE)

    • Deep learning model for MNIST dataset
    • Implements both encoder and decoder networks
    • Generates new digit images from latent space

Note: Programs 6 and 8 are intentionally omitted from this collection.

Dependencies

The package requires the following Python packages:

pip install nltk pandas scikit-learn requests gensim scipy==1.11.4 tensorflow matplotlib numpy

Additional Setup

  1. NLTK Data: Required for Programs 1-4

    • Downloads automatically when running the programs
    • Includes: punkt, stopwords, wordnet, averaged_perceptron_tagger, maxent_ne_chunker
  2. GloVe Embeddings: Required for Program 5

    • Downloads automatically on first use (~66MB)
    • Uses the glove-wiki-gigaword-50 model
  3. MNIST Dataset: Required for Program 9

    • Downloads automatically through TensorFlow
    • Used for training and testing the VAE

Note on GPU Support

Programs 7 (LSTM) and 9 (VAE) can benefit from GPU acceleration if TensorFlow is installed with CUDA support.

Error Handling

All programs include proper error handling and will display informative messages if:

  • Required data is not available
  • Words are not found in vocabulary
  • Models fail to load or process

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pkmb-0.1.1.tar.gz (8.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pkmb-0.1.1-py3-none-any.whl (8.4 kB view details)

Uploaded Python 3

File details

Details for the file pkmb-0.1.1.tar.gz.

File metadata

  • Download URL: pkmb-0.1.1.tar.gz
  • Upload date:
  • Size: 8.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for pkmb-0.1.1.tar.gz
Algorithm Hash digest
SHA256 4414e11034a69099283e7eb7a87d9237d6a5a4cd865900a9760e2364bd28b665
MD5 adb852b68aa6f4f412910da924af3fa7
BLAKE2b-256 b2eea332b847c3260bd34f6e1d146a80658eda78cecd529003a998a15fa33f8d

See more details on using hashes here.

File details

Details for the file pkmb-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pkmb-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 8.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for pkmb-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6ea56c3d33d8dd1fa33c6f4fe4de1609fd56e08b076e5855723fdd1e910c46e7
MD5 7b9209a0cbdd27f792624d48a6d4d710
BLAKE2b-256 9108de705d26f3783a718629622da8f56e53200a0e26cbb2de81fadd90562845

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page