Skip to main content

A comprehensive NLP and Machine Learning package with example implementations

Project description

PKMB Package

A comprehensive Python package containing various NLP and Machine Learning implementations.

Installation

pip install pkmb

Usage

from pkmb import print_program

# Print any program (1-5, 7, 9)
print_program(1)  # Basic NLP operations
print_program(2)  # Named Entity Recognition
print_program(3)  # TF-IDF implementation
print_program(4)  # N-grams analysis
print_program(5)  # Word Embeddings analysis
print_program(7)  # Text Generation with LSTM
print_program(9)  # Variational Autoencoder for MNIST

Available Programs

  1. Program 1: Natural Language Processing (NLP) Text Analysis

    • Basic NLP operations using NLTK
    • Includes: tokenization, stopword removal, stemming, and lemmatization
    • Demonstrates both sentence and word-level processing
  2. Program 2: Named Entity Recognition (NER)

    • Uses NLTK for entity extraction
    • Identifies persons, organizations, locations
    • Includes BIO tagging and tree representation
  3. Program 3: TF-IDF Implementation

    • Manual implementation of TF-IDF calculation
    • Comparison with scikit-learn's TfidfVectorizer
    • Document similarity analysis
  4. Program 4: N-grams Analysis

    • Uses Pride and Prejudice as corpus
    • Generates unigrams, bigrams, and trigrams
    • Includes frequency analysis and visualization
  5. Program 5: Word Embeddings Analysis

    • Uses GloVe embeddings (50d)
    • Word similarity computation
    • Semantic relationship analysis
  6. Program 7: Text Generation with LSTM

    • Neural network-based text generation
    • Uses TensorFlow/Keras LSTM architecture
    • Includes training and text generation capabilities
  7. Program 9: Variational Autoencoder (VAE)

    • Deep learning model for MNIST dataset
    • Implements both encoder and decoder networks
    • Generates new digit images from latent space

Note: Programs 6 and 8 are intentionally omitted from this collection.

Dependencies

The package requires the following Python packages:

pip install nltk pandas scikit-learn requests gensim scipy==1.11.4 tensorflow matplotlib numpy

Additional Setup

  1. NLTK Data: Required for Programs 1-4

    • Downloads automatically when running the programs
    • Includes: punkt, stopwords, wordnet, averaged_perceptron_tagger, maxent_ne_chunker
  2. GloVe Embeddings: Required for Program 5

    • Downloads automatically on first use (~66MB)
    • Uses the glove-wiki-gigaword-50 model
  3. MNIST Dataset: Required for Program 9

    • Downloads automatically through TensorFlow
    • Used for training and testing the VAE

Note on GPU Support

Programs 7 (LSTM) and 9 (VAE) can benefit from GPU acceleration if TensorFlow is installed with CUDA support.

Error Handling

All programs include proper error handling and will display informative messages if:

  • Required data is not available
  • Words are not found in vocabulary
  • Models fail to load or process

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pkmb-0.1.2.tar.gz (11.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pkmb-0.1.2-py3-none-any.whl (10.0 kB view details)

Uploaded Python 3

File details

Details for the file pkmb-0.1.2.tar.gz.

File metadata

  • Download URL: pkmb-0.1.2.tar.gz
  • Upload date:
  • Size: 11.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for pkmb-0.1.2.tar.gz
Algorithm Hash digest
SHA256 8780b0319d9fb091194bafe2d9ddefdd177c9618dbf63315c6c8b5674ea3966b
MD5 56f874d752dd14311ab14332f1d9035d
BLAKE2b-256 efe37f3bad0017690e92d30946621c0ee757c14229a400fc9d1aa57a32b7aebe

See more details on using hashes here.

File details

Details for the file pkmb-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: pkmb-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 10.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for pkmb-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 17b6958567f667d86aa9e7ac4843702d2a58324305522efd23a7b3e9f5cd247b
MD5 646b5ef44ff5811146f934fae5ad527f
BLAKE2b-256 47d9c6801a084b2dfa0710045a0694487a0fde2cb8a7aa89c59dab01e8526bdb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page