A tool for working with text data
Project description
LANCETNIC 2.0.2
LANCETNIC is a library with built-in neural network models for working with text and numeric data. Lancetnic provides convenient tools for:
- Data preparation and vectorization
- Training classification models
- Visualization of metrics
- Forecasting on new data
The library allows you to work with purely textual data, as well as with a combination of textual and numerical features, trend and price analysis. Usage examples: text classification, identification of spam, fraudulent messages, working with numerical series and time signs.
🚀 Installing:
Install with CUDA
To work with the GPU, it is recommended to install PyTorch with CUDA support (OPTIONAL):
pip install torch==2.5.1+cu124 torchaudio==2.5.1+cu124 torchvision==0.20.1+cu124 --index-url https://download.pytorch.org/whl/cu124
Then install lancetnic:
pip install lancetnic
👥 Autors
📄 Documentation
Документация на русском
Documentation in English
Quick start
Text classification example
from lancetnic.models import LancetMC
from lancetnic import TextClass
text_model = TextClass(
text_column='description', # Column name containing text data
label_column='category', # Column name containing labels
split_ratio=0.2, # Train/validation split ratio (if no val_path)
random_state=42 # Random seed for reproducibility
)
text_model.train(model_name=LancetMC, # Model architecture for text classification
train_path="train.csv", # Path to training data (CSV format)
val_path="val.csv", # Path to validation data (None for auto-split)
num_epochs=50, # Total training epochs
hidden_size=256, # Size of hidden layers
num_layers=1, # Number of hidden layers
batch_size=256, # Batch size for training
learning_rate=0.001, # Learning rate for optimizer
dropout=0, # Dropout rate (0-1)
optim_name='Adam', # Optimizer ('Adam', 'SGD', 'RAdam', etc.)
crit_name='CELoss' # Loss function ('CELoss' or 'BCELoss')
)
Making predictions
from lancetnic import TextClass
text_model = TextClass()
text_pred = text_model.predict(
model_path="model.pth", # Path to saved model
text="Sample text to classify" # Text input for prediction
)
Combined text and numeric features example
from lancetnic.models import LancetMC
from lancetnic import TextScalarClass
mixed_model = TextScalarClass(
text_column='description', # Text column name (None if only numeric)
data_column=['feat1', 'feat2'], # List of numeric feature columns
label_column='target', # Target variable column
split_ratio=0.2, # Train/val split ratio
random_state=42 # Random seed
)
mixed_model.train(model_name=LancetMC, # Model architecture for text classification
train_path="train.csv", # Path to training data (CSV format)
val_path="val.csv", # Path to validation data (None for auto-split)
num_epochs=50, # Total training epochs
hidden_size=256, # Size of hidden layers
num_layers=1, # Number of hidden layers
batch_size=256, # Batch size for training
learning_rate=0.001, # Learning rate for optimizer
dropout=0, # Dropout rate (0-1)
optim_name='Adam', # Optimizer ('Adam', 'SGD', 'RAdam', etc.)
crit_name='CELoss' # Loss function ('CELoss' or 'BCELoss')
)
Making predictions
from lancetnic import TextScalarClass
mixed_model = TextScalarClass()
mixed_pred = mixed_model.predict(
model_path="mixed_model.pth", # Path to saved model
text="Product description", # Text input (None if only numeric)
numeric=[0.5, 1.2] # Numeric features as list
)
There are two classes of basic models in LANCETNIC:
- LancetMC
from lancetnic.models import LancetMC
- LancetMCA
from lancetnic.models import LancetMC
| Key Differences Between Models | LancetMC | LancetMCA |
|---|---|---|
| Feature | ||
| Core Architecture | Basic LSTM | LSTM + Attention |
| Complexity | Lower | Higher |
| Computational Cost | Less resource-intensive | More resource-intensive |
| Best For | Pure text classification | Mixed data or complex patterns |
| Interpretability | Standard | Provides attention weights |
| Sequence Handling | Good | Excellent for long sequences |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lancetnic-2.0.2.tar.gz.
File metadata
- Download URL: lancetnic-2.0.2.tar.gz
- Upload date:
- Size: 13.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c095c1a2d505ed888db9961cc70df3f70ea441b7ff2e524eda37249bf6f8b7df
|
|
| MD5 |
a680a020bcbb063c639da1c3dd813111
|
|
| BLAKE2b-256 |
5280b490c44632ffaf5d5458e6059981c6b0ab9b1070b9e8a4cc96d62211f8b1
|
File details
Details for the file lancetnic-2.0.2-py3-none-any.whl.
File metadata
- Download URL: lancetnic-2.0.2-py3-none-any.whl
- Upload date:
- Size: 14.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4ee50f62bcd76e133f79c0d1abf06dbde0ff8969322802b948d84480014d4335
|
|
| MD5 |
c043eefaa2b1fa9e924ec743302b8946
|
|
| BLAKE2b-256 |
065eeae6286351b253435b4e34019dc4e1ae5821d7010cb8da8eb9750370a07c
|