NeuralSmith CLI - ML Engineering Made Accessible to All
Project description
NeuralSmith CLI
A standalone command-line tool for automated neural architecture search and machine learning model training.
Features
- ๐ผ๏ธ Image Classification - Train models using Neural Architecture Search (14-model experiments)
- ๐ Tabular Classification & Regression - Train models on CSV data with EDA
- ๐ Time Series Classification & Regression - Train models on time series data
- ๐ท๏ธ Auto Labeler - Automatically label unlabeled data using weighted KNN
- ๐ค CoPilot - Interactive AI assistant for guidance and troubleshooting
Functionality Docs
Each major functionality now has a dedicated in-package README:
neuralsmith/image_classification/README.mdneuralsmith/tabular/README.mdneuralsmith/timeseries/README.mdneuralsmith/auto_labeler/README.mdneuralsmith/copilot/README.mdneuralsmith/README.md(index + core package internals)
Installation
Prerequisites
- Python 3.8 or higher
- pip or pipx
### From PyPI (Future)
```bash
pipx install neuralsmith
Quick Start
1. Configure API Key (for CoPilot)
neuralsmith --config-key
Or set environment variable:
export NEURALSMITH_GEMINI_API_KEY="your-api-key-here"
2. Try CoPilot
neuralsmith copilot
Ask questions like:
- "How do I train an image classification model?"
- "What options does tabular-classification support?"
- "Help me validate my dataset"
3. Train Your First Model
Image Classification
neuralsmith image-classification \
--data-path ./my_images \
--output-dir ./results \
--epochs 10
Tabular Classification
neuralsmith tabular-classification \
--data-path ./data.csv \
--target-column species \
--mode fast
Commands
Image Classification
Train image classification models using NAS:
neuralsmith image-classification \
--data-path <folder> \
--output-dir <dir> \
[--target-size H,W] \
[--epochs N] \
[--val-split 0.1] \
[--batch-size N] \
[--learning-rate 0.001] \
[--device cpu|cuda] \
[--seed 42]
Example:
neuralsmith image-classification \
--data-path ./my_images \
--output-dir ./results \
--target-size 128,128 \
--epochs 100 \
--val-split 0.2
Tabular Classification
Train classification models on CSV data:
neuralsmith tabular-classification \
--data-path <csv> \
--target-column <name> \
[--output-dir <dir>] \
[--mode fast++|fast|exhaustive] \
[--train-percent 80.0] \
[--val-percent 0.0] \
[--test-percent 20.0] \
[--no-eda]
Example:
neuralsmith tabular-classification \
--data-path ./data.csv \
--target-column species \
--mode exhaustive \
--output-dir ./results
Tabular Regression
Same as classification, but for regression tasks:
neuralsmith tabular-regression \
--data-path <csv> \
--target-column <name> \
[OPTIONS]
Time Series Classification
Train classification models on time series data:
neuralsmith timeseries-classification \
--data-path <csv> \
[--time-column <name>] \
--target-column <name> \
[--window-size <n>] \
[--mode fast|fast++|exhaustive] \
[--split-method temporal|random] \
[--train-percent 70.0] \
[--val-percent 15.0] \
[--test-percent 15.0] \
[--random-state 42] \
[--no-normalize] \
[--epochs 10] \
[--batch-size 32]
Time Series Regression
Same as classification, but for regression:
neuralsmith timeseries-regression \
--data-path <csv> \
[--time-column <name>] \
--target-column <name> \
[--window-size <n>] \
[OPTIONS]
Notes:
--data-pathcan point to either a CSV file or a directory containing pre-windowed NumPy splits:X_train.npy,y_train.npy,X_val.npy,y_val.npy,X_test.npy,y_test.npy
- For CSV input,
--time-columnand--window-sizeare required. - For NumPy input,
--time-columnand--window-sizeare ignored. - Default split is
--split-method temporalto avoid overlap leakage between train/val/test windows.
Auto Labeler
Automatically label unlabeled data:
neuralsmith auto-labeler \
--data-path <path> \
--data-type image|tabular|timeseries \
--labeled-column <name> \
--label-column <name> \
--output-path <path> \
[--k 5] \
[--min-confidence 0.5]
CoPilot
Start interactive AI assistant:
neuralsmith copilot [--gemini-key <key>]
Modes (default is Ask โ plain chat, no autonomous tools):
| Mode | Flag | Behavior |
|---|---|---|
| Ask | (default) | Answers questions; you run !validate, !status, !watch yourself. |
| Agent | --agent |
The model can call read-only tools (inspect paths, CSV, validation, run status) and propose full neuralsmith training commands. Each training run is shown as an exact argv and runs only if you type yes. |
| Agent-plus (experimental) | --agent-plus |
Same tools as Agent, but proposed training commands run without confirmation. Use only in trusted environments. |
Examples:
neuralsmith copilot --agent
neuralsmith copilot --agent-plus # experimental
In CoPilot, you can:
- Ask questions about NeuralSmith commands
- Get help with workflows
- Validate datasets:
!validate <path> - Live status for running wizards (from another terminal):
!status [path]- read the newestneuralsmith_run_status.jsonunder--output-dir(or the parent folder of--output-pathforauto-labeler)!watch [path]- poll status every ~2s until the run iscompletedorfailed
- Type
helpfor commands,exitto quit
Agent modes use the same ! commands as Ask mode. Training wizards themselves are unchanged; the agent only invokes the existing CLI in a subprocess with an allowlisted set of flags.
Agent / Agent-plus Quick Guide
Use this when you want CoPilot to help prepare and run training commands end-to-end.
Start modes
neuralsmith copilot --agent
neuralsmith copilot --agent-plus # experimental
How --agent works (recommended default)
- You ask for a task (for example: "train a quick tabular classifier on this CSV").
- CoPilot may inspect files / validate data with read-only tools.
- CoPilot prints a proposed exact command, for example:
python -m neuralsmith tabular-classification ...
- Nothing runs until you confirm by typing
yes. - Training output streams in the same terminal.
How --agent-plus works
- Same planning/tool behavior as
--agent - Difference: proposed training commands run immediately without the
yesconfirmation step - Use only in trusted, local environments
What agent modes can do
- Validate inputs with
!validate <path> - Check run snapshot with
!status [path] - Follow live progress with
!watch [path] - Propose and execute existing NeuralSmith training wizards (
image-classification,tabular-*,timeseries-*,auto-labeler)
Safe usage tips
- Prefer
--agentfor normal use - Provide explicit paths and target columns in your prompt to reduce retries
- Use
--agent-plusonly if you are comfortable with automatic execution
Quick test with bundled sample data
!validate tests/data/tabular_classification/iris_like_100.csv
Train a quick tabular classification model on tests/data/tabular_classification/iris_like_100.csv using target column species and mode fast.
Example (two terminals):
- Start a wizard with
--output-dir(for example./run_live_test) - In CoPilot, run
!watch ./run_live_testto keep getting posted while training runs
Common Workflows
Image Classification Workflow
-
Prepare your images in a folder structure:
my_images/ โโโ class1/ โ โโโ img1.jpg โ โโโ img2.jpg โโโ class2/ โโโ img3.jpg โโโ img4.jpg -
Run training:
neuralsmith image-classification \ --data-path ./my_images \ --output-dir ./results \ --epochs 50 \ --val-split 0.2
-
Check results in
./results/directory
Tabular Classification Workflow
-
Prepare your CSV with a target column
-
Run EDA and training:
neuralsmith tabular-classification \ --data-path ./data.csv \ --target-column target \ --mode exhaustive \ --output-dir ./results
-
Review models in
./results/models/
Time Series Workflow
-
Prepare CSV with time column and features
-
Run training:
neuralsmith timeseries-classification \ --data-path ./timeseries.csv \ --time-column timestamp \ --target-column label \ --window-size 20 \
--mode fast
--split-method temporal
## Using Your Trained Models
After training completes, NeuralSmith automatically generates comprehensive training summary reports and provides easy-to-use model loading utilities.
### Training Summary Report
After each training run, NeuralSmith generates a comprehensive report at:
- **Markdown Report:** `{output_dir}/training_summary_report.md`
- **JSON Report:** `{output_dir}/training_summary_report.json`
The report includes:
- **Executive Summary:** Total models trained, best model identification
- **Model Performance Comparison:** Ranked table of all models with metrics
- **Best Model Details:** Complete information about the best performing model
- **Model Usage Instructions:** Ready-to-use code examples
### Loading Models
#### Image Classification Models
```python
from neuralsmith.model_loader import load_model
import torch
import numpy as np
from PIL import Image
# Load the trained model
model = load_model('results/model_*.pth')
# Preprocess an image
image = Image.open('your_image.jpg')
image = image.resize((64, 64)) # Match your training size
img_array = np.array(image).astype(np.float32) / 255.0
img_array = np.transpose(img_array, (2, 0, 1)) # HWC -> CHW
img_tensor = torch.FloatTensor(img_array).unsqueeze(0)
# Make prediction
model.eval()
with torch.no_grad():
prediction = model(img_tensor)
predicted_class = torch.argmax(prediction, dim=1).item()
probabilities = torch.softmax(prediction, dim=1)[0]
print(f'Predicted class: {predicted_class}')
print(f'Probabilities: {probabilities.numpy()}')
Tabular Classification/Regression Models
from neuralsmith.model_loader import load_model
import pandas as pd
# Load model and preprocessing pipeline
model, preprocessor = load_model('results/models/best_model_*/')
# Load your new data
new_data = pd.read_csv('new_data.csv')
# Preprocess using the same pipeline (handles imputation, scaling, feature selection)
X_processed = preprocessor.transform(new_data)
# Make predictions
predictions = model.predict(X_processed)
# For classification, get probabilities
if hasattr(model, 'predict_proba'):
probabilities = model.predict_proba(X_processed)
print(f'Predictions: {predictions}')
print(f'Probabilities:\n{probabilities}')
else:
print(f'Predictions: {predictions}')
Using CoPilot for Model Usage
After training, you can ask CoPilot for help using your models:
neuralsmith copilot
Example Questions:
- "How do I use the model I just trained?"
- "Generate code to load my model from results/"
- "Show me how to make predictions on new images"
- "How do I use my tabular model for batch predictions?"
Best Practices
- Always check the training summary report first for model details
- Use the same preprocessing that was used during training
- Match input shapes - especially for image models (size, channels)
- Handle device placement - ensure data and model are on the same device
- Use CoPilot for customized code generation based on your specific needs
Configuration
Configuration is stored in ~/.neuralsmith/config.json:
{
"gemini_api_key": "your-api-key",
"default_output_dir": "./models",
"log_level": "INFO"
}
Environment variables (override config):
NEURALSMITH_GEMINI_API_KEY- Gemini API keyNEURALSMITH_TEST_MODE- Enable test mode (limits epochs/models)
Getting Help
Command Help
neuralsmith --help
neuralsmith image-classification --help
neuralsmith tabular-classification --help
CoPilot Assistant
neuralsmith copilot
Then type:
help- Show available commands!validate <path>- Validate a dataset- Ask any question about NeuralSmith
Troubleshooting
API Key Issues
# Check if API key is set
neuralsmith --config-key
# Or use environment variable
export NEURALSMITH_GEMINI_API_KEY="your-key"
Python Environment
Make sure you have Python 3.8+:
python --version
Missing Dependencies
Install all dependencies:
pip install -e ".[copilot,auto-labeler]"
Memory Issues
For large datasets, use smaller batch sizes or reduce image sizes:
neuralsmith image-classification \
--data-path ./large_dataset \
--target-size 64,64 \
--batch-size 16
Model Loading Issues
Model Not Found:
- Check the training summary report for exact model paths
- Verify the output directory path is correct
Shape Mismatch Errors:
- For images: Ensure image size matches training size
- For tabular: Ensure feature names match training features
Preprocessing Errors:
- Load the preprocessor from the same model directory
- Use the same preprocessing pipeline that was used during training
Ask CoPilot: If you encounter issues, ask CoPilot:
- "Help me debug my model loading code"
- "Why am I getting a shape mismatch error?"
- "How do I preprocess my data correctly?"
Testing
The repository includes a comprehensive test suite in the tests/ directory.
Running Tests
Install test dependencies:
pip install -e ".[dev]"
Run all tests:
pytest tests/ -v
Run fast tests only (skip slow training tests):
pytest tests/ -v -m "not slow"
Run a specific test file:
pytest tests/test_image_classification.py -v
Test Coverage
The test suite includes:
- CLI entry point and argument parsing tests
- Image classification wizard tests
- Tabular classification/regression wizard tests
- Time series classification/regression wizard tests
- Auto-labeler wizard tests
- CoPilot functionality tests
- Configuration management tests
- Integration tests for full workflows
Test datasets are stored in tests/data/ and include small synthetic datasets for all wizard types.
Development
Setup
-
Clone the repository (if not already done)
-
Navigate to CLI directory:
cd CLI
-
Install in development mode:
pip install -e ".[dev]"
Project Structure
CLI/
โโโ neuralsmith/ # Main package
โ โโโ cli.py # CLI entry point
โ โโโ config.py # Configuration management
โ โโโ model_loader.py # Model loading utilities
โ โโโ reporting.py # Report generation
โ โโโ image_classification/
โ โโโ tabular/
โ โโโ timeseries/
โ โโโ auto_labeler/
โ โโโ copilot/
โโโ Legacy_utils/ # Shared Python scripts
โโโ scripts/ # Utility scripts
โโโ pyproject.toml # Package configuration
โโโ README.md # This file
Development Workflow
- Make changes to code in
neuralsmith/ - Add tests for new functionality (if applicable)
- Test manually to ensure everything works
- Update documentation in
README.mdif needed
Building the Package
cd CLI
python -m build
This creates dist/ with source distribution and wheel.
Code Style
- Follow PEP 8
- Use type hints where possible
- Add docstrings to public functions
- Keep functions focused and testable
Adding New Features
- Implement the feature in appropriate module
- Add CLI command in
neuralsmith/cli.py - Update documentation in
README.md - Test end-to-end with real data
Requirements
- Python 3.8+
- See
pyproject.tomlfor full dependency list
License
NeuralSmith
Support
For issues and questions:
- Use CoPilot:
neuralsmith copilot - Check the training summary reports for model-specific guidance
- Review this README for common workflows and troubleshooting
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file neuralsmith_cli-1.0.0.tar.gz.
File metadata
- Download URL: neuralsmith_cli-1.0.0.tar.gz
- Upload date:
- Size: 145.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
80b570dfaa0844af3c784725edab9c60e1f775f1723a219e77c32bec8b39d0c6
|
|
| MD5 |
0dde8be2479e32e8dc558b85fc43866f
|
|
| BLAKE2b-256 |
e97aed31df5bf7831e9bb1423da2015a6136a9c1f369bb6174b206102675c252
|
File details
Details for the file neuralsmith_cli-1.0.0-py3-none-any.whl.
File metadata
- Download URL: neuralsmith_cli-1.0.0-py3-none-any.whl
- Upload date:
- Size: 97.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
60200534239b54faafc3aee742a4658486503c5f9be81824a3a6a5b1fb0e39be
|
|
| MD5 |
9bf779c0a347ad61b59fd6fe60e4ab41
|
|
| BLAKE2b-256 |
0d4ca37d885680073997ae70126e05c5a872af36088d5103eb73f63e4ff107bd
|