Data Experimentation and Tinkering Kit - A comprehensive Python toolkit for data science, machine learning, optimization, simulation, and visualization experiments
Project description
Dexter Toolkit
Data Experimentation and Tinkering Kit
A comprehensive Python toolkit for data science, machine learning, optimization, simulation, and visualization experiments.
Overview
Dexter is a modular toolkit designed for rapid prototyping and experimentation in data science and related fields. It provides a collection of specialized modules for different aspects of data analysis, machine learning, optimization, and visualization.
๐ Quick Start
Installation
# Clone the repository
git clone https://github.com/DenizK00/Dexter.git
cd Dexter
# Install in development mode
make install-dev
# Or install manually
python install_dev.py
Basic Usage
import dexter
# Machine Learning
from dexter import pick_classifier
import pandas as pd
df = pd.read_csv('your_data.csv')
best_model = pick_classifier(df, target='target_column')
# Optimization
from dexter import Problem
objective = "min 2*x + 3*y"
constraints = ["x + y >= 10", "x >= 0", "y >= 0"]
problem = Problem(objective, constraints)
solution = problem.solve()
# Statistics
from dexter import Normal, Uniform
normal_dist = Normal(mean=0, var=1)
rv = normal_dist.draw()
# Simulation
from dexter import SimManager
import simpy
env = simpy.Environment()
sim = SimManager(env)
sim.run(until=100)
๐ฆ Package Structure
dexter/
โโโ src/dexter/ # Main package
โ โโโ __init__.py # Package initialization
โ โโโ board/ # Interactive data dashboard
โ โโโ core/ # Core pipeline utilities
โ โโโ data_wrangling/ # Data transformation tools
โ โโโ environment/ # Environment simulation
โ โโโ language/ # Language processing
โ โโโ ml/ # Machine learning
โ โโโ optimization/ # Mathematical optimization
โ โโโ simulation/ # Discrete event simulation
โ โโโ stats/ # Statistical analysis
โ โโโ visualization/ # Visualization tools
โโโ tests/ # Test suite
โโโ docs/ # Documentation
โโโ examples/ # Usage examples
โโโ scripts/ # Utility scripts
๐ฏ Modules
๐ง ML - Machine Learning
- Auto Model Selection: Automated classifier selection with hyperparameter optimization
- Model Comparison: Cross-validation and performance metrics comparison
- Hyperopt Integration: Bayesian optimization for hyperparameter tuning
- Binary/Multiclass Support: Handles both binary and multiclass classification tasks
from dexter.ml import pick_classifier
# Automatically find the best classifier
best_model = pick_classifier(df, target='target_column', mode='extensive')
โก Optimization - Mathematical Optimization
- Mathematical Optimization: Linear and nonlinear optimization problems
- Pyomo Integration: Mathematical modeling with Pyomo framework
- Equation Parsing: Natural language equation parsing and conversion
- Solution Management: Optimal solution extraction and evaluation
from dexter.optimization import Problem
# Define and solve optimization problem
problem = Problem("min 2*x + 3*y", ["x + y >= 10", "x >= 0", "y >= 0"])
solution = problem.solve()
๐ฎ Simulation - Discrete Event Simulation
- Discrete Event Simulation: Built on SimPy for event-driven simulations
- Resource Management: Dynamic resource allocation and management
- Process Control: Start, stop, and manage simulation processes
- Step Mode: Step-by-step simulation execution for debugging
from dexter.simulation import SimManager
import simpy
env = simpy.Environment()
sim = SimManager(env)
sim.add_resource("service", simpy.Resource(env, capacity=2))
sim.run(until=100)
๐ Stats - Statistical Analysis
- Probability Distributions: Comprehensive distribution library
- Normal, Uniform, Binomial, Geometric, Negative Binomial
- Poisson, Exponential, Gamma, Chi-Square distributions
- Random Variable Management: RV and Sample classes for statistical operations
- Distribution Operations: Addition, multiplication, and transformation of distributions
from dexter.stats import Normal, Uniform, Binomial
# Create and work with distributions
normal_dist = Normal(mean=0, var=1)
uniform_dist = Uniform(a=0, b=1)
binomial_dist = Binomial(n=10, p=0.5)
# Generate random variables
rv = normal_dist.draw()
sample = uniform_dist.draw(n=100)
๐จ Visualization - Interactive Visualization
- 3D Space Visualization: Interactive 3D plotting with Plotly
- Vector Visualization: 3D vector representation and manipulation
- Surface Plotting: 3D surface and mesh grid visualization
- Interactive Plots: Web-based interactive visualizations
from dexter.visualization import Space
# Create 3D visualization space
space = Space(x_size=10, y_size=10, z_size=10)
space.add_vector([1, 2, 3], color='red')
space.show()
๐ Environment - Environment Simulation
- Grid-based Environment: 2D grid system for agent-based simulations
- Tkinter GUI: Interactive grid display with agent positioning
- Agent Management: Place and track agents within the grid environment
from dexter.environment import Grid, GridApp
# Create grid environment
grid = Grid(nrows=10, ncolumns=10)
grid.set_agent(5, 5)
grid.set_cell(3, 3, '#')
๐ฏ Board - Interactive Data Dashboard
- Interactive Web Dashboard: Built with Dash and Bootstrap for data visualization
- IPython Integration: Custom kernel management with Jupyter console integration
- Real-time Data Viewing: Live data table updates and interactive components
๐ง Data Wrangling - Data Transformation
- Data Modification: Tools for data transformation and manipulation
- Diffusion Functions: Data diffusion and spreading utilities
- Deviation Functions: Statistical deviation and error introduction
๐ Core - Pipeline Management
- Modular Pipeline System: Extensible pipeline architecture
- Process Chaining: Sequential process execution with result management
- Step-by-step Execution: Individual step execution and monitoring
๐ค Language - Language Processing
- Fine-tuning Framework: Tools for model fine-tuning and training
- RAG Pipeline: Retrieval-Augmented Generation pipeline components
- Chain Management: Modular chain-based processing architecture
๐ ๏ธ Development
Setup Development Environment
# Install in development mode
make install-dev
# Run tests
make test
# Run linting
make lint
# Format code
make format
# Run all checks
make check
Project Structure
dexter/
โโโ src/dexter/ # Source code
โโโ tests/ # Test suite
โโโ docs/ # Documentation
โโโ examples/ # Usage examples
โโโ scripts/ # Utility scripts
โโโ pyproject.toml # Project configuration
โโโ setup.py # Setup script
โโโ Makefile # Development tasks
โโโ install_dev.py # Development installation
โโโ README.md # This file
๐ Dependencies
Core Dependencies
- Data Science: pandas, numpy, scipy, scikit-learn
- Visualization: matplotlib, seaborn, plotly
- Web Dashboard: dash, dash-bootstrap-components
- Optimization: pyomo
- Simulation: simpy
- Machine Learning: hyperopt
- GUI: PyQt5
- Jupyter: ipykernel, ipython
Development Dependencies
- Testing: pytest, pytest-cov
- Linting: flake8, mypy
- Formatting: black, isort
- Documentation: sphinx, sphinx-rtd-theme
๐ Documentation
For detailed documentation, examples, and API reference, see the documentation.
๐ค Contributing
We welcome contributions! Please see our Contributing Guide for details.
Development Workflow
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes and add tests
- Run tests:
make test - Format code:
make format - Commit your changes:
git commit -m 'Add amazing feature' - Push to the branch:
git push origin feature/amazing-feature - Open a Pull Request
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐จโ๐ป Author
Deniz - denizkurtaran00@gmail.com
๐ Acknowledgments
- Built with โค๏ธ for the data science community
- Inspired by the need for rapid experimentation tools
- Powered by the amazing Python ecosystem
Dexter Toolkit - Making data experimentation and tinkering easier and more efficient.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dexter_toolkit-0.1.0.tar.gz.
File metadata
- Download URL: dexter_toolkit-0.1.0.tar.gz
- Upload date:
- Size: 31.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f02cc2e92cbc81129cee00ab820615f82019e70d8728e4d11e538dd4710af131
|
|
| MD5 |
588d37994831d3a3a1365da9861fadd0
|
|
| BLAKE2b-256 |
774bf7c99b44df0cf3a0f3a3038dce2ab16f305be675c18c8ea430cf8da0dae1
|
File details
Details for the file dexter_toolkit-0.1.0-py3-none-any.whl.
File metadata
- Download URL: dexter_toolkit-0.1.0-py3-none-any.whl
- Upload date:
- Size: 37.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cfb1e87fe9bfc15fe7fc9d23255984815e575a1042a9f6e6d1b3eca673242e88
|
|
| MD5 |
0ae329ecdb727fdf8b87004cba51c898
|
|
| BLAKE2b-256 |
0684cb1a3753b99addd5b3da8cd9067a5446fa62f1ab1d591039e0be59cd3a58
|