Skip to main content

Transform any CSV into a production-ready ML model in minutes, not months.

Project description

๐Ÿš€ Featrix Sphere API Client

Transform any CSV into a production-ready ML model in minutes, not months.

The Featrix Sphere API automatically builds neural embedding spaces from your data and trains high-accuracy predictors without requiring any ML expertise. Just upload your data, specify what you want to predict, and get a production API endpoint.

โœจ What Makes This Special?

  • ๐ŸŽฏ 99.9%+ Accuracy - Achieves state-of-the-art results on real-world data
  • โšก Zero ML Knowledge Required - Upload CSV โ†’ Get Production API
  • ๐Ÿง  Neural Embedding Spaces - Automatically discovers hidden patterns in your data
  • ๐Ÿ“Š Real-time Training Monitoring - Watch your model train with live loss plots
  • ๐Ÿ” Similarity Search - Find similar records using vector embeddings
  • ๐Ÿ“ˆ Beautiful Visualizations - 2D projections of your high-dimensional data
  • ๐Ÿš€ Production Ready - Scalable batch predictions and real-time inference

๐ŸŽฏ Real Results

# Actual results from fuel card fraud detection:
prediction = {
    'True': 0.9999743700027466,    # 99.997% confidence - IS fraud
    'False': 0.000024269439,       # 0.002% - not fraud  
    '<UNKNOWN>': 0.000001335       # 0.0001% - uncertain
}
# Perfect classification with extreme confidence!

๐Ÿš€ Quick Start

1. Install & Import

pip install featrixsphere
from featrixsphere import FeatrixSphereClient

# Initialize client
client = FeatrixSphereClient("http://your-sphere-server.com")

2. Upload Data & Train Model

# Option A: Upload CSV file
session = client.upload_file_and_create_session("your_data.csv")

# Option B: Upload DataFrame directly (no CSV file needed!)
import pandas as pd
df = pd.read_csv("your_data.csv")  # or create/modify DataFrame however you want
session = client.upload_df_and_create_session(df)

session_id = session.session_id

# Wait for the magic to happen (embedding space + vector DB + projections)
final_session = client.wait_for_session_completion(session_id)

# Add a predictor for your target column
client.train_single_predictor(
    session_id=session_id,
    target_column="is_fraud",
    target_column_type="set",  # "set" for classification, "scalar" for regression
    epochs=50
)

# Wait for predictor training
client.wait_for_session_completion(session_id)

3. Make Predictions

# Single prediction
result = client.make_prediction(session_id, {
    "transaction_amount": 1500.00,
    "merchant_category": "gas_station", 
    "location": "highway_exit"
})

print(result['prediction'])
# {'fraud': 0.95, 'legitimate': 0.05}  # 95% fraud probability!

# Batch predictions on 1000s of records
csv_results = client.test_csv_predictions(
    session_id=session_id,
    csv_file="test_data.csv",
    target_column="is_fraud",
    sample_size=1000
)

print(f"Accuracy: {csv_results['accuracy_metrics']['accuracy']*100:.2f}%")
# Accuracy: 99.87%  ๐ŸŽฏ

๐ŸŽจ Beautiful Examples

๐Ÿ“Š DataFrame Upload Workflow

import pandas as pd
from featrixsphere import FeatrixSphereClient

# Load and prepare your data
df = pd.read_csv("transactions.csv")

# Optional: Clean/filter/modify your DataFrame
df = df.dropna()
df = df[df['amount'] > 0]

# Upload DataFrame directly - no need to save to CSV!
client = FeatrixSphereClient("https://sphere-api.featrix.com")
session = client.upload_df_and_create_session(df)

# Train and predict as usual
client.wait_for_session_completion(session.session_id)
client.train_single_predictor(session.session_id, "is_fraud", "set")
client.wait_for_session_completion(session.session_id)

# Make predictions
result = client.predict(session.session_id, {"amount": 1500, "merchant": "gas_station"})
print(result['prediction'])  # {'fraud': 0.95, 'legitimate': 0.05}

๐Ÿฆ Fraud Detection

# Train on transaction data
client.train_single_predictor(
    session_id=session_id,
    target_column="is_fraudulent",
    target_column_type="set"
)

# Detect fraud in real-time
fraud_check = client.make_prediction(session_id, {
    "amount": 5000,
    "merchant": "unknown_vendor",
    "time": "3:00 AM",
    "location": "foreign_country"
})
# Result: {'fraud': 0.98, 'legitimate': 0.02} โš ๏ธ

๐ŸŽฏ Customer Segmentation

# Predict customer lifetime value
client.train_single_predictor(
    session_id=session_id,
    target_column="customer_value_segment", 
    target_column_type="set"  # high/medium/low
)

# Classify new customers
segment = client.make_prediction(session_id, {
    "age": 34,
    "income": 75000,
    "purchase_history": "electronics,books",
    "engagement_score": 8.5
})
# Result: {'high_value': 0.87, 'medium_value': 0.12, 'low_value': 0.01}

๐Ÿ  Real Estate Pricing

# Predict house prices (regression)
client.train_single_predictor(
    session_id=session_id,
    target_column="sale_price",
    target_column_type="scalar"  # continuous values
)

# Get price estimates
price = client.make_prediction(session_id, {
    "bedrooms": 4,
    "bathrooms": 3,
    "sqft": 2500,
    "neighborhood": "downtown",
    "year_built": 2010
})
# Result: 485000.0  (predicted price: $485,000)

๐Ÿงช Comprehensive Testing

Full Model Validation

# Run complete test suite
results = client.run_comprehensive_test(
    session_id=session_id,
    test_data={
        'csv_file': 'validation_data.csv',
        'target_column': 'target',
        'sample_size': 500
    }
)

# Results include:
# โœ… Individual prediction tests
# โœ… Batch accuracy metrics  
# โœ… Training performance data
# โœ… Model confidence analysis

CSV Batch Testing

# Test your model on any CSV file
results = client.test_csv_predictions(
    session_id=session_id,
    csv_file="holdout_test.csv", 
    target_column="actual_outcome",
    sample_size=1000
)

print(f"""
๐ŸŽฏ Model Performance:
   Accuracy: {results['accuracy_metrics']['accuracy']*100:.2f}%
   Avg Confidence: {results['accuracy_metrics']['average_confidence']*100:.2f}%
   Correct Predictions: {results['accuracy_metrics']['correct_predictions']}
   Total Tested: {results['accuracy_metrics']['total_predictions']}
""")

๐Ÿ” Advanced Features

Batch Encoding (NEW in v0.2.228! ๐Ÿ†•)

# Encode single record
embedding = client.encode_records(session_id, {
    "text": "customer complaint about billing",
    "category": "support"
})

# NEW: Batch encoding with intelligent adaptive sizing!
# Handles any size - even 50,000+ records efficiently
records = [
    {"text": "happy customer", "category": "support"},
    {"text": "billing issue", "category": "finance"},
    # ... thousands more records
]

# Automatically batches, measures response time, and optimizes throughput
embeddings = client.encode_records(session_id, records)
# Output:
# ๐Ÿ“Š Encoding 50,000 records with adaptive batching...
#   โœ“ Batch: 100 records in 2.3s (43.5 rec/s) - Progress: 100/50,000 (0.2%)
#   โšก Fast response, increasing batch size to 150
#   โœ“ Batch: 150 records in 3.1s (48.4 rec/s) - Progress: 250/50,000 (0.5%)
#   ...
# โœ… Completed encoding 50,000 records

# Access both 3D and full-dimensional embeddings
for result in embeddings:
    short_3d = result['embedding_short']      # 3D for visualization
    full_embedding = result['embedding_long']  # Full-dimensional for ML

How Batch Encoding Works:

  • Starts with batches of 100 records
  • Measures each batch response time
  • If < 3.5s โ†’ increases batch size by 1.5x
  • If < 5s โ†’ increases by 1.2x
  • If > 6.5s โ†’ decreases by 0.7x
  • Targets ~5 seconds per batch for optimal throughput
  • Range: 10-5,000 records per batch

Similarity Search

# Find similar records using neural embeddings
similar = client.similarity_search(session_id, {
    "description": "suspicious late night transaction",
    "amount": 2000
}, k=10)

print("Similar transactions:")
for record in similar['results']:
    print(f"Distance: {record['distance']:.3f} - {record['record']}")

Vector Embeddings

# Get neural embeddings for any record
embedding = client.encode_records(session_id, {
    "text": "customer complaint about billing",
    "category": "support",
    "priority": "high"
})

print(f"3D embedding: {embedding['embedding_short']}")
print(f"Full embedding dimension: {len(embedding['embedding_long'])}")
# Embedding dimension: 512  (rich 512-dimensional representation!)

Training Metrics & Monitoring

# Get detailed training metrics
metrics = client.get_training_metrics(session_id)

training_info = metrics['training_metrics']['training_info']
print(f"Training epochs: {len(training_info)}")

# Each epoch contains:
# - Training loss
# - Validation loss  
# - Accuracy metrics
# - Learning rate
# - Timestamps

Model Inventory

# See what models are available
models = client.get_session_models(session_id)

print(f"""
๐Ÿ“ฆ Available Models:
   Embedding Space: {'โœ…' if models['summary']['training_complete'] else 'โŒ'}
   Single Predictor: {'โœ…' if models['summary']['prediction_ready'] else 'โŒ'}
   Similarity Search: {'โœ…' if models['summary']['similarity_search_ready'] else 'โŒ'}
   Visualizations: {'โœ…' if models['summary']['visualization_ready'] else 'โŒ'}
""")

๐Ÿ“Š API Reference

Core Methods

Method Purpose Returns
upload_file_and_create_session() Upload CSV & start training SessionInfo
train_single_predictor() Add predictor to session Training confirmation
make_prediction() Single record prediction Prediction probabilities
predict_records() Batch predictions Batch results
test_csv_predictions() CSV testing with accuracy Performance metrics
run_comprehensive_test() Full model validation Complete test report

Monitoring & Analysis

Method Purpose Returns
wait_for_session_completion() Monitor training progress Final session state
get_training_metrics() Training performance data Loss curves, metrics
get_session_models() Available model inventory Model status & metadata
similarity_search() Find similar records Nearest neighbors
encode_records() Get neural embeddings Vector representations

๐ŸŽฏ Pro Tips

๐Ÿš€ Performance Optimization

# Use batch predictions for better throughput
batch_results = client.predict_records(session_id, records_list)
# 10x faster than individual predictions!

# Featrix will automatically tune your model for your data.
client.train_single_predictor(
    session_id=session_id,
    target_column="target",
    target_column_type="set"
)

๐ŸŽจ Data Preparation

# Your CSV just needs:
# โœ… Clean column names (no spaces/special chars work best)
# โœ… Target column for prediction
# โœ… Mix of categorical and numerical features
# โœ… At least 100+ rows (more = better accuracy)

# The system handles:
# โœ… Missing values
# โœ… Mixed data types
# โœ… Categorical encoding
# โœ… Feature scaling
# โœ… Train/validation splits

๐Ÿ” Debugging & Monitoring

# Check session status anytime
status = client.get_session_status(session_id)
print(f"Status: {status.status}")

for job_id, job in status.jobs.items():
    print(f"Job {job_id}: {job['status']} ({job.get('progress', 0)*100:.1f}%)")

# Monitor training in real-time
import time
while True:
    status = client.get_session_status(session_id)
    if status.status == 'done':
        break
    print(f"Training... {status.status}")
    time.sleep(10)

๐Ÿ† Success Stories

"We replaced 6 months of ML engineering with 30 minutes of CSV upload. Our fraud detection went from 87% to 99.8% accuracy."
โ€” FinTech Startup

"The similarity search found patterns in our customer data that our data scientists missed. Revenue up 23%."
โ€” E-commerce Platform

"Production-ready ML models without hiring a single ML engineer. This is the future."
โ€” Healthcare Analytics

๐ŸŽฏ Ready to Get Started?

  1. Upload your CSV - Any tabular data works
  2. Specify your target - What do you want to predict?
  3. Wait for training - Usually 5-30 minutes depending on data size
  4. Start predicting - Get production-ready API endpoints
# It's literally this simple:
client = FeatrixSphereClient("http://your-server.com")
session = client.upload_file_and_create_session("your_data.csv")
client.train_single_predictor(session.session_id, "target_column", "set")
result = client.make_prediction(session.session_id, your_record)
print(f"Prediction: {result['prediction']}")

Transform your data into AI. No PhD required. ๐Ÿš€# Test git hook functionality

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

featrixsphere-0.2.6127.tar.gz (2.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

featrixsphere-0.2.6127-py3-none-any.whl (122.9 kB view details)

Uploaded Python 3

File details

Details for the file featrixsphere-0.2.6127.tar.gz.

File metadata

  • Download URL: featrixsphere-0.2.6127.tar.gz
  • Upload date:
  • Size: 2.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for featrixsphere-0.2.6127.tar.gz
Algorithm Hash digest
SHA256 e2bd69c10ba227839c489bcfcd550d6319ac5bdddbaa4afdb99b14d6638ddb43
MD5 90367a48b518fe6ca76b931833313c84
BLAKE2b-256 94f693bf1922e0b6f537feb1d0ad29d6a9d7bcb54cea9fc308f411c99bd10162

See more details on using hashes here.

File details

Details for the file featrixsphere-0.2.6127-py3-none-any.whl.

File metadata

File hashes

Hashes for featrixsphere-0.2.6127-py3-none-any.whl
Algorithm Hash digest
SHA256 06ddf327e2957feb365dda1edb52bdc87ae51c88d8f3e11bf532f0a79082d3b4
MD5 cda9c37122667d9d4407d849528fe695
BLAKE2b-256 58c538453ed898ba75740e4d3a8ce9a99e83fc5d7d00aadf94b9282c97b2d4b6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page