Natural language hypothesis testing and comprehensive statistics library

These details have not been verified by PyPI

Project links

Project description

HypoTestX

Natural Language Hypothesis Testing — Powered by LLMs or Pure Regex

Ask a statistical question in plain English. HypoTestX routes it to the right test — with or without an LLM.

HypoTestX gives you two ways to run hypothesis tests:

Direct API — call any of 12 statistical tests explicitly with full parameter control.
Natural language interface — pass a plain-English question and a DataFrame to analyze(). HypoTestX parses the intent (via a regex fallback or a real LLM), picks the right test, extracts the right columns, and returns a full HypoResult.

The mathematical core is pure Python — no NumPy, no SciPy, no compiled extensions required.

Key Features

Natural Language Interface — `analyze()`

import hypotestx as hx
import pandas as pd

df = pd.read_csv('survey.csv')

# Zero config — built-in regex router, no API key needed
result = hx.analyze(df, "Do males earn more than females?")
print(result.summary())

Plug-in LLM Backends

Swap in any LLM with a single keyword argument — no code changes:

# Google Gemini (free tier, 1500 req/day) — pick any gemini-2.x model
result = hx.analyze(df, "Is age correlated with salary?",
                    backend="gemini", api_key="AIza...",
                    model="gemini-2.0-flash")   # or "gemini-2.0-flash-lite"

# Groq (free tier, OpenAI-compatible) — pick any supported model
result = hx.analyze(df, "Is there an association between gender and dept?",
                    backend="groq", api_key="gsk_...",
                    model="llama-3.3-70b-versatile")

# OpenAI — specify model and token budget
result = hx.analyze(df, "Do groups differ?",
                    backend="openai", api_key="sk-...",
                    model="gpt-4o-mini", temperature=0.0)

# Local Ollama (completely offline) — choose any pulled model
result = hx.analyze(df, "Compare satisfaction across regions?",
                    backend="ollama", model="mistral")

# Bring your own callable
result = hx.analyze(df, "Any question",
                    backend=lambda msgs: my_llm(msgs[-1]["content"]))

Pure Python Mathematics

Zero dependencies for all statistical computations
All test functions and distributions implemented from scratch
Complete transparency — read the source to see exactly how statistics work
All LLM HTTP calls use only urllib.request from the standard library

Dual Mode Design

# Natural language — let HypoTestX choose the test
hx.analyze(df, "Is there a difference between group A and B?")

# Direct API — explicit control over every parameter
hx.ttest_2samp(group1, group2, equal_var=False, alpha=0.01)

Comprehensive Statistical Toolkit

Parametric tests: one-sample, two-sample, paired t-tests, one-way ANOVA
Non-parametric tests: Mann-Whitney U, Wilcoxon signed-rank, Kruskal-Wallis
Categorical tests: chi-square (independence + GoF), Fisher's exact
Correlation: Pearson, Spearman, point-biserial
Effect sizes: Cohen's d, eta-squared, Cramer's V, rank-biserial r
Power analysis: sample size calculations, post-hoc power

Quick Start

Installation

pip install hypotestx

No mandatory external dependencies — all statistical maths and HTTP calls are pure Python stdlib.
Optional extras:

# For local Ollama backend (free, offline)
# 1. Install Ollama from https://ollama.com
# 2. Pull a model:  ollama pull llama3.2

# For HuggingFace local inference (optional)
pip install transformers torch

# For visualization helpers (optional)
pip install matplotlib

Basic Usage

import hypotestx as hx
import pandas as pd

# Load your data
df = pd.read_csv('your_data.csv')

# Ask questions naturally — no API key required (regex fallback)
result = hx.analyze(df, "Do customers in region A spend more than region B?")

# Get comprehensive results
print(result.summary())
# [ Welch's t-test (unequal variances) ]
# =========================================
# Statistic (t):   3.2456
# p-value:         0.0012
# Significant:     Yes (alpha = 0.05)
# Effect size (d): 0.6834   (medium)
# 95% CI:          [1.23, 4.56]

# Access individual values
print(result.p_value)          # 0.0012
print(result.effect_size)      # 0.6834
print(result.is_significant)   # True

Examples

One-Sample t-test

# Natural language
result = hx.analyze(df, "Is the average score different from 75?")

# Direct API
result = hx.ttest_1samp(df['scores'].tolist(), mu=75, alternative='two-sided')

Two-Sample t-test

# Natural language — columns detected from schema
result = hx.analyze(df, "Do males have higher income than females?")

# Direct API with full control
males   = df[df['gender'] == 'M']['income'].tolist()
females = df[df['gender'] == 'F']['income'].tolist()
result  = hx.ttest_2samp(males, females, alternative='greater', equal_var=False)

Paired t-test

# Natural language
result = hx.analyze(df, "Did scores improve from pre_score to post_score?")

# Direct API
result = hx.ttest_paired(df['pre_score'].tolist(), df['post_score'].tolist(),
                         alternative='less')

Correlation

# Natural language
result = hx.analyze(df, "Is age correlated with salary?")

# Direct API
result = hx.pearson(df['age'].tolist(), df['salary'].tolist())

Categorical Association

# Natural language
result = hx.analyze(df, "Is there an association between gender and department?")

# Direct API
import hypotestx as hx
table = [[30, 10], [20, 40]]   # 2x2 contingency table
result = hx.chi2_test(table)

Using a Real LLM Backend

import hypotestx as hx
import pandas as pd

df = pd.read_csv('employees.csv')

# Gemini free tier — best out-of-the-box accuracy
result = hx.analyze(
    df,
    "Is there a salary difference between engineering and sales departments?",
    backend="gemini",
    api_key="AIza...",
    model="gemini-2.0-flash",   # or "gemini-2.0-flash-lite" for faster/cheaper
    temperature=0.0,
)
print(result.summary())

# Groq free tier (OpenAI-compatible, very fast)
result = hx.analyze(
    df,
    "Is employee satisfaction correlated with tenure?",
    backend="groq",
    api_key="gsk_...",
    model="llama-3.3-70b-versatile",  # or "mixtral-8x7b-32768"
)

# Local Ollama (fully offline, no API key)
result = hx.analyze(
    df,
    "Are there differences in performance scores across teams?",
    backend="ollama",           # uses llama3.2 by default
    model="phi4",               # override model
)

Natural Language Examples

analyze() understands plain English. The built-in regex fallback handles the patterns below with no API key. A real LLM backend handles arbitrarily complex phrasings.

Two-group comparisons

hx.analyze(df, "Do males spend more than females?")
hx.analyze(df, "Is there a difference between group A and group B?")
hx.analyze(df, "Are premium customers different from basic customers?")
hx.analyze(df, "Test whether method 1 is better than method 2")

One-sample tests

hx.analyze(df, "Is the average score different from 100?")
hx.analyze(df, "Test if the mean equals 50")
hx.analyze(df, "Is the average significantly greater than 75?")

Correlation & relationships

hx.analyze(df, "Is there a correlation between age and income?")
hx.analyze(df, "Is salary related to years of experience?")
hx.analyze(df, "Does age predict salary?")

Categorical associations

hx.analyze(df, "Are gender and department independent?")
hx.analyze(df, "Is there an association between treatment and outcome?")
hx.analyze(df, "Are product preference and region related?")

Multi-group comparisons

hx.analyze(df, "Compare satisfaction scores across all regions")
hx.analyze(df, "Are there differences in performance across three teams?")

Paired / before-after

hx.analyze(df, "Did scores improve from pre_score to post_score?")
hx.analyze(df, "Compare before and after treatment")

Supported Tests

Parametric

Test	NL phrase examples	Direct function
One-sample t-test	"Is the mean different from 100?"	`ttest_1samp()`
Two-sample t-test	"Do groups differ?"	`ttest_2samp()`
Welch's t-test	"Compare (unequal variances)"	`welch_ttest()`
Paired t-test	"Did scores change?"	`ttest_paired()`
One-way ANOVA	"Compare three or more groups"	`anova_1way()`

Non-Parametric

Test	NL phrase examples	Direct function
Mann-Whitney U	"Compare (non-normal data)"	`mannwhitney()`
Wilcoxon signed-rank	"Paired (non-normal)"	`wilcoxon()`
Kruskal-Wallis	"Multiple groups (non-normal)"	`kruskal()`

Categorical

Test	NL phrase examples	Direct function
Chi-square	"Are the variables independent?"	`chi2_test()`
Fisher's exact	"2x2 table, small sample"	`fisher_exact()`

Correlation

Test	NL phrase examples	Direct function
Pearson	"Linear relationship between X and Y?"	`pearson()`
Spearman	"Monotonic / rank correlation?"	`spearman()`
Point-biserial	"Continuous vs binary?"	`pointbiserial()`

Advanced Features

LLM Backends

All backends require zero extra dependencies except where noted.

Backend string	Provider	Cost	Default model	Dependencies
`None` / `"fallback"`	Built-in regex router	Free, offline	—	None
`"ollama"`	Local Ollama	Free, offline	`llama3.2`	Ollama app
`"gemini"`	Google Gemini	Free (1500 req/day)	`gemini-2.0-flash`	None
`"groq"`	Groq Cloud	Free tier	`llama-3.3-70b-versatile`	None
`"openai"`	OpenAI	Paid	`gpt-4o-mini`	None
`"together"`	Together AI	Free tier	`meta-llama/Llama-3-70b-chat-hf`	None
`"mistral"`	Mistral AI	Free tier	`mistral-small-latest`	None
`"perplexity"`	Perplexity AI	Free tier	`llama-3.1-sonar-small-128k-online`	None
`"huggingface"`	HF Inference API or local	Free tier / Local	`zephyr-7b-beta`	`transformers` (local only)

All extra kwargs are passed directly to the backend constructor via hx.analyze():

kwarg	backends	notes
`model`	all	override the default model name
`temperature`	gemini, openai-compat, huggingface	sampling temperature (0 = deterministic)
`max_tokens`	gemini, openai-compat, huggingface	max tokens in the LLM response
`timeout`	all	HTTP timeout in seconds (default: 60)
`host`	ollama	server URL (default `http://localhost:11434`)
`options`	ollama	dict forwarded to Ollama model options
`token`	huggingface	HF access token for Inference API
`use_local`	huggingface	load model locally via `transformers`
`device`	huggingface (local)	`"cpu"` or `"cuda"`
`base_url`	openai-compat	override API base URL (e.g. Azure endpoint)
`extra_headers`	openai-compat	additional HTTP headers dict

import hypotestx as hx

# --- Gemini (free tier) ---
result = hx.analyze(df, "Is age correlated with salary?",
                    backend="gemini", api_key="AIza...",
                    model="gemini-2.0-flash",       # or "gemini-2.0-flash-lite"
                    temperature=0.0, max_tokens=512)

# --- Groq (free tier, very fast) ---
result = hx.analyze(df, "Compare departments",
                    backend="groq", api_key="gsk_...",
                    model="llama-3.3-70b-versatile") # or "mixtral-8x7b-32768"

# --- OpenAI ---
result = hx.analyze(df, "Is salary correlated with tenure?",
                    backend="openai", api_key="sk-...",
                    model="gpt-4o-mini",             # or "gpt-4o"
                    temperature=0.0, max_tokens=256)

# --- Together AI / Mistral / Perplexity ---
result = hx.analyze(df, "Do groups differ?",
                    backend="together", api_key="...",
                    model="meta-llama/Llama-3-70b-chat-hf")

# --- Custom OpenAI-compatible endpoint (Azure, vLLM, LiteLLM, …) ---
result = hx.analyze(df, "Compare groups",
                    backend="openai", api_key="...",
                    base_url="https://my-az.openai.azure.com/v1",
                    model="gpt-4o")

# --- Ollama (local, offline) ---
result = hx.analyze(df, "Do males earn more?",
                    backend="ollama",
                    model="mistral",                 # default: llama3.2
                    host="http://localhost:11434", timeout=120)

# --- HuggingFace Inference API ---
result = hx.analyze(df, "Are departments different?",
                    backend="huggingface", token="hf_...",
                    model="HuggingFaceH4/zephyr-7b-beta")

# --- HuggingFace local (requires: pip install transformers torch) ---
result = hx.analyze(df, "Is income different across regions?",
                    backend="huggingface",
                    model="microsoft/Phi-3.5-mini-instruct",
                    use_local=True, device="cuda")   # or device="cpu"

# --- Custom / plug-in backend ---
class MyCompanyLLM(hx.LLMBackend):
    name = "my_llm"
    def chat(self, messages):
        return my_internal_api.complete(messages[-1]["content"])

result = hx.analyze(df, "Is satisfaction higher in Q4?",
                    backend=MyCompanyLLM())

# --- Wrap any callable ---
result = hx.analyze(df, "...",
                    backend=lambda msgs: my_fn(msgs[-1]["content"]))

Assumption Checking

from hypotestx import check_normality, check_equal_variances

norm = check_normality(data)
if not norm.is_significant:          # Shapiro-Wilk p > 0.05 -> normal
    print("Normality assumption met")
else:
    print("Non-normal — consider Mann-Whitney U")
    result = hx.mannwhitney(group1, group2)

Effect Size Interpretation

result = hx.ttest_2samp(group1, group2)

print(f"Effect size: {result.effect_size:.3f}")
print(f"Magnitude:   {result.effect_magnitude}")  # 'small', 'medium', 'large'

if result.is_significant and result.effect_magnitude in ('medium', 'large'):
    print("Both statistically and practically significant")

Power Analysis

# How many participants do I need?
n = hx.n_ttest_two_sample(effect_size=0.5, alpha=0.05, power=0.8)
print(f"Required n per group: {n}")

# Post-hoc power
pow_result = hx.power_ttest_two_sample(
    effect_size=0.4, n1=30, n2=30, alpha=0.05
)
print(f"Achieved power: {pow_result.power:.2f}")

Bootstrap & Permutation Tests

# Bootstrap confidence interval for the difference in means
result = hx.bootstrap_ci(group1, statistic='mean', n_bootstrap=5000)
print(f"95% CI: {result}")

# Permutation test (non-parametric, exact)
result = hx.permutation_test(group1, group2, n_permutations=10000)

Verbose Mode

# See which test was selected and why
result = hx.analyze(
    df, "Is salary different between genders?",
    backend="gemini", api_key="AIza...",
    model="gemini-2.0-flash",
    verbose=True,
)
# [HypoTestX] Schema: 500 rows, columns: ['gender', 'salary', 'age']
# [HypoTestX] Backend: GeminiBackend
# [HypoTestX] Routing -> test='two_sample_ttest', confidence=0.95
# [HypoTestX] Reasoning: Two groups (M/F) compared on a numeric column

API Reference

`analyze()` — Natural Language Entry Point

hx.analyze(df, question, backend=None, alpha=0.05, verbose=False, **kwargs)

Parameter	Type	Description
`df`	`DataFrame`	pandas or polars DataFrame
`question`	`str`	Plain-English hypothesis question
`backend`	`str \| LLMBackend \| callable \| None`	LLM to use (default: regex fallback)
`alpha`	`float`	Significance level (default `0.05`)
`verbose`	`bool`	Print routing info to stdout
`api_key`	`str`	API key forwarded to backend constructor
`model`	`str`	Model name forwarded to backend constructor

Returns a HypoResult object.

`get_backend()` — Backend Factory

b = hx.get_backend("groq", api_key="gsk_...")   # by string
b = hx.get_backend(hx.OllamaBackend(model="phi4"))  # pass instance
b = hx.get_backend(my_callable)                 # wrap a callable

routing = b.route("Do males earn more?", hx.build_schema(df))
print(routing.test, routing.group_column, routing.value_column)

`HypoResult` Object

result.test_name            # 'Welch\'s t-test (unequal variances)'
result.statistic            # test statistic value
result.p_value              # p-value
result.effect_size          # Cohen's d / r / eta^2 / Cramer's V
result.effect_size_name     # 'Cohen\'s d', 'Pearson r', ...
result.confidence_interval  # (lower, upper)
result.degrees_of_freedom   # df
result.sample_sizes         # list of group sizes
result.is_significant       # bool — p_value < alpha
result.effect_magnitude     # 'small' | 'medium' | 'large'
result.interpretation       # plain-English interpretation string
result.alpha                # significance level used
result.alternative          # 'two-sided' | 'greater' | 'less'
result.summary()            # formatted multi-line summary string
result.to_dict()            # dict representation

Direct Test Functions

t-tests

hx.ttest_1samp(data, mu=0, alpha=0.05, alternative='two-sided')
hx.ttest_2samp(group1, group2, alpha=0.05, alternative='two-sided', equal_var=True)
hx.ttest_paired(before, after, alpha=0.05, alternative='two-sided')
hx.welch_ttest(group1, group2, alpha=0.05, alternative='two-sided')
hx.anova_1way(*groups, alpha=0.05)

Non-parametric

hx.mannwhitney(group1, group2, alpha=0.05, alternative='two-sided')
hx.wilcoxon(x, y=None, mu=0, alpha=0.05, alternative='two-sided')
hx.kruskal(*groups, alpha=0.05)

Categorical

hx.chi2_test(observed, alpha=0.05)                          # 2-D table or 1-D GoF
hx.fisher_exact(table, alpha=0.05, alternative='two-sided') # 2x2 only

Correlation

hx.pearson(x, y, alpha=0.05, alternative='two-sided')
hx.spearman(x, y, alpha=0.05, alternative='two-sided')
hx.pointbiserial(continuous, binary, alpha=0.05)

Backend Classes (for plug-in use)

from hypotestx import (
    LLMBackend,          # Abstract base — subclass to create your own
    CallableBackend,     # Wraps any callable(messages) -> str
    FallbackBackend,     # Built-in regex router (default)
    OllamaBackend,       # Local Ollama
    OpenAICompatBackend, # OpenAI / Groq / Together / Mistral / Azure
    GeminiBackend,       # Google Gemini
    HuggingFaceBackend,  # HuggingFace Inference API or local transformers
)

Creating a custom backend:

class MyBackend(hx.LLMBackend):
    name = "my_backend"

    def chat(self, messages: list[dict]) -> str:
        """
        messages: [{"role": "system", "content": ...},
                   {"role": "user",   "content": ...}]
        Return a JSON string matching the RoutingResult schema.
        """
        prompt = messages[-1]["content"]
        return call_my_llm_api(prompt)   # must return JSON string

result = hx.analyze(df, "Is salary different by gender?",
                    backend=MyBackend())

🎨 Visualization

Basic Plots

# Automatic visualization based on test type
result = htx.test("Compare groups A and B", data=df)
result.plot()  # Generates appropriate plot (box plot, histogram, etc.)

Custom Visualizations

# Distribution comparison
htx.plot_distributions(group1, group2, 
                      labels=['Group A', 'Group B'],
                      title='Distribution Comparison')

# Effect size visualization
htx.plot_effect_size(result, 
                    context='psychological research')

# Assumption diagnostics
htx.plot_assumptions(data, test_type='ttest')

Publication-Ready Output

# APA-style statistical reporting
htx.generate_apa_report(results, 
                       filename='statistical_analysis.pdf')

# Custom report generation
htx.generate_report(results, 
                   template='academic',
                   format='html',
                   include_plots=True)

Architecture

Design Philosophy

Zero mandatory dependencies — pure Python stdlib for math and HTTP
Plug-in LLMs — swap backends without changing test logic
Modular — each component works independently
Transparent — read the source to see exactly how every test works

Package Layout

hypotestx/
├── core/
│   ├── engine.py          # analyze() dispatcher
│   ├── result.py          # HypoResult dataclass
│   ├── parser.py          # Legacy regex NL parser
│   ├── assumptions.py     # Shapiro-Wilk, Levene, Bartlett, ...
│   └── llm/               # LLM sub-package
│       ├── base.py        # LLMBackend ABC, RoutingResult, SchemaInfo
│       ├── prompts.py     # System prompt, schema builder, user prompt
│       └── backends/
│           ├── fallback.py       # Regex router (default, zero deps)
│           ├── ollama.py         # Local Ollama
│           ├── openai_compat.py  # OpenAI / Groq / Together / Mistral
│           ├── gemini.py         # Google Gemini
│           └── huggingface.py    # HF Inference API + local transformers
├── math/           # Pure Python: distributions, statistics, linear algebra
├── tests/          # Statistical test implementations
├── stats/          # Descriptive stats, bootstrap, inference
├── power/          # Power analysis and sample size
├── reporting/      # APA reports, formatters
└── utils/          # Data utilities and validation

How `analyze()` Works

analyze(df, question, backend)  ←  user calls this
    │
    ├─ build_schema(df)         → SchemaInfo(columns, dtypes, numerics, categoricals)
    │
    ├─ backend.route(question, schema)
    │       │
    │       ├─ FallbackBackend  → regex pattern matching (instant, offline)
    │       ├─ GeminiBackend    → Gemini REST API (JSON response)
    │       ├─ OllamaBackend    → local HTTP to Ollama server
    │       └─ (any LLMBackend) → JSON parsed into RoutingResult
    │
    └─ _dispatch(routing, df)   → extracts columns, calls test function
            │
            └─ HypoResult       ← returned to caller

Mathematical Implementation

All statistical computations are implemented from scratch using:

Newton's method for square roots and optimization
Taylor series for transcendental functions
Lanczos approximation for gamma function
Continued fractions for special functions
Numerical integration for distribution functions

🤝 Contributing

We welcome contributions! Here's how to get started:

Development Setup

git clone https://github.com/Ankit-Anand123/HypoTestX.git
cd HypoTestX

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Linux/Mac
# or
venv\Scripts\activate     # Windows

# Install development dependencies
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

# Run tests
pytest

Contribution Areas

🧪 New statistical tests: Implement additional tests
🗣️ NLP improvements: Enhance natural language understanding
📊 Visualizations: Add new plotting capabilities
🎓 Educational content: Improve explanations and tutorials
🏥 Domain packages: Specialized tests for specific fields
🌍 Internationalization: Support for other languages

Code Style

Follow PEP 8
Type hints required for all public functions
Comprehensive docstrings with examples
95%+ test coverage for new code

📖 Documentation

Full Documentation

Documentation site is not yet available. In the meantime, refer to this README, the inline docstrings, and the example notebooks below.

📊 Performance

Benchmarks

# Performance comparison with other libraries
import hypotestx as htx
import scipy.stats as stats
import time

# HypoTestX (pure Python)
start = time.time()
result_htx = htx.ttest_2samp(group1, group2)
time_htx = time.time() - start

# SciPy (compiled)
start = time.time()
result_scipy = stats.ttest_ind(group1, group2)
time_scipy = time.time() - start

print(f"HypoTestX: {time_htx:.4f}s")
print(f"SciPy: {time_scipy:.4f}s")
print(f"Results match: {abs(result_htx.p_value - result_scipy.pvalue) < 1e-10}")

Typical performance:

Small datasets (n < 1000): Comparable to SciPy
Large datasets (n > 10000): 2-3x slower than compiled libraries
Trade-off: Transparency and educational value vs. raw speed

Roadmap

Version 0.1.0 (Released)

Complete parametric test suite (t-tests, ANOVA)
Non-parametric tests (Mann-Whitney, Wilcoxon, Kruskal-Wallis)
Categorical tests (Chi-square, Fisher's exact)
Correlation tests (Pearson, Spearman, point-biserial)
Pure Python math core (distributions, special functions)
Assumption checking (Shapiro-Wilk, Levene, Bartlett, Jarque-Bera)
Power analysis and sample size calculation
Bootstrap and permutation tests
APA-style reporting
LLM-powered analyze() interface with plug-in backend system
- Built-in regex fallback (zero deps, offline)
- Ollama backend (local, free)
- Gemini backend (free tier)
- Groq / OpenAI / Together / Mistral / Azure backends
- HuggingFace Inference API + local transformers
- Custom backend API (LLMBackend subclass or callable)

Version 0.2.0 (Planned)

Two-way ANOVA and repeated-measures ANOVA
Regression-based tests (linear, logistic)
Automatic assumption-driven test selection
Streaming LLM responses for verbose mode
analyze() result explains why a test was chosen

Version 0.3.0 (Planned)

Bayesian alternatives (Bayesian t-test, Bayes factor)
Time series stationarity and change-point tests
Meta-analysis tools
Interactive Jupyter widgets for results

Version 1.0.0 (Released)

Domain-specific packages (clinical, A/B testing, finance)
Publication-ready PDF/HTML reporting
LLM-powered analyze() interface with plug-in backend system
Full test suite (483 tests passing)

Support & Community

Getting Help

💬 GitHub Discussions
🐛 Issue Tracker

📄 License

HypoTestX is released under the MIT License.

MIT License

Copyright (c) 2024 HypoTestX Contributors

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

🙏 Acknowledgments

Author

Ankit — Ankit-Anand123 — sole developer and maintainer

Inspiration

R's elegant statistical interface
spaCy's intuitive NLP design
pandas' data manipulation philosophy
scikit-learn's consistent API design

Dependencies

The mathematical core and all LLM HTTP calls are pure Python stdlib. Optional extras that unlock additional functionality:

Ollama desktop app — for the local OllamaBackend
transformers + torch — for HuggingFaceBackend local inference mode
matplotlib — for visualization helpers

📈 Citation

If you use HypoTestX in your research, please cite:

@software{hypotestx2025,
  author = {Ankit},
  title = {HypoTestX: Natural Language Hypothesis Testing for Python},
  url = {https://github.com/Ankit-Anand123/HypoTestX},
  version = {1.0.0},
  year = {2026}
}

Made with ❤️ for the data science community

GitHub • PyPI

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.6

Mar 11, 2026

1.0.5

Mar 10, 2026

1.0.4

Mar 9, 2026

This version

1.0.3

Mar 9, 2026

1.0.2

Mar 9, 2026

1.0.1

Mar 9, 2026

1.0.0

Mar 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hypotestx-1.0.3.tar.gz (130.3 kB view details)

Uploaded Mar 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hypotestx-1.0.3-py3-none-any.whl (129.8 kB view details)

Uploaded Mar 9, 2026 Python 3

File details

Details for the file hypotestx-1.0.3.tar.gz.

File metadata

Download URL: hypotestx-1.0.3.tar.gz
Upload date: Mar 9, 2026
Size: 130.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for hypotestx-1.0.3.tar.gz
Algorithm	Hash digest
SHA256	`5c89d866da94c2bdc09e6905db0b8ca728992419ef9c669abcd78f94344e40f2`
MD5	`754bdf360520e4439f2f015c71cafeee`
BLAKE2b-256	`0d2070c459dc240bd56f69f9eb4c257ee2769bc22a38677607b1e82381ccff03`

See more details on using hashes here.

File details

Details for the file hypotestx-1.0.3-py3-none-any.whl.

File metadata

Download URL: hypotestx-1.0.3-py3-none-any.whl
Upload date: Mar 9, 2026
Size: 129.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for hypotestx-1.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`479cfb695578a4830ee6d806d64b9e858d7137e69c0bd268c0879cf1bf757ae0`
MD5	`226d2b308ac90be2b5515b03d4098951`
BLAKE2b-256	`988a2352d9d8c29b0d91f5f84051bd4f11f0f158e96664ceaafe3acb40bcf85f`

See more details on using hashes here.

hypotestx 1.0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

HypoTestX

Key Features

Natural Language Interface — analyze()

Plug-in LLM Backends

Pure Python Mathematics

Dual Mode Design

Comprehensive Statistical Toolkit

Quick Start

Installation

Basic Usage

Examples

One-Sample t-test

Two-Sample t-test

Paired t-test

Correlation

Categorical Association

Using a Real LLM Backend

Natural Language Examples

Two-group comparisons

One-sample tests

Correlation & relationships

Categorical associations

Multi-group comparisons

Paired / before-after

Supported Tests

Parametric

Non-Parametric

Categorical

Correlation

Advanced Features

LLM Backends

Assumption Checking

Effect Size Interpretation

Power Analysis

Bootstrap & Permutation Tests

Verbose Mode

API Reference

analyze() — Natural Language Entry Point

get_backend() — Backend Factory

HypoResult Object

Direct Test Functions

t-tests

Non-parametric

Categorical

Correlation

Backend Classes (for plug-in use)

🎨 Visualization

Basic Plots

Custom Visualizations

Publication-Ready Output

Architecture

Design Philosophy

Package Layout

How analyze() Works

Mathematical Implementation

🤝 Contributing

Development Setup

Contribution Areas

Code Style

📖 Documentation

Full Documentation

📊 Performance

Benchmarks

Roadmap

Version 0.1.0 (Released)

Version 0.2.0 (Planned)

Version 0.3.0 (Planned)

Version 1.0.0 (Released)

Support & Community

Getting Help

📄 License

Natural Language Interface — `analyze()`

`analyze()` — Natural Language Entry Point

`get_backend()` — Backend Factory

`HypoResult` Object

How `analyze()` Works