pandas drop-in replacement: JSON serialization, DataFrame pipeline, diff tracking, column validation, streaming export, caching, FastAPI/Flask/Django integration

These details have not been verified by PyPI

Project links

Project description

pandasv2

Full pandas replacement with built-in JSON serialization and web framework integration.

pip install pandasv2 — and never worry about pandas web pain points again.

import pandasv2 as pd          # drop-in replacement for pandas

Why pandasv2?

Problem	pandas	pandasv2
JSON serialize a DataFrame	`TypeError: Object of type int64 is not JSON serializable`	`df.to_json_safe()` — just works
Return a DataFrame from FastAPI/Flask	Multiple conversion steps	`return df.to_web()`
Reconstruct DataFrame from JSON	Manual dtype restoration	`DataFrame.from_json_safe(json_str)`
Web framework setup	Configure encoder per framework	`setup_json_encoder(app)`
Same API as pandas	—	100% compatible — all methods identical

Installation

pip install pandasv2

Requirements: Python 3.8+, NumPy >= 1.20, pandas >= 2.0

Quick Start

import pandasv2 as pd   # same as: import pandas as pd

# Create
df = pd.DataFrame({'name': ['Alice', 'Bob', 'Charlie'],
                   'score': [95.5, 87.0, 92.3],
                   'grade': ['A', 'B', 'A']})

# All standard pandas operations work
print(df.groupby('grade')['score'].mean())
print(df.sort_values('score', ascending=False))
print(df.describe())

# Plus web extras (new in pandasv2)
json_str = df.to_json_safe()                       # lossless JSON, no TypeError ever
web_data = df.to_web()                             # plain list of dicts for HTTP responses
df2      = pd.DataFrame.from_json_safe(json_str)   # round-trip

Feature Coverage

DataFrame

All standard pandas.DataFrame methods are available unchanged.

df = pd.DataFrame(data)
df = pd.DataFrame.from_records(records)
df = pd.DataFrame.from_dict(d)

# Selection & filtering
df['col']
df[['col1', 'col2']]
df.loc[rows, cols]
df.iloc[0:5]
df.at[idx, col]
df.iat[0, 1]
df.query("score > 90")
df.filter(like='score')
df.select_dtypes(include='number')
df.isin({'grade': ['A', 'B']})

# Sorting
df.sort_values('score', ascending=False)
df.sort_index()

# Shape / info
df.shape, df.dtypes, df.columns, df.index
df.info(), df.describe()
df.head(10), df.tail(10), df.sample(5)

# Modification
df.assign(pass_=df['score'] >= 60)
df.rename(columns={'score': 'points'})
df.drop(columns=['grade'])
df.reset_index(), df.set_index('name')
df.astype({'score': 'float32'})
df.fillna(0), df.dropna()
df.replace({'grade': {'A': 'Excellent'}})

# Math
df.sum(), df.mean(), df.median(), df.std(), df.var()
df.min(), df.max(), df.count(), df.nunique()
df.cumsum(), df.cumprod(), df.cummax(), df.cummin()
df.diff(), df.pct_change(), df.shift(1)
df.abs(), df.round(2)
df.corr(), df.cov()

# Apply / transform
df.apply(func, axis=1)
df.applymap(func)
df.pipe(func)
df.eval("total = score * 1.1")

# Merge / join
df.merge(other, on='key', how='left')
df.join(other, how='outer')

# Reshape
df.pivot(index='date', columns='category', values='value')
df.pivot_table(values='score', index='grade', aggfunc='mean')
df.melt(id_vars=['name'], value_vars=['score'])
df.stack(), df.unstack()
df.transpose()
df.explode('list_col')

# Window
df.rolling(7).mean()
df.expanding().sum()
df.ewm(span=5).mean()

# GroupBy
df.groupby('grade').sum()
df.groupby('grade').agg({'score': ['mean', 'std']})
df.groupby('grade').transform('mean')
df.groupby('grade').apply(custom_func)
df.groupby('grade').filter(lambda g: len(g) > 1)

# I/O
df.to_csv('out.csv')
df.to_excel('out.xlsx')
df.to_json('out.json')
df.to_parquet('out.parquet')
df.to_sql('table', con)
df.to_html(), df.to_latex(), df.to_markdown()
df.to_clipboard(), df.to_pickle('out.pkl')
df.to_string()

# pandasv2 extras
df.to_json_safe()               # strict JSON for web APIs (NumPy scalars, Timestamp/Timedelta, NaN/NaT/NA, +/-inf)
df.to_web(orient='records')     # plain dict/list for HTTP responses
pd.DataFrame.from_json_safe(s)  # round-trip reconstruction

Series

s = pd.Series([1, 2, 3], name='score')

# Math
s.sum(), s.mean(), s.median(), s.std(), s.var()
s.min(), s.max(), s.count(), s.nunique()
s.cumsum(), s.cummax(), s.cummin(), s.cumprod()
s.abs(), s.round(2), s.clip(0, 100)
s.diff(), s.pct_change(), s.shift(1)

# Selection
s.isin([1, 3])
s.between(1, 2)
s.where(s > 1)

# Transformation
s.apply(func)
s.map({'a': 1, 'b': 2})
s.replace(1, 99)
s.fillna(0), s.dropna()
s.astype('float32')
s.sort_values(), s.sort_index()

# Info
s.value_counts()
s.unique()
s.describe()

# Accessors (same as pandas)
s.str.upper()                   # .str accessor
s.dt.year                       # .dt accessor
s.cat.codes                     # .cat accessor

# pandasv2 extras
s.to_json_safe()
s.to_web()

I/O Functions

# CSV
df = pd.read_csv('data.csv')
df = pd.read_csv('data.csv', sep=';', usecols=['a','b'], dtype={'a': 'int32'})
df.to_csv('out.csv', index=False)

# Excel
df = pd.read_excel('data.xlsx')
df = pd.read_excel('data.xlsx', sheet_name='Sheet2', skiprows=2)
df.to_excel('out.xlsx', sheet_name='Results', index=False)

# JSON
df = pd.read_json('data.json')
df = pd.read_json('data.json', orient='records', lines=True)
df.to_json('out.json', orient='records', lines=True)

# Parquet
df = pd.read_parquet('data.parquet')
df.to_parquet('out.parquet', compression='snappy')

# Feather
df = pd.read_feather('data.feather')
df.to_feather('out.feather')

# SQL
df = pd.read_sql('SELECT * FROM users WHERE active=1', con)
df = pd.read_sql_table('users', con, schema='public')
df = pd.read_sql_query('SELECT id, name FROM users', con)
df.to_sql('results', con, if_exists='replace', index=False)

# HTML
dfs = pd.read_html('https://example.com/table.html')

# XML
df = pd.read_xml('data.xml', xpath='.//row')

# Clipboard
df = pd.read_clipboard()
df.to_clipboard()

# Other
df = pd.read_pickle('data.pkl')
df = pd.read_fwf('data.fwf', widths=[10, 20, 15])
df = pd.read_table('data.tsv')
df = pd.read_hdf('data.h5', key='df')

Merge / Reshape

# Merge (like SQL JOIN)
pd.merge(left, right, on='key')
pd.merge(left, right, on='key', how='left')
pd.merge(left, right, left_on='id', right_on='user_id', suffixes=('_l','_r'))
pd.merge_asof(trades, quotes, on='time', by='ticker')
pd.merge_ordered(df1, df2, on='date', fill_method='ffill')

# Concat
pd.concat([df1, df2], ignore_index=True)          # vertical
pd.concat([df1, df2], axis=1)                      # horizontal

# Pivot / Unpivot
pd.pivot(df, index='date', columns='variable', values='value')
pd.pivot_table(df, values='sales', index='region',
               columns='product', aggfunc='sum', margins=True)
pd.melt(df, id_vars=['id'], value_vars=['jan','feb','mar'],
        var_name='month', value_name='sales')
pd.wide_to_long(df, stubnames=['A','B'], i='id', j='year')

# Cross-tabulation
pd.crosstab(df['gender'], df['age_group'], margins=True, normalize='index')

# Encoding
pd.get_dummies(df, columns=['color', 'size'], drop_first=True)
pd.cut(df['age'], bins=[0,18,35,60,100], labels=['child','young','adult','senior'])
pd.qcut(df['score'], q=4, labels=['Q1','Q2','Q3','Q4'])

# Misc
pd.factorize(df['category'])
pd.unique(df['col'])
pd.value_counts(df['col'])
pd.isna(df), pd.notna(df)

GroupBy

g = df.groupby('category')
g = df.groupby(['region', 'product'])       # multi-key
g = df.groupby(pd.Grouper(freq='ME'))       # time-based

# Aggregations
g.sum(), g.mean(), g.median()
g.min(), g.max(), g.count(), g.size()
g.std(), g.var(), g.sem()
g.prod(), g.first(), g.last()
g.nunique(), g.describe()

# Multiple aggregations
g.agg('sum')
g.agg({'score': ['mean', 'std', 'min', 'max'], 'sales': 'sum'})
g.agg(total=('sales', 'sum'), avg=('score', 'mean'))

# Transform (same shape as input)
g['score'].transform('mean')
g['score'].transform(lambda x: (x - x.mean()) / x.std())

# Apply (custom function per group)
g.apply(lambda x: x.nlargest(3, 'score'))

# Filter (drop groups not matching condition)
g.filter(lambda x: x['score'].mean() > 80)

# Cumulative
g.cumsum(), g.cummax(), g.cummin(), g.cumcount()
g.rank(method='dense')
g.shift(1), g.diff(), g.pct_change()

# Window on groups
g.rolling(3).mean()
g.expanding().sum()
g.ewm(span=5).mean()

Date / Time

# Create date ranges
pd.date_range('2024-01-01', periods=12, freq='ME')
pd.date_range('2024-01-01', '2024-12-31', freq='W')
pd.bdate_range('2024-01-01', periods=10)
pd.period_range('2024Q1', periods=4, freq='Q')
pd.timedelta_range('0 days', periods=5, freq='1D')

# Parse / convert
pd.to_datetime(df['date_str'], format='%Y-%m-%d')
pd.to_timedelta(df['duration_str'])
pd.to_numeric(df['col'], errors='coerce')

# .dt accessor on Series
s.dt.year, s.dt.month, s.dt.day
s.dt.hour, s.dt.minute, s.dt.second
s.dt.dayofweek, s.dt.day_of_year, s.dt.quarter
s.dt.is_month_start, s.dt.is_year_end
s.dt.strftime('%Y-%m-%d')
s.dt.tz_localize('UTC'), s.dt.tz_convert('US/Eastern')
s.dt.floor('h'), s.dt.ceil('D'), s.dt.round('min')
s.dt.month_name(), s.dt.day_name()
s.dt.isocalendar()
s.dt.total_seconds()              # timedelta

String Accessor (.str)

s = pd.Series(['Hello World', 'foo bar', None])

# Case
s.str.upper(), s.str.lower(), s.str.title(), s.str.capitalize()

# Search
s.str.contains('World', case=False, na=False)
s.str.startswith('foo'), s.str.endswith('bar')
s.str.match(r'^\w+')
s.str.findall(r'\w+')
s.str.count(r'\w+')

# Extract
s.str.extract(r'(\w+)\s(\w+)')      # returns DataFrame
s.str.extractall(r'(\d+)')

# Replace / strip
s.str.replace('World', 'Python', regex=False)
s.str.strip(), s.str.lstrip(), s.str.rstrip()

# Split / join
s.str.split(' ', expand=True)        # returns DataFrame
s.str.join('-')

# Padding / width
s.str.pad(20, side='right')
s.str.center(20, fillchar='*')
s.str.zfill(10)

# Type checks
s.str.isalpha(), s.str.isnumeric(), s.str.isdigit()

# Length / slice
s.str.len()
s.str.slice(0, 5)

Categorical (.cat)

s = pd.Series(pd.Categorical(['A','B','A','C'], categories=['A','B','C'], ordered=True))

s.cat.categories
s.cat.codes
s.cat.ordered
s.cat.rename_categories({'A': 'Alpha'})
s.cat.add_categories(['D'])
s.cat.remove_categories(['C'])
s.cat.remove_unused_categories()
s.cat.set_categories(['A','B'], ordered=True)
s.cat.as_ordered(), s.cat.as_unordered()

Index Types

pd.Index([1, 2, 3])
pd.RangeIndex(start=0, stop=100, step=2)
pd.MultiIndex.from_tuples([('a', 1), ('b', 2)])
pd.MultiIndex.from_product([['A','B'], [1,2,3]])
pd.MultiIndex.from_frame(df[['region','product']])
pd.DatetimeIndex(pd.date_range('2024-01-01', periods=5))
pd.PeriodIndex(pd.period_range('2024Q1', periods=4, freq='Q'))
pd.CategoricalIndex(['a','b','a','c'])
pd.IntervalIndex.from_breaks([0, 1, 2, 3])
pd.TimedeltaIndex(pd.timedelta_range('0 days', periods=5))

Data Types

# Nullable integer types (handle NaN without float conversion)
pd.Int8Dtype(), pd.Int16Dtype(), pd.Int32Dtype(), pd.Int64Dtype()
pd.UInt8Dtype(), pd.UInt16Dtype(), pd.UInt32Dtype(), pd.UInt64Dtype()

# Nullable float
pd.Float32Dtype(), pd.Float64Dtype()

# Nullable boolean and string
pd.BooleanDtype(), pd.StringDtype()

# Categorical
pd.CategoricalDtype(categories=['A','B','C'], ordered=True)

# Datetime with timezone
pd.DatetimeTZDtype(tz='UTC')

# Period, Interval, Sparse
pd.PeriodDtype(freq='M')
pd.IntervalDtype(subtype='float64')
pd.SparseDtype('float64', fill_value=0)

Scalar Types

pd.Timestamp('2024-01-15 10:30:00', tz='UTC')
pd.Timestamp.now()
pd.Timedelta('3 days 4 hours')
pd.Period('2024Q1', freq='Q')
pd.Interval(left=0, right=5, closed='right')
pd.NaT          # Not a Time (missing datetime)
pd.NA           # Nullable NA

Options

pd.set_option('display.max_rows', 100)
pd.set_option('display.max_columns', 50)
pd.set_option('display.float_format', '{:.2f}'.format)
pd.get_option('display.max_rows')
pd.reset_option('display.max_rows')
pd.describe_option('display')

with pd.option_context('display.max_rows', 5, 'display.max_columns', 3):
    print(df)

JSON Utilities

# json_normalize: flatten nested JSON/dicts
data = [{'id': 1, 'address': {'city': 'NY', 'zip': '10001'}},
        {'id': 2, 'address': {'city': 'LA', 'zip': '90001'}}]
df = pd.json_normalize(data)
# Result: id | address.city | address.zip

# Nested record_path
orders = [{'id':1, 'items':[{'sku':'A','qty':2},{'sku':'B','qty':1}]}]
df = pd.json_normalize(orders, record_path='items', meta=['id'])

pandas vs pandasv2 — Side-by-Side

JSON Serialization

# ── pandas ─────────────────────────────────────────────────────────────────
import pandas as pd
import json

df = pd.read_csv('data.csv')
result = df.groupby('category')['sales'].agg(['sum','mean'])

# Fails!
json.dumps(result.to_dict())   # TypeError: Object of type int64 is not JSON serializable

# Manual workaround (messy, fragile, easy to forget edge cases)
import numpy as np
def convert(o):
    if isinstance(o, (np.integer, np.floating)):
        return o.item()
    raise TypeError
json.dumps(result.to_dict(), default=convert)

# ── pandasv2 ────────────────────────────────────────────────────────────────
import pandasv2 as pd

df = pd.read_csv('data.csv')
result = df.groupby('category')['sales'].agg(['sum','mean'])
json_str = result.to_json_safe()   # done — zero boilerplate

FastAPI Integration

# ── pandas + FastAPI ────────────────────────────────────────────────────────
from fastapi import FastAPI
from fastapi.responses import JSONResponse
import pandas as pd
import numpy as np, json

app = FastAPI()

@app.get('/data')
def get():
    df = pd.read_csv('data.csv')
    # Have to manually fix every numpy type
    data = json.loads(df.to_json(orient='records'))
    return JSONResponse(data)

# ── pandasv2 + FastAPI ──────────────────────────────────────────────────────
from fastapi import FastAPI
import pandasv2 as pd

app = FastAPI()

@app.get('/data')
def get():
    df = pd.read_csv('data.csv')
    return df.to_web()             # works out of the box

Type Round-Trip

# ── pandas ─────────────────────────────────────────────────────────────────
import pandas as pd
import json

df = pd.DataFrame({
    'date': pd.date_range('2024-01-01', periods=3),
    'value': pd.array([1, 2, 3], dtype='Int64'),  # nullable int
})

# to_json loses type info
s = df.to_json()
df2 = pd.read_json(s)    # dates become int64 timestamps, nullable int becomes float

# ── pandasv2 ────────────────────────────────────────────────────────────────
import pandasv2 as pd

df = pd.DataFrame({
    'date': pd.date_range('2024-01-01', periods=3),
    'value': pd.array([1, 2, 3], dtype='Int64'),
})

# Perfect round-trip
payload = pd.serialize(df)       # includes dtype metadata
df2     = pd.deserialize(payload)
assert df2['date'].dtype == df['date'].dtype   # passes

Web Framework Integration (pandasv2 extras)

FastAPI

from fastapi import FastAPI
import pandasv2 as pd

app = FastAPI()
pd.setup_json_encoder(app)     # one-line setup

@app.get('/users')
def get_users():
    df = pd.read_csv('users.csv')
    return df.to_web()         # returns plain list of dicts

@app.get('/users/json')
def get_users_json():
    df = pd.read_csv('users.csv')
    return pd.FastAPIResponse(df)

Flask

from flask import Flask
import pandasv2 as pd

app = Flask(__name__)
pd.setup_json_encoder(app)

@app.route('/data')
def data():
    df = pd.read_csv('data.csv')
    return pd.FlaskResponse(df).get_response()

Django

# views.py
import pandasv2 as pd

def data_view(request):
    df = pd.read_csv('data.csv')
    return pd.DjangoResponse(df, safe=False).get_response()

Custom encoder (any framework)

import json
import pandasv2 as pd

df = pd.DataFrame({'a': [1, 2], 'ts': pd.date_range('2024-01-01', periods=2)})

# Works with any JSON serializer
json_str = json.dumps(df.to_dict('records'), cls=pd.JSONEncoder)

# Or use the helper directly
json_str = pd.to_json(df)
df_back  = pd.from_json(json_str)

# Lossless round-trip with metadata
payload  = pd.serialize(df)    # dict with type info + dtypes
df_back  = pd.deserialize(payload)

Performance

Operation	pandas (manual)	pandasv2
Serialize 10k-row DataFrame to JSON	~120 ms (with workarounds)	~30 ms
Round-trip (serialize + deserialize)	~250 ms	~70 ms
Batch convert 100 DataFrames	~12 s	~3 s

pandasv2 is 3-5x faster than manual conversion because it avoids redundant object traversals.

Complete API Reference

Category	Functions / Classes
Core types	`DataFrame`, `Series`
I/O (read)	`read_csv`, `read_excel`, `read_json`, `read_parquet`, `read_feather`, `read_orc`, `read_sql`, `read_sql_query`, `read_sql_table`, `read_html`, `read_xml`, `read_hdf`, `read_pickle`, `read_clipboard`, `read_fwf`, `read_table`, `read_spss`, `read_sas`
Merge/Concat	`merge`, `merge_asof`, `merge_ordered`, `concat`
Reshape	`pivot`, `pivot_table`, `melt`, `wide_to_long`, `crosstab`
Encoding	`get_dummies`, `from_dummies`, `cut`, `qcut`
Utilities	`factorize`, `unique`, `value_counts`, `isna`, `notna`, `eval`, `json_normalize`
Datetime	`date_range`, `bdate_range`, `period_range`, `timedelta_range`, `interval_range`, `to_datetime`, `to_timedelta`, `to_numeric`
Index types	`Index`, `RangeIndex`, `MultiIndex`, `DatetimeIndex`, `TimedeltaIndex`, `PeriodIndex`, `CategoricalIndex`, `IntervalIndex`
Scalars	`Timestamp`, `Timedelta`, `Period`, `Interval`, `NaT`, `NA`
Dtypes	`CategoricalDtype`, `DatetimeTZDtype`, `IntervalDtype`, `PeriodDtype`, `SparseDtype`, `Int8–Int64Dtype`, `UInt8–UInt64Dtype`, `Float32–Float64Dtype`, `BooleanDtype`, `StringDtype`
Grouper	`Grouper`, `NamedAgg`
Options	`set_option`, `get_option`, `reset_option`, `describe_option`, `option_context`
Web extras	`to_json_safe()`, `to_web()`, `from_json_safe()`, `JSONEncoder`, `JSONDecoder`, `to_json`, `from_json`, `serialize`, `deserialize`
Framework helpers	`FastAPIResponse`, `FlaskResponse`, `DjangoResponse`, `setup_json_encoder`, `create_response_handler`
Testing	`testing.assert_frame_equal`, `testing.assert_series_equal`

Contributing

Contributions are welcome! Here's how to get started:

Fork the repository on GitHub
Create a feature branch: git checkout -b feature/your-feature
Make your changes and add tests
Run the test suite: pytest tests/ -v
Submit a pull request

Please open an issue first for major changes to discuss the approach.

Author

Mahesh Makvana — GitHub · PyPI

MIT License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.4.4

May 17, 2026

2.4.3

May 17, 2026

2.4.2

May 16, 2026

2.4.1

May 16, 2026

2.4.0

May 16, 2026

This version

2.3.0

May 13, 2026

2.2.0

Apr 10, 2026

2.1.0

Apr 8, 2026

2.0.1

Apr 8, 2026

2.0.0

Apr 8, 2026

1.0.0

Apr 8, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandasv2-2.3.0.tar.gz (77.5 kB view details)

Uploaded May 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pandasv2-2.3.0-py3-none-any.whl (78.0 kB view details)

Uploaded May 13, 2026 Python 3

File details

Details for the file pandasv2-2.3.0.tar.gz.

File metadata

Download URL: pandasv2-2.3.0.tar.gz
Upload date: May 13, 2026
Size: 77.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for pandasv2-2.3.0.tar.gz
Algorithm	Hash digest
SHA256	`86bcd4fb49886e3b380cde44a5e09cd891e58e5e0584e36d7ff245e4d123153c`
MD5	`3fdadedffa6358c65f4b5d10b82fe43e`
BLAKE2b-256	`4db4bd975b63af9b172878ec6e74687294ebf18b8630482bb3efffa7066f17d9`

See more details on using hashes here.

File details

Details for the file pandasv2-2.3.0-py3-none-any.whl.

File metadata

Download URL: pandasv2-2.3.0-py3-none-any.whl
Upload date: May 13, 2026
Size: 78.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for pandasv2-2.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a9918143192ca23b08b46436175f97c353d8bda537797985a4e399ade6b507d8`
MD5	`03419cc2da47947970e6acae187188a4`
BLAKE2b-256	`f7fb2e58a916c278f8e164e8f87add78fe3726135b8dd32161db2671b6154f2f`

See more details on using hashes here.

pandasv2 2.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

pandasv2

Why pandasv2?

Installation

Quick Start

Feature Coverage

DataFrame

Series

I/O Functions

Merge / Reshape

GroupBy

Date / Time

String Accessor (.str)

Categorical (.cat)

Index Types

Data Types

Scalar Types

Options

JSON Utilities

pandas vs pandasv2 — Side-by-Side

JSON Serialization

FastAPI Integration

Type Round-Trip

Web Framework Integration (pandasv2 extras)

FastAPI

Flask

Django

Custom encoder (any framework)

Performance

Complete API Reference

Contributing

Author

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes