Skip to main content

🐻 DataFrame Library

Project description

Orso

Orso is a shared DataFrame library for Opteryx and Mabel.

PyPI Latest Release Downloads codecov Documentation

Overview

Orso is not intended to compete with Polars or Pandas (or your favorite bear DataFrame technology), instead it is developed as a common layer for Mabel and Opteryx.

Key Use Cases:

  • In Opteryx, Orso provides most of the database Cursor functionality
  • In Mabel, Orso provides the data schema and validation functionality

Orso DataFrames are row-based, driven by their initial target use-case as the WAL for Mabel and Cursor for Opteryx. Each row in an Orso DataFrame can be quickly converted to a Tuple of values, a Dictionary, or a byte representation.

Installation

Install Orso from PyPI:

pip install orso

Quick Start

Creating a DataFrame

import orso

# Create from list of dictionaries
df = orso.DataFrame([
    {'name': 'Alice', 'age': 30, 'city': 'New York'},
    {'name': 'Bob', 'age': 25, 'city': 'San Francisco'},
    {'name': 'Charlie', 'age': 35, 'city': 'Chicago'}
])

print(f"Created DataFrame with {df.rowcount} rows and {df.columncount} columns")

Displaying Data

# Display the DataFrame
print(df.display())

# Convert to different formats
arrow_table = df.arrow()  # PyArrow Table
pandas_df = df.pandas()   # Pandas DataFrame

Working with Schema

# Access column names
print("Columns:", df.column_names)

# Access schema information  
print("Schema:", df.schema)

Converting Between Formats

# From PyArrow
import pyarrow as pa
arrow_table = pa.table({'x': [1, 2, 3], 'y': ['a', 'b', 'c']})
orso_df = orso.DataFrame.from_arrow(arrow_table)

# To Pandas
pandas_df = orso_df.pandas()

Features

  • Lightweight: Minimal overhead for tabular data operations
  • Row-based: Optimized for row-oriented operations
  • Interoperable: Easy conversion to/from PyArrow, Pandas
  • Schema-aware: Built-in data validation and type checking
  • Fast serialization: Efficient conversion to bytes, tuples, and dictionaries

API Reference

DataFrame Class

The main DataFrame class provides the following key methods:

  • DataFrame(dictionaries=None, *, rows=None, schema=None) - Constructor
  • display(limit=5, colorize=True, show_types=True) - Pretty print the DataFrame
  • arrow(size=None) - Convert to PyArrow Table
  • pandas(size=None) - Convert to Pandas DataFrame
  • from_arrow(tables) - Create DataFrame from PyArrow Table(s)
  • fetchall() - Get all rows as list of Row objects
  • collect() - Materialize the DataFrame
  • append(other) - Append another DataFrame
  • distinct() - Get unique rows

Properties

  • rowcount - Number of rows
  • columncount - Number of columns
  • column_names - List of column names
  • schema - Schema information

Development

Building from Source

# Clone the repository
git clone https://github.com/mabel-dev/orso.git
cd orso

# Install dependencies
pip install -r requirements.txt
pip install -r tests/requirements.txt

# Build Cython extensions
make compile

# Run tests
make test

Contributing

Orso is part of the Mabel ecosystem. Contributions are welcome! Please ensure:

  1. All tests pass: make test
  2. Code follows the project style: make lint
  3. New features include appropriate tests
  4. Documentation is updated for API changes

Performance Benchmarking

Orso includes a comprehensive performance benchmark suite to compare different versions:

# Run full benchmark suite
python tests/test_benchmark_suite.py

# Compare two versions
python tests/test_benchmark_suite.py -o baseline.json
# <switch version>
python tests/test_benchmark_suite.py -o current.json -c baseline.json

See BENCHMARK_SUITE.md for detailed documentation.

License

License

Orso is licensed under Apache 2.0 unless explicitly indicated otherwise.

Status

Status

Orso is in beta. Beta means different things to different people, to us, being beta means:

  • Interfaces are generally stable but may still have breaking changes
  • Unit tests are not reliable enough to capture breaks to functionality
  • Bugs are likely to exist in edge cases
  • Code may not be tuned for performance

As such, we really don't recommend using Orso in critical applications.

Related Projects

  • Opteryx - SQL query engine for data files
  • Mabel - Data processing framework

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

orso-0.0.241.tar.gz (351.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

orso-0.0.241-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_34_x86_64.whl (909.6 kB view details)

Uploaded CPython 3.14tmanylinux: glibc 2.17+ x86-64manylinux: glibc 2.34+ x86-64

orso-0.0.241-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (908.9 kB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ x86-64

orso-0.0.241-cp314-cp314-macosx_10_15_universal2.whl (324.5 kB view details)

Uploaded CPython 3.14macOS 10.15+ universal2 (ARM64, x86-64)

orso-0.0.241-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (919.8 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

orso-0.0.241-cp313-cp313-macosx_10_15_universal2.whl (324.3 kB view details)

Uploaded CPython 3.13macOS 10.15+ universal2 (ARM64, x86-64)

orso-0.0.241-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (937.7 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

orso-0.0.241-cp312-cp312-macosx_10_15_universal2.whl (325.4 kB view details)

Uploaded CPython 3.12macOS 10.15+ universal2 (ARM64, x86-64)

orso-0.0.241-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (884.4 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

orso-0.0.241-cp311-cp311-macosx_10_15_universal2.whl (331.7 kB view details)

Uploaded CPython 3.11macOS 10.15+ universal2 (ARM64, x86-64)

orso-0.0.241-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (844.2 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

orso-0.0.241-cp310-cp310-macosx_10_15_universal2.whl (335.3 kB view details)

Uploaded CPython 3.10macOS 10.15+ universal2 (ARM64, x86-64)

orso-0.0.241-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (837.2 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

orso-0.0.241-cp39-cp39-macosx_10_15_universal2.whl (335.9 kB view details)

Uploaded CPython 3.9macOS 10.15+ universal2 (ARM64, x86-64)

File details

Details for the file orso-0.0.241.tar.gz.

File metadata

  • Download URL: orso-0.0.241.tar.gz
  • Upload date:
  • Size: 351.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for orso-0.0.241.tar.gz
Algorithm Hash digest
SHA256 3afe4e295f5fc9d2e2f86d81064fcbb618569f6075e1b14504dcacbd0e1917f3
MD5 65c0b8468b607e212b7233b47d0e4e84
BLAKE2b-256 eb1fa237889f3fc87b229d704b1ba92d808332e676ea97e997061718e4d2d978

See more details on using hashes here.

File details

Details for the file orso-0.0.241-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for orso-0.0.241-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 495e7bdfb5e46ce500ada57b72b8393e11f80797421d07e04ed1a4cc7b93f697
MD5 ed5662b099bf9a5c0616c82c7d4f0232
BLAKE2b-256 d07d11a58e0d150f961f31375a5b2fea914b6bfaa86b2bee8439ba8e6b3279c7

See more details on using hashes here.

File details

Details for the file orso-0.0.241-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for orso-0.0.241-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 0d51aa21c9c80caad3a2c829871d51c2a8c8dab4a069db8fb0a06ee585b9ed45
MD5 59f0e547e66128694fc6357a77ce87d1
BLAKE2b-256 36894d6150fcd41918314bd8f2d5dcaaaefc4eca6f42863564e5f29d939e67c5

See more details on using hashes here.

File details

Details for the file orso-0.0.241-cp314-cp314-macosx_10_15_universal2.whl.

File metadata

File hashes

Hashes for orso-0.0.241-cp314-cp314-macosx_10_15_universal2.whl
Algorithm Hash digest
SHA256 5155cf6843e8db8d7f2c5ad7ed96217ac2e8ca5eb38bf90f1aad4eb3702d99cd
MD5 ccb069b75300323e8582d1eebb7fc016
BLAKE2b-256 bb51f598e090a9c023676f74e78405f124ac39d76569205fad2775cdb54dc782

See more details on using hashes here.

File details

Details for the file orso-0.0.241-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for orso-0.0.241-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 fc2c69356aac2d4529debd56206e4657056f1e883c067cf7add84dd89248ae75
MD5 bc141045e503c36be2c7e54657256e23
BLAKE2b-256 0ff9a0d5ee1f7426f9c8edece167ab36faf23c99a22150133b80c40fc3b7bd5b

See more details on using hashes here.

File details

Details for the file orso-0.0.241-cp313-cp313-macosx_10_15_universal2.whl.

File metadata

File hashes

Hashes for orso-0.0.241-cp313-cp313-macosx_10_15_universal2.whl
Algorithm Hash digest
SHA256 5ed80750605fc8af7e33294b316c24d60cc3cf009f2f96205dd4c63783ec36f5
MD5 84b37717f8990ed12b52d5b228c222ec
BLAKE2b-256 b872a50f0c356b2548f3b584b4a8c699ea6258502cddb951ce27f16ea21574be

See more details on using hashes here.

File details

Details for the file orso-0.0.241-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for orso-0.0.241-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 25f4998fce59f6b8657dd0ba4863d7bd5c1cba7222fc05bb8b80590525c87658
MD5 37213a8d8541503a797d82b45473162e
BLAKE2b-256 c83e689c702f286752c31045cc1684ee1053b623e81d9519a1fe0ea44a32623e

See more details on using hashes here.

File details

Details for the file orso-0.0.241-cp312-cp312-macosx_10_15_universal2.whl.

File metadata

File hashes

Hashes for orso-0.0.241-cp312-cp312-macosx_10_15_universal2.whl
Algorithm Hash digest
SHA256 16bbd47fcbbe491f32ea73316c60c288aee9de1af06e0121040ff692b652e34d
MD5 782a38c180fb41b5ac83391fab86a720
BLAKE2b-256 209e05ba61dcc97c43eb6641fe3b3242e3314dc8542ddc24ad15f344efd5bb59

See more details on using hashes here.

File details

Details for the file orso-0.0.241-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for orso-0.0.241-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 5f857250a64b79f6232442beabba4e62c48a42731b694912868a4598e0c0f678
MD5 4a4dbedeb39c3aac9470fcc794a4a019
BLAKE2b-256 1854b54f6dab11dae772a1b475d5dc3bbf55e6320e63429481d114ab934ebb0c

See more details on using hashes here.

File details

Details for the file orso-0.0.241-cp311-cp311-macosx_10_15_universal2.whl.

File metadata

File hashes

Hashes for orso-0.0.241-cp311-cp311-macosx_10_15_universal2.whl
Algorithm Hash digest
SHA256 e08fab5c9868d47a414a53b96cdca9b239635eab0aacd4fa4d024ae299090da1
MD5 8a62366e57039a5af3f58c8983a4bc2b
BLAKE2b-256 d0701dc5b6025c43af19330dfe858fa6587962df1b7ef80aaad6991ccb0a2ef1

See more details on using hashes here.

File details

Details for the file orso-0.0.241-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for orso-0.0.241-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 5fe646bbbb62b942bd35fbb69f5fc1e8c900977eb4181cab2dd76726d03a1b7d
MD5 a418f68bc0c822e1b1d100a25ef0dfe9
BLAKE2b-256 7b22015e5a659ab825857c2a627a6dc6f56a592f301a1a92af238f3dd179319f

See more details on using hashes here.

File details

Details for the file orso-0.0.241-cp310-cp310-macosx_10_15_universal2.whl.

File metadata

File hashes

Hashes for orso-0.0.241-cp310-cp310-macosx_10_15_universal2.whl
Algorithm Hash digest
SHA256 540815bae08f386450012e5f8727f0080d2d8b96e00d9b995b166662274f6cf0
MD5 81210e6da5634e58b84b0f5d35d4ae8f
BLAKE2b-256 2fa3734ae0c6812a839bfd91783d783d4f21bd697d0daf7fc3323edca403d609

See more details on using hashes here.

File details

Details for the file orso-0.0.241-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for orso-0.0.241-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 6781614a390dee8e8f885891396b30f770b549d02ffb788bd7d578c368590633
MD5 7b57769278ff0397f8346e9b04785a9f
BLAKE2b-256 da13027f5005663d3aadf01cf592ccd430775b33ccb9f7a096b889ebe96719ac

See more details on using hashes here.

File details

Details for the file orso-0.0.241-cp39-cp39-macosx_10_15_universal2.whl.

File metadata

File hashes

Hashes for orso-0.0.241-cp39-cp39-macosx_10_15_universal2.whl
Algorithm Hash digest
SHA256 f9c3701dfce66f054e1de79be74845eb909487d3b0d3a600dbe11c6c7e15c5fc
MD5 ca116bcd9202a74d06621206851c0e7b
BLAKE2b-256 1f0b89a19f232c84fe763cbbc7169167754dab762d71fea9e79b01f4beecec68

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page