Skip to main content

🐻 DataFrame Library

Project description

Orso

Orso is a shared DataFrame library for Opteryx and Mabel.

PyPI Latest Release Downloads codecov Documentation

Overview

Orso is not intended to compete with Polars or Pandas (or your favorite bear DataFrame technology), instead it is developed as a common layer for Mabel and Opteryx.

Key Use Cases:

  • In Opteryx, Orso provides most of the database Cursor functionality
  • In Mabel, Orso provides the data schema and validation functionality

Orso DataFrames are row-based, driven by their initial target use-case as the WAL for Mabel and Cursor for Opteryx. Each row in an Orso DataFrame can be quickly converted to a Tuple of values, a Dictionary, or a byte representation.

Installation

Install Orso from PyPI:

pip install orso

Quick Start

Creating a DataFrame

import orso

# Create from list of dictionaries
df = orso.DataFrame([
    {'name': 'Alice', 'age': 30, 'city': 'New York'},
    {'name': 'Bob', 'age': 25, 'city': 'San Francisco'},
    {'name': 'Charlie', 'age': 35, 'city': 'Chicago'}
])

print(f"Created DataFrame with {df.rowcount} rows and {df.columncount} columns")

Displaying Data

# Display the DataFrame
print(df.display())

# Convert to different formats
arrow_table = df.arrow()  # PyArrow Table
pandas_df = df.pandas()   # Pandas DataFrame

Working with Schema

# Access column names
print("Columns:", df.column_names)

# Access schema information  
print("Schema:", df.schema)

Converting Between Formats

# From PyArrow
import pyarrow as pa
arrow_table = pa.table({'x': [1, 2, 3], 'y': ['a', 'b', 'c']})
orso_df = orso.DataFrame.from_arrow(arrow_table)

# To Pandas
pandas_df = orso_df.pandas()

Features

  • Lightweight: Minimal overhead for tabular data operations
  • Row-based: Optimized for row-oriented operations
  • Interoperable: Easy conversion to/from PyArrow, Pandas
  • Schema-aware: Built-in data validation and type checking
  • Fast serialization: Efficient conversion to bytes, tuples, and dictionaries

API Reference

DataFrame Class

The main DataFrame class provides the following key methods:

  • DataFrame(dictionaries=None, *, rows=None, schema=None) - Constructor
  • display(limit=5, colorize=True, show_types=True) - Pretty print the DataFrame
  • arrow(size=None) - Convert to PyArrow Table
  • pandas(size=None) - Convert to Pandas DataFrame
  • from_arrow(tables) - Create DataFrame from PyArrow Table(s)
  • fetchall() - Get all rows as list of Row objects
  • collect() - Materialize the DataFrame
  • append(other) - Append another DataFrame
  • distinct() - Get unique rows

Properties

  • rowcount - Number of rows
  • columncount - Number of columns
  • column_names - List of column names
  • schema - Schema information

Development

Building from Source

# Clone the repository
git clone https://github.com/mabel-dev/orso.git
cd orso

# Install dependencies
pip install -r requirements.txt
pip install -r tests/requirements.txt

# Build Cython extensions
make compile

# Run tests
make test

Contributing

Orso is part of the Mabel ecosystem. Contributions are welcome! Please ensure:

  1. All tests pass: make test
  2. Code follows the project style: make lint
  3. New features include appropriate tests
  4. Documentation is updated for API changes

Performance Benchmarking

Orso includes a comprehensive performance benchmark suite to compare different versions:

# Run full benchmark suite
python tests/test_benchmark_suite.py

# Compare two versions
python tests/test_benchmark_suite.py -o baseline.json
# <switch version>
python tests/test_benchmark_suite.py -o current.json -c baseline.json

See BENCHMARK_SUITE.md for detailed documentation.

License

License

Orso is licensed under Apache 2.0 unless explicitly indicated otherwise.

Status

Status

Orso is in beta. Beta means different things to different people, to us, being beta means:

  • Interfaces are generally stable but may still have breaking changes
  • Unit tests are not reliable enough to capture breaks to functionality
  • Bugs are likely to exist in edge cases
  • Code may not be tuned for performance

As such, we really don't recommend using Orso in critical applications.

Related Projects

  • Opteryx - SQL query engine for data files
  • Mabel - Data processing framework

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

orso-0.0.239.tar.gz (349.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

orso-0.0.239-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_34_x86_64.whl (908.2 kB view details)

Uploaded CPython 3.14tmanylinux: glibc 2.17+ x86-64manylinux: glibc 2.34+ x86-64

orso-0.0.239-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (907.4 kB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ x86-64

orso-0.0.239-cp314-cp314-macosx_10_15_universal2.whl (323.1 kB view details)

Uploaded CPython 3.14macOS 10.15+ universal2 (ARM64, x86-64)

orso-0.0.239-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (918.3 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

orso-0.0.239-cp313-cp313-macosx_10_15_universal2.whl (322.9 kB view details)

Uploaded CPython 3.13macOS 10.15+ universal2 (ARM64, x86-64)

orso-0.0.239-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (936.3 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

orso-0.0.239-cp312-cp312-macosx_10_15_universal2.whl (324.0 kB view details)

Uploaded CPython 3.12macOS 10.15+ universal2 (ARM64, x86-64)

orso-0.0.239-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (882.9 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

orso-0.0.239-cp311-cp311-macosx_10_15_universal2.whl (330.2 kB view details)

Uploaded CPython 3.11macOS 10.15+ universal2 (ARM64, x86-64)

orso-0.0.239-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (840.0 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

orso-0.0.239-cp310-cp310-macosx_10_15_universal2.whl (333.9 kB view details)

Uploaded CPython 3.10macOS 10.15+ universal2 (ARM64, x86-64)

orso-0.0.239-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (835.8 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

orso-0.0.239-cp39-cp39-macosx_10_15_universal2.whl (334.5 kB view details)

Uploaded CPython 3.9macOS 10.15+ universal2 (ARM64, x86-64)

File details

Details for the file orso-0.0.239.tar.gz.

File metadata

  • Download URL: orso-0.0.239.tar.gz
  • Upload date:
  • Size: 349.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for orso-0.0.239.tar.gz
Algorithm Hash digest
SHA256 b87f3a70b1caf0164675745881cfcbf4acf45bc2086e2df312cfa7444125649b
MD5 33a810b9e2ea2fd040e42e7fc6d46849
BLAKE2b-256 042d56f057b7c4c846c840bf3c298b74d53d7766c9373fb57cfaad2aef496c32

See more details on using hashes here.

File details

Details for the file orso-0.0.239-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for orso-0.0.239-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 32af8d6d4e54d31c877b7185e63914e72c24b0d81c9282691feb04f2dce76665
MD5 a061905420d3165c17aeaff73b22ea7c
BLAKE2b-256 07ea328090dd125e28406ddb510ef257832865443ef42b135d1a41e6e2ad015c

See more details on using hashes here.

File details

Details for the file orso-0.0.239-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for orso-0.0.239-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 404bcee3f7c6d945a1eabadde0bea7c441d0cb038b9aa5931dcf371c9e62cc01
MD5 5ead788fa411be005891056278229f97
BLAKE2b-256 ed64c2e86460534e183f2a65e1087664c290112ca268f8cd8947c1d8d0aab75f

See more details on using hashes here.

File details

Details for the file orso-0.0.239-cp314-cp314-macosx_10_15_universal2.whl.

File metadata

File hashes

Hashes for orso-0.0.239-cp314-cp314-macosx_10_15_universal2.whl
Algorithm Hash digest
SHA256 2835d61e288c6a27f49d08c7769f90e6504b944226672df4280aa0c4fd1f8353
MD5 3267f4db81d1efb34e6e5346b3c89953
BLAKE2b-256 8cb1edf3b508b4de4c35ea1d99a73ace1ba7f0b142719c148176e1f0e87a4a5f

See more details on using hashes here.

File details

Details for the file orso-0.0.239-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for orso-0.0.239-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 29a04ce786a5124b3733279d7c8576076edab8d59a36c0f99ae0b5b83afcfab3
MD5 47bbfa1509529ef6832d6cb630cbbcc0
BLAKE2b-256 ca63fd2cb2ecf5dd0514bb052aa20bf8adfc02f7acd4143e4325fb016f43b8f4

See more details on using hashes here.

File details

Details for the file orso-0.0.239-cp313-cp313-macosx_10_15_universal2.whl.

File metadata

File hashes

Hashes for orso-0.0.239-cp313-cp313-macosx_10_15_universal2.whl
Algorithm Hash digest
SHA256 aaed826700567d00817efbdf53ccb3a260bfc94a3ea25df0592ac05c5e35ed8d
MD5 1c7f6b11e714d7767a851c19b4447f69
BLAKE2b-256 25237e6ef4803b4c084ecd0dfe39c6da436ef0faa5f1c813d346d5e7e4ddd80f

See more details on using hashes here.

File details

Details for the file orso-0.0.239-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for orso-0.0.239-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 289fd5a528e089de5611fda9da5e982105fc132e2751c6d7b78bb29d40cc34f0
MD5 4ac7c572264df2a3d6375f1a2e6e6366
BLAKE2b-256 0b2694fd2c4cc02727f4bf98b55242caab38d5f25d62fba3ccda20465eefe9b2

See more details on using hashes here.

File details

Details for the file orso-0.0.239-cp312-cp312-macosx_10_15_universal2.whl.

File metadata

File hashes

Hashes for orso-0.0.239-cp312-cp312-macosx_10_15_universal2.whl
Algorithm Hash digest
SHA256 a16e03afd519fc67728221a8c059b2b91861e6dbbf4940d19c0114a0a3722898
MD5 af6651c7bdb3a657db04f51696a97493
BLAKE2b-256 60e25cdde35487c6c516b4f047e82b6ba83ebf83be4a6e94e230f51e2f6645af

See more details on using hashes here.

File details

Details for the file orso-0.0.239-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for orso-0.0.239-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 f6fd173be66152845e7fe38d52d88bb8af80981dbd4fd50239c310644f6bd948
MD5 9d8be473687ca42430c10e19cc8a56f2
BLAKE2b-256 4fc6afac0be991e2b618e97ad05faa3776506ee12d9394acf84153a6150e8974

See more details on using hashes here.

File details

Details for the file orso-0.0.239-cp311-cp311-macosx_10_15_universal2.whl.

File metadata

File hashes

Hashes for orso-0.0.239-cp311-cp311-macosx_10_15_universal2.whl
Algorithm Hash digest
SHA256 5b1b7e63efeeae2cd096c6b5d600b069cd2f4681efa408666d813072c5b968b3
MD5 2b19168e2061a3f75c2183ca1f9179c5
BLAKE2b-256 49776a936141c604b122c8cc9dd7defc998e9193beac80082f175ea4247f3bdc

See more details on using hashes here.

File details

Details for the file orso-0.0.239-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for orso-0.0.239-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 5fd2b0e2da2e7146b43224d016aef0e2a0c4bf97a056bf76793dd2f1c840b863
MD5 31152ea622370206457bb93d11cfc93b
BLAKE2b-256 927ffd17760d4f5ca210fb4e9eeb929713e8149e8d15e3ab82f27586543cb350

See more details on using hashes here.

File details

Details for the file orso-0.0.239-cp310-cp310-macosx_10_15_universal2.whl.

File metadata

File hashes

Hashes for orso-0.0.239-cp310-cp310-macosx_10_15_universal2.whl
Algorithm Hash digest
SHA256 5c845e4c5df17c9846aa40fcfb4259a6c6d8ce79da9b4ba6bd7a7f6b58d207af
MD5 824b1676eab341ae1449c4550befcfae
BLAKE2b-256 015c835ab2391eb99bd854d20483830b9f2e29b1590c2397caf030c46ecceb32

See more details on using hashes here.

File details

Details for the file orso-0.0.239-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for orso-0.0.239-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 08d1d4356798903aa87e32263095a808eb80a350dc5d273e3b1cc14a88089d73
MD5 a8e9fd1ada2e94a726b24e6f8bb75c7e
BLAKE2b-256 69e811f5b7d70930a0636418b12abc02ee3c9965eb511704f5b9a58ff900a229

See more details on using hashes here.

File details

Details for the file orso-0.0.239-cp39-cp39-macosx_10_15_universal2.whl.

File metadata

File hashes

Hashes for orso-0.0.239-cp39-cp39-macosx_10_15_universal2.whl
Algorithm Hash digest
SHA256 8b83863a95d2c5e545eb0ba3e2604138a405f5efa4b2e90a1b0c7ff92b1f0a42
MD5 f89d13ae200f804a75d28b77b364e0d1
BLAKE2b-256 85a2d8b78f88c3ffd52405f150ed762434004c0fa776072918e562783b86fcf8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page