Skip to main content

A high-performance DataFrame implementation built on top of NumPy

Project description

FastDF: High-Performance DataFrame for Python

FastDF is a lightning-fast, memory-efficient DataFrame implementation built on top of NumPy, designed to overcome the performance limitations of pandas for basic data operations.

🚀 Key Features

  • Blazing Fast: Up to 126x faster data access compared to pandas
  • Memory Efficient: Optimized memory usage with NumPy 2D arrays
  • Pandas-Compatible: Seamless integration with existing pandas-based projects
  • Minimalist: Focuses on core functionality for maximum performance

🎯 Motivation

FastDF was born out of frustration with the sluggish performance of pandas, especially when dealing with large datasets. After exploring various alternatives that either didn't work as expected or introduced complex syntax changes, we realized that for many data analysis tasks, we only need a handful of core features:

  • Named columns
  • Efficient slicing
  • Basic operations like shift and any

By leveraging the power of NumPy's 2D arrays and implementing only the essential features, FastDF achieves remarkable performance improvements without sacrificing ease of use.

⚡ Performance

In our benchmarks, FastDF has shown:

  • 126x faster data access compared to pandas
  • Significantly faster slicing operations
  • Reduced memory footprint

🛠 Installation

pip install git+https://github.com/stwrn/fastdf.git

🚦 Quick Start

from fastdf import fdf
import pandas as pd
import numpy as np

# Create a pandas DataFrame
pdf = pd.DataFrame({'A': np.random.rand(1000000), 'B': np.random.rand(1000000)})

# Convert to FastDF
fast_df = fdf.from_pandas(pdf)

# Use FastDF with familiar pandas-like syntax
print(fast_df.loc[0:5, 'A'])
print(fast_df['B'].shift(1))
print(fast_df.any())

🔄 Compatibility

FastDF is designed to be a drop-in replacement for basic pandas operations. You can easily convert your pandas DataFrame to FastDF and continue using the familiar syntax:

# Your existing pandas code
result = df.loc[1000:2000, 'column_name']

# With FastDF
fast_df = fdf.from_pandas(df)
result = fast_df.loc[1000:2000, 'column_name']

🤝 Contributing

We welcome contributions to FastDF! Whether it's bug reports, feature requests, or code contributions, please feel free to make a pull request or open an issue.

📜 License

FastDF is released under the MIT License. See the LICENSE file for more details.

🙏 Acknowledgements

Special thanks to the NumPy and pandas teams for their incredible work, which laid the foundation for this project.


FastDF is still in active development. We're excited to see how it can help accelerate your data analysis workflows!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastdf-0.1.0.tar.gz (4.3 kB view details)

Uploaded Source

Built Distribution

fastdf-0.1.0-py3-none-any.whl (4.1 kB view details)

Uploaded Python 3

File details

Details for the file fastdf-0.1.0.tar.gz.

File metadata

  • Download URL: fastdf-0.1.0.tar.gz
  • Upload date:
  • Size: 4.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.3

File hashes

Hashes for fastdf-0.1.0.tar.gz
Algorithm Hash digest
SHA256 40191121bc66e0d8b0dd5cec225d4aa94aedf1e69c4ca18ed25ba298d560fdd6
MD5 2ce97abc9264904ceda2825c83301929
BLAKE2b-256 952ad04f0d71146753a7b3f3ae110c7340c5f664acec625aa822794a9dfaf6da

See more details on using hashes here.

File details

Details for the file fastdf-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: fastdf-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 4.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.3

File hashes

Hashes for fastdf-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 440cc5cbf78b9025613966c9bdb1bb6e094007dcddc7c0b725cc4a7901fcfef8
MD5 d83036b0b3591c3c00fec6b6a14bda9b
BLAKE2b-256 87e0172c95821e07141ca9cdcb63fa311580c73f4b37edbe4537687e6fa38c6c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page