Skip to main content

A high-performance DataFrame implementation built on top of NumPy

Project description

FastDF: High-Performance DataFrame for Python

FastDF is a lightning-fast, memory-efficient DataFrame implementation built on top of NumPy, designed to overcome the performance limitations of pandas for basic data operations.

🚀 Key Features

  • Blazing Fast: Up to 126x faster data access compared to pandas
  • Memory Efficient: Optimized memory usage with NumPy 2D arrays
  • Pandas-Compatible: Seamless integration with existing pandas-based projects
  • Minimalist: Focuses on core functionality for maximum performance

🎯 Motivation

FastDF was born out of frustration with the sluggish performance of pandas, especially when dealing with large datasets. After exploring various alternatives that either didn't work as expected or introduced complex syntax changes, we realized that for many data analysis tasks, we only need a handful of core features:

  • Named columns
  • Efficient slicing
  • Basic operations like shift and any

By leveraging the power of NumPy's 2D arrays and implementing only the essential features, FastDF achieves remarkable performance improvements without sacrificing ease of use.

⚡ Performance

In our benchmarks, FastDF has shown:

  • 126x faster data access compared to pandas
  • Significantly faster slicing operations
  • Reduced memory footprint

🛠 Installation

You can install FastDF using pip:

pip install fastdf

For the latest development version, you can install directly from GitHub:

pip install git+https://github.com/stwrn/fastdf.git

🚦 Quick Start

from fastdf import fdf
import pandas as pd
import numpy as np

# Create a pandas DataFrame
pdf = pd.DataFrame({'A': np.random.rand(1000000), 'B': np.random.rand(1000000)})

# Convert to FastDF
fast_df = fdf.from_pandas(pdf)

# Use FastDF with familiar pandas-like syntax
print(fast_df.loc[0:5, 'A'])
print(fast_df['B'].shift(1))
print(fast_df.any())

🔄 Compatibility

FastDF is designed to be a drop-in replacement for basic pandas operations. You can easily convert your pandas DataFrame to FastDF and continue using the familiar syntax:

# Your existing pandas code
result = df.loc[1000:2000, 'column_name']

# With FastDF
fast_df = fdf.from_pandas(df)
result = fast_df.loc[1000:2000, 'column_name']

🤝 Contributing

We welcome contributions to FastDF! Whether it's bug reports, feature requests, or code contributions, please feel free to make a pull request or open an issue.

📜 License

FastDF is released under the MIT License. See the LICENSE file for more details.

🙏 Acknowledgements

Special thanks to the NumPy and pandas teams for their incredible work, which laid the foundation for this project.


FastDF is still in active development. We're excited to see how it can help accelerate your data analysis workflows!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastdf-0.1.1.tar.gz (4.4 kB view details)

Uploaded Source

Built Distribution

fastdf-0.1.1-py3-none-any.whl (4.2 kB view details)

Uploaded Python 3

File details

Details for the file fastdf-0.1.1.tar.gz.

File metadata

  • Download URL: fastdf-0.1.1.tar.gz
  • Upload date:
  • Size: 4.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.3

File hashes

Hashes for fastdf-0.1.1.tar.gz
Algorithm Hash digest
SHA256 a7ba63cadb267b97f99a1aa2d94116377458b9bca875076df326332af113aefe
MD5 1a721f18a03406cecd88dc4924eca899
BLAKE2b-256 16e01db1b0e2126c11b06f57299c75fd3c442e757263c0a6f6301072257a2c2b

See more details on using hashes here.

File details

Details for the file fastdf-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: fastdf-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 4.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.3

File hashes

Hashes for fastdf-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0884b85f93cd327e33f9ee625beb3a721460f40015d740fe1ec10478c921ff38
MD5 de14d3a017dd4d88c9591c91986b6841
BLAKE2b-256 5df172674fd28c03f3ce4d26eaf17c9c863b1b918f7133d015701ecde212a99f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page