Skip to main content

A high-performance DataFrame implementation built on top of NumPy

Project description

FastDF: High-Performance DataFrame for Python

FastDF is a lightning-fast, memory-efficient DataFrame implementation built on top of NumPy, designed to overcome the performance limitations of pandas for basic data operations.

🚀 Key Features

  • Blazing Fast: Up to 126x faster data access compared to pandas
  • Memory Efficient: Optimized memory usage with NumPy 2D arrays
  • Pandas-Compatible: Seamless integration with existing pandas-based projects
  • Minimalist: Focuses on core functionality for maximum performance

🎯 Motivation

FastDF was born out of frustration with the sluggish performance of pandas, especially when dealing with large datasets. After exploring various alternatives that either didn't work as expected or introduced complex syntax changes, we realized that for many data analysis tasks, we only need a handful of core features:

  • Named columns
  • Efficient slicing
  • Basic operations like shift and any

By leveraging the power of NumPy's 2D arrays and implementing only the essential features, FastDF achieves remarkable performance improvements without sacrificing ease of use.

⚡ Performance

In our benchmarks, FastDF has shown:

  • 40x faster data access compared to pandas
  • Significantly faster slicing operations
  • Reduced memory footprint

🛠 Installation

You can install FastDF using pip:

pip install fastdf

For the latest development version, you can install directly from GitHub:

pip install git+https://github.com/stwrn/fastdf.git

🚦 Quick Start

from fastdf import fdf
import pandas as pd
import numpy as np

# Create a pandas DataFrame
pdf = pd.DataFrame({'A': np.random.rand(1000000), 'B': np.random.rand(1000000)})

# Convert to FastDF
fast_df = fdf.from_pandas(pdf)

# Use FastDF with familiar pandas-like syntax
print(fast_df.loc[0:5, 'A'])
print(fast_df['B'].shift(1))
print(fast_df.any())

🔄 Compatibility

FastDF is designed to be a drop-in replacement for basic pandas operations. You can easily convert your pandas DataFrame to FastDF and continue using the familiar syntax:

# Your existing pandas code
result = pandas_df.loc[1000:2000]['B']
print(f"Pandas result {result}")

# With FastDF from pandas
fast_df = fdf.from_pandas(pandas_df)
result_fdf = fast_df.loc[1000:2000]['B']
print(f"FastDF result {result_fdf}")

# With FastDF
data = np.random.rand(1000, 5)
columns = ['A', 'B', 'C', 'D', 'E']
fast_df = fdf(data, columns)
print(f"FastDF {fast_df}")

🤝 Contributing

We welcome contributions to FastDF! Whether it's bug reports, feature requests, or code contributions, please feel free to make a pull request or open an issue.

📜 License

FastDF is released under the MIT License. See the LICENSE file for more details.

🙏 Acknowledgements

Special thanks to the NumPy and pandas teams for their incredible work, which laid the foundation for this project.


FastDF is still in active development. We're excited to see how it can help accelerate your data analysis workflows!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastdf-0.1.3.tar.gz (30.9 kB view details)

Uploaded Source

Built Distribution

fastdf-0.1.3-py3-none-any.whl (4.9 kB view details)

Uploaded Python 3

File details

Details for the file fastdf-0.1.3.tar.gz.

File metadata

  • Download URL: fastdf-0.1.3.tar.gz
  • Upload date:
  • Size: 30.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.3

File hashes

Hashes for fastdf-0.1.3.tar.gz
Algorithm Hash digest
SHA256 7752e400d95ad6e512c59cb44551636d9c82e6c6e540e365c2c3c882025886ed
MD5 96c35d95772ad6283f4510ed736cce49
BLAKE2b-256 403a995cf4bac219dd154bbec72d20883226a5f24c38b0d60a31c90fdd8c7881

See more details on using hashes here.

File details

Details for the file fastdf-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: fastdf-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 4.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.3

File hashes

Hashes for fastdf-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 0817d3d27d738cf4278f4eaef5c0acb8fd43f9188ec33e809e8ee705bb712259
MD5 99388c6ae9e170e26d1962c92d32cbe5
BLAKE2b-256 5ec840775e312d39f3d73dd27548774378d5313a3c466bdea4bd9aa1a1a2cd73

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page