Skip to main content

No project description provided

Project description

flaco

Code Style CI PyPI PyPI - Wheel Downloads

The easiest and perhaps most memory efficient way to get PostgreSQL data (more flavors to come?) into Arrow (IPC/Feather) or Parquet files.

If you're trying to load data directly into Pandas then you may find that evan a 'real' 100MB can cause bloat upwards of 1GB. Expanding this can cause significant bottle necks in processing data efficiently.

Since Arrow supports efficient and even larger-than-memory processing, as with dask, duckdb, or others. Just getting data onto disk is sometimes the hardest part; this aims to make that easier.

NOTE: This is still a WIP, and is purpose built for my needs. I intend to generalize it more to be useful towards a wider audience. Issues and pull requests welcome!


Example

from flaco import read_sql_to_file, FileFormat


uri = "postgresql://postgres:postgres@localhost:5432/postgres"
stmt = "select * from my_big_table"

read_sql_to_file(uri, stmt, 'output.data', FileFormat.Parquet)

# Then with pandas...
import pandas as pd
df = pd.read_parquet('output.data')

# pyarrow... (memory mapped file, where potentially larger than memory)
import pyarrow as pa
with pa.memory_map('output.data', 'rb') as source:
  table = pa.ipc.open_file(source).read_all()  # mmap pyarrow.Table

# DuckDB...
import duckdb
cur = duckdb.connect()
cur.execute("select * from read_parquet('output.data')")

# Or anything else which works with Arrow and/or Parquet files

License

Why did you choose such lax licensing? Could you change to a copy left license, please?

...just kidding, no one would ask that. This is dual licensed under Unlicense or MIT, at your discretion.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flaco-0.6.0rc1.tar.gz (23.2 kB view hashes)

Uploaded Source

Built Distributions

flaco-0.6.0rc1-cp310-none-win_amd64.whl (1.1 MB view hashes)

Uploaded CPython 3.10 Windows x86-64

flaco-0.6.0rc1-cp310-none-win32.whl (1.1 MB view hashes)

Uploaded CPython 3.10 Windows x86

flaco-0.6.0rc1-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.7 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.12+ x86-64

flaco-0.6.0rc1-cp310-cp310-manylinux_2_12_i686.manylinux2010_i686.whl (1.7 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.12+ i686

flaco-0.6.0rc1-cp310-cp310-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl (2.6 MB view hashes)

Uploaded CPython 3.10 macOS 10.9+ universal2 (ARM64, x86-64) macOS 10.9+ x86-64 macOS 11.0+ ARM64

flaco-0.6.0rc1-cp310-cp310-macosx_10_7_x86_64.whl (1.3 MB view hashes)

Uploaded CPython 3.10 macOS 10.7+ x86-64

flaco-0.6.0rc1-cp39-none-win_amd64.whl (1.1 MB view hashes)

Uploaded CPython 3.9 Windows x86-64

flaco-0.6.0rc1-cp39-none-win32.whl (1.1 MB view hashes)

Uploaded CPython 3.9 Windows x86

flaco-0.6.0rc1-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.7 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

flaco-0.6.0rc1-cp39-cp39-manylinux_2_12_i686.manylinux2010_i686.whl (1.7 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.12+ i686

flaco-0.6.0rc1-cp39-cp39-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl (2.6 MB view hashes)

Uploaded CPython 3.9 macOS 10.9+ universal2 (ARM64, x86-64) macOS 10.9+ x86-64 macOS 11.0+ ARM64

flaco-0.6.0rc1-cp39-cp39-macosx_10_7_x86_64.whl (1.3 MB view hashes)

Uploaded CPython 3.9 macOS 10.7+ x86-64

flaco-0.6.0rc1-cp38-none-win_amd64.whl (1.1 MB view hashes)

Uploaded CPython 3.8 Windows x86-64

flaco-0.6.0rc1-cp38-none-win32.whl (1.1 MB view hashes)

Uploaded CPython 3.8 Windows x86

flaco-0.6.0rc1-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.7 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

flaco-0.6.0rc1-cp38-cp38-manylinux_2_12_i686.manylinux2010_i686.whl (1.7 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.12+ i686

flaco-0.6.0rc1-cp38-cp38-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl (2.6 MB view hashes)

Uploaded CPython 3.8 macOS 10.9+ universal2 (ARM64, x86-64) macOS 10.9+ x86-64 macOS 11.0+ ARM64

flaco-0.6.0rc1-cp38-cp38-macosx_10_7_x86_64.whl (1.3 MB view hashes)

Uploaded CPython 3.8 macOS 10.7+ x86-64

flaco-0.6.0rc1-cp37-none-win_amd64.whl (1.1 MB view hashes)

Uploaded CPython 3.7 Windows x86-64

flaco-0.6.0rc1-cp37-none-win32.whl (1.1 MB view hashes)

Uploaded CPython 3.7 Windows x86

flaco-0.6.0rc1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.7 MB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

flaco-0.6.0rc1-cp37-cp37m-manylinux_2_12_i686.manylinux2010_i686.whl (1.7 MB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.12+ i686

flaco-0.6.0rc1-cp37-cp37m-macosx_10_7_x86_64.whl (1.3 MB view hashes)

Uploaded CPython 3.7m macOS 10.7+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page