sling·PyPI

Slings data from a source to a target

These details have not been verified by PyPI

Project links

Project description

logo

Slings from a data source to a data target.

Installation

pip install sling or pip install sling[arrow] for streaming.

Then you should be able to run sling --help from command line.

Running a Extract-Load Task

CLI

sling run --src-conn MY_PG --src-stream myschema.mytable \
  --tgt-conn YOUR_SNOWFLAKE --tgt-object yourschema.yourtable \
  --mode full-refresh

Or passing a yaml/json string or file

cat '
source: MY_POSTGRES
target: MY_SNOWFLAKE

# default config options which apply to all streams
defaults:
  mode: full-refresh
  object: new_schema.{stream_schema}_{stream_table}

streams:
  my_schema.*:
' > /path/to/replication.yaml

sling run -r /path/to/replication.yaml

Using the `Replication` class

Run a replication from file:

import yaml
from sling import Replication

# From a YAML file
replication = Replication(file_path="path/to/replication.yaml")
replication.run()

# Or load into object
with open('path/to/replication.yaml') as file:
  config = yaml.load(file, Loader=yaml.FullLoader)

replication = Replication(**config)

replication.run()

Build a replication dynamically:

from sling import Replication, ReplicationStream, Mode

# build sling replication
streams = {}
for (folder, table_name) in list(folders):
  streams[folder] = ReplicationStream(
    mode=Mode.FULL_REFRESH, object=table_name, primary_key='_hash_id')

replication = Replication(
  source='aws_s3',
  target='snowflake',
  streams=streams,
  env=dict(SLING_STREAM_URL_COLUMN='true', SLING_LOADED_AT_COLUMN='true'),
  debug=True,
)

replication.run()

Using the `Sling` Class

For more direct control and streaming capabilities, you can use the Sling class, which mirrors the CLI interface.

Basic Usage with `run()` method

import os
from sling import Sling, Mode

# Set postgres & snowflake connection
# see https://docs.slingdata.io/connections/database-connections
os.environ["POSTGRES"] = 'postgres://...'
os.environ["SNOWFLAKE"] = 'snowflake://...'

# Database to database transfer
Sling(
    src_conn="postgres",
    src_stream="public.users",
    tgt_conn="snowflake",
    tgt_object="public.users_copy",
    mode=Mode.FULL_REFRESH
).run()

# Database to file
Sling(
    src_conn="postgres", 
    src_stream="select * from users where active = true",
    tgt_object="file:///tmp/active_users.csv"
).run()

# File to database
Sling(
    src_stream="file:///path/to/data.csv",
    tgt_conn="snowflake",
    tgt_object="public.imported_data"
).run()

Input Streaming - Python Data to Target

💡 Tip: Install pip install sling[arrow] for better streaming performance and improved data type handling.

📊 DataFrame Support: The input parameter accepts lists of dictionaries, pandas DataFrames, or polars DataFrames. DataFrame support preserves data types when using Arrow format.

⚠️ Note: Be careful with large numbers of Sling invocations using input or stream() methods when working with external systems (databases, file systems). Each call re-opens the connection since it invokes the underlying sling binary. For better performance and connection reuse, consider using the Replication class instead, which maintains open connections across multiple operations.

import os
from sling import Sling, Format

# Set postgres connection
# see https://docs.slingdata.io/connections/database-connections
os.environ["POSTGRES"] = 'postgres://...'

# Stream Python data to CSV file
data = [
    {"id": 1, "name": "John", "age": 30},
    {"id": 2, "name": "Jane", "age": 25},
    {"id": 3, "name": "Bob", "age": 35}
]

Sling(
    input=data,
    tgt_object="file:///tmp/output.csv"
).run()

# Stream Python data to database
Sling(
    input=data,
    tgt_conn="postgres",
    tgt_object="public.users"
).run()

# Stream Python data to JSON Lines file
Sling(
    input=data,
    tgt_object="file:///tmp/output.jsonl",
    tgt_options={"format": Format.JSONLINES}
).run()

# Stream from generator (memory efficient for large datasets)
def data_generator():
    for i in range(10000):
        yield {"id": i, "value": f"item_{i}", "timestamp": "2023-01-01"}

Sling(input=data_generator(), tgt_object="file:///tmp/large_dataset.csv").run()

# Stream pandas DataFrame to database
import pandas as pd

df = pd.DataFrame({
    "id": [1, 2, 3, 4],
    "name": ["Alice", "Bob", "Charlie", "Diana"],
    "age": [25, 30, 35, 28],
    "salary": [50000, 60000, 70000, 55000]
})

Sling(
    input=df,
    tgt_conn="postgres",
    tgt_object="public.employees"
).run()

# Stream polars DataFrame to CSV file
import polars as pl

df = pl.DataFrame({
    "product_id": [101, 102, 103],
    "product_name": ["Laptop", "Mouse", "Keyboard"],
    "price": [999.99, 25.50, 75.00],
    "in_stock": [True, False, True]
})

Sling(
    input=df,
    tgt_object="file:///tmp/products.csv"
).run()

# DataFrame with column selection
Sling(
    input=df,
    select=["product_name", "price"],  # Only export specific columns
    tgt_object="file:///tmp/product_prices.csv"
).run()

Output Streaming with `stream()`

import os
from sling import Sling

# Set postgres connection
# see https://docs.slingdata.io/connections/database-connections
os.environ["POSTGRES"] = 'postgres://...'

# Stream data from database
sling = Sling(
    src_conn="postgres",
    src_stream="public.users",
    limit=1000
)

for record in sling.stream():
    print(f"User: {record['name']}, Age: {record['age']}")

# Stream data from file
sling = Sling(
    src_stream="file:///path/to/data.csv"
)

# Process records one by one (memory efficient)
for record in sling.stream():
    # Process each record
    processed_data = transform_record(record)
    # Could save to another system, send to API, etc.

# Stream with parameters
sling = Sling(
    src_conn="postgres",
    src_stream="public.orders",
    select=["order_id", "customer_name", "total"],
    where="total > 100",
    limit=500
)

records = list(sling.stream())
print(f"Found {len(records)} high-value orders")

High-Performance Streaming with `stream_arrow()`

🚀 Performance: The stream_arrow() method provides the highest performance streaming with full data type preservation by using Apache Arrow's columnar format. Requires pip install sling[arrow].

📊 Type Safety: Unlike stream() which may convert data types during CSV serialization, stream_arrow() preserves exact data types including integers, floats, timestamps, and more.

import os
from sling import Sling

# Set postgres connection  
# see https://docs.slingdata.io/connections/database-connections
os.environ["POSTGRES"] = 'postgres://...'

# Basic Arrow streaming from database
sling = Sling(src_conn="postgres", src_stream="public.users", limit=1000)

# Get Arrow RecordBatchStreamReader for maximum performance
reader = sling.stream_arrow()

# Convert to Arrow Table for analysis
table = reader.read_all()
print(f"Received {table.num_rows} rows with {table.num_columns} columns")
print(f"Column names: {table.column_names}")
print(f"Schema: {table.schema}")

# Convert to pandas DataFrame with preserved types
if table.num_rows > 0:
    df = table.to_pandas()
    print(df.dtypes)  # Shows preserved data types

# Stream Arrow file with type preservation
sling = Sling(
    src_stream="file:///path/to/data.arrow",
    src_options={"format": "arrow"}
)

reader = sling.stream_arrow()
table = reader.read_all()

# Access columnar data directly (very efficient)
for column_name in table.column_names:
    column = table.column(column_name)
    print(f"{column_name}: {column.type}")

# Process Arrow batches for large datasets (memory efficient)
sling = Sling(
    src_conn="postgres", 
    src_stream="select * from large_table"
)

reader = sling.stream_arrow()
for batch in reader:
    # Process each batch separately to manage memory
    print(f"Processing batch with {batch.num_rows} rows")
    # Convert batch to pandas if needed
    batch_df = batch.to_pandas()
    # Process batch_df...

# Round-trip with Arrow format preservation
import pandas as pd

# Write DataFrame to Arrow file with type preservation
df = pd.DataFrame({
    "id": [1, 2, 3],
    "amount": [100.50, 250.75, 75.25],
    "timestamp": pd.to_datetime(["2023-01-01", "2023-01-02", "2023-01-03"]),
    "active": [True, False, True]
})

Sling(
    input=df,
    tgt_object="file:///tmp/data.arrow",
    tgt_options={"format": "arrow"}
).run()

# Read back with full type preservation
sling = Sling(
    src_stream="file:///tmp/data.arrow",
    src_options={"format": "arrow"}
)

reader = sling.stream_arrow()
restored_table = reader.read_all()
restored_df = restored_table.to_pandas()

# Types are exactly preserved (no string conversion)
print(restored_df.dtypes)
assert restored_df['active'].dtype == 'bool'
assert 'datetime64' in str(restored_df['timestamp'].dtype)

Notes:

stream_arrow() requires PyArrow: pip install sling[arrow]
Cannot be used with a target object (use run() instead)
Provides the best performance for large datasets
Preserves exact data types including timestamps, decimals, and booleans
Ideal for analytics workloads and data science applications

Round-trip Examples

import os
from sling import Sling

# Set postgres connection
# see https://docs.slingdata.io/connections/database-connections
os.environ["POSTGRES"] = 'postgres://...'

# Python → File → Python
original_data = [
    {"id": 1, "name": "Alice", "score": 95.5},
    {"id": 2, "name": "Bob", "score": 87.2}
]

# Step 1: Python data to file
sling_write = Sling(
    input=original_data,
    tgt_object="file:///tmp/scores.csv"
)
sling_write.run()

# Step 2: File back to Python
sling_read = Sling(
    src_stream="file:///tmp/scores.csv"
)
loaded_data = list(sling_read.stream())

# Python → Database → Python (with transformations)
sling_to_db = Sling(
    input=original_data,
    tgt_conn="postgres",
    tgt_object="public.temp_scores"
)
sling_to_db.run()

sling_from_db = Sling(
    src_conn="postgres", 
    src_stream="select *, score * 1.1 as boosted_score from public.temp_scores",
)
transformed_data = list(sling_from_db.stream())

# DataFrame → Database → DataFrame (with pandas/polars)
import pandas as pd

# Start with pandas DataFrame
df = pd.DataFrame({
    "user_id": [1, 2, 3],
    "purchase_amount": [100.50, 250.75, 75.25],
    "category": ["electronics", "clothing", "books"]
})

# Write DataFrame to database
Sling(
    input=df,
    tgt_conn="postgres",
    tgt_object="public.purchases"
).run()

# Read back with SQL transformations as pandas DataFrame
sling_query = Sling(
    src_conn="postgres",
    src_stream="""
        SELECT category, 
               COUNT(*) as purchase_count,
               AVG(purchase_amount) as avg_amount
        FROM public.purchases 
        GROUP BY category
    """
)
summary_data = list(sling_query.stream())
summary_df = pd.DataFrame(summary_data)
print(summary_df)

Using the `Pipeline` class

Run a Pipeline:

from sling import Pipeline
from sling.hooks import StepLog, StepCopy, StepReplication, StepHTTP, StepCommand

# From a YAML file
pipeline = Pipeline(file_path="path/to/pipeline.yaml")
pipeline.run()

# Or using Hook objects for type safety
pipeline = Pipeline(
    steps=[
        StepLog(message="Hello world"),
        StepCopy(from_="sftp//path/to/file", to="aws_s3/path/to/file"),
        StepReplication(path="path/to/replication.yaml"),
        StepHTTP(url="https://trigger.webhook.com"),
        StepCommand(command=["ls", "-l"], print_output=True)
    ],
    env={"MY_VAR": "value"}
)
pipeline.run()

# Or programmatically using dictionaries
pipeline = Pipeline(
    steps=[
        {"type": "log", "message": "Hello world"},
        {"type": "copy", "from": "sftp//path/to/file", "to": "aws_s3/path/to/file"},
        {"type": "replication", "path": "path/to/replication.yaml"},
        {"type": "http", "url": "https://trigger.webhook.com"},
        {"type": "command", "command": ["ls", "-l"], "print": True}
    ],
    env={"MY_VAR": "value"}
)
pipeline.run()

Testing

pytest sling/tests/tests.py -v
pytest sling/tests/test_sling_class.py -v

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.4.12

Jun 29, 2025

1.4.11

Jun 27, 2025

1.4.10.post2

Jun 19, 2025

1.4.10.post1

Jun 12, 2025

1.4.10

Jun 12, 2025

1.4.9.post2

Jun 3, 2025

1.4.9.post1

May 26, 2025

1.4.9

May 25, 2025

1.4.8

May 13, 2025

1.4.7

May 13, 2025

1.4.6

Apr 27, 2025

1.4.5

Apr 18, 2025

1.4.4

Mar 15, 2025

1.4.3.post1

Feb 17, 2025

1.4.3

Feb 14, 2025

1.4.2

Feb 3, 2025

1.4.1.post3

Feb 3, 2025

1.4.1.post2

Feb 3, 2025

1.4.1.post1

Feb 3, 2025

1.4.1

Feb 1, 2025

1.4.0

Jan 22, 2025

1.3.5

Jan 15, 2025

1.3.4

Dec 27, 2024

1.3.3

Dec 16, 2024

1.3.2

Dec 3, 2024

1.3.1

Dec 1, 2024

1.2.24

Nov 26, 2024

1.2.23

Nov 24, 2024

1.2.22

Oct 24, 2024

1.2.21

Oct 21, 2024

1.2.20

Sep 24, 2024

1.2.19

Sep 22, 2024

1.2.18

Aug 30, 2024

1.2.17

Aug 28, 2024

1.2.16

Aug 27, 2024

1.2.15

Aug 15, 2024

1.2.14

Jul 25, 2024

1.2.13

Jun 29, 2024

1.2.11

Jun 4, 2024

1.2.10

May 18, 2024

1.2.9

Apr 28, 2024

1.2.6

Apr 22, 2024

1.2.5

Apr 15, 2024

1.2.4.post1

Apr 14, 2024

1.2.4

Apr 13, 2024

1.2.3.post2

Apr 11, 2024

1.2.3.post1

Apr 10, 2024

1.2.3

Apr 10, 2024

1.2.2

Mar 31, 2024

1.2.1

Mar 23, 2024

1.1.14

Mar 15, 2024

1.1.13

Mar 9, 2024

1.1.12

Mar 1, 2024

1.1.11

Feb 28, 2024

1.1.8

Feb 26, 2024

1.1.7

Feb 22, 2024

1.1.6.post1

Feb 20, 2024

1.1.6

Feb 19, 2024

1.1.5.post4

Feb 13, 2024

1.1.5.post3

Feb 13, 2024

1.1.5.post2

Feb 12, 2024

1.1.5.post1

Feb 11, 2024

1.1.5 yanked

Feb 11, 2024

Reason this release was yanked:

Broken

1.1.3

Feb 8, 2024

1.0.73

Feb 3, 2024

1.0.72

Jan 27, 2024

1.0.71

Jan 20, 2024

1.0.70

Jan 19, 2024

1.0.69

Jan 12, 2024

1.0.68

Jan 10, 2024

1.0.66

Dec 28, 2023

1.0.65

Dec 16, 2023

1.0.64

Dec 11, 2023

1.0.63

Dec 4, 2023

1.0.62

Nov 30, 2023

1.0.61

Nov 27, 2023

1.0.60

Nov 20, 2023

1.0.59

Nov 17, 2023

1.0.58

Nov 10, 2023

1.0.57

Nov 8, 2023

1.0.56

Nov 6, 2023

1.0.55

Nov 6, 2023

1.0.54

Nov 4, 2023

1.0.53

Nov 2, 2023

1.0.50

Oct 28, 2023

1.0.48

Oct 23, 2023

1.0.47

Oct 23, 2023

1.0.45

Oct 22, 2023

1.0.44

Oct 22, 2023

1.0.41

Oct 14, 2023

1.0.40

Oct 13, 2023

1.0.39

Oct 10, 2023

1.0.38

Oct 8, 2023

1.0.35

Oct 4, 2023

1.0.34

Oct 2, 2023

1.0.30

Sep 25, 2023

1.0.29

Sep 21, 2023

1.0.20

Sep 13, 2023

1.0.18

Sep 12, 2023

1.0.17

Sep 9, 2023

1.0.15

Sep 2, 2023

1.0.9

Aug 28, 2023

1.0.8

Aug 25, 2023

1.0.5

Aug 25, 2023

1.0.4

Aug 24, 2023

1.0.3

Aug 22, 2023

1.0.1

Aug 19, 2023

1.0.0

Aug 19, 2023

0.87.99

Aug 18, 2023

0.87.97

Jul 10, 2023

0.87.90

May 15, 2023

0.87.89

May 15, 2023

0.87.88

May 15, 2023

0.87.85

May 7, 2023

0.87.84

May 6, 2023

0.87.82

Apr 14, 2023

0.87.81

Apr 13, 2023

0.87.77

Mar 15, 2023

0.87.48

Feb 10, 2023

0.87.42

Feb 5, 2023

0.87.41

Feb 5, 2023

0.87.37

Feb 2, 2023

0.87.36

Feb 2, 2023

0.87.35

Feb 2, 2023

0.87.34

Jan 29, 2023

0.87.33

Jan 27, 2023

0.87.32

Jan 26, 2023

0.87.31

Jan 25, 2023

0.87.30

Jan 25, 2023

0.87.25

Jan 1, 2023

0.87.23

Dec 30, 2022

0.87.22

Dec 29, 2022

0.87.21

Dec 29, 2022

0.87.17

Dec 29, 2022

0.87.16

Dec 27, 2022

0.87.15

Dec 27, 2022

0.87.12

Dec 22, 2022

0.87.11

Dec 17, 2022

0.87.6

Dec 15, 2022

0.87.3

Dec 8, 2022

0.87.1

Dec 8, 2022

0.87.0

Dec 7, 2022

0.86.112

Dec 5, 2022

0.86.110

Dec 5, 2022

0.86.97

Nov 28, 2022

0.86.96

Nov 27, 2022

0.86.91

Nov 27, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sling-1.4.12.tar.gz (39.8 kB view details)

Uploaded Jun 29, 2025 Source

Built Distribution

sling-1.4.12-py3-none-any.whl (24.0 kB view details)

Uploaded Jun 29, 2025 Python 3

File details

Details for the file sling-1.4.12.tar.gz.

File metadata

Download URL: sling-1.4.12.tar.gz
Upload date: Jun 29, 2025
Size: 39.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for sling-1.4.12.tar.gz
Algorithm	Hash digest
SHA256	`347c4c35cce8183ec09bbfa91eb2f17125dfe3f544e3ab73bcb2c84bbd20e95e`
MD5	`81084016080e655a6ca5edb193064dd7`
BLAKE2b-256	`5a35c97b2e250d66d5aede4505f28520cb5f21b801eb9921987886e3ab9aaf3f`

See more details on using hashes here.

File details

Details for the file sling-1.4.12-py3-none-any.whl.

File metadata

Download URL: sling-1.4.12-py3-none-any.whl
Upload date: Jun 29, 2025
Size: 24.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for sling-1.4.12-py3-none-any.whl
Algorithm	Hash digest
SHA256	`189b88396bde0a726c89b2632f3576c8b7399d38016296386fef02430de453f3`
MD5	`f6d067680c76aa600ca063e2d5773ae6`
BLAKE2b-256	`ebb91a53ff8f8030f7b4649168a0cc703ebd84c88f5566c70cbb413375356e47`

See more details on using hashes here.

sling 1.4.12

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Installation

Running a Extract-Load Task

CLI

Using the `Replication` class

Using the `Sling` Class

Basic Usage with `run()` method

Input Streaming - Python Data to Target

Output Streaming with `stream()`

High-Performance Streaming with `stream_arrow()`

Round-trip Examples

Using the `Pipeline` class

Testing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

sling 1.4.12

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Installation

Running a Extract-Load Task

CLI

Using the Replication class

Using the Sling Class

Basic Usage with run() method

Input Streaming - Python Data to Target

Output Streaming with stream()

High-Performance Streaming with stream_arrow()

Round-trip Examples

Using the Pipeline class

Testing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Using the `Replication` class

Using the `Sling` Class

Basic Usage with `run()` method

Output Streaming with `stream()`

High-Performance Streaming with `stream_arrow()`

Using the `Pipeline` class