package for map a list

These details have not been verified by PyPI

Project links

Homepage

Project description

functional_list

Functional programming for Python lists with Spark RDD-style transformations

🎯 Overview

`functional_list` brings functional programming paradigms to Python lists, inspired by Apache Spark RDD operations. It provides both eager (`ListMapper`) and lazy (`LazyListMapper`) execution modes, making data transformations more expressive and chainable.

✨ Key Features

🔗 Functional-style transformations: `map`, `filter`, `reduce`, `flat_map`, `reduce_by_key`, and more
⚡ Multiple execution backends:
- `Serial` - Simple sequential execution
- `Local` - Multi-threaded or multi-process parallelization
- `Async` - Asynchronous I/O operations
- `Ray` - Distributed computing with Ray
- `Dask` - Distributed computing with Dask
💤 Lazy evaluation: Build transformation pipelines that execute only when needed
📁 File I/O support: Read from CSV, JSON, JSONL, Parquet, and text files
🚀 Cython-accelerated operations: Optional compiled extensions for performance-critical operations
🐍 Fully typed: Complete type hints for better IDE support and type checking
📦 Zero required dependencies: Install only what you need with optional extras

📋 Requirements

Python 3.10+ (Python 3.6-3.9 are not supported in recent versions)

📦 Installation

Basic Installation

pip install functional-list

Or using `uv`:

uv add functional-list

Installation with Optional Features

Install with specific backends or I/O support:

# For Ray distributed computing
pip install functional-list[ray]
# For Dask distributed computing
pip install functional-list[dask]
# For Parquet/CSV file I/O support
pip install functional-list[io]
# Install everything
pip install functional-list[all]

With `uv`:

uv add "functional-list[all]"

🚀 Quick Start

Basic Usage

from functional_list import ListMapper
# Create a ListMapper
numbers = ListMapper[int](1, 2, 3, 4, 5)
# Chain transformations
result = (
    numbers
    .map(lambda x: x * x)           # [1, 4, 9, 16, 25]
    .filter(lambda x: x % 2 == 0)   # [4, 16]
    .reduce(lambda x, y: x + y)     # 20
)
print(result)  # 20

Word Count Example

The classic MapReduce word count example:

from functional_list import ListMapper
# Given: a list of text documents
document = ListMapper[str](
    "python is good",
    "python is better than x",
    "python is the best",
)
# When: perform word count using functional transformations
word_counts = (
    document
    .flat_map(lambda line: line.split())      # Split into words
    .map(lambda word: (word, 1))              # Create (word, count) pairs
    .reduce_by_key(lambda x, y: x + y)        # Sum counts by word
)
# Then: result is a list of (word, count) tuples
print(word_counts)
# Output: [('than', 1), ('the', 1), ('best', 1), ('better', 1), 
#          ('good', 1), ('is', 3), ('python', 3), ('x', 1)]

Working with Standard List Operations

`ListMapper` maintains compatibility with Python's built-in list operations:

from functional_list import ListMapper
my_list = ListMapper[int](2, 4, 9, 13, 15, 20)
# Standard list operations work as expected
my_list.append(55)
print(my_list)  # [2, 4, 9, 13, 15, 20, 55]
# Indexing and slicing
print(my_list[0])     # 2
print(my_list[1:4])   # [4, 9, 13]
# Length
print(len(my_list))   # 7
# Chain functional operations
result = (
    my_list
    .map(lambda x: x * x)
    .filter(lambda x: x % 2 == 0)
    .reduce(lambda x, y: x + y)
)
print(result)  # 3720

💤 Lazy Evaluation

Use `LazyListMapper` for deferred execution - transformations are only computed when needed:

from functional_list import ListMapper
# Convert to lazy mode
lazy_pipeline = (
    ListMapper[int](1, 2, 3, 4, 5)
    .lazy()                              # Switch to lazy evaluation
    .map(lambda x: x * 2)
    .filter(lambda x: x > 5)
    .map(lambda x: x ** 2)
)
# No computation happens yet!
# Materialize the results
result = lazy_pipeline.collect()         # Now computation happens
print(result)  # [36, 64, 100]
# Or iterate (also materializes)
for item in lazy_pipeline:
    print(item)

⚡ Execution Backends

Choose the right backend for your workload:

Serial Backend (Default)

from functional_list import ListMapper
data = ListMapper[int](1, 2, 3, 4, 5)
result = data.map(lambda x: x * 2).collect()

Local Backend (Multi-threading/Multi-processing)

from functional_list import ListMapper, LocalBackend
data = ListMapper[int](range(1000))
# Use threading for I/O-bound tasks
result = data.map(
    lambda x: expensive_io_operation(x),
    backend=LocalBackend(use_threads=True, max_workers=10)
).collect()
# Use multiprocessing for CPU-bound tasks
result = data.map(
    lambda x: expensive_cpu_operation(x),
    backend=LocalBackend(use_processes=True, max_workers=4)
).collect()

Async Backend

from functional_list import ListMapper, AsyncBackend
import asyncio
async def async_fetch(url):
    # Your async code here
    pass
data = ListMapper[str](["url1", "url2", "url3"])
result = data.map(async_fetch, backend=AsyncBackend()).collect()

Ray Backend (Distributed Computing)

from functional_list import ListMapper, RayBackend
# Requires: pip install functional-list[ray]
data = ListMapper[int](range(10000))
result = data.map(
    lambda x: complex_computation(x),
    backend=RayBackend(num_cpus=8)
).collect()

Dask Backend (Distributed Computing)

from functional_list import ListMapper, DaskBackend
# Requires: pip install functional-list[dask]
data = ListMapper[int](range(10000))
result = data.map(
    lambda x: complex_computation(x),
    backend=DaskBackend(n_workers=4)
).collect()

📁 File I/O Operations

`functional_list` provides built-in support for reading data from various file formats:

Supported Formats

Format	Description	Requires
CSV	Comma-separated values	Built-in
JSON	JSON arrays or objects	Built-in
JSONL	JSON Lines (one object per line)	Built-in
Parquet	Columnar storage format	`pyarrow`
Text	Plain text files	Built-in

Reading CSV Files

from functional_list import ListMapper
from functional_list.io import CSVReadOptions
# Read CSV with custom options
users = ListMapper.from_csv(
    "users.csv",
    options=CSVReadOptions(
        skip_header=True,
        delimiter=",",
        encoding="utf-8"
    ),
    transform=lambda row: {
        "name": row[0],
        "age": int(row[1]),
        "email": row[2]
    }
)
# Process the data
adults = users.filter(lambda user: user["age"] >= 18)

Reading JSON Files

from functional_list import ListMapper
# Read JSON array
data = ListMapper.from_json("data.json")
# Read and transform
names = (
    ListMapper.from_json("users.json")
    .map(lambda user: user["name"])
    .filter(lambda name: len(name) > 3)
)

Reading JSONL Files

from functional_list import ListMapper
# Each line is a separate JSON object
events = ListMapper.from_jsonl("events.jsonl")
# Process streaming logs
errors = (
    events
    .filter(lambda e: e.get("level") == "ERROR")
    .map(lambda e: e["message"])
)

Reading Parquet Files

from functional_list import ListMapper
# Read entire Parquet file
data = ListMapper.from_parquet("data.parquet")
# Read specific columns only
users = ListMapper.from_parquet(
    "users.parquet",
    columns=["name", "age", "country"]
)
# Process efficiently
summary = (
    users
    .filter(lambda u: u["country"] == "USA")
    .map(lambda u: u["age"])
    .reduce(lambda x, y: x + y)
)

Reading Text Files

from functional_list import ListMapper
from functional_list.io import TextReadOptions
# Read with options
lines = ListMapper.from_text(
    "log.txt",
    options=TextReadOptions(
        strip_lines=True,      # Remove whitespace
        skip_empty=True,       # Skip empty lines
        encoding="utf-8"
    )
)
# Process log file
error_lines = (
    lines
    .filter(lambda line: "ERROR" in line)
    .map(lambda line: line.split("|"))
)

📚 Core API Reference

Transformation Methods

Method	Description	Example
`map(fn)`	Apply function to each element	`data.map(lambda x: x * 2)`
`filter(fn)`	Keep elements where fn returns True	`data.filter(lambda x: x > 0)`
`flat_map(fn)`	Map and flatten results	`data.flat_map(lambda x: [x, x*2])`
`reduce(fn)`	Reduce to single value	`data.reduce(lambda x, y: x + y)`
`reduce_by_key(fn)`	Reduce grouped by key	`pairs.reduce_by_key(lambda x, y: x + y)`
`group_by(fn)`	Group elements by key function	`data.group_by(lambda x: x % 2)`
`sort(key, reverse)`	Sort elements	`data.sort(key=lambda x: x)`
`distinct()`	Remove duplicates	`data.distinct()`
`take(n)`	Take first n elements	`data.take(10)`
`sample(n)`	Random sample of n elements	`data.sample(5)`

Aggregation Methods

Method	Description	Example
`count()`	Count elements	`data.count()`
`sum()`	Sum numeric elements	`data.sum()`
`mean()`	Calculate mean	`data.mean()`
`min()`	Find minimum	`data.min()`
`max()`	Find maximum	`data.max()`
`collect()`	Materialize to list	`lazy_data.collect()`

🎓 Advanced Examples

Processing Log Files

from functional_list import ListMapper
from datetime import datetime
# Parse and analyze log files
errors_by_hour = (
    ListMapper.from_text("app.log")
    .filter(lambda line: "ERROR" in line)
    .map(lambda line: line.split("|"))
    .map(lambda parts: {
        "timestamp": datetime.fromisoformat(parts[0]),
        "message": parts[2]
    })
    .map(lambda e: (e["timestamp"].hour, 1))
    .reduce_by_key(lambda x, y: x + y)
    .sort(key=lambda x: x[1], reverse=True)
)

ETL Pipeline

from functional_list import ListMapper
# Load from multiple sources
csv_users = ListMapper.from_csv("users.csv", transform=parse_user)
json_users = ListMapper.from_json("new_users.json")
# Combine and process
all_users = (
    csv_users
    .union(json_users)
    .distinct()
    .filter(lambda u: u["active"])
    .map(lambda u: enrich_user(u))
)
# Save results
all_users.to_json("processed_users.json")

Parallel Web Scraping

from functional_list import ListMapper, LocalBackend
import requests
def fetch_page(url):
    return requests.get(url).text
urls = ListMapper[str](
    "https://example.com/page1",
    "https://example.com/page2",
)
# Fetch pages in parallel
pages = urls.map(
    fetch_page,
    backend=LocalBackend(use_threads=True, max_workers=10)
)
# Extract data
results = pages.map(parse_html).flat_map(extract_links).distinct()

🔧 Performance Tips

Choose the right backend: Use `LocalBackend` with threads for I/O-bound tasks, processes for CPU-bound
Use lazy evaluation: Build pipelines with `.lazy()` to optimize execution
Cache intermediate results: Use `.cache()` on expensive computations
Batch operations: Combine multiple transformations before materializing
Use Cython accelerators: Ensure extensions are compiled for numerical operations

🤝 Contributing

Contributions are welcome! Please check out our GitLab repository.

Development Setup

# Clone the repository
git clone https://gitlab.com/Tantelitiana22/list-function-python-project.git
cd list-function-python-project
# Install with development dependencies
uv sync --group dev --extra all
# Run tests
uv run pytest
# Run type checking
uv run mypy ./src/functional_list/
# Run linters
uv run flake8 ./src/functional_list/
uv run pylint ./src/functional_list/

📖 Documentation

Full Documentation (MkDocs)

Complete documentation is available at https://sensational-cobbler-2b96f1.netlify.app/ To run documentation locally:

uv sync --group dev
mkdocs serve -f documentation/mkdocs.yml

Quick API Reference

from functional_list import ListMapper
# List all methods
print(dir(ListMapper))
# Get documentation for a specific method
print(ListMapper.map.__doc__)
# Get help
help(ListMapper.reduce_by_key)

❓ FAQ & Troubleshooting

Why am I getting "module not found" errors for Ray/Dask?

You need to install the optional dependencies:

pip install functional-list[ray]  # For Ray
pip install functional-list[dask]  # For Dask
pip install functional-list[all]   # For everything

Can I use this with Python 3.9 or earlier?

No, `functional_list` requires Python 3.10+. Earlier versions are not supported.

How do I improve performance for large datasets?

Use lazy evaluation: `.lazy()` to defer execution
Choose appropriate backends (Ray/Dask for distributed computing)
Use `.cache()` for intermediate results you'll reuse
Ensure Cython extensions are compiled

Does this work with async functions?

Yes! Use the `AsyncBackend`:

from functional_list import ListMapper, AsyncBackend
async def async_operation(x):
    # Your async code
    pass
result = data.map(async_operation, backend=AsyncBackend())

📄 License

This project is licensed under the terms specified in the LICENSE file.

👤 Author

Andrianarivo Tantelitiana RAKOTOARIJAONA

Email: tantelitiana22@gmail.com
GitLab: Tantelitiana22

🔗 Links

⭐ If you find this library useful, please consider giving it a star on GitLab!

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.2.2

Jan 11, 2026

This version

0.2.1

Jan 10, 2026

0.2.0

Jan 10, 2026

0.1.6

Jan 2, 2024

0.1.5

Dec 9, 2023

0.1.4

Jul 18, 2023

0.1.3

Mar 16, 2023

0.1.2

Dec 14, 2020

0.1.1

Dec 13, 2020

0.1.0

Dec 12, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

functional_list-0.2.1.tar.gz (28.9 kB view details)

Uploaded Jan 10, 2026 Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

functional_list-0.2.1-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl (275.6 kB view details)

Uploaded Jan 10, 2026 CPython 3.12manylinux: glibc 2.28+ x86-64manylinux: glibc 2.5+ x86-64

functional_list-0.2.1-cp311-cp311-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl (261.8 kB view details)

Uploaded Jan 10, 2026 CPython 3.11manylinux: glibc 2.28+ x86-64manylinux: glibc 2.5+ x86-64

functional_list-0.2.1-cp310-cp310-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl (247.1 kB view details)

Uploaded Jan 10, 2026 CPython 3.10manylinux: glibc 2.28+ x86-64manylinux: glibc 2.5+ x86-64

File details

Details for the file functional_list-0.2.1.tar.gz.

File metadata

Download URL: functional_list-0.2.1.tar.gz
Upload date: Jan 10, 2026
Size: 28.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for functional_list-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`a3354bbb508fcafdd86186aee97962ace0e474cdd6e7e6a14df76039cff48261`
MD5	`4e871f31aa476ac614c0f43fdab1dec1`
BLAKE2b-256	`752f698dff4b29ece54787667c2b5245b1bc4008846f017f9d21c9668d352969`

See more details on using hashes here.

File details

Details for the file functional_list-0.2.1-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl.

File metadata

Download URL: functional_list-0.2.1-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl
Upload date: Jan 10, 2026
Size: 275.6 kB
Tags: CPython 3.12, manylinux: glibc 2.28+ x86-64, manylinux: glibc 2.5+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for functional_list-0.2.1-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl
Algorithm	Hash digest
SHA256	`f3b739116aa8ede2a5d4de1347c8144ce2d4e507aec1775e212e28eda1482fd1`
MD5	`3c0cc9f312b4707c486735a07e37a841`
BLAKE2b-256	`08fc2fd515a86d65041e0ede6d9c9fbc074eedaf12ff27bae1ec10f15cfe2c5f`

See more details on using hashes here.

File details

Details for the file functional_list-0.2.1-cp311-cp311-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl.

File metadata

Download URL: functional_list-0.2.1-cp311-cp311-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl
Upload date: Jan 10, 2026
Size: 261.8 kB
Tags: CPython 3.11, manylinux: glibc 2.28+ x86-64, manylinux: glibc 2.5+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for functional_list-0.2.1-cp311-cp311-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl
Algorithm	Hash digest
SHA256	`447734661ef0d350719abe98f51c9382bd1c6836ace212437659d4cf5458a70a`
MD5	`b6d76dba75bf9b7b0214056e80d034b8`
BLAKE2b-256	`a3d1f39d062ac6442e37918c0d0ffdb97bb76a1437d9188482519f5a37726e06`

See more details on using hashes here.

File details

Details for the file functional_list-0.2.1-cp310-cp310-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl.

File metadata

Download URL: functional_list-0.2.1-cp310-cp310-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl
Upload date: Jan 10, 2026
Size: 247.1 kB
Tags: CPython 3.10, manylinux: glibc 2.28+ x86-64, manylinux: glibc 2.5+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for functional_list-0.2.1-cp310-cp310-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl
Algorithm	Hash digest
SHA256	`f061d0bf3f2fd73a05e0eed68797189db41a2b73f41bdc40ca99bfaafc9525f0`
MD5	`9b44475d7f179eaf6b5286b57232c4aa`
BLAKE2b-256	`d66c32f13ee5b5773ca2d4a738b720e8909c17119c12d6667c4a7ab0dcaf9825`

See more details on using hashes here.

functional_list 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

functional_list

🎯 Overview

✨ Key Features

📋 Requirements

📦 Installation

Basic Installation

Installation with Optional Features

🚀 Quick Start

Basic Usage

Word Count Example

Working with Standard List Operations

💤 Lazy Evaluation

⚡ Execution Backends

Serial Backend (Default)

Local Backend (Multi-threading/Multi-processing)

Async Backend

Ray Backend (Distributed Computing)

Dask Backend (Distributed Computing)

📁 File I/O Operations

Supported Formats

Reading CSV Files

Reading JSON Files

Reading JSONL Files

Reading Parquet Files

Reading Text Files

📚 Core API Reference

Transformation Methods

Aggregation Methods

🎓 Advanced Examples

Processing Log Files

ETL Pipeline

Parallel Web Scraping

🔧 Performance Tips

🤝 Contributing

Development Setup

📖 Documentation

Full Documentation (MkDocs)

Quick API Reference

❓ FAQ & Troubleshooting

Why am I getting "module not found" errors for Ray/Dask?

Can I use this with Python 3.9 or earlier?

How do I improve performance for large datasets?

Does this work with async functions?

📄 License

👤 Author

🔗 Links

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes