Randomized fast readline for large text files.
Project description
random-readline
Randomized fast readline for large text files.
Install
pip install random_readline
Usage
from random_readline import readline
# lines are shuffled by default.
n_lines, read = readline("text.txt")
for line in read():
print(line)
Sequencial read
from random_readline import readline
# lines are not shuffled as it is.
n_lines, read = readline("text.txt", shuffle=False)
for line in read():
print(line)
Gzipped file
import gzip
from random_readline import readline
n_lines, read = readline("text.txt.gz", opener=gzip.open)
for line in read():
print(line)
Control the frequency of seeking
Since random seeking can be very slow with gzipped files, the readline function has an option chunk_size
to control the frequency of seeking.
This value is set to 1
by default, which means that a seeking is performed every single line to read the entire file completely at random.
Increasing the value of chunk_size
will reduce the frequency with which seekings are performed, thus improving performance in exchange for randomness.
import gzip
from random_readline import readline
# lines will be randomized by every 100 lines
n_lines, read = readline("text.txt.gz", opener=gzip.open, chunk_size=100)
for line in read():
print(line)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file random_readline-0.1.0.tar.gz
.
File metadata
- Download URL: random_readline-0.1.0.tar.gz
- Upload date:
- Size: 3.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e7957a6f1f32c150477cfa8ba7242f62eaeb6896dbcee576cde797581d12a28a |
|
MD5 | 36c47b9c9d5631941693977fa8219e34 |
|
BLAKE2b-256 | e45497e5c1864c6e17f4421f432c4a7d8915de243a50e3e1c2e7ab67cc780060 |
File details
Details for the file random_readline-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: random_readline-0.1.0-py3-none-any.whl
- Upload date:
- Size: 3.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8806ca797168a7e998e03584de7e6ac3deb3fc600fe7fd6d522275fd03b7c4e1 |
|
MD5 | f43b0fbe9d92a74ca9a34e3a41aa947a |
|
BLAKE2b-256 | e5122ac205c2539c0c2c9d2ad3f083288d2aaf19cebe221d46c8a2f6c962dcb5 |