Skip to main content

Script for indexing text files using line offsets.

Project description

FileIndexer

Tool for creating line offset indexes as tsv files.

Install

You can download this repository or directly install with:

pip install fileindexer

Index format

The index in basic case is formated as one column tsv file with voluntary headline containing file line offset.

This indexer also supports to create key mapping to given file line offset if jsonl file is provided. In that case the index is two column tsv file with first column containing key and second the corresponding file offset.

Feel free to visit examples folder with toy example to get more familiar with the format.

Examples

This repository contains examples folder with variants of indexes that could be created. Bellow follows list of commands that were used for creating those indexes:

toy_basic.jsonl.index

fileindexer examples/toy.jsonl examples/toy_basic.jsonl.index

toy_basic_with_headline.jsonl.index

fileindexer examples/toy.jsonl examples/toy_basic_with_headline.jsonl.index --headline --name_offset "file_line_offset"

toy_key_mapping.jsonl.index

fileindexer examples/toy.jsonl examples/toy_key_mapping.jsonl.index --key k

toy_key_mapping_with_headline.jsonl.index

fileindexer examples/toy.jsonl examples/toy_key_mapping_with_headline.jsonl.index --key k --headline --name_key key --name_offset "file_line_offset"

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fileindexer-1.0.2.tar.gz (4.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fileindexer-1.0.2-py3-none-any.whl (4.7 kB view details)

Uploaded Python 3

File details

Details for the file fileindexer-1.0.2.tar.gz.

File metadata

  • Download URL: fileindexer-1.0.2.tar.gz
  • Upload date:
  • Size: 4.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.5

File hashes

Hashes for fileindexer-1.0.2.tar.gz
Algorithm Hash digest
SHA256 9f5b58d82b58b5bd9a94db2d2d04e30094632a68cc89af5552501e8efe8ee2aa
MD5 b1ad30b914a5c39e21ec6d70d89b642e
BLAKE2b-256 ea62e5807f836b2f28da7f6b692defc536e6f11da78b1150a64d33f0f129c47c

See more details on using hashes here.

File details

Details for the file fileindexer-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: fileindexer-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 4.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.5

File hashes

Hashes for fileindexer-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e45f0e0729702eb995ec15bc5b6545d771b621533586d67e687946411938b514
MD5 7e6674142a000c57ea1224a65bddf9b0
BLAKE2b-256 e83d390c7aa8ae71850382e4387760bb6f11ede1ad9be52c9703d0aa231e81a4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page