Skip to main content

A pure python IO interface for data accessing in kaldi

Project description

Kaldi Python IO

A python (3.6+) wrapper for Kaldi's data accessing.

Support Type

  • Kaldi's binary archives (*.ark)
  • Kaldi's scripts (alignments & features, *.scp)
  • Kaldi nnet3 data examples in binary (*.egs)

Install

python setup.py install or pip install kaldi-python-io

Usage

  • ArchiveReader && AlignArchiveReader

    # allow only sequential index
    ark_reader = ArchiveReader("copy-feats ark:foo.ark ark:- |")
    for key, _ in ark_reader:
        print(key)
    ali_reader = AlignArchiveReader("gunzip -c foo.ali.gz |")
    for key, _ in ark_reader:
        print(key)
    
  • Nnet3EgsReader

    # allow only sequential index
    egs_reader = Nnet3EgsReader("foo.egs")
    for key, _ in egs_reader:
        print(key)
    
  • ArchiveWriter

    with ArchiveWriter("foo.ark", "foo.scp") as writer:
        for i in range(10):
            mat = np.random.rand(100, 20)
            writer.write(f"mat-{i}", mat)
    
  • ScriptReader && AlignScriptReader

    # allow sequential/random index
    scp_reader = ScriptReader("shuf foo.scp | head -n 2")
    for key, mat in scp_reader:
        print(f"{key}: {mat.shape}")
    ali_reader = AlignScriptReader("foo.ali.scp")
    for key, ali in ali_reader:
        print(f"{key}: {ali.shape}")
    

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kaldi-python-io-1.1.0.tar.gz (8.4 kB view details)

Uploaded Source

File details

Details for the file kaldi-python-io-1.1.0.tar.gz.

File metadata

  • Download URL: kaldi-python-io-1.1.0.tar.gz
  • Upload date:
  • Size: 8.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.4

File hashes

Hashes for kaldi-python-io-1.1.0.tar.gz
Algorithm Hash digest
SHA256 61d34879b1c2d0f32fecf3a161e0743107c3d0c5aeaad0710b9c5ad4588eb72c
MD5 79949783cd62dcbba73f20fc732201fc
BLAKE2b-256 f6db612a9dcc4e0da71de99dfe1c4e11789994c8fe448f752235a38fa5070671

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page