A utility for storing and reading files for LM training.
Project description
LM_Dataformat
![Coverage Status](https://pypi-camo.freetls.fastly.net/4fed0e15f5f7c84df38d8dc3a0f297e288a808f3/68747470733a2f2f636f766572616c6c732e696f2f7265706f732f6769746875622f6c656f67616f322f6c6d5f64617461666f726d61742f62616467652e7376673f6272616e63683d6d6173746572)
Utilities for storing data for LM training.
Basic Usage
To write:
ar = Archive('output_dir')
for x in something():
# do other stuff
ar.add_data(somedocument, meta={
'example': stuff,
'someothermetadata': [othermetadata, otherrandomstuff],
'otherotherstuff': True
})
# remember to commit at the end!
ar.commit()
To read:
rdr = Reader('input_dir_or_file')
for doc in rdr.stream_data():
# do something with the document
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
lm_dataformat-0.0.12.tar.gz
(3.7 kB
view hashes)
Built Distribution
Close
Hashes for lm_dataformat-0.0.12-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 82271461938fef7ce5bbeb16c8faaa0b382baa399781aa8a189e062c8c654a39 |
|
MD5 | b0f4f5f9098a11d18cf30bcd99d5e77f |
|
BLAKE2b-256 | 4738434a28bde5bbd233962f06829593abdccde91bb5b8fb5fd0d211f99389e5 |