Skip to main content

Contains MLTable loading and authoring apis for the mltable package.

Project description

# mltable: machine learning table data toolkit MLTable is a Python package that provides fast, flexible data loading functions designed to make accessing “tabular” data easy and intuitive. MLTable will help you to abstract the schema definition for tabular data so that it is easier to materialize the table into a Pandas dataframe. MlTable can be leveraged upon delimited text files, parquet files, delta lake, json-lines files from a cloud object store or local disk.

## Main Features

Here are a few things that mltable does well:

  • Flexible sampling and filtering functionality on large data

  • Robust IO tools for loading data from  flat files (CSV and delimited), parquet files, delta lake and json-lines files

  • Capturing and defining schema contained in flat files

  • Fast materialization of data into Pandas DataFrame

## Getting started

You can install MLTable package via pip. `bash pip install mltable `

Please note MLTable package is pre-installed on AzureML compute instances.

## Documentation

The official documentation is hosted on [Create a mltable data asset.](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-create-data-assets?tabs=cli#create-a-mltable-data-asset)

MLTable artifact’s metadata file is called  MLTable which adheres to the [AzureML MLTable schema](https://learn.microsoft.com/en-us/azure/machine-learning/reference-yaml-mltable).

# Release History

## 1.3.1 (2023-04-26)

### Features Added
  • bugfix (support more encoding variants, mltable save/load roundtrip)

## 1.3.0 (2023-04-07)

### Features Added
  • bugfix (user error mapping, mltable save/load roundtrip)

## 1.2.0 (2023-02-22)

### Features Added
  • bugfix (mltable save/load, validation schema)

## 1.1.0 (2023-01-26)

### Features Added
  • bugfix (fix schema, flake8 errors)

  • improve logging and exception message

## 1.0.0 (2022-12-05)

### Features Added
  • factory apis(from_delta_lake)

  • Authhoring apis(convert_column_types,save, skip etc)

## 0.1.0b4 (2022-10-05)

### Features Added - Factory apis(from_paths, from_delimited_files, from_parquet_files, from_json_lines_files). - Authoring apis(keep_columns, drop_columns, take_random_sample, take etc). - Support mltable load from data asset uri

## 0.1.0b3 (2022-06-30)

## 0.1.0b2 (2022-05-23)

## 0.1.0b1 (2022-05-17)

### Features Added - Initial public preview release to load into pandas dataframe

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

mltable-1.3.1-py3-none-any.whl (178.7 kB view details)

Uploaded Python 3

File details

Details for the file mltable-1.3.1-py3-none-any.whl.

File metadata

  • Download URL: mltable-1.3.1-py3-none-any.whl
  • Upload date:
  • Size: 178.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.9.6 requests/2.31.0 setuptools/50.3.2 requests-toolbelt/1.0.0 tqdm/4.65.0 CPython/3.8.13

File hashes

Hashes for mltable-1.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5bb6587d222f19ad1aabcff384880b6367d36b7e2d989956cdbc676438756ef5
MD5 df5991aa0b7e9445407ffbe7fadceac5
BLAKE2b-256 93bdd511d65101d73ef1c61b2651d291b6f85e40cb367093ac78f22b4f705c24

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page