mldatalib

Library for data analysis - extracting, storing and retrieving features

Project description

mldatalib

mldatalib (Machine Learning Data Library) provides a Python library which simplifies processing and extracting features (for machine learning) from files. Stores features in a SQLite database, has label transformation options, functions which convert features to NumPy arrays, etc. Original idea and feature list by Viktor Evstratov (viktor.evst@gmail.com). Originally designed for use with the Galaxy Zoo challenge on Kaggle. Requires numpy and SQLAlchemy.

Why?

This is an attempt to minimize the amount of effort needed to extract, save and retrieve new features and allow a user to spend more time on more ‘scientific’ work. If several users are working on the same project, they can each extract independent sets of features and then share the database files and copy the features they are missing.

Functionality

Basic functionality includes: Extracting features and storing them in a database (the user provides the extractor function), retrieving features by name, extracting and transforming labels from a file and storing them in a database, copying features from one database to another, returning features as a numpy array.

Roadmap

Add a pure SQL way of storing and retrieving features (by converting them to JSON format and creating columns via ALTER TABLE statements), thus allowing easy use from other languages. At least add this functionality for numpy arrays and lists.

Add a dataset class for CSV files (where all data is stored in a single CSV file).

Project details

Release history Release notifications | RSS feed

This version

0.2.1

Mar 27, 2014

0.2

Mar 25, 2014

0.1.4

Mar 21, 2014

0.1.3

Mar 20, 2014

0.1.2

Mar 19, 2014

0.1.1

Mar 19, 2014

0.1

Mar 16, 2014

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mldatalib-0.2.1.tar.gz (12.2 kB view details)

Uploaded Mar 27, 2014 Source

File details

Details for the file mldatalib-0.2.1.tar.gz.

File metadata

Download URL: mldatalib-0.2.1.tar.gz
Upload date: Mar 27, 2014
Size: 12.2 kB
Tags: Source
Uploaded using Trusted Publishing? No

File hashes

Hashes for mldatalib-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`526e6d45cee3feebdd80afab459518fb5f5b286305124b0f746a291271490a88`
MD5	`20b0669d93640d2ad420a6dc4e5abcb7`
BLAKE2b-256	`8dbc17d0f7a42c96b3b5ad6592afde3d447dc64f91179d88af3532319894861e`

See more details on using hashes here.

mldatalib 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

mldatalib

Why?

Functionality

Roadmap

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes