Skip to main content

Utility library and scripts for simpler data-processing tasks

Project description

Author:

Craig Kelly

Introduction

Source Note: The authoritative version of this file is the Markdown version. The RST version is automatically created from the Markdown by pandoc.

This is an Apache licensed library and set of command-line tools for simple data processing tasks and pipelines. It is assumed that it will be used with tools like dmk and that serious work will be done with serious tools (like jupyterlab and scipy).

If it feels like a mishmash of functionality, that’s because it is. This is mainly a collection of odds and ends that keeps getting used in projects in a very specific analytics and data science team.

Installing

The normal way:

$ pip install datasimple

However, we use Python 3 and prefer user installs, so on a system like Ubuntu you probably want:

$ python3 -m pip install --user --upgrade datasimple

HOWEVER, The CORRECT usage is a Pipfile controlled by pipenv.

See below (in Hacking) for installing in development mode if you need to make source code changes.

What you get

The datasimple library and some handy scripts (see ./bin). Of note is a class designed to help you write scripts to convert anything to Excel spreadsheets. (Once again, this is functionality we need for a particular business environment. It is expressly NOT an endorsement of Excel for data science.)

Requirements

This is Python 3. Don’t submit requests for Python 2 compatibility.

See setup.py for dependencies (which will get installed automatically when you install this package with pip)

Hacking

You should be developing in a virtualenv. Since you are probably forced to work in a Vagrant Ubuntu VM on a Windows machine, and you’ll want to use the shared /vagrant folder, you might want to consider using pipenv and pyenv with the virtualenv plugin.

Use make test for testing (which will also handle linting). In fact, see the Makefile for what we automate with this project.

Contributing

The following guidelines are used when accepting external contributions:

  • ./lint should not find any issues

  • There should be appropriate tests add to the appropriate module in ./tests

  • There should be an existing and compelling use case.

The ./lint script in the root of this repo uses pylama which you must install. Currently it also expects a pylama linter plugin called “quotes”. See Craig (the maintainer) for this plugin. NOTE: if even ONE PERSON contacts me I’ll make that plugin public :)

If you don’t currently have pylama installed, you can get the latest installed for your user with python3 -m pip install --user --upgrade pylama.

You should also test using the ./test script in the root of this repo. It runs tests using nosetests. Our setup also requires the package nose-exclude. However, the test script delegates via setup.py so you shouldn’t need to worry about this.

Note that both pylama and nosetests have configuration specified in setup.cfg.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datasimple-1.0.7.tar.gz (29.1 kB view details)

Uploaded Source

Built Distribution

datasimple-1.0.7-py2.py3-none-any.whl (35.9 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file datasimple-1.0.7.tar.gz.

File metadata

  • Download URL: datasimple-1.0.7.tar.gz
  • Upload date:
  • Size: 29.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.22.0 CPython/3.5.2

File hashes

Hashes for datasimple-1.0.7.tar.gz
Algorithm Hash digest
SHA256 1f4c266a7f9c49cd5c56a6b829286af5c8602e4b75bbcc8784568b12edb5744b
MD5 b59d6a02038ef8e7c585d7318352df30
BLAKE2b-256 818e538de40a56faed09e5dfd9fe20357ed3d04755157ea85805b082fd164db5

See more details on using hashes here.

File details

Details for the file datasimple-1.0.7-py2.py3-none-any.whl.

File metadata

  • Download URL: datasimple-1.0.7-py2.py3-none-any.whl
  • Upload date:
  • Size: 35.9 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.22.0 CPython/3.5.2

File hashes

Hashes for datasimple-1.0.7-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 476b38881e94f7f50fa67d527db91dc60c4b20d5fbcf402948d7a76e4f468cd0
MD5 809a8b6d9cd8dfe718a8698eed3ebed9
BLAKE2b-256 ec3a8b1e509fe04608394b29c61b25438363175fac99a84f2c3a5f0233d0460f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page