Utility library and scripts for simpler data-processing tasks
Project description
Introduction
Source Note: The authoritative version of this file is the Markdown version. The RST version is automatically created from the Markdown by pandoc.
This is an Apache licensed library and set of command-line tools for simple data processing tasks and pipelines. It is assumed that it will be used with tools like dmk and that serious work will be done with serious tools (like jupyterlab and scipy).
If it feels like a mishmash of functionality, that’s because it is. This is mainly a collection of odds and ends that keeps getting used in projects in a very specific analytics and data science team.
Installing
The normal way:
$ pip install datasimple
However, we use Python 3 and prefer user installs, so on a system like Ubuntu you probably want:
$ python3 -m pip install --user --upgrade datasimple
HOWEVER, The CORRECT usage is a Pipfile controlled by pipenv.
See below (in Hacking) for installing in development mode if you need to make source code changes.
What you get
The datasimple library and some handy scripts (see ./bin). Of note is a class designed to help you write scripts to convert anything to Excel spreadsheets. (Once again, this is functionality we need for a particular business environment. It is expressly NOT an endorsement of Excel for data science.)
Requirements
This is Python 3. Don’t submit requests for Python 2 compatibility.
See setup.py for dependencies (which will get installed automatically when you install this package with pip)
Hacking
You should be developing in a virtualenv. Since you are probably forced to work in a Vagrant Ubuntu VM on a Windows machine, and you’ll want to use the shared /vagrant folder, you might want to consider using pipenv and pyenv with the virtualenv plugin.
Use make test for testing (which will also handle linting). In fact, see the Makefile for what we automate with this project.
Contributing
The following guidelines are used when accepting external contributions:
./lint should not find any issues
There should be appropriate tests add to the appropriate module in ./tests
There should be an existing and compelling use case.
The ./lint script in the root of this repo uses pylama which you must install. Currently it also expects a pylama linter plugin called “quotes”. See Craig (the maintainer) for this plugin. NOTE: if even ONE PERSON contacts me I’ll make that plugin public :)
If you don’t currently have pylama installed, you can get the latest installed for your user with python3 -m pip install --user --upgrade pylama.
You should also test using the ./test script in the root of this repo. It runs tests using nosetests. Our setup also requires the package nose-exclude. However, the test script delegates via setup.py so you shouldn’t need to worry about this.
Note that both pylama and nosetests have configuration specified in setup.cfg.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file datasimple-1.0.7.tar.gz
.
File metadata
- Download URL: datasimple-1.0.7.tar.gz
- Upload date:
- Size: 29.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.22.0 CPython/3.5.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1f4c266a7f9c49cd5c56a6b829286af5c8602e4b75bbcc8784568b12edb5744b |
|
MD5 | b59d6a02038ef8e7c585d7318352df30 |
|
BLAKE2b-256 | 818e538de40a56faed09e5dfd9fe20357ed3d04755157ea85805b082fd164db5 |
File details
Details for the file datasimple-1.0.7-py2.py3-none-any.whl
.
File metadata
- Download URL: datasimple-1.0.7-py2.py3-none-any.whl
- Upload date:
- Size: 35.9 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.22.0 CPython/3.5.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 476b38881e94f7f50fa67d527db91dc60c4b20d5fbcf402948d7a76e4f468cd0 |
|
MD5 | 809a8b6d9cd8dfe718a8698eed3ebed9 |
|
BLAKE2b-256 | ec3a8b1e509fe04608394b29c61b25438363175fac99a84f2c3a5f0233d0460f |