A python library to read and write structured data in csv, zipped csvformat and to/from databases
Project description
Support the project
If your company has embedded pyexcel and its components into a revenue generating product, please support me on patreon or bounty source to maintain the project and develop it further.
If you are an individual, you are welcome to support me too and for however long you feel like. As my backer, you will receive early access to pyexcel related contents.
And your issues will get prioritized if you would like to become my patreon as pyexcel pro user.
With your financial support, I will be able to invest a little bit more time in coding, documentation and writing interesting posts.
Known constraints
Fonts, colors and charts are not supported.
Introduction
pyexcel-io provides one application programming interface(API) to read and write the data in excel format, import the data into and export the data from database. It provides support for csv(z) format, django database and sqlalchemy supported databases. Its supported file formats are extended to cover “xls”, “xlsx”, “ods” by the following extensions:
Package name |
Supported file formats |
Dependencies |
Python versions |
---|---|---|---|
2.6, 2.7, 3.3, 3.4, 3.5, 3.6 pypy |
|||
xls, xlsx(read only), xlsm(read only) |
same as above |
||
xlsx |
same as above |
||
ods |
pyexcel-ezodf, lxml |
2.6, 2.7, 3.3, 3.4 3.5, 3.6 |
|
ods |
same as above |
Package name |
Supported file formats |
Dependencies |
Python versions |
---|---|---|---|
xlsx(write only) |
Python 2 and 3 |
||
xlsx(read only) |
lxml |
same as above |
|
read only for ods, fods |
lxml |
same as above |
|
html(read only) |
lxml,html5lib |
same as above |
In order to manage the list of plugins installed, you need to use pip to add or remove a plugin. When you use virtualenv, you can have different plugins per virtual environment. In the situation where you have multiple plugins that does the same thing in your environment, you need to tell pyexcel which plugin to use per function call. For example, pyexcel-ods and pyexcel-odsr, and you want to get_array to use pyexcel-odsr. You need to append get_array(…, library=’pyexcel-odsr’).
Footnotes
If you need to manipulate the data, you might do it yourself or use its brother library pyexcel .
If you would like to extend it, you may use it to write your own extension to handle a specific file format.
Installation
You can install pyexcel-io via pip:
$ pip install pyexcel-io
or clone it and install it:
$ git clone https://github.com/pyexcel/pyexcel-io.git
$ cd pyexcel-io
$ python setup.py install
Development guide
Development steps for code changes
cd pyexcel-io
Upgrade your setup tools and pip. They are needed for development and testing only:
pip install –upgrade setuptools pip
Then install relevant development requirements:
pip install -r rnd_requirements.txt # if such a file exists
pip install -r requirements.txt
pip install -r tests/requirements.txt
Once you have finished your changes, please provide test case(s), relevant documentation and update CHANGELOG.rst.
Note
As to rnd_requirements.txt, usually, it is created when a dependent library is not released. Once the dependecy is installed (will be released), the future version of the dependency in the requirements.txt will be valid.
How to test your contribution
Although nose and doctest are both used in code testing, it is adviable that unit tests are put in tests. doctest is incorporated only to make sure the code examples in documentation remain valid across different development releases.
On Linux/Unix systems, please launch your tests like this:
$ make
On Windows systems, please issue this command:
> test.bat
How to update test environment and update documentation
Additional steps are required:
pip install moban
git clone https://github.com/moremoban/setupmobans.git # generic setup
git clone https://github.com/pyexcel/pyexcel-commons.git commons
make your changes in .moban.d directory, then issue command moban
What is pyexcel-commons
Many information that are shared across pyexcel projects, such as: this developer guide, license info, etc. are stored in pyexcel-commons project.
What is .moban.d
.moban.d stores the specific meta data for the library.
Acceptance criteria
Has Test cases written
Has all code lines tested
Passes all Travis CI builds
Has fair amount of documentation if your change is complex
run ‘make format’ so as to confirm the pyexcel organisation’s coding style
Please update CHANGELOG.rst
Please add yourself to CONTRIBUTORS.rst
Agree on NEW BSD License for your contribution
License
New BSD License
Change log
0.5.14 - 21.02.2019
updated
#65: add tests/__init__.py because python2.7 setup.py test needs it
0.5.13 - 12.02.2019
updated
#63: Version 0.5.12 prevents xslx and ods plugin from being loaded
0.5.12 - 9.02.2019
updated
0.5.11 - 3.12.2018
updated
#59: Please use scan_plugins_regex, which lml 0.7 complains about
0.5.10 - 27.11.2018
added
#57, long type will not be written in ods. please use string type. And if the integer is equal or greater than 10 to the power of 16, it will not be written either in ods. In both situation, IntegerPrecisionLossError will be raised. And this version enables pyexcel-ods and pyexcel-ods3 to do so.
0.5.9.1 - 30.08.2018
updated
#53, upgrade lml dependency to at least 0.0.2
0.5.9 - 23.08.2018
added
pyexcel#148, support force_file_type
0.5.8 - 16.08.2018
added
#49, support additional options when detecting float values in csv format. default_float_nan, ignore_nan_text
0.5.7 - 02.05.2018
fixed
0.5.6 - 11.01.2018
fixed
#46, expose bulk_save to developer
0.5.5 - 23.12.2017
fixed
Issue #45, csv reader throws exception because google app engine does not support mmap. People who are not working with google app engine, need not to take this update. Enjoy your Christmas break.
0.5.4 - 10.11.2017
updated
PR #44, use unicodewriter for csvz writers.
0.5.3 - 23.10.2017
updated
pyexcel pyexcel#105, remove gease from setup_requires, introduced by 0.5.2.
remove python2.6 test support
0.5.2 - 20.10.2017
added
pyexcel#103, include LICENSE file in MANIFEST.in, meaning LICENSE file will appear in the released tar ball.
0.5.1 - 02.09.2017
Fixed
pyexcel-ods#25, Unwanted dependency on pyexcel.
0.5.0 - 30.08.2017
Added
Collect all data type conversion codes as service.py.
Updated
#19, use cString by default. For python, it will be a performance boost
0.4.4 - 08.08.2017
Updated
#42, raise exception if database table name does not match the sheet name
0.4.3 - 29.07.2017
Updated
#41, walk away gracefully when mmap is not available.
0.4.2 - 05.07.2017
Updated
#37, permanently fix the residue folder pyexcel by release all future releases in a clean clone.
0.4.1 - 29.06.2017
Updated
#39, raise exception when bulk save in django fails. Please bulk_save=False if you as the developer choose to save the records one by one if bulk_save cannot be used. However, exception in one-by-one save case will be raised as well. This change is made to raise exception in the first place so that you as the developer will be suprised when it was deployed in production.
0.4.0 - 19.06.2017
Updated
‘built-in’ as the value to the parameter ‘library’ as parameter to invoke pyexcel-io’s built-in csv, tsv, csvz, tsvz, django and sql won’t work. It is renamed to ‘pyexcel-io’.
built-in csv, tsv, csvz, tsvz, django and sql are lazy loaded.
pyexcel-io plugin interface has been updated. v0.3.x plugins won’t work.
#32, csv and csvz file handle are made sure to be closed. File close mechanism is enfored.
iget_data function is introduced to cope with dangling file handle issue.
Removed
Removed plugin loading code and lml is used instead
0.3.4 - 18.05.2017
Updated
#33, handle mmap object differently given as file content. This issue has put in a priority to single sheet csv over multiple sheets in a single memory stream. The latter format is pyexcel own creation but is rarely used. In latter case, multiple_sheet=True should be passed along get_data.
#34, treat mmap object as a file content.
#35, encoding parameter take no effect when given along with file content
use ZIP_DEFALTED to really do the compression
0.3.3 - 30.03.2017
Updated
#31, support pyinstaller
0.3.2 - 26.01.2017
Updated
#29, change skip_empty_rows to False by default
0.3.1 - 21.01.2017
Added
updated versions of extra packages
Updated
#23, provide helpful message when old pyexcel plugin exists
restored previously available diagnosis message for missing libraries
0.3.0 - 22.12.2016
Added
lazy loading of plugins. for example, pyexcel-xls is not entirely loaded until xls format is used at its first attempted reading or writing. Since it is loaded, it will not be loaded in the second io action.
pyexcel-xls#11, make case-insensitive for file type
0.2.6 - 21.12.2016
Updated
#24, pass on batch_size
0.2.5 - 20.12.2016
Updated
#26, performance issue with getting the number of columns.
0.2.4 - 24.11.2016
Updated
#23, Failed to convert long integer string in python 2 to its actual value
0.2.3 - 16.09.2016
Added
0.2.2 - 31.08.2016
Added
support pagination. two pairs: start_row, row_limit and start_column, column_limit help you deal with large files.
skip_empty_rows=True was introduced. To include empty rows, put it to False.
Updated
#20, pyexcel-io attempts to parse cell contents of ‘infinity’ as a float/int, crashes
0.2.1 - 11.07.2016
Added
csv format: handle utf-16 encoded csv files. Potentially being able to decode other formats if correct “encoding” is provided
csv format: write utf-16 encoded files. Potentially other encoding is also supported
support stdin as input stream and stdout as output stream
Updated
Attention, user of pyexcel-io! No longer io stream validation is performed in python 3. The guideline is: io.StringIO for csv, tsv only, otherwise BytesIO for xlsx, xls, ods. You can use RWManager.get_io to produce a correct stream type for you.
#15, support foreign django/sql foreign key
0.2.0 - 01.06.2016
Added
autoload of pyexcel-io plugins
auto detect datetime, float and int. Detection can be switched off by auto_detect_datetime, auto_detect_float, auto_detect_int
0.1.0 - 17.01.2016
Added
yield key word to return generator as content
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pyexcel_io-0.5.14-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bd8de0ea49564095151740faa173286dd3d22051c9e56e0581995e14a0de8bb2 |
|
MD5 | f57c0fdded251cc8060b08656a57771f |
|
BLAKE2b-256 | f6bfe851ebb03fc68be1410e3ed67c4c9febb7a8f32d580200b42d7919ad5366 |