Extract data from Flora of North America
Project description
Flora Data Extraction Project
This script is designed to extract data from pdf files of genera from the book Flora of North America. It creates csv files whose names match the PDF files given to the script as arguments. The csv files have the format
"Species name", "Location where the species appears"
and
"Species name", "Classifiers (if any)"
The easiest way to run the script is to move to a folder where the only pdf files are genera files from Flora of North America and enter:
python -m florana.extract -A -o data.csv
The script will then run on every pdf file in the directory and create a file called 'data.csv' of all the locations it can find, as well as a file 'data-classifiers.csv'. If the script couldn't find locations for some species, detailed information will be included in 'error.log'.
Note: python 2
If you also have python 2 installed on your system, you will probably need to run
python3
instead ofpython
Installing
python -m pip install florana
Note: Windows Users
If you're running Windows, you'll likely need to install poppler. You'll need to extract the latest binary from the link provided and add its bin folder to your PATH environment variable. i.e. If
C:\path\to\poppler
is the directory where you extracted poppler, then you'll need to addC:\path\to\poppler\bin
to your PATH environment variable.
Dependencies
- python > 3
- textract
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file florana-1.1.7.tar.gz
.
File metadata
- Download URL: florana-1.1.7.tar.gz
- Upload date:
- Size: 10.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.38.0 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ce3925ed79d6aa604d0912f36b9f9862b330fa16e10a64b46fb68d053b681ce9 |
|
MD5 | 690f7d9b6ecbc8282ca0fb8480e2edad |
|
BLAKE2b-256 | d8d4f7972b5b53dd76c818015bd4319381e75d0322caedcee61565c39714fa92 |
File details
Details for the file florana-1.1.7-py3-none-any.whl
.
File metadata
- Download URL: florana-1.1.7-py3-none-any.whl
- Upload date:
- Size: 11.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.38.0 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ef4764ecd1b23ceedf510e9f7b7903c55cbf2681dd4e9034408fc77f14849e39 |
|
MD5 | 9318f728734010f3261f021e53fb3c03 |
|
BLAKE2b-256 | cb30d6a1f2820af6c7fd8579e1fdcf4cfe2efaf43d8974a819695c4f8c84b0b0 |