Skip to main content

Extract data from Flora of North America

Project description

Flora Data Extraction Project

This script is designed to extract data from pdf files of genera from the book Flora of North America. It creates csv files whose names match the PDF files given to the script as arguments. The csv files have the format

"Species name", "Location where the species appears"

and

"Species name", "Classifiers (if any)"

The easiest way to run the script is to move to a folder where the only pdf files are genera files from Flora of North America and enter:

python -m florana.extract -A -o data.csv

The script will then run on every pdf file in the directory and create a file called 'data.csv' of all the locations it can find, as well as a file 'data-classifiers.csv'. If the script couldn't find locations for some species, detailed information will be included in 'error.log'.

Note: python 2

If you also have python 2 installed on your system, you will probably need to run python3 instead of python

Installing

python -m pip install florana

Note: Windows Users

If you're running Windows, you'll likely need to install poppler. You'll need to extract the latest binary from the link provided and add its bin folder to your PATH environment variable. i.e. If C:\path\to\poppler is the directory where you extracted poppler, then you'll need to add C:\path\to\poppler\bin to your PATH environment variable.

Dependencies

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

florana-JOSIEST-1.1.43.tar.gz (10.1 kB view details)

Uploaded Source

Built Distribution

florana_JOSIEST-1.1.43-py3-none-any.whl (12.2 kB view details)

Uploaded Python 3

File details

Details for the file florana-JOSIEST-1.1.43.tar.gz.

File metadata

  • Download URL: florana-JOSIEST-1.1.43.tar.gz
  • Upload date:
  • Size: 10.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.38.0 CPython/3.7.3

File hashes

Hashes for florana-JOSIEST-1.1.43.tar.gz
Algorithm Hash digest
SHA256 93f1b2d19b236b6618c85446cda555bad1e7c59ffcb94c8b61f9ae53776d6ede
MD5 bdda477df3a965bb4f516786b0e0bdb3
BLAKE2b-256 3ce9f4e26abd72aa1a92559ce90052360ea6eaff6e9f2b7b2676cc381a778129

See more details on using hashes here.

File details

Details for the file florana_JOSIEST-1.1.43-py3-none-any.whl.

File metadata

  • Download URL: florana_JOSIEST-1.1.43-py3-none-any.whl
  • Upload date:
  • Size: 12.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.38.0 CPython/3.7.3

File hashes

Hashes for florana_JOSIEST-1.1.43-py3-none-any.whl
Algorithm Hash digest
SHA256 d4a366daba7e679ba1cc2e2b4be771f843befb95df14596afabd7688615170a1
MD5 1cac3c28155429066f077b81e2573854
BLAKE2b-256 d99251bbf8711b4213932bdc19da72706584db5251f52857c0a5eab07d3b43df

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page