Python library and webapp for searching standard industry and product classifiers
Project description
Perdu
Python library and webapp for matching against standard industry and product classifiers. Comes with NAICS, GS1, and USEEIO built-in.
Installation
Install using pip or conda:
conda -c conda-forge -c cmutel perdu
-or-
pip install perdu
Depends on:
- appdirs
- docopt
- flask
- peewee
- rdflib
- rdflib-jsonld
- whoosh
Usage
As a webapp:
conda_webapp
As a library:
import perdu
perdu.search_useeio("plastic toy")
Search basics
Perdu uses whoosh as the search engine. When you first import it, Perdu will import the three built-in catalogues in around one minute.
Built-in catalogues
Uploading data
Currently, the only possibility to upload data to the web interface is via CSV, with the first column being the item name or title, and the second (optional) column being the item description. See perdu.test.fixtures
for examples.
Adding other catalogues
See the files in perdu.extraction
for examples on how to extract data from PDFs (NAICS), XML (GS1), and JSON (USEEIO). Each search catalogue will have its own schema, but Perdu expects these schemas to have at least the columns name
, description
, and code
(see examples in perdu.searching
). New catalogues will need to have suitable functions provided in perdu.webapp.search_mapping
.
Advanced searching
In addition to the default search method used in the web interface, Perdu also offers search corrections (search_corrector_gs1
, search_corrector_naics
, and search_corrector_useeio
) and disjunction maximization (search_gs1_disjoint
, search_useeio_disjoint
, and search_naics_disjoint
).
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file perdu-0.1.1.tar.gz
.
File metadata
- Download URL: perdu-0.1.1.tar.gz
- Upload date:
- Size: 623.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2a53a2dae0fa76f3750e4a0a418674f01f421b5481be2c66b6d5a4216843a507 |
|
MD5 | 2f93a4d3770605044fa1c53a4e034e75 |
|
BLAKE2b-256 | d787785efd36e45bee89d5fc25a7c3413d034804135479f8246b44788f5ed53f |
File details
Details for the file perdu-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: perdu-0.1.1-py3-none-any.whl
- Upload date:
- Size: 623.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1b73939e1f3797b4d2b07e4eb7aadfe91d356d8e9a02bdb8b93326c08b69a069 |
|
MD5 | 112fdc70e5d5e57e5b2765fa3b4ec800 |
|
BLAKE2b-256 | 35239a15aed9c449b9c579f63f355fd925a2c99d82e3eb91cceddc2aee8c2b8a |