No project description provided
Project description
Automatically Assigning Industry Classifications to Company Descriptions – An Entailment Based Approach
Description
naicskit is a Python package which assigns industry classification codes to descriptions of companies. This model leverages the Huggingface library and an entailment-based approach for assigning taxonomy codes with the hopes that this will make the model robust with unseen taxonomies. Based on initial results, the model has an 87% accuracy on taxonomies it was trained on and 81% accuracy on unseen taxonomies.
How To Use naicskit
from naicskit.coder import IndustryCoder
description = '...'
coder = IndustryCoder('naics.2022','2')
results = coder.code_records(description)
Supported Taxonomies
- International Standard of Industrial Classification Rev 5.0 (ISIC 2024)
isic.2024 - International Standard of Industrial Classification Rev 4.0 (ISIC 2006)
isic.2006 - International Standard of Industrial Classification Rev 3.1 (ISIC 2002)
isic.2002 - International Standard of Industrial Classification Rev 3.0 (ISIC 1989)
isic.1989 - International Standard of Industrial Classification Rev 2.0 (ISIC 1968)
isic.1968 - North American Industry Classification System 2022 (NAICS 2022)
naics.2022 - North American Industry Classification System 2017 (NAICS 2017)
naics.2017 - North American Industry Classification System 2012 (NAICS 2012)
naics.2012 - North American Industry Classification System 2007 (NAICS 2007)
naics.2007 - Standard Industrial Classification (SIC 1987)
sic.1987
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file naicskit-0.1.0.tar.gz.
File metadata
- Download URL: naicskit-0.1.0.tar.gz
- Upload date:
- Size: 203.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.6.1 CPython/3.11.4 Darwin/22.4.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5d46e2cdbf5c0edb08a45361a5b337cf9efcfc6e8102e7fc44979cb484c555bf
|
|
| MD5 |
a91ccaa7ebc92bf16d014edcb820a96d
|
|
| BLAKE2b-256 |
d038c39dc8cc03848fb9fcd04f8183ce3c323d23680108d2d9cbc6dab1596609
|
File details
Details for the file naicskit-0.1.0-py3-none-any.whl.
File metadata
- Download URL: naicskit-0.1.0-py3-none-any.whl
- Upload date:
- Size: 213.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.6.1 CPython/3.11.4 Darwin/22.4.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a11477dcf100f86758d33c4465923a2b34632eea656369f99bea7c8362d5c47c
|
|
| MD5 |
7af9fe525d2cd16b483adf568a6dde88
|
|
| BLAKE2b-256 |
1b6194d04662266280523ca49b097cf65f7da5500c1fb28a201b1f9fa35f6752
|