Skip to main content

No project description provided

Project description

Automatically Assigning Industry Classifications to Company Descriptions – An Entailment Based Approach

Description

naicskit is a Python package which assigns industry classification codes to descriptions of companies. This model leverages the Huggingface library and an entailment-based approach for assigning taxonomy codes with the hopes that this will make the model robust with unseen taxonomies. Based on initial results, the model has an 87% accuracy on taxonomies it was trained on and 81% accuracy on unseen taxonomies.

How To Use naicskit

from naicskit.coder import IndustryCoder
description = '...'
coder = IndustryCoder('naics.2022','2')
results = coder.code_records(description)

Supported Taxonomies

  • International Standard of Industrial Classification Rev 5.0 (ISIC 2024) isic.2024
  • International Standard of Industrial Classification Rev 4.0 (ISIC 2006) isic.2006
  • International Standard of Industrial Classification Rev 3.1 (ISIC 2002) isic.2002
  • International Standard of Industrial Classification Rev 3.0 (ISIC 1989) isic.1989
  • International Standard of Industrial Classification Rev 2.0 (ISIC 1968) isic.1968
  • North American Industry Classification System 2022 (NAICS 2022) naics.2022
  • North American Industry Classification System 2017 (NAICS 2017) naics.2017
  • North American Industry Classification System 2012 (NAICS 2012) naics.2012
  • North American Industry Classification System 2007 (NAICS 2007) naics.2007
  • Standard Industrial Classification (SIC 1987) sic.1987

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

naicskit-0.1.0.tar.gz (203.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

naicskit-0.1.0-py3-none-any.whl (213.2 kB view details)

Uploaded Python 3

File details

Details for the file naicskit-0.1.0.tar.gz.

File metadata

  • Download URL: naicskit-0.1.0.tar.gz
  • Upload date:
  • Size: 203.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.11.4 Darwin/22.4.0

File hashes

Hashes for naicskit-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5d46e2cdbf5c0edb08a45361a5b337cf9efcfc6e8102e7fc44979cb484c555bf
MD5 a91ccaa7ebc92bf16d014edcb820a96d
BLAKE2b-256 d038c39dc8cc03848fb9fcd04f8183ce3c323d23680108d2d9cbc6dab1596609

See more details on using hashes here.

File details

Details for the file naicskit-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: naicskit-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 213.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.11.4 Darwin/22.4.0

File hashes

Hashes for naicskit-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a11477dcf100f86758d33c4465923a2b34632eea656369f99bea7c8362d5c47c
MD5 7af9fe525d2cd16b483adf568a6dde88
BLAKE2b-256 1b6194d04662266280523ca49b097cf65f7da5500c1fb28a201b1f9fa35f6752

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page