Skip to main content

A library for taxonomy management.

Project description

Taxonomy Library README

This module provides functions for taxonomy creation, text classification, and header translation using AI models and vector similarity search.

Usage

Import the module with:

from bee_taxonomy import taxonomy

Installation

Install the library using pip:

pip install bee-taxonomy

Main Functions

1. taxonomy.propose_taxonomy(field: str, description: str, discrete_fields: list[str] = None)

Purpose: Generate taxonomy suggestions using OpenAI

Parameters:

  • field: Name of the field to categorize
  • description: Description of the field's purpose
  • discrete_fields: Optional specific values to consider

Example:

taxonomy.propose_taxonomy(
    field="Color",
    description="Vehicle paint color classification",
    discrete_fields=["Red", "Blue", "Green", "Custom"]
)
# Returns: ["Red", "Blue", "Green", "Other"]

2. taxonomy.apply_taxonomy_similarity(discrete_fields: list[str], taxonomy: list[str], category_type: str = None)

Purpose: Classify values using semantic similarity with vector database

Parameters:

  • discrete_fields: Values to classify
  • taxonomy: List of allowed classification terms
  • category_type: Special processing for categories like 'streets'

Example:

taxonomy.apply_taxonomy_similarity(
    discrete_fields=["Rd", "Street", "Ave"],
    taxonomy=["Road", "Street", "Avenue"],
    category_type="streets"
)
# Returns: {'Rd': {'match': 'Road', 'score': 0.92}, ...}

3. taxonomy.apply_taxonomy_reasoning(discrete_fields: list[str], taxonomy: list[str], classification_description: str, hash_file: str = None)

Purpose: Use AI reasoning to classify values into taxonomy

Parameters:

  • discrete_fields: List of values to classify
  • taxonomy: List of allowed categories
  • classification_description: Context for classification
  • hash_file: Optional file hash for progress tracking

Example:

taxonomy.apply_taxonomy_reasoning(
    discrete_fields=["Quick Brown Fox", "Lazy Dog"],
    taxonomy=["Animal", "Object", "Action"],
    classification_description="Classify animal-related phrases"
)
# Returns: {'Quick Brown Fox': 'Animal', 'Lazy Dog': 'Animal'

4. taxonomy.translate_headers_reasoning(src_lang, dest_lang, headers)

Purpose: Translate headers between languages using AI reasoning

Parameters:

  • src_lang: Source language code
  • dest_lang: Target language code
  • headers: List of headers to translate

Example:

taxonomy.translate_headers_reasoning(
    src_lang="en",
    dest_lang="es",
    headers=["Street Name", "Zip Code"]
)
# Returns: {'Street Name': 'Nombre de la Calle', 'Zip Code': 'Código Postal'

5. taxonomy.analyze_text_field(field_name: str, field_value: str, task: Literal["label", "summarize"] = "label")

Purpose: Analyze text fields for classification or summarization

Parameters:

  • field_name: Name of the text field
  • field_value: Text to analyze
  • task: "label" for classification or "summarize" for text summary

Example:

taxonomy.analyze_text_field(
    field_name="Product Description",
    field_value="This ergonomic chair provides lumbar support and adjustable height",
    task="label"
)
# Returns: "Office Furniture"

Environment Variables

Users must rename .env.example to .env and fill in all the required fields with their specific values:

  • MODEL_NAME: Hugging Face model identifier
  • SERVER_URL: Base URL for OpenAI-compatible API
  • API_KEY: Authentication token for the API
  • EMBEDDER_MODEL: Embedding model for semantic similarity

Features

  • Validation workflow with Pydantic models
  • Progress checkpointing for large datasets
  • Google search integration for ambiguous classifications

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bee_taxonomy-0.0.10.tar.gz (26.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bee_taxonomy-0.0.10-py3-none-any.whl (14.4 kB view details)

Uploaded Python 3

File details

Details for the file bee_taxonomy-0.0.10.tar.gz.

File metadata

  • Download URL: bee_taxonomy-0.0.10.tar.gz
  • Upload date:
  • Size: 26.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for bee_taxonomy-0.0.10.tar.gz
Algorithm Hash digest
SHA256 322e341a994a639a6be05a9066c4b6d40bfaa29ed65f5e4921aae0827e49f494
MD5 9599d5b30efde1ab51c45ff42aad6b41
BLAKE2b-256 f89300d4f42ed07b1e98acaa6a771c2b8403f8590d1467d8c1de73b391f40279

See more details on using hashes here.

File details

Details for the file bee_taxonomy-0.0.10-py3-none-any.whl.

File metadata

  • Download URL: bee_taxonomy-0.0.10-py3-none-any.whl
  • Upload date:
  • Size: 14.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for bee_taxonomy-0.0.10-py3-none-any.whl
Algorithm Hash digest
SHA256 cadd7a78f01f49b36f16c0fac0b21971328ac944629bce06883b93fdbc35b42c
MD5 d5924a3dc6d282d03d120b74db05e282
BLAKE2b-256 9a393b2a27e13f734d9c8770ad7e8a1bc79b7f5dac27b11a734968c401b86001

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page