Skip to main content

A library for taxonomy management.

Project description

Taxonomy Library README

This module provides functions for taxonomy creation, text classification, and header translation using AI models and vector similarity search.

Usage

Import the module with:

from bee_taxonomy import taxonomy

Installation

Install the library using pip:

pip install bee-taxonomy

Main Functions

1. taxonomy.propose_taxonomy(field: str, description: str, discrete_fields: list[str] = None)

Purpose: Generate taxonomy suggestions using OpenAI

Parameters:

  • field: Name of the field to categorize
  • description: Description of the field's purpose
  • discrete_fields: Optional specific values to consider

Example:

taxonomy.propose_taxonomy(
    field="Color",
    description="Vehicle paint color classification",
    discrete_fields=["Red", "Blue", "Green", "Custom"]
)
# Returns: ["Red", "Blue", "Green", "Other"]

2. taxonomy.apply_taxonomy_similarity(discrete_fields: list[str], taxonomy: list[str], category_type: str = None)

Purpose: Classify values using semantic similarity with vector database

Parameters:

  • discrete_fields: Values to classify
  • taxonomy: List of allowed classification terms
  • category_type: Special processing for categories like 'streets'

Example:

taxonomy.apply_taxonomy_similarity(
    discrete_fields=["Rd", "Street", "Ave"],
    taxonomy=["Road", "Street", "Avenue"],
    category_type="streets"
)
# Returns: {'Rd': {'match': 'Road', 'score': 0.92}, ...}

3. taxonomy.apply_taxonomy_reasoning(discrete_fields: list[str], taxonomy: list[str], classification_description: str, hash_file: str = None)

Purpose: Use AI reasoning to classify values into taxonomy

Parameters:

  • discrete_fields: List of values to classify
  • taxonomy: List of allowed categories
  • classification_description: Context for classification
  • hash_file: Optional file hash for progress tracking

Example:

taxonomy.apply_taxonomy_reasoning(
    discrete_fields=["Quick Brown Fox", "Lazy Dog"],
    taxonomy=["Animal", "Object", "Action"],
    classification_description="Classify animal-related phrases"
)
# Returns: {'Quick Brown Fox': 'Animal', 'Lazy Dog': 'Animal'

4. taxonomy.translate_headers_reasoning(src_lang, dest_lang, headers)

Purpose: Translate headers between languages using AI reasoning

Parameters:

  • src_lang: Source language code
  • dest_lang: Target language code
  • headers: List of headers to translate

Example:

taxonomy.translate_headers_reasoning(
    src_lang="en",
    dest_lang="es",
    headers=["Street Name", "Zip Code"]
)
# Returns: {'Street Name': 'Nombre de la Calle', 'Zip Code': 'Código Postal'

5. taxonomy.analyze_text_field(field_name: str, field_value: str, task: Literal["label", "summarize"] = "label")

Purpose: Analyze text fields for classification or summarization

Parameters:

  • field_name: Name of the text field
  • field_value: Text to analyze
  • task: "label" for classification or "summarize" for text summary

Example:

taxonomy.analyze_text_field(
    field_name="Product Description",
    field_value="This ergonomic chair provides lumbar support and adjustable height",
    task="label"
)
# Returns: "Office Furniture"

Environment Variables

Users must rename .env.example to .env and fill in all the required fields with their specific values:

  • MODEL_NAME: Hugging Face model identifier
  • SERVER_URL: Base URL for OpenAI-compatible API
  • API_KEY: Authentication token for the API
  • EMBEDDER_MODEL: Embedding model for semantic similarity

Features

  • Validation workflow with Pydantic models
  • Progress checkpointing for large datasets
  • Google search integration for ambiguous classifications

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bee_taxonomy-0.0.15.tar.gz (26.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bee_taxonomy-0.0.15-py3-none-any.whl (14.6 kB view details)

Uploaded Python 3

File details

Details for the file bee_taxonomy-0.0.15.tar.gz.

File metadata

  • Download URL: bee_taxonomy-0.0.15.tar.gz
  • Upload date:
  • Size: 26.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for bee_taxonomy-0.0.15.tar.gz
Algorithm Hash digest
SHA256 1a4879e48d61cef2a15c16e1c7e2114649f63e167cb13b124d5120d9d9d5b527
MD5 058948f1184b7d2b1cfdb2059c2bf8d0
BLAKE2b-256 0aa951af2c936d909f36b846735d671a25d204bfe4ab18459f5bdbb1d225108a

See more details on using hashes here.

File details

Details for the file bee_taxonomy-0.0.15-py3-none-any.whl.

File metadata

  • Download URL: bee_taxonomy-0.0.15-py3-none-any.whl
  • Upload date:
  • Size: 14.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for bee_taxonomy-0.0.15-py3-none-any.whl
Algorithm Hash digest
SHA256 35bb3734ec7bd84a13c539e7f2a364c9dc2deb9df2a272b0322f03a7c62477c7
MD5 462eb32849be12f3a8465406b68bc14b
BLAKE2b-256 219cf9ae06445ae59f1df8bdf46b3ee8344dc483837ac321a91a66bcf616cd87

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page