Skip to main content

A library for taxonomy management.

Project description

Taxonomy Library README

This module provides functions for taxonomy creation, text classification, and header translation using AI models and vector similarity search.

Usage

Import the module with:

from bee_taxonomy import taxonomy

Installation

Install the library using pip:

pip install bee-taxonomy

Main Functions

1. taxonomy.propose_taxonomy(field: str, description: str, discrete_fields: list[str] = None)

Purpose: Generate taxonomy suggestions using OpenAI

Parameters:

  • field: Name of the field to categorize
  • description: Description of the field's purpose
  • discrete_fields: Optional specific values to consider

Example:

taxonomy.propose_taxonomy(
    field="Color",
    description="Vehicle paint color classification",
    discrete_fields=["Red", "Blue", "Green", "Custom"]
)
# Returns: ["Red", "Blue", "Green", "Other"]

2. taxonomy.apply_taxonomy_similarity(discrete_fields: list[str], taxonomy: list[str], category_type: str = None)

Purpose: Classify values using semantic similarity with vector database

Parameters:

  • discrete_fields: Values to classify
  • taxonomy: List of allowed classification terms
  • category_type: Special processing for categories like 'streets'

Example:

taxonomy.apply_taxonomy_similarity(
    discrete_fields=["Rd", "Street", "Ave"],
    taxonomy=["Road", "Street", "Avenue"],
    category_type="streets"
)
# Returns: {'Rd': {'match': 'Road', 'score': 0.92}, ...}

3. taxonomy.apply_taxonomy_reasoning(discrete_fields: list[str], taxonomy: list[str], classification_description: str, hash_file: str = None)

Purpose: Use AI reasoning to classify values into taxonomy

Parameters:

  • discrete_fields: List of values to classify
  • taxonomy: List of allowed categories
  • classification_description: Context for classification
  • hash_file: Optional file hash for progress tracking

Example:

taxonomy.apply_taxonomy_reasoning(
    discrete_fields=["Quick Brown Fox", "Lazy Dog"],
    taxonomy=["Animal", "Object", "Action"],
    classification_description="Classify animal-related phrases"
)
# Returns: {'Quick Brown Fox': 'Animal', 'Lazy Dog': 'Animal'

4. taxonomy.translate_headers_reasoning(src_lang, dest_lang, headers)

Purpose: Translate headers between languages using AI reasoning

Parameters:

  • src_lang: Source language code
  • dest_lang: Target language code
  • headers: List of headers to translate

Example:

taxonomy.translate_headers_reasoning(
    src_lang="en",
    dest_lang="es",
    headers=["Street Name", "Zip Code"]
)
# Returns: {'Street Name': 'Nombre de la Calle', 'Zip Code': 'Código Postal'

5. taxonomy.analyze_text_field(field_name: str, field_value: str, task: Literal["label", "summarize"] = "label")

Purpose: Analyze text fields for classification or summarization

Parameters:

  • field_name: Name of the text field
  • field_value: Text to analyze
  • task: "label" for classification or "summarize" for text summary

Example:

taxonomy.analyze_text_field(
    field_name="Product Description",
    field_value="This ergonomic chair provides lumbar support and adjustable height",
    task="label"
)
# Returns: "Office Furniture"

Environment Variables

Users must rename .env.example to .env and fill in all the required fields with their specific values:

  • MODEL_NAME: Hugging Face model identifier
  • SERVER_URL: Base URL for OpenAI-compatible API
  • API_KEY: Authentication token for the API
  • EMBEDDER_MODEL: Embedding model for semantic similarity

Features

  • Validation workflow with Pydantic models
  • Progress checkpointing for large datasets
  • Google search integration for ambiguous classifications

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bee_taxonomy-0.0.12.tar.gz (26.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bee_taxonomy-0.0.12-py3-none-any.whl (14.4 kB view details)

Uploaded Python 3

File details

Details for the file bee_taxonomy-0.0.12.tar.gz.

File metadata

  • Download URL: bee_taxonomy-0.0.12.tar.gz
  • Upload date:
  • Size: 26.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for bee_taxonomy-0.0.12.tar.gz
Algorithm Hash digest
SHA256 6ae5dc253efc995badbeec5d75b2d72b6644f693c496043b6be061c0488e1e26
MD5 6ffb31742303f2a2c0f4d75f2eff74ed
BLAKE2b-256 88062b8e9393ed6a49de53cdd970772a84665f2adf9b6804648c56266b6da35a

See more details on using hashes here.

File details

Details for the file bee_taxonomy-0.0.12-py3-none-any.whl.

File metadata

  • Download URL: bee_taxonomy-0.0.12-py3-none-any.whl
  • Upload date:
  • Size: 14.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for bee_taxonomy-0.0.12-py3-none-any.whl
Algorithm Hash digest
SHA256 2830164e2f611d61474b29528140c954e2e7a2aa72286fe3d95ec5cba583bd0c
MD5 6054b35b0a9657f98ce0a2fffafcbb84
BLAKE2b-256 5d75fc1bbd24b9111ad3710d6e4dd57e101e5221e5ad3ad8a64d6e3973e732c0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page