Skip to main content

A library for taxonomy management.

Project description

Taxonomy Library README

This module provides functions for taxonomy creation, text classification, and header translation using AI models and vector similarity search.

Usage

Import the module with:

from bee_taxonomy import taxonomy

Main Functions

1. taxonomy.propose_taxonomy(field: str, description: str, discrete_fields: list[str] = None)

Purpose: Generate taxonomy suggestions using OpenAI Parameters:

  • field: Name of the field to categorize
  • description: Description of the field's purpose
  • discrete_fields: Optional specific values to consider

Example:

taxonomy.propose_taxonomy(
    field="Color",
    description="Vehicle paint color classification",
    discrete_fields=["Red", "Blue", "Green", "Custom"]
)
# Returns: ["Red", "Blue", "Green", "Other"]

2. taxonomy.apply_taxonomy_similarity(discrete_fields: list[str], taxonomy: list[str], category_type: str = None)

Purpose: Classify values using semantic similarity with vector database Parameters:

  • discrete_fields: Values to classify
  • taxonomy: List of allowed classification terms
  • category_type: Special processing for categories like 'streets'

Example:

taxonomy.apply_taxonomy_similarity(
    discrete_fields=["Rd", "Street", "Ave"],
    taxonomy=["Road", "Street", "Avenue"],
    category_type="streets"
)
# Returns: {'Rd': {'match': 'Road', 'score': 0.92}, ...}

3. taxonomy.apply_taxonomy_reasoning(discrete_fields: list[str], taxonomy: list[str], classification_description: str, hash_file: str = None)

Parameters:

  • discrete_fields: List of values to classify
  • taxonomy: List of allowed categories
  • classification_description: Context for classification
  • hash_file: Optional file hash for progress tracking

Example:

taxonomy.apply_taxonomy_reasoning(
    discrete_fields=["Quick Brown Fox", "Lazy Dog"],
    taxonomy=["Animal", "Object", "Action"],
    classification_description="Classify animal-related phrases"
)
# Returns: {'Quick Brown Fox': 'Animal', 'Lazy Dog': 'Animal'

4. taxonomy.translate_headers_reasoning(src_lang, dest_lang, headers)

Parameters:

  • src_lang: Source language code
  • dest_lang: Target language code
  • headers: List of headers to translate

Example:

taxonomy.translate_headers_reasoning(
    src_lang="en",
    dest_lang="es",
    headers=["Street Name", "Zip Code"]
)
# Returns: {'Street Name': 'Nombre de la Calle', 'Zip Code': 'Código Postal'

5. taxonomy.analyze_text_field(field_name: str, field_value: str, task: Literal["label", "summarize"] = "label")

Parameters:

  • field_name: Name of the text field
  • field_value: Text to analyze
  • task: "label" for classification or "summarize" for text summary

Example:

taxonomy.analyze_text_field(
    field_name="Product Description",
    field_value="This ergonomic chair provides lumbar support and adjustable height",
    task="label"
)
# Returns: "Office Furniture"

Environment Variables

  • MODEL_NAME: Hugging Face model identifier
  • SERVER_URL: Base URL for OpenAI-compatible API
  • API_KEY: Authentication token for the API
  • EMBEDDER_MODEL: Embedding model for semantic similarity

Features

  • Validation workflow with Pydantic models
  • Progress checkpointing for large datasets
  • Google search integration for ambiguous classifications

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bee_taxonomy-0.0.7.tar.gz (12.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bee_taxonomy-0.0.7-py3-none-any.whl (14.1 kB view details)

Uploaded Python 3

File details

Details for the file bee_taxonomy-0.0.7.tar.gz.

File metadata

  • Download URL: bee_taxonomy-0.0.7.tar.gz
  • Upload date:
  • Size: 12.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for bee_taxonomy-0.0.7.tar.gz
Algorithm Hash digest
SHA256 fa8890f4bfdfc8e21681873f3647dda1f367cdb7cd83f0cdd6ee8f68eb35ef4f
MD5 c292e31711a691ae52cc1c48ef73473d
BLAKE2b-256 9a3af332b99e1faee638ed91f55a819f43ac56a1bec3b181511427ec7d7dc9c6

See more details on using hashes here.

File details

Details for the file bee_taxonomy-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: bee_taxonomy-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 14.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for bee_taxonomy-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 f8f8d946a1444e8761ba08093f9152593441a1022e26d2e94858684501220a15
MD5 889cd80fdb83b370a9888a97c55d6293
BLAKE2b-256 5bade1c521f58ce54dce7be8a56d37cc9ffe73c138b2234562fe954fc820e951

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page