Skip to main content

A library for taxonomy management.

Project description

Taxonomy Library README

This module provides functions for taxonomy creation, text classification, and header translation using AI models and vector similarity search.

Usage

Import the module with:

from bee_taxonomy import taxonomy

Installation

Install the library using pip:

pip install bee-taxonomy

Main Functions

1. taxonomy.propose_taxonomy(field: str, description: str, discrete_fields: list[str] = None)

Purpose: Generate taxonomy suggestions using OpenAI

Parameters:

  • field: Name of the field to categorize
  • description: Description of the field's purpose
  • discrete_fields: Optional specific values to consider

Example:

taxonomy.propose_taxonomy(
    field="Color",
    description="Vehicle paint color classification",
    discrete_fields=["Red", "Blue", "Green", "Custom"]
)
# Returns: ["Red", "Blue", "Green", "Other"]

2. taxonomy.apply_taxonomy_similarity(discrete_fields: list[str], taxonomy: list[str], category_type: str = None)

Purpose: Classify values using semantic similarity with vector database

Parameters:

  • discrete_fields: Values to classify
  • taxonomy: List of allowed classification terms
  • category_type: Special processing for categories like 'streets'

Example:

taxonomy.apply_taxonomy_similarity(
    discrete_fields=["Rd", "Street", "Ave"],
    taxonomy=["Road", "Street", "Avenue"],
    category_type="streets"
)
# Returns: {'Rd': {'match': 'Road', 'score': 0.92}, ...}

3. taxonomy.apply_taxonomy_reasoning(discrete_fields: list[str], taxonomy: list[str], classification_description: str, hash_file: str = None)

Purpose: Use AI reasoning to classify values into taxonomy

Parameters:

  • discrete_fields: List of values to classify
  • taxonomy: List of allowed categories
  • classification_description: Context for classification
  • hash_file: Optional file hash for progress tracking

Example:

taxonomy.apply_taxonomy_reasoning(
    discrete_fields=["Quick Brown Fox", "Lazy Dog"],
    taxonomy=["Animal", "Object", "Action"],
    classification_description="Classify animal-related phrases"
)
# Returns: {'Quick Brown Fox': 'Animal', 'Lazy Dog': 'Animal'

4. taxonomy.translate_headers_reasoning(src_lang, dest_lang, headers)

Purpose: Translate headers between languages using AI reasoning

Parameters:

  • src_lang: Source language code
  • dest_lang: Target language code
  • headers: List of headers to translate

Example:

taxonomy.translate_headers_reasoning(
    src_lang="en",
    dest_lang="es",
    headers=["Street Name", "Zip Code"]
)
# Returns: {'Street Name': 'Nombre de la Calle', 'Zip Code': 'Código Postal'

5. taxonomy.analyze_text_field(field_name: str, field_value: str, task: Literal["label", "summarize"] = "label")

Purpose: Analyze text fields for classification or summarization

Parameters:

  • field_name: Name of the text field
  • field_value: Text to analyze
  • task: "label" for classification or "summarize" for text summary

Example:

taxonomy.analyze_text_field(
    field_name="Product Description",
    field_value="This ergonomic chair provides lumbar support and adjustable height",
    task="label"
)
# Returns: "Office Furniture"

Environment Variables

Users must rename .env.example to .env and fill in all the required fields with their specific values:

  • MODEL_NAME: Hugging Face model identifier
  • SERVER_URL: Base URL for OpenAI-compatible API
  • API_KEY: Authentication token for the API
  • EMBEDDER_MODEL: Embedding model for semantic similarity

Features

  • Validation workflow with Pydantic models
  • Progress checkpointing for large datasets
  • Google search integration for ambiguous classifications

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bee_taxonomy-0.0.9.tar.gz (26.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bee_taxonomy-0.0.9-py3-none-any.whl (14.3 kB view details)

Uploaded Python 3

File details

Details for the file bee_taxonomy-0.0.9.tar.gz.

File metadata

  • Download URL: bee_taxonomy-0.0.9.tar.gz
  • Upload date:
  • Size: 26.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for bee_taxonomy-0.0.9.tar.gz
Algorithm Hash digest
SHA256 da156850939fa1b9fe1e3368a4137757e5ae43840410ce24535764a4600ea8a8
MD5 2d678cba7bac4c66f16f9e76a417a26c
BLAKE2b-256 f85d604b0c43f9dd2539037e66c24cbaf1a1fbafbfc9e775f354717d7ba22267

See more details on using hashes here.

File details

Details for the file bee_taxonomy-0.0.9-py3-none-any.whl.

File metadata

  • Download URL: bee_taxonomy-0.0.9-py3-none-any.whl
  • Upload date:
  • Size: 14.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for bee_taxonomy-0.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 7450f1a6f34dffa3572ea0434d0785d0ad43fd3da851d8e644659eaa69ee770d
MD5 ba00713f3a96b864463677e18ce35127
BLAKE2b-256 52079eb9d0629425318d682ef787294af2685ba6227eec37c7de92ee83844795

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page