Skip to main content

Customizable Case-Based Reasoning (CBR) toolkit for Python with a built-in API and CLI.

Project description

CBRkit

cbrkit logo

PyPI | Docs | Example

Customizable Case-Based Reasoning (CBR) toolkit for Python with a built-in API and CLI.


CBRkit

Installation

The library is available on PyPI, so you can install it with pip:

pip install cbrkit

It comes with several optional dependencies for certain tasks like NLP which can be installed with:

pip install cbrkit[EXTRA_NAME,...]

where EXTRA_NAME is one of the following:

  • nlp: Standalone NLP tools levenshtein, nltk, openai, and spacy
  • transformers: NLP tools based on pytorch and transformers
  • cli: Command Line Interface (CLI)
  • api: REST API Server
  • all: All of the above

Usage

CBRkit allows the definition of similarity metrics through composition. This means that you can easily build even complex similarities by mixing built-in and/or custom measures. CBRkit also includes predefined aggregation functions. To get started, we provide a demo project that shows how to use the library in a real-world scenario. The following modules are part of CBRkit:

  • loaders: Functions for loading cases and queries.
  • sim: Similarity generator functions for various data types (e.g., strings, numbers).
  • global_sim: Similarity generator functions for aggregating the above ones.
  • retrieval: Functions for retrieving cases based on a query.
  • typing: Generic type definitions for defining custom functions.

CBRkit is fully typed, so IDEs like VSCode and PyCharm can provide autocompletion and type checking. We will explain all modules and their basic usage in the following sections.

Loading Cases and Queries

The first step is to load cases and queries. We provide predefined functions for the most common formats like CSV, JSON, and XML. Additionally, cbrkit also integrates with pandas for loading data frames. The following example shows how to load cases and queries from a CSV file using pandas:

import pandas as pd
import cbrkit

df = pd.read_csv("path/to/cases.csv")
cases = cbrkit.loaders.dataframe(df)

When dealing with formats like JSON, the files can be loaded directly:

cases = cbrkit.loaders.json("path/to/cases.json")

Queries can either be loaded using the same loader functions. CBRkit expects the type of the queries to match the type of the cases.

 # for pandas
queries = cbrkit.loaders.dataframe(pd.read_csv("path/to/queries.csv"))
# for json
queries = cbrkit.loaders.json("path/to/queries.json")

In case your query collection only contains a single entry, you can use the singleton function to extract it.

query = cbrkit.helpers.singleton(queries)

Alternatively, you can also create a query directly in Python:

# for pandas
query = pd.Series({"name": "John", "age": 25})
# for json
query = {"name": "John", "age": 25}

Similarity Measures and Aggregation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cbrkit-0.4.0.tar.gz (20.3 kB view hashes)

Uploaded Source

Built Distribution

cbrkit-0.4.0-py3-none-any.whl (25.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page