Skip to main content

ADALA: Automated Data Labeling Agent

Project description

ADALA logo

Adala is an Autonomous DAta (Labeling) Agent framework.

Adala offers a robust framework for implementing agents specialized in data processing, with a particular emphasis on diverse data labeling tasks. These agents are autonomous, meaning they can independently acquire one or more skills through iterative learning. This learning process is influenced by their operating environment, observations, and reflections. Users define the environment by providing a ground truth dataset. Every agent learns and applies its skills in what we refer to as a "runtime", synonymous with LLM.

Diagram of components

Why Choose Adala?

  • Reliable Agents: Built upon a foundation of ground truth data, our agents ensure consistent and trustworthy results, making Adala a reliable choice for data processing needs.

  • Controllable Output: For every skill, you can configure the desired output, setting specific constraints with varying degrees of flexibility. Whether you want strict adherence to particular guidelines or more adaptive outputs based on the agent's learning, Adala allows you to tailor results to your exact needs.

  • Specialized in Data Processing: While our agents excel in diverse data labeling tasks, they can be tailored to a wide range of data processing needs.

  • Autonomous Learning: Adala agents aren't just automated; they're intelligent. They iteratively and independently develop skills based on environment, observations, and reflections.

  • Flexible and Extensible Runtime: Adala's runtime environment is adaptable. A single skill can be deployed across multiple runtimes, facilitating dynamic scenarios like the student/teacher architecture. Moreover, the openness of our framework invites the community to extend and tailor runtimes, ensuring continuous evolution and adaptability to diverse needs.

  • Extend Skills: Quickly tailor and develop agents to address the specific challenges and nuances of your domain, without facing a steep learning curve.

Installation

Install ADALA:

pip install adala

If you're planning to use human-in-the-loop labeling, or need a labeling tool to produce ground truth datasets, we suggest installing Label Studio. Adala is made to support Label Studio format right out of the box.

pip install label-studio

Prerequisites

Set OPENAI_API_KEY (see instructions here)

export OPENAI_API_KEY='your-openai-api-key'

Quickstart

In this example we will use ADALA as a standalone library directly inside our python notebook. You can open it in Collab right here.

import pandas as pd

from adala.agents import Agent
from adala.datasets import DataFrameDataset
from adala.environments import BasicEnvironment
from adala.skills import ClassificationSkill
from rich import print

print("=> Initialize datasets ...")

# Train dataset
train_df = pd.DataFrame([
    ["It was the negative first impressions, and then it started working.", "Positive"],
    ["Not loud enough and doesn't turn on like it should.", "Negative"],
    ["I don't know what to say.", "Neutral"],
    ["Manager was rude, but the most important that mic shows very flat frequency response.", "Positive"],
    ["The phone doesn't seem to accept anything except CBR mp3s.", "Negative"],
    ["I tried it before, I bought this device for my son.", "Neutral"],
], columns=["text", "ground_truth"])

# Test dataset
test_df = pd.DataFrame([
    "All three broke within two months of use.",
    "The device worked for a long time, can't say anything bad.",
    "Just a random line of text.",
    "Will order from them again!",
], columns=["text"])

train_dataset = DataFrameDataset(df=train_df)
test_dataset = DataFrameDataset(df=test_df)

print("=> Initialize and train ADALA agent ...")
agent = Agent(
    # connect to a dataset
    environment=BasicEnvironment(
        ground_truth_dataset=train_dataset,
        ground_truth_column="ground_truth"
    ),
    # define a skill
    skills=ClassificationSkill(
        name='sentiment_classification',
        instructions="Label text as subjective or objective.",
        labels=["Positive", "Negative", "Neutral"],
        input_data_field='text'
    ),
    
    # uncomment this if you want more quality and you have access to OPENAI GPT-4 model
    # default_teacher_runtime='openai-gpt4',
)
print(agent)

agent.learn(learning_iterations=3, accuracy_threshold=0.95)
print(agent.skills)

print('\n=> Run tests ...')
run = agent.apply_skills(test_dataset)
print('\n => Test results:')
print(run)

More Notebooks

  • Quickstart – An extended example of the above with comments and outputs.
  • Creating New Skill – An example that walks you through creating a new skill.
  • Label Studio Tutorial – An example of connecting Adala to an external labeling tool for enhanced supervision.

Who Adala is for?

Adala is a versatile framework designed for individuals and professionals in the field of AI and machine learning. Here's who can benefit:

  • AI Engineers: Architect and design AI Agent systems with modular, interconnected skills. Build production-level agent systems, abstracting low-level ML to Adala and LLMs.
  • Machine Learning Researchers: Experiment with complex problem decomposition and causal reasoning.
  • Data Scientists: Apply agents to preprocess and postprocess your data. Interact with Adala natively through Python notebooks when working with large Dataframes.
  • Educators and Students: Use Adala as a teaching tool or as a base for advanced projects and research.

While the roles highlighted above are central, it's pivotal to note that Adala is intricately designed to streamline and elevate the AI development journey, catering to all enthusiasts, irrespective of their specific niche in the field.

Roadmap

  • Create Named Entity Recognition Skill
  • Extend Environemnt with one more example
  • Command Line Utility (see the source for this readme for example)
  • REST API to interact with Adala

Contributing to Adala

Dive into the heart of Adala by enhancing Skills, optimizing Runtimes, or pioneering new Agent Types. Whether you're crafting nuanced tasks, refining computational environments, or sculpting specialized agents for unique domains, your contributions will power Adala's evolution. Join us in shaping the future of intelligent systems and making Adala more versatile and impactful for users across the globe.

Read more here.

Support

Are you in need of assistance or looking to engage with our community? Our Discord channel is the perfect place for real-time support and interaction. Whether you have questions, need clarifications, or simply want to discuss topics related to our project, the Discord community is welcoming!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adala_pk_test-0.0.1.post0.tar.gz (5.1 kB view details)

Uploaded Source

Built Distribution

adala_pk_test-0.0.1.post0-py3-none-any.whl (5.3 kB view details)

Uploaded Python 3

File details

Details for the file adala_pk_test-0.0.1.post0.tar.gz.

File metadata

  • Download URL: adala_pk_test-0.0.1.post0.tar.gz
  • Upload date:
  • Size: 5.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.9.3 CPython/3.11.6

File hashes

Hashes for adala_pk_test-0.0.1.post0.tar.gz
Algorithm Hash digest
SHA256 9c185cc5eabcb963978b98678992581f53f44f49c1456e4a176ef1b13f1b54e8
MD5 176eb9181a91bdaba2c0c2e9db19edfe
BLAKE2b-256 bc3c27a3a9cbc9fe40ba12156b32fa0139346751c328b9f76f07aa3dc72ba442

See more details on using hashes here.

File details

Details for the file adala_pk_test-0.0.1.post0-py3-none-any.whl.

File metadata

File hashes

Hashes for adala_pk_test-0.0.1.post0-py3-none-any.whl
Algorithm Hash digest
SHA256 cfadcf6b6b58f5c1ebb885e9eaab41f2b9f2c9ae091803e58aca00f95e15fb78
MD5 6a24e2eeef7f4858c0a756ef5dac6bb5
BLAKE2b-256 c630d66ec37f6cb45496b2c8fbe887a00e3bb4cff3c42dbda19c2ec0f2ef9ab4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page