Skip to main content

A multi-agent data processing system built on AgentScope and Data-Juicer

Project description


Data-Juicer Agents: Towards Agentic Data Processing

A Suite of Agents for Agentic Data Processing. Built on Data-Juicer (DJ) and AgentScope.

简体中文 | English

🏗️ Overview Doc • ⚡️ Quick Start Doc • >_ CLI Doc • 🔧 Tools Doc • 🎯 Roadmap

News

Roadmap

The long-term vision of DJ-Agents is to enable a development-free data processing lifecycle, allowing developers to focus on what to do rather than how to do it.

To achieve this vision, we are tackling two fundamental challenges:

  • Agents: How to design and build powerful agents specialized in data processing
  • Services & Tools: How to package these agents into ready-to-use, out-of-the-box products

We continuously iterate on both directions, and the roadmap may evolve accordingly as our understanding and capabilities improve.


Agents

  • Data-Juicer Data Processing Agent (DJ Process Agent) & Data-Juicer Code Development Agent (DJ Dev Agent)
  • We have stopped building scenario-specific data processing agents, and instead are building data processing tools for general-purpose agents. From there:
    • Hard-orchestrate these tools into capabilities, exposed as the djx CLI
    • Soft-orchestrate them through prompts, packaged as skills
    • Rely on agent self-orchestration to support conversational data processing

Services & Tools

  • Q&A Copilot: a Q&A assistant for the Data-Juicer ecosystem
  • InteRecipe: interactive data recipe construction through natural language
    • [2026-03-11]: the current ./interactive_recipe only shows workflow-based examples. The dj-agents CLI entry is already built and supports interactive data-recipe construction through natural language in the TUI. We are developing a frontend tool (studio) on top of this foundation as the next upgrade.

Priority Items

  • DJ Skills: use prompt-based soft orchestration to package tools into skills for general-purpose agents.
  • InteRecipe Studio: support interactive data recipe construction through natural language, with multi-dimensional data and result views.
  • Plan Tool: extend support for fuller Data-Juicer capability coverage, DJ Hub recipe matching, and more.
  • Dev Tool: stabilization testing and optimization

Long-term Directions

  • Continue building tools and skills for broader data-processing scenarios, enabling wider and more flexible applications.
    • RAG
    • Embodied Intelligence
    • Data Lakehouse architectures

Common Issues

Q: How to get DashScope API key? A: Visit DashScope official website to register an account and apply for an API key.

Related Resources

  • Data-Juicer has been used by a large number of Tongyi and Alibaba Cloud internal and external users, and has facilitated many research works. All code is continuously maintained and enhanced.

Welcome to visit GitHub, Star, Fork, submit Issues, and join the community!

Contributing: Welcome to submit Issues and Pull Requests to improve Data-Juicer Agents, Data-Juicer, and AgentScope. If you encounter problems during use or have feature suggestions, please feel free to contact us.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data_juicer_agents-0.0.2.tar.gz (96.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

data_juicer_agents-0.0.2-py3-none-any.whl (150.5 kB view details)

Uploaded Python 3

File details

Details for the file data_juicer_agents-0.0.2.tar.gz.

File metadata

  • Download URL: data_juicer_agents-0.0.2.tar.gz
  • Upload date:
  • Size: 96.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for data_juicer_agents-0.0.2.tar.gz
Algorithm Hash digest
SHA256 0495d78091f654ed42b0f2e7954d46ee91d8ab6de75a8b96cbc105715c8cbd39
MD5 2e4d7b2d03e862b1d5757fb2feab7a04
BLAKE2b-256 a6694f4a36ff25db5ed6fdfa2b6a17876ad60e9317c2783680116b31d1773efd

See more details on using hashes here.

File details

Details for the file data_juicer_agents-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for data_juicer_agents-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 83d61396b6c0eeb39177336ff4f8d6800369d40ea1f90a76c69ad757e45ea2dc
MD5 83785a5998a61fc93329519bfcef7d77
BLAKE2b-256 8fbfb2b8067a3623b851a61fd4da1201c0e6a1ba6369a765f6fdc9af9d6a3905

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page