Skip to main content

A multi-agent data processing system built on AgentScope and Data-Juicer

Project description


Data-Juicer Agents: Towards Agentic Data Processing

A Suite of Agents for Agentic Data Processing. Built on Data-Juicer (DJ) and AgentScope.

简体中文 | English

🏗️ Overview Doc • ⚡️ Quick Start Doc • >_ CLI Doc • 🔧 Tools Doc • 🎯 Roadmap

News

Roadmap

The long-term vision of DJ-Agents is to enable a development-free data processing lifecycle, allowing developers to focus on what to do rather than how to do it.

To achieve this vision, we are tackling two fundamental challenges:

  • Agents: How to design and build powerful agents specialized in data processing
  • Services & Tools: How to package these agents into ready-to-use, out-of-the-box products

We continuously iterate on both directions, and the roadmap may evolve accordingly as our understanding and capabilities improve.


Agents

  • Data-Juicer Data Processing Agent (DJ Process Agent) & Data-Juicer Code Development Agent (DJ Dev Agent)
  • We have stopped building scenario-specific data processing agents, and instead are building data processing tools for general-purpose agents. From there:
    • Hard-orchestrate these tools into capabilities, exposed as the djx CLI
    • Soft-orchestrate them through prompts, packaged as skills
    • Rely on agent self-orchestration to support conversational data processing

Services & Tools

  • Q&A Copilot: a Q&A assistant for the Data-Juicer ecosystem
  • InteRecipe: interactive data recipe construction through natural language
    • [2026-03-11]: the current ./interactive_recipe only shows workflow-based examples. The dj-agents CLI entry is already built and supports interactive data-recipe construction through natural language in the TUI. We are developing a frontend tool (studio) on top of this foundation as the next upgrade.

Priority Items

  • DJ Skills: use prompt-based soft orchestration to package tools into skills for general-purpose agents.
  • InteRecipe Studio: support interactive data recipe construction through natural language, with multi-dimensional data and result views.
  • Plan Tool: extend support for fuller Data-Juicer capability coverage, DJ Hub recipe matching, and more.
  • Dev Tool: stabilization testing and optimization

Long-term Directions

  • Continue building tools and skills for broader data-processing scenarios, enabling wider and more flexible applications.
    • RAG
    • Embodied Intelligence
    • Data Lakehouse architectures

Common Issues

Q: How to get DashScope API key? A: Visit DashScope official website to register an account and apply for an API key.

Related Resources

  • Data-Juicer has been used by a large number of Tongyi and Alibaba Cloud internal and external users, and has facilitated many research works. All code is continuously maintained and enhanced.

Welcome to visit GitHub, Star, Fork, submit Issues, and join the community!

Contributing: Welcome to submit Issues and Pull Requests to improve Data-Juicer Agents, Data-Juicer, and AgentScope. If you encounter problems during use or have feature suggestions, please feel free to contact us.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data_juicer_agents-0.0.3.tar.gz (122.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

data_juicer_agents-0.0.3-py3-none-any.whl (193.3 kB view details)

Uploaded Python 3

File details

Details for the file data_juicer_agents-0.0.3.tar.gz.

File metadata

  • Download URL: data_juicer_agents-0.0.3.tar.gz
  • Upload date:
  • Size: 122.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for data_juicer_agents-0.0.3.tar.gz
Algorithm Hash digest
SHA256 7e147ce87ca80dc94b9b0854cdb68e2a07e95d68747799524a1f145c582e4a8a
MD5 fd72ab435d0a6ccf1110891cff986ae5
BLAKE2b-256 f882c0a80048b5141e4363392b8045646530785392522ce915f558fcaa0e0f4c

See more details on using hashes here.

File details

Details for the file data_juicer_agents-0.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for data_juicer_agents-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 6eaec861bd2b43644dfa5f64ec5fbebbda607239902340e8739656d8b4d30f17
MD5 f980a4f74d4981a540464f46b95f4c04
BLAKE2b-256 a8fd66219a50955e8e310e1da24558f947899d2fd5b81801c295d001d0ed2fae

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page