Skip to main content

A multi-agent data processing system built on AgentScope and Data-Juicer

Project description


Data-Juicer Agents: Towards Agentic Data Processing

A Suite of Agents for Agentic Data Processing. Built on Data-Juicer (DJ) and AgentScope.

简体中文 | English

🏗️ Overview Doc • ⚡️ Quick Start Doc • >_ CLI Doc • 🔧 Tools Doc • 🎯 Roadmap

News

Roadmap

The long-term vision of DJ-Agents is to enable a development-free data processing lifecycle, allowing developers to focus on what to do rather than how to do it.

To achieve this vision, we are tackling two fundamental challenges:

  • Agents: How to design and build powerful agents specialized in data processing
  • Services & Tools: How to package these agents into ready-to-use, out-of-the-box products

We continuously iterate on both directions, and the roadmap may evolve accordingly as our understanding and capabilities improve.


Agents

  • Data-Juicer Data Processing Agent (DJ Process Agent) & Data-Juicer Code Development Agent (DJ Dev Agent)
  • We have stopped building scenario-specific data processing agents, and instead are building data processing tools for general-purpose agents. From there:
    • Hard-orchestrate these tools into capabilities, exposed as the djx CLI
    • Soft-orchestrate them through prompts, packaged as skills
    • Rely on agent self-orchestration to support conversational data processing

Services & Tools

  • Q&A Copilot: a Q&A assistant for the Data-Juicer ecosystem
  • InteRecipe: interactive data recipe construction through natural language
    • [2026-03-11]: the current ./interactive_recipe only shows workflow-based examples. The dj-agents CLI entry is already built and supports interactive data-recipe construction through natural language in the TUI. We are developing a frontend tool (studio) on top of this foundation as the next upgrade.

Priority Items

  • DJ Skills: use prompt-based soft orchestration to package tools into skills for general-purpose agents.
  • InteRecipe Studio: support interactive data recipe construction through natural language, with multi-dimensional data and result views.
  • Plan Tool: extend support for fuller Data-Juicer capability coverage, DJ Hub recipe matching, and more.
  • Dev Tool: stabilization testing and optimization

Long-term Directions

  • Continue building tools and skills for broader data-processing scenarios, enabling wider and more flexible applications.
    • RAG
    • Embodied Intelligence
    • Data Lakehouse architectures

Common Issues

Q: How to get DashScope API key? A: Visit DashScope official website to register an account and apply for an API key.

Related Resources

  • Data-Juicer has been used by a large number of Tongyi and Alibaba Cloud internal and external users, and has facilitated many research works. All code is continuously maintained and enhanced.

Welcome to visit GitHub, Star, Fork, submit Issues, and join the community!

Contributing: Welcome to submit Issues and Pull Requests to improve Data-Juicer Agents, Data-Juicer, and AgentScope. If you encounter problems during use or have feature suggestions, please feel free to contact us.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data_juicer_agents-0.0.1.tar.gz (85.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

data_juicer_agents-0.0.1-py3-none-any.whl (136.0 kB view details)

Uploaded Python 3

File details

Details for the file data_juicer_agents-0.0.1.tar.gz.

File metadata

  • Download URL: data_juicer_agents-0.0.1.tar.gz
  • Upload date:
  • Size: 85.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for data_juicer_agents-0.0.1.tar.gz
Algorithm Hash digest
SHA256 397a627eca058ad40c55eb0f3d0b3f1e201a1a8c56ae00333a57b9985e2acc74
MD5 72add9420f2d0704479753d4716ab61d
BLAKE2b-256 26c2cbacb0c66de592c1c4c0d17bdc095956ca2741b598160444c5c3b69be36f

See more details on using hashes here.

File details

Details for the file data_juicer_agents-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for data_juicer_agents-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2e9d48d65665c8c9b2e660e5c88cef6dd5f55e1fe73066a801f0654efc104b4d
MD5 e6b4fc8f7cf0bd19b0abc17159bccbda
BLAKE2b-256 65b72d472079c0905f5869e853712555aaccec4939bb2c24d8374167ea805e05

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page