ACE-Step 1.5
Project description
ACE-Step 1.5
Pushing the Boundaries of Open-Source Music Generation
Project | Hugging Face | ModelScope | Space Demo | Discord | Technical Report
Table of Contents
- โจ Features
- ๐ฆ Installation
- ๐ฅ Model Download
- ๐ Usage
- ๐ Tutorial
- ๐จ Train
- ๐๏ธ Architecture
- ๐ฆ Model Zoo
๐ Abstract
๐ We present ACE-Step v1.5, a highly efficient open-source music foundation model that brings commercial-grade generation to consumer hardware. On commonly used evaluation metrics, ACE-Step v1.5 achieves quality beyond most commercial music models while remaining extremely fastโunder 2 seconds per full song on an A100 and under 10 seconds on an RTX 3090. The model runs locally with less than 4GB of VRAM, and supports lightweight personalization: users can train a LoRA from just a few songs to capture their own style.
๐ At its core lies a novel hybrid architecture where the Language Model (LM) functions as an omni-capable planner: it transforms simple user queries into comprehensive song blueprintsโscaling from short loops to 10-minute compositionsโwhile synthesizing metadata, lyrics, and captions via Chain-of-Thought to guide the Diffusion Transformer (DiT). โก Uniquely, this alignment is achieved through intrinsic reinforcement learning relying solely on the model's internal mechanisms, thereby eliminating the biases inherent in external reward models or human preferences. ๐๏ธ
๐ฎ Beyond standard synthesis, ACE-Step v1.5 unifies precise stylistic control with versatile editing capabilitiesโsuch as cover generation, repainting, and vocal-to-BGM conversionโwhile maintaining strict adherence to prompts across 50+ languages. This paves the way for powerful tools that seamlessly integrate into the creative workflows of music artists, producers, and content creators. ๐ธ
โจ Features
โก Performance
- โ Ultra-Fast Generation โ Under 2s per full song on A100, under 10s on RTX 3090 (0.5s to 10s on A100 depending on think mode & diffusion steps)
- โ Flexible Duration โ Supports 10 seconds to 10 minutes (600s) audio generation
- โ Batch Generation โ Generate up to 8 songs simultaneously
๐ต Generation Quality
- โ Commercial-Grade Output โ Quality beyond most commercial music models (between Suno v4.5 and Suno v5)
- โ Rich Style Support โ 1000+ instruments and styles with fine-grained timbre description
- โ Multi-Language Lyrics โ Supports 50+ languages with lyrics prompt for structure & style control
๐๏ธ Versatility & Control
| Feature | Description |
|---|---|
| โ Reference Audio Input | Use reference audio to guide generation style |
| โ Cover Generation | Create covers from existing audio |
| โ Repaint & Edit | Selective local audio editing and regeneration |
| โ Track Separation | Separate audio into individual stems |
| โ Multi-Track Generation | Add layers like Suno Studio's "Add Layer" feature |
| โ Vocal2BGM | Auto-generate accompaniment for vocal tracks |
| โ Metadata Control | Control duration, BPM, key/scale, time signature |
| โ Simple Mode | Generate full songs from simple descriptions |
| โ Query Rewriting | Auto LM expansion of tags and lyrics |
| โ Audio Understanding | Extract BPM, key/scale, time signature & caption from audio |
| โ LRC Generation | Auto-generate lyric timestamps for generated music |
| โ LoRA Training | One-click annotation & training in Gradio. 8 songs, 1 hour on 3090 (12GB VRAM) |
| โ Quality Scoring | Automatic quality assessment for generated audio |
Staying ahead
Star ACE-Step on GitHub and be instantly notified of new releases
๐ฆ Installation
Requirements: Python 3.11, CUDA GPU recommended (works on CPU/MPS but slower)
1. Install uv (Package Manager)
# macOS / Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows (PowerShell)
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
2. Clone & Install
git clone https://github.com/ACE-Step/ACE-Step-1.5.git
cd ACE-Step-1.5
uv sync
3. Launch
๐ฅ๏ธ Gradio Web UI (Recommended)
uv run acestep
Open http://localhost:7860 in your browser. Models will be downloaded automatically on first run.
๐ REST API Server
uv run acestep-api
API runs at http://localhost:8001. See API Documentation for endpoints.
Command Line Options
Gradio UI (acestep):
| Option | Default | Description |
|---|---|---|
--port |
7860 | Server port |
--server-name |
127.0.0.1 | Server address (use 0.0.0.0 for network access) |
--share |
false | Create public Gradio link |
--language |
en | UI language: en, zh, ja |
--init_service |
false | Auto-initialize models on startup |
--config_path |
auto | DiT model (e.g., acestep-v15-turbo, acestep-v15-turbo-shift3) |
--lm_model_path |
auto | LM model (e.g., acestep-5Hz-lm-0.6B, acestep-5Hz-lm-1.7B) |
--offload_to_cpu |
auto | CPU offload (auto-enabled if VRAM < 16GB) |
--enable-api |
false | Enable REST API endpoints alongside Gradio UI |
--api-key |
none | API key for API endpoints authentication |
--auth-username |
none | Username for Gradio authentication |
--auth-password |
none | Password for Gradio authentication |
Examples:
# Public access with Chinese UI
uv run acestep --server-name 0.0.0.0 --share --language zh
# Pre-initialize models on startup
uv run acestep --init_service true --config_path acestep-v15-turbo
# Enable API endpoints with authentication
uv run acestep --enable-api --api-key sk-your-secret-key --port 8001
# Enable both Gradio auth and API auth
uv run acestep --enable-api --api-key sk-123456 --auth-username admin --auth-password password
Development
# Add dependencies
uv add package-name
uv add --dev package-name
# Update all dependencies
uv sync --upgrade
๐ฅ Model Download
Models are automatically downloaded from [HuggingFace]https://huggingface.co/ACE-Step/Ace-Step1.5) or ModelScope on first run. You can also manually download models using the CLI or huggingface-cli.
Automatic Download
When you run acestep or acestep-api, the system will:
- Check if the required models exist in
./checkpoints - If not found, automatically download them from HuggingFace
Manual Download with CLI
# Download main model (includes everything needed to run)
uv run acestep-download
# Download all available models (including optional variants)
uv run acestep-download --all
# Download a specific model
uv run acestep-download --model acestep-v15-sft
# List all available models
uv run acestep-download --list
# Download to a custom directory
uv run acestep-download --dir /path/to/checkpoints
Manual Download with huggingface-cli
You can also use huggingface-cli directly:
# Download main model (includes vae, Qwen3-Embedding-0.6B, acestep-v15-turbo, acestep-5Hz-lm-1.7B)
huggingface-cli download ACE-Step/Ace-Step1.5 --local-dir ./checkpoints
# Download optional LM models
huggingface-cli download ACE-Step/acestep-5Hz-lm-0.6B --local-dir ./checkpoints/acestep-5Hz-lm-0.6B
huggingface-cli download ACE-Step/acestep-5Hz-lm-4B --local-dir ./checkpoints/acestep-5Hz-lm-4B
# Download optional DiT models
huggingface-cli download ACE-Step/acestep-v15-base --local-dir ./checkpoints/acestep-v15-base
huggingface-cli download ACE-Step/acestep-v15-sft --local-dir ./checkpoints/acestep-v15-sft
huggingface-cli download ACE-Step/acestep-v15-turbo-shift1 --local-dir ./checkpoints/acestep-v15-turbo-shift1
huggingface-cli download ACE-Step/acestep-v15-turbo-shift3 --local-dir ./checkpoints/acestep-v15-turbo-shift3
huggingface-cli download ACE-Step/acestep-v15-turbo-continuous --local-dir ./checkpoints/acestep-v15-turbo-continuous
Available Models
| Model | HuggingFace Repo | Description |
|---|---|---|
| Main | ACE-Step/Ace-Step1.5 | Core components: vae, Qwen3-Embedding-0.6B, acestep-v15-turbo, acestep-5Hz-lm-1.7B |
| acestep-5Hz-lm-0.6B | ACE-Step/acestep-5Hz-lm-0.6B | Lightweight LM model (0.6B params) |
| acestep-5Hz-lm-4B | ACE-Step/acestep-5Hz-lm-4B | Large LM model (4B params) |
| acestep-v15-base | ACE-Step/acestep-v15-base | Base DiT model |
| acestep-v15-sft | ACE-Step/acestep-v15-sft | SFT DiT model |
| acestep-v15-turbo-shift1 | ACE-Step/acestep-v15-turbo-shift1 | Turbo DiT with shift1 |
| acestep-v15-turbo-shift3 | ACE-Step/acestep-v15-turbo-shift3 | Turbo DiT with shift3 |
| acestep-v15-turbo-continuous | ACE-Step/acestep-v15-turbo-continuous | Turbo DiT with continuous shift (1-5) |
๐ก Which Model Should I Choose?
ACE-Step automatically adapts to your GPU's VRAM. Here's a quick guide:
| Your GPU VRAM | Recommended LM Model | Notes |
|---|---|---|
| โค6GB | None (DiT only) | LM disabled by default to save memory |
| 6-12GB | acestep-5Hz-lm-0.6B |
Lightweight, good balance |
| 12-16GB | acestep-5Hz-lm-1.7B |
Better quality |
| โฅ16GB | acestep-5Hz-lm-4B |
Best quality and audio understanding |
๐ For detailed GPU compatibility information (duration limits, batch sizes, memory optimization), see GPU Compatibility Guide: English | ไธญๆ | ๆฅๆฌ่ช
๐ Usage
We provide multiple ways to use ACE-Step:
| Method | Description | Documentation |
|---|---|---|
| ๐ฅ๏ธ Gradio Web UI | Interactive web interface for music generation | Gradio Guide |
| ๐ Python API | Programmatic access for integration | Inference API |
| ๐ REST API | HTTP-based async API for services | REST API |
๐ Documentation available in: English | ไธญๆ | ๆฅๆฌ่ช
๐ Tutorial
๐ฏ Must Read: Comprehensive guide to ACE-Step 1.5's design philosophy and usage methods.
| Language | Link |
|---|---|
| ๐บ๐ธ English | English Tutorial |
| ๐จ๐ณ ไธญๆ | ไธญๆๆ็จ |
| ๐ฏ๐ต ๆฅๆฌ่ช | ๆฅๆฌ่ชใใฅใผใใชใขใซ |
This tutorial covers:
- Mental models and design philosophy
- Model architecture and selection
- Input control (text and audio)
- Inference hyperparameters
- Random factors and optimization strategies
๐จ Train
See the LoRA Training tab in Gradio UI for one-click training, or check Gradio Guide - LoRA Training for details.
๐๏ธ Architecture
๐ฆ Model Zoo
DiT Models
| DiT Model | Pre-Training | SFT | RL | CFG | Step | Refer audio | Text2Music | Cover | Repaint | Extract | Lego | Complete | Quality | Diversity | Fine-Tunability | Hugging Face |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
acestep-v15-base |
โ | โ | โ | โ | 50 | โ | โ | โ | โ | โ | โ | โ | Medium | High | Easy | Link |
acestep-v15-sft |
โ | โ | โ | โ | 50 | โ | โ | โ | โ | โ | โ | โ | High | Medium | Easy | Link |
acestep-v15-turbo |
โ | โ | โ | โ | 8 | โ | โ | โ | โ | โ | โ | โ | Very High | Medium | Medium | Link |
acestep-v15-turbo-rl |
โ | โ | โ | โ | 8 | โ | โ | โ | โ | โ | โ | โ | Very High | Medium | Medium | To be released |
LM Models
| LM Model | Pretrain from | Pre-Training | SFT | RL | CoT metas | Query rewrite | Audio Understanding | Composition Capability | Copy Melody | Hugging Face |
|---|---|---|---|---|---|---|---|---|---|---|
acestep-5Hz-lm-0.6B |
Qwen3-0.6B | โ | โ | โ | โ | โ | Medium | Medium | Weak | โ |
acestep-5Hz-lm-1.7B |
Qwen3-1.7B | โ | โ | โ | โ | โ | Medium | Medium | Medium | โ |
acestep-5Hz-lm-4B |
Qwen3-4B | โ | โ | โ | โ | โ | Strong | Strong | Strong | To be released |
๐ License & Disclaimer
This project is licensed under MIT
ACE-Step enables original music generation across diverse genres, with applications in creative production, education, and entertainment. While designed to support positive and artistic use cases, we acknowledge potential risks such as unintentional copyright infringement due to stylistic similarity, inappropriate blending of cultural elements, and misuse for generating harmful content. To ensure responsible use, we encourage users to verify the originality of generated works, clearly disclose AI involvement, and obtain appropriate permissions when adapting protected styles or materials. By using ACE-Step, you agree to uphold these principles and respect artistic integrity, cultural diversity, and legal compliance. The authors are not responsible for any misuse of the model, including but not limited to copyright violations, cultural insensitivity, or the generation of harmful content.
๐ Important Notice
The only official website for the ACE-Step project is our GitHub Pages site.
We do not operate any other websites.
๐ซ Fake domains include but are not limited to:
ac**p.com, a**p.org, a***c.org
โ ๏ธ Please be cautious. Do not visit, trust, or make payments on any of those sites.
๐ Acknowledgements
This project is co-led by ACE Studio and StepFun.
๐ Citation
If you find this project useful for your research, please consider citing:
@misc{gong2026acestep,
title={ACE-Step 1.5: Pushing the Boundaries of Open-Source Music Generation},
author={Junmin Gong, Yulin Song, Wenxiao Zhao, Sen Wang, Shengyuan Xu, Jing Guo},
howpublished={\url{https://github.com/ace-step/ACE-Step-1.5}},
year={2026},
note={GitHub repository}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ace_step15_fork-20260212-py3-none-any.whl.
File metadata
- Download URL: ace_step15_fork-20260212-py3-none-any.whl
- Upload date:
- Size: 1.4 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
543abae0b84b271bcf9c64bfbf2db8c9f19b2c71fb44896b58dcd8974685a4ce
|
|
| MD5 |
10eb60a6792c14b3584145b568508f25
|
|
| BLAKE2b-256 |
8d5401ddb6983b7305c369d5583f434f9d464b2588690720ffb146439b4aa5bb
|