XuanCe: A Comprehensive and Unified Deep Reinforcement Learning Library.

These details have not been verified by PyPI

Project links

Project description

XuanCe: A Comprehensive and Unified Deep Reinforcement Learning Library

PyPI - Python Version

Full Documentation | 中文文档 | README_CN.md

XuanCe is an open-source ensemble of Deep Reinforcement Learning (DRL) algorithm implementations.

We call it as Xuan-Ce (玄策) in Chinese. "Xuan (玄)" means incredible and magic box, "Ce (策)" means policy.

DRL algorithms are sensitive to hyper-parameters tuning, varying in performance with different tricks, and suffering from unstable training processes, therefore, sometimes DRL algorithms seems elusive and "Xuan". This project gives a thorough, high-quality and easy-to-understand implementation of DRL algorithms, and hope this implementation can give a hint on the magics of reinforcement learning.

We expect it to be compatible with multiple deep learning backends( PyTorch, TensorFlow, and MindSpore), and hope it can really become a zoo full of DRL algorithms.

Paper link: https://arxiv.org/pdf/2312.16248.pdf

Features

:school_satchel: Highly modularized.
:thumbsup: Easy to learn, easy for installation, and easy for usage.
:twisted_rightwards_arrows: Flexible for model combination.
:tada: Abundant algorithms with various tasks.
:couple: Supports both DRL and MARL tasks.
:key: High compatibility for different users. (PyTorch, TensorFlow2, MindSpore, CPU, GPU, Linux, Windows, MacOS, etc.)
:zap: Fast running speed with parallel environments.
:computer: Distributed training with multi-GPUs.
🎛️ Support automatically hyperparameters tuning.
:chart_with_upwards_trend: Good visualization effect with tensorboard or wandb tool.

Algorithms

:point_right: DRL

DQN: Deep Q Network [Paper]
Double DQN: DQN with Double Q-learning [Paper]
Dueling DQN: DQN with Dueling Network [Paper]
PER: DQN with Prioritized Experience Replay [Paper]
NoisyDQN: DQN with Parameter Space Noise for Exploration [Paper]
DRQN: Deep Recurrent Q-Network [Paper]
QRDQN: DQN with Quantile Regression [Paper]
C51: Distributional Reinforcement Learning [Paper]
PG: Vanilla Policy Gradient [Paper]
NPG: Natural Policy Gradient [Paper]
PPG: Phasic Policy Gradient [Paper] [Code]
A2C: Advantage Actor Critic [Paper] [Code]
SAC: Soft Actor-Critic [Paper] [Code]
SAC-Discrete: Soft Actor-Critic for Discrete Actions [Paper] [Code]
PPO-Clip: Proximal Policy Optimization with Clipped Objective [Paper] [Code]
PPO-KL: Proximal Policy Optimization with KL Divergence [Paper] [Code]
DDPG: Deep Deterministic Policy Gradient [Paper] [Code]
TD3: Twin Delayed Deep Deterministic Policy Gradient [Paper][Code]
P-DQN: Parameterised Deep Q-Network [Paper]
MP-DQN: Multi-pass Parameterised Deep Q-network [Paper] [Code]
SP-DQN: Split Parameterised Deep Q-Network [Paper]

:point_right: Model-Based Reinforcement Learning (MBRL)

DreamerV2 [Paper] [Code]
DreamerV3 [Paper] [Code]
HarmonyDream [Paper] [Code]

:point_right: Multi-Agent Reinforcement Learning (MARL)

IQL: Independent Q-learning [Paper] [Code]
VDN: Value Decomposition Networks [Paper] [Code]
QMIX: Q-mixing networks [Paper] [Code]
WQMIX: Weighted Q-mixing networks [Paper] [Code]
QTRAN: Q-transformation [Paper] [Code]
DCG: Deep Coordination Graphs [Paper] [Code]
IDDPG: Independent Deep Deterministic Policy Gradient [Paper]
MADDPG: Multi-agent Deep Deterministic Policy Gradient [Paper] [Code]
IAC: Independent Actor-Critic [Paper] [Code]
COMA: Counterfactual Multi-agent Policy Gradient [Paper] [Code]
VDAC: Value-Decomposition Actor-Critic [Paper] [Code]
IPPO: Independent Proximal Policy Optimization [Paper] [Code]
MAPPO: Multi-agent Proximal Policy Optimization [Paper] [Code]
MFQ: Mean-Field Q-learning [Paper] [Code]
MFAC: Mean-Field Actor-Critic [Paper] [Code]
ISAC: Independent Soft Actor-Critic
MASAC: Multi-agent Soft Actor-Critic [Paper]
MATD3: Multi-agent Twin Delayed Deep Deterministic Policy Gradient [Paper]
IC3Net: Individualized Controlled Continuous Communication Model [Paper] [Code]
CommNet: Communication Neural Net [Paper][Code]

:point_right: Contrastive Reinforcement Learning (CRL)

CURL: Contrastive Unsupervised Representation Learning for Sample-Efficient Reinforcement Learning [Paper] [Code]
SPR: Data-Efficient Reinforcement Learning with Self-Predictive Representations [Paper] [Code]
DrQ: Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels [Paper] [Code]

Environments

Classic Control

Cart Pole

Pendulum

Acrobot

MountainCar

Box2D

Bipedal Walker

Car Racing

Lunar Lander

MuJoCo Environments

Ant

HalfCheetah

Hopper

HumanoidStandup

Humanoid

InvertedPendulum

...

Atari Environments

Adventure

Air Raid

Alien

Amidar

Assault

Asterix

Asteroids

...

Minigrid Environments

GoToDoorEnv

LockedRoomEnv

MemoryEnv

PlaygroundEnv

...

Drones Environments

Helix

Single-Agent Hover

Multi-Agent Hover

...

MetaDrive

MPE Environments

Simple Push

Simple Reference

Simple Spread

Simple Adversary

...

Robotic Warehouse

Example 1

Example 2

Example 3

Example 4

...

SMAC

Google Research Football

:point_right: Installation

:computer: XuanCe can run at Linux, Windows, MacOS, and EulerOS, etc.

Step 1: Set up a Python environment

We recommend installing Anaconda to manage your Python environment. (You can also download a specific Anaconda installer from here.)

Then open a terminal and create/activate a new conda environment (Python >= 3.8 is recommended):

conda create -n xuance_env python=3.8 && conda activate xuance_env

Step 2: Install XuanCe

pip install xuance

This command does not include the dependencies of deep learning backends. To install the XuanCe with deep learning tools, you can type pip install xuance[torch] for PyTorch, pip install xuance[tensorflow] for TensorFlow2, pip install xuance[mindspore] for MindSpore, and pip install xuance[all] for all dependencies.

Note: Some extra packages should be installed manually for further usage. Click here to see more details for installation.

:point_right: Quickly Start

Train a Model

import xuance

runner = xuance.get_runner(algo='ppo',
                           env='classic_control',
                           env_id='CartPole-v1')
runner.run(mode='train')

Test the Model

import xuance

runner = xuance.get_runner(algo='ppo',
                           env='classic_control',
                           env_id='CartPole-v1')
runner.run(mode='test')

Visualize the results

Tensorboard

You can use tensorboard to visualize what happened in the training process. After training, the log file will be automatically generated in the directory ".results/" and you should be able to see some training data after running the command.

$ tensorboard --logdir ./logs/dqn/torch/CartPole-v0

Weights & Biases (wandb)

XuanCe also supports Weights & Biases (wandb) tools for users to visualize the results of the running implementation.

How to use wandb online? :arrow_right: https://github.com/wandb/wandb.git/

How to use wandb offline? :arrow_right: https://github.com/wandb/server.git/

Benchmarks

XuanCe provides an official benchmark pipeline for evaluating DRL and MARL algorithms.

To avoid increasing the size of the main repository, official benchmark results (including evaluation curves, summary tables, and pretrained models) are maintained in a separate repository:

👉 https://github.com/agi-brain/xuance-benchmarks

Users can either:

Run benchmarks locally using the provided pipeline, or
Directly inspect and reuse the official benchmark results without rerunning experiments.

Community

GitHub issues: https://github.com/agi-brain/xuance/issues
Github discussions: https://github.com/orgs/agi-brain/discussions
Discord invite link: https://discord.gg/HJn2TBQS7y
Slack invite link: https://join.slack.com/t/xuancerllib/
QQ App's group number: 552432695, 153966755
WeChat account: "玄策 RLlib"

(Note: You can also post your questions on Stack Overflow.)

(QR code for QQ group and WeChat official account)

QQ group 1

QQ group 2

Official account (WeChat)

Citations

If you use XuanCe in your research or development, please cite the paper:

@article{liu2023xuance,
  title={XuanCe: A Comprehensive and Unified Deep Reinforcement Learning Library},
  author={Liu, Wenzhang and Cai, Wenzhe and Jiang, Kun and Cheng, Guangran and Wang, Yuanda and Wang, Jiawei and Cao, Jingyu and Xu, Lele and Mu, Chaoxu and Sun, Changyin},
  journal={arXiv preprint arXiv:2312.16248},
  year={2023}
}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.4.2

Apr 17, 2026

1.4.1

Feb 25, 2026

1.4.0

Jan 12, 2026

1.3.3

Dec 31, 2025

1.3.2

Oct 2, 2025

1.3.1

Jul 2, 2025

1.3.0

Jun 16, 2025

1.2.6

Feb 8, 2025

1.2.5

Jan 5, 2025

1.2.4

Dec 11, 2024

1.2.3

Sep 11, 2024

1.2.2

Aug 3, 2024

1.2.1

Jul 1, 2024

1.2.0

Jun 11, 2024

1.1.1

May 12, 2024

1.1.0

May 1, 2024

1.0.11

Apr 11, 2024

1.0.10

Mar 5, 2024

1.0.9

Jan 5, 2024

1.0.8

Jan 3, 2024

1.0.7

Dec 28, 2023

1.0.5

Dec 15, 2023

1.0.4

Dec 12, 2023

1.0.3

Dec 12, 2023

1.0.2

Dec 5, 2023

1.0.1

Nov 24, 2023

1.0.0

Oct 21, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xuance-1.4.2.tar.gz (689.8 kB view details)

Uploaded Apr 17, 2026 Source

File details

Details for the file xuance-1.4.2.tar.gz.

File metadata

Download URL: xuance-1.4.2.tar.gz
Upload date: Apr 17, 2026
Size: 689.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for xuance-1.4.2.tar.gz
Algorithm	Hash digest
SHA256	`ac924f25d9e7ca24b0ce40da779297d26d9e983a152e35391dd1bcd6e1763c6d`
MD5	`c50f93e9de05c5c4a7b2991ceded7bf2`
BLAKE2b-256	`be56f8e407d2c2f24654ff383fe06ecbd843f93663cedca09554e28c7d5d65ae`

See more details on using hashes here.

xuance 1.4.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

XuanCe: A Comprehensive and Unified Deep Reinforcement Learning Library

Table of Contents:

Features

Algorithms

:point_right: DRL

:point_right: Model-Based Reinforcement Learning (MBRL)

:point_right: Multi-Agent Reinforcement Learning (MARL)

:point_right: Contrastive Reinforcement Learning (CRL)

Environments

:point_right: Installation

:point_right: Quickly Start

Train a Model

Test the Model

Visualize the results

Tensorboard

Weights & Biases (wandb)

Benchmarks

Community

Citations

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes