Skip to main content

Chinese Chess Environment for Reinforcement Learning

Project description

Xiangqi

UI

How to use

from xiangqi import Env

env = Env(opening)  # 不传开局状态表示从头下起

ob = env.reset()  # 初始观察
while True:
    env.render()  # 显示棋局
    player = # which agent are responsible for ob['cur_player']?
    action = player.make_decision(**ob)  # agent 的决策
    ob, reward, done, info = env.step(action)
    if done:
        env.render()
        break

Design decisions

observation

以当前玩家的视角观察,包括坐标系、行、列,以及当前的合法走法,上一步被对手吃掉的子(用于reward shaping)

action

  • 方案1

    以当前玩家的视角行动,action space: MultiDiscrete(9, 10, 9, 10)

    前面2维表示源位置(col, row),后面2维表示目标位置

    空间大小8100

  • 方案2 任一位置的行,列,四方2x2范围的位置,空间大小2550

  • 方案3

    对每一类子,从右到左,从上到下编号,然后针对棋子类别编码最大可能的走法

    车炮:17,马:8,象:4,士:4,兵:3,将:4

    空间大小119

注意

  • 策略应该包含对对手求和请求的处理,回答是否同意和

Details

默认执红者先下

观察到的信息包括:

  • board_state:棋盘状态,FEN串,它包含了局面的所有信息
  • sue_draw: 是否对手在求和,如果是,需要对求和做出反应,即是否同意和局

操作环境的输入(即env.step(action)中的action)为一个字符串:

"RESIGN": 认输认输会引起本局结束
"SUE_DRAW": 求和主动求和只要发这个字符串对手需要回复"yes""no"表示同意与否
UCCI走法串: "b7b0"即源位置和目标位置的编码

操作环境的输出(即env.step(action)的返回值)分为四个部分:

  • ob (dict) 观察,切换当前玩家后(即下一个玩家)的观察,具体描述参看上面,doneTrue时为None
  • reward (int) 回报,当前玩家(即做本次action的玩家)的回报,doneFalse时为None
  • done (bool),这一局是否结束
  • info (str),一些附加信息

显示棋局env.render()

字符界面下显示的棋局。棋盘不以玩家视角变化的,而是固定红色在下,黑色在上。

棋子在颜色上做了区分,在字形上也做了区分,红色的感觉更带人性一些,黑方有点像原始文明

另外"将军",吃子等信息也会显示

常量

N_ROWS = 10  # term: rank
N_COLS = 9  # term: file

REWARD_WIN = 1
REWARD_LOSE = -1
REWARD_DRAW = 0
REWARD_ILLEGAL = -5  # illegal action

class Action(IntEnum):
    ADVANCE = 1  # 进
    RETREAT = 2  # 退
    TRAVERSE = 3 # 平
    SUE_DRAW = 4 # 求和
    RESIGN = 5   # 认输

class Camp(IntEnum):
    BLACK = 1
    RED = 2

class Force(IntEnum):
    SHUAI = 1
    SHI = 2
    XIANG = 3
    MA = 4
    JU = 5
    PAO = 6
    BING = 7

帮助函数

def decode(board_state)  # 解码棋盘状态,整数->(Camp, Force)列表(从左到右,从上到下)
def chinese_to_ucci(action, camp, board)  # 中文纵线格式 -> ucci格式,如炮二平五->h7e7
def ucci_to_chinese(action)  # 上面函数的逆

reference

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xiangqi-1.0.5.tar.gz (31.4 kB view details)

Uploaded Source

Built Distribution

xiangqi-1.0.5-py3-none-any.whl (21.9 kB view details)

Uploaded Python 3

File details

Details for the file xiangqi-1.0.5.tar.gz.

File metadata

  • Download URL: xiangqi-1.0.5.tar.gz
  • Upload date:
  • Size: 31.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.8.11

File hashes

Hashes for xiangqi-1.0.5.tar.gz
Algorithm Hash digest
SHA256 fc991eca63190f59b4679cfad86e716663874f5f00a11b26450295d8ad774ed5
MD5 8d9949089c71991a457e5af8e3017b41
BLAKE2b-256 f4af62409a1e45950f93be0cdafdd158added9ecb84c58f0252aad0db3225ae9

See more details on using hashes here.

File details

Details for the file xiangqi-1.0.5-py3-none-any.whl.

File metadata

  • Download URL: xiangqi-1.0.5-py3-none-any.whl
  • Upload date:
  • Size: 21.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.8.11

File hashes

Hashes for xiangqi-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 d0e9ec9a3f682d3c2fd25ac57e554e1415e860530009e87a1c8f6ce0b851a5f0
MD5 2dd3eb842278ffe05809983fe8ed25c4
BLAKE2b-256 68707c450b92cb4b6ea70e1a77b754f2f091ef94627546a2c914c9894b9fc266

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page