dexholdem-v2
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDexHoldem Robot Skill
DexHoldem 机器人技能
This skill runs a physical two-player Texas Hold'em setup with a dexterous
robot hand. The coding agent owns perception orchestration, state maintenance,
poker reasoning, and recovery decisions. Python helpers do deterministic work:
preflight, image capture, state-file updates, action translation, and robot
command dispatch, and next-move routing.
The main agent owns final state interpretation. Helpers may mutate caches,
action metadata, and state files only when the main agent invokes them.
Visual subagents never write state files; they only return evidence for the
main agent to merge.
The workflow is state-folder based. Every decision is grounded in the current
state image, parsed state markdown, local caches, and the current action
sequence.
本技能通过灵巧机器人手运行实体双人德州扑克(Texas Hold'em)系统。编码Agent负责感知编排、状态维护、扑克推理和恢复决策。Python辅助工具负责确定性工作:预检、图像捕获、状态文件更新、动作转换、机器人命令调度以及下一步动作路由。
主Agent拥有最终状态的解释权。只有当主Agent调用辅助工具时,它们才能修改缓存、动作元数据和状态文件。视觉子Agent永远不会写入状态文件;它们仅返回证据供主Agent合并。
工作流基于状态文件夹构建。每一项决策都基于当前状态图像、解析后的状态markdown文件、本地缓存以及当前动作序列。
Session Start
会话启动
First, from the user's working directory, expose the helper scripts at the
workspace root:
bash
ln -s .agents/skills/dexholdem-v2/scripts/*.py ./For Claude installations, use the Claude skill path instead:
bash
ln -s .claude/skills/dexholdem-v2/scripts/*.py ./Then run preflight from the user's working directory:
bash
python3 preflight.py
python3 preflight.py --exp-name my_runFor a hardware-free smoke check:
bash
python3 preflight.py --skip-camera --skip-remote --skip-audioPause after preflight. Inspect the printed result, confirm the experiment
directory exists, confirm exists when camera was not
skipped, and report any preflight error or suspicious setup instead of
continuing the workflow automatically.
s0/00_capture.jpgPreflight creates , points to
that folder, initializes and , copies the executable helper
scripts plus and into the experiment root, and
validates remote click coordinates before capturing unless
camera checks are skipped.
experiments/<exp-name>/experiments/currents0/s_currentpyproject.tomlconfig.yamls0/00_capture.jpgAfter preflight, work from the experiment root:
bash
cd experiments/current
python3 state.py currentPerform one visual pass for blind/dealer assignment using
, then cache the result:
visual_guidelines/BLIND_BUTTON_RECOGNITION.mdbash
python3 state.py set-blinds --dealer robot --small-blind robot --big-blind opponent --source-state s0Blind amounts are fixed for this setup: the small blind is an initial bet of 5
chips, and the big blind is an initial bet of 10 chips. Use the cached
small-blind/big-blind assignment with visible bet recognition when reasoning
about preflop current bets.
首先,从用户的工作目录中,将辅助脚本暴露在工作区根目录:
bash
ln -s .agents/skills/dexholdem-v2/scripts/*.py ./对于Claude部署,请使用Claude技能路径:
bash
ln -s .claude/skills/dexholdem-v2/scripts/*.py ./然后从用户工作目录运行预检:
bash
python3 preflight.py
python3 preflight.py --exp-name my_run如需无硬件冒烟测试:
bash
python3 preflight.py --skip-camera --skip-remote --skip-audio预检后暂停。检查打印结果,确认实验目录已创建;若未跳过相机检查,确认存在;如有预检错误或可疑配置,请报告,不要自动继续工作流。
s0/00_capture.jpg预检会创建,将指向该文件夹,初始化和,将可执行辅助脚本以及和复制到实验根目录,并在捕获前验证远程点击坐标(除非跳过相机检查)。
experiments/<exp-name>/experiments/currents0/s_currentpyproject.tomlconfig.yamls0/00_capture.jpg预检完成后,进入实验根目录工作:
bash
cd experiments/current
python3 state.py current使用执行一次盲注/庄家分配的视觉识别,然后缓存结果:
visual_guidelines/BLIND_BUTTON_RECOGNITION.mdbash
python3 state.py set-blinds --dealer robot --small-blind robot --big-blind opponent --source-state s0本系统的盲注金额固定:小盲注初始下注5个筹码,大盲注初始下注10个筹码。在翻牌前推理当前下注时,请使用缓存的小盲注/大盲注分配结果结合可见下注识别。
State Contract
状态契约
The experiment root contains the timeline and the durable caches:
text
experiments/current/
s0/
00_capture.jpg
01_parsed_state.md
02_action.md
s1/
s_current -> s1
hole_card_cache.json
action_sequence.jsonEach state folder is filled in this order:
- - exact image used for visual parsing.
00_capture.jpg - - agent-authored parsed state markdown with one JSON block.
01_parsed_state.md - - committed decision, execution result, and translated commands.
02_action.md
Create the next state only after exists for the current state:
02_action.mdbash
python3 state.py begin-next --after s0After is written, create the next state and capture a fresh
observation. This applies to ordinary poker actions, waits, continued
or sequences, states, , ,
and states that need recovery or collection. The fresh state is how the
agent verifies what physically happened.
02_action.mdactingatom_idleto_recovershow_handwindownThe normal exceptions are , which ends the session instead of continuing
the timeline, and , which blocks automatic state advance until a
human confirms how to proceed.
stoprequest_human实验根目录包含时间线和持久化缓存:
text
experiments/current/
s0/
00_capture.jpg
01_parsed_state.md
02_action.md
s1/
s_current -> s1
hole_card_cache.json
action_sequence.json每个状态文件夹按以下顺序填充:
- - 用于视觉解析的精确图像。
00_capture.jpg - - 由Agent编写的解析后状态markdown文件,包含一个JSON块。
01_parsed_state.md - - 已提交的决策、执行结果和转换后的命令。
02_action.md
只有当前状态的存在后,才能创建下一个状态:
02_action.mdbash
python3 state.py begin-next --after s002_action.mdactingatom_idleto_recovershow_handwindown常规例外情况是(结束会话而非继续时间线)和(阻止状态自动推进,直到人类确认后续操作)。
stoprequest_humanLoop Stage
循环阶段
loop_stage- - a robot atom action was dispatched recently or the hand is still moving. The next agent action should normally be
acting, followed by a fresh capture.wait - - the hand has settled after an atom action, but the full
atom_idlestill has pending steps. Continue or verify that sequence; do not start a new poker action.action_sequence.json - - the full action sequence is complete, the hand is near rest pose, and the agent may make the next poker decision.
idle - - the opponent has shown hole cards or showdown has been reached; reveal the robot hole cards as needed and resolve the outcome.
show_hand - - the robot has won because the opponent folded or the known showdown cards give the robot the stronger hand. Pull back the recognized bet chips.
win - - the robot has lost because it folded or the known showdown cards give the opponent the stronger hand. Do not pull chips back.
lose - - the previous atom action appears to have failed harmlessly or had no effect after the hand settled, and the table layout is still safe enough to retry or repair using the cached action sequence. Examples: a hole card was not picked up and remains near its original position, or a chip push did not move the intended chip and did not disturb cards/chip layout.
to_recover - - execution is failed, interrupted, blocked, or unsafe to continue blindly.
down
A completed parsed state should use one of these values.
loop_stage- - 最近已调度机器人原子动作,或手部仍在移动。Agent的下一个动作通常应为
acting,随后进行新的捕获。wait - - 原子动作完成后手部已稳定,但
atom_idle仍有未完成步骤。继续或验证该序列;不要启动新的扑克动作。action_sequence.json - - 完整动作序列已完成,手部接近静止姿态,Agent可做出下一个扑克决策。
idle - - 对手已亮出底牌或进入摊牌阶段;根据需要亮出机器人底牌并判定结果。
show_hand - - 机器人获胜(对手弃牌或已知摊牌卡牌使机器人手牌更强)。收回已识别的下注筹码。
win - - 机器人失败(已弃牌或已知摊牌卡牌使对手手牌更强)。不要收回筹码。
lose - - 上一个原子动作看似无危害地失败,或手部稳定后未产生效果,且桌面布局仍足够安全,可使用缓存的动作序列重试或修复。例如:底牌未被拿起,仍留在原位置附近;或筹码推动未移动目标筹码,且未干扰卡牌/筹码布局。
to_recover - - 执行失败、中断、受阻,或盲目继续存在安全风险。
down
已完成的解析状态应使用上述值之一。
Caches
缓存
hole_card_cache.jsonaction_sequence.jsonplanplanStep status is deliberately physical:
- means the atom has not been dispatched.
pending - means
dispatchedsent the robot policy, but the next capture has not yet verified the physical result.executor.py - means the atom was visually verified in an
completedstate.atom_idle
executor.pydispatchedcompletedUseful cache helpers:
bash
python3 state.py cache-card --slot left --card Ah --source-state s3 --confidence 0.9
python3 action_translator.py --action '{"action":"view_card","position":"left"}' --as-sequence-cache
python3 state.py start-action --sequence-json '<translator sequence-cache JSON>'
python3 state.py dispatch-step --step pick_card
python3 state.py complete-step --step read_card
python3 state.py prepare-retry --step push_chip_10_1 --reason to_recover
python3 state.py next-hand
python3 state.py next-hand --refresh-blinds
python3 state.py set-loop-stage --stage to_recover
python3 state.py set-loop-stage --stage show_hand
python3 state.py set-loop-stage --stage win
python3 state.py set-loop-stage --stage lose
python3 state.py set-loop-stage --stage atom_idle
python3 state.py set-loop-stage --stage actinghole_card_cache.jsonaction_sequence.jsonplanplan步骤状态明确对应物理状态:
- - 原子动作尚未调度。
pending - -
dispatched已发送机器人策略,但尚未通过下一次捕获验证物理结果。executor.py - - 原子动作已在
completed状态下通过视觉验证。atom_idle
executor.pydispatchedcompleted实用缓存辅助命令:
bash
python3 state.py cache-card --slot left --card Ah --source-state s3 --confidence 0.9
python3 action_translator.py --action '{"action":"view_card","position":"left"}' --as-sequence-cache
python3 state.py start-action --sequence-json '<translator sequence-cache JSON>'
python3 state.py dispatch-step --step pick_card
python3 state.py complete-step --step read_card
python3 state.py prepare-retry --step push_chip_10_1 --reason to_recover
python3 state.py next-hand
python3 state.py next-hand --refresh-blinds
python3 state.py set-loop-stage --stage to_recover
python3 state.py set-loop-stage --stage show_hand
python3 state.py set-loop-stage --stage win
python3 state.py set-loop-stage --stage lose
python3 state.py set-loop-stage --stage atom_idle
python3 state.py set-loop-stage --stage actingRouter Reference
路由参考
After the current state has a capture and parsed state, the local router gives
the initial gate:
bash
python3 router.pyThe router returns , , , , and
optional commands. It does not parse images, decide poker strategy, or declare
unsafe physical recovery by itself; those remain main-agent responsibilities.
routereasonagent_requiredjudged_results当前状态拥有捕获图像和解析状态后,本地路由器会给出初始指引:
bash
python3 router.py路由器返回、、、以及可选命令。它不会解析图像、决定扑克策略或声明不安全的物理恢复;这些仍属于主Agent的职责。
routereasonagent_requiredjudged_resultsVisual Parsing
视觉解析
Use the files in as needed to write a truthful
, , and table fields. Multiple visual checks may be used for
the same captured state when they add useful information. The visual model may
answer in plain language; the coding agent converts those answers into
.
visual_guidelines/loop_stagerobot01_parsed_state.mdWhen visual information is needed, the main agent MUST delegate image reading
to visual subagents. Assign each subagent one guideline or one visual question,
such as scene stability, robot behavior, turn button, community cards, bets,
chip inventory, held card reading, or showdown outcome. Give each subagent the
current image, relevant recent images, cache summaries, action-sequence context,
and the appropriate visual guideline as its prompt. Subagents are read-only
evidence providers: they must inspect images and context, return findings,
evidence, uncertainty, and suggested parsed fields, but must not edit state
files. The main agent merges the subagent outputs, resolves conflicts
conservatively, and writes the single authoritative
.
s_current/01_parsed_state.mdGuideline purposes:
- - action completion, waiting decisions, and movement checks, usually paired with recent images.
SCENE_STABILITY.md - - dexterous-hand pose, motion, held objects, physical safety, atom progress, and recovery context. A robot-behavior subagent should receive at least the current image and the previous captured image so it can judge motion, progress, and whether the hand has actually settled.
ROBOT_BEHAVIOR.md - - robot/opponent orientation, betting zones, inventory zones, and camera/table layout.
TABLE_GEOMETRY.md - - dealer, small blind, and big blind buttons.
BLIND_BUTTON_RECOGNITION.md - - readable hole card held by the robot hand.
HELD_CARD_RECOGNITION.md - - physical white turn button and
TURN_DETECTION.md.is_my_turn - - shared board cards.
COMMUNITY_CARDS.md - - showdown state, revealed cards, fold/win/lose outcome.
SHOWDOWN_OUTCOME.md - - remaining chip inventories.
CHIP_RECOGNITION.md - - current bet chips in each betting area.
BET_RECOGNITION.md
It is acceptable to refresh , board, chips, bets, and robot state
on every captured image if that helps keep the parsed state current. The router
will decide which fields matter for the current .
is_my_turnloop_stageKeep parsed state compact:
json
{
"loop_stage": "idle",
"robot": "dexterous hand is near its initial pose and not holding a card or chips",
"table": {
"scene_stable": true,
"uncertain_fields": [],
"is_my_turn": true,
"community_cards": [],
"my_chips": {"5": 4, "10": 3, "50": 3, "100": 3},
"opponent_chips": {"5": 4, "10": 4, "50": 3, "100": 3},
"my_current_bet": {"5": 0, "10": 0, "50": 0, "100": 0},
"opponent_bet": {"5": 0, "10": 0, "50": 0, "100": 0}
}
}Derived concepts such as poker street, total call amount, and turn confidence
can be inferred later from the stored cards, chip counts, and turn button
state; they do not belong in .
01_parsed_state.mdThe router uses stage-specific required fields. An state needs the full
table block shown above. Non-idle states must still include a object,
but it may be sparse when fields were not visually parsed and are irrelevant to
the current gate. Include when an omitted or unclear value
matters to the next action.
idletableuncertain_fieldsFor showdown, use as the main compact signal. Add only small table
notes that help routing or verification, such as visible opponent hole cards;
do not store bulky hand-ranking explanations.
loop_stage根据需要使用中的文件,如实填写、和table字段。同一捕获状态可进行多次视觉检查,以补充有用信息。视觉模型可使用自然语言作答;编码Agent需将这些答案转换为。
visual_guidelines/loop_stagerobot01_parsed_state.md当需要视觉信息时,主Agent必须将图像读取任务委托给视觉子Agent。为每个子Agent分配一个指南或一个视觉问题,例如场景稳定性、机器人行为、回合按钮、公共牌、下注、筹码库存、手牌读取或摊牌结果。为每个子Agent提供当前图像、相关近期图像、缓存摘要、动作序列上下文以及相应的视觉指南作为提示。子Agent是只读证据提供者:它们必须检查图像和上下文,返回发现、证据、不确定性和建议的解析字段,但不得编辑状态文件。主Agent合并子Agent的输出,保守地解决冲突,并写入唯一权威的。
s_current/01_parsed_state.md指南用途:
- - 动作完成、等待决策和移动检查,通常与近期图像配合使用。
SCENE_STABILITY.md - - 灵巧手姿态、运动、握持物体、物理安全、原子动作进度和恢复上下文。机器人行为子Agent应至少接收当前图像和上一次捕获的图像,以便判断运动、进度以及手部是否真的稳定。
ROBOT_BEHAVIOR.md - - 机器人/对手朝向、下注区域、库存区域以及相机/桌面布局。
TABLE_GEOMETRY.md - - 庄家、小盲注和大盲注按钮。
BLIND_BUTTON_RECOGNITION.md - - 机器人手持的可读底牌。
HELD_CARD_RECOGNITION.md - - 实体白色回合按钮和
TURN_DETECTION.md。is_my_turn - - 公共牌。
COMMUNITY_CARDS.md - - 摊牌状态、亮出的卡牌、弃牌/获胜/失败结果。
SHOWDOWN_OUTCOME.md - - 剩余筹码库存。
CHIP_RECOGNITION.md - - 每个下注区域的当前下注筹码。
BET_RECOGNITION.md
如果有助于保持解析状态的时效性,可在每次捕获图像时刷新、公共牌、筹码、下注和机器人状态。路由器会决定当前需要哪些字段。
is_my_turnloop_stage保持解析状态简洁:
json
{
"loop_stage": "idle",
"robot": "灵巧手接近初始姿态,未握持卡牌或筹码",
"table": {
"scene_stable": true,
"uncertain_fields": [],
"is_my_turn": true,
"community_cards": [],
"my_chips": {"5": 4, "10": 3, "50": 3, "100": 3},
"opponent_chips": {"5": 4, "10": 4, "50": 3, "100": 3},
"my_current_bet": {"5": 0, "10": 0, "50": 0, "100": 0},
"opponent_bet": {"5": 0, "10": 0, "50": 0, "100": 0}
}
}扑克街、总跟注金额和回合置信度等派生概念可稍后从存储的卡牌、筹码数量和回合按钮状态推断得出;它们不属于的内容。
01_parsed_state.md路由器使用特定阶段的必填字段。状态需要上述完整的table块。非idle状态仍需包含对象,但当字段未经过视觉解析且与当前指引无关时,可仅保留精简内容。当某个缺失或不明确的值对下一个动作有影响时,请包含。
idletableuncertain_fields对于摊牌,使用作为主要的简洁信号。仅添加有助于路由或验证的小型桌面说明,例如可见的对手底牌;不要存储冗长的手牌排名解释。
loop_stagePoker Reasoning
扑克推理
When the router returns , the main agent MUST delegate the
Texas Hold'em reasoning to a reasoning subagent. Give the subagent the current
parsed table, hole-card cache, blind/dealer assignment, action history if
available, supported action space, and the blind amounts: small blind = 5,
big blind = 10.
choose_poker_actionThe reasoning subagent should infer the current betting situation from
, , , , community
cards, hole cards, turn state, and blind assignment. It should return a concise
rationale plus one recommended supported action JSON, such as , ,
, , or . The main agent validates that recommendation
against the current parsed state, supported action schema, and physical chip
constraints, then commits and executes the final action through .
my_current_betopponent_betmy_chipsopponent_chipscheckfoldcallraiseall_inexecutor.py当路由器返回时,主Agent必须将德州扑克推理任务委托给推理子Agent。为子Agent提供当前解析后的桌面状态、底牌缓存、盲注/庄家分配结果(如有可用)、动作历史、支持的动作空间以及盲注金额:小盲注=5,大盲注=10。
choose_poker_action推理子Agent应从、、、、公共牌、底牌、回合状态和盲注分配结果中推断当前下注情况。它应返回简洁的推理过程以及一个推荐的支持动作JSON,例如、、、或。主Agent验证该建议是否符合当前解析状态、支持的动作模式和物理筹码限制,然后通过提交并执行最终动作。
my_current_betopponent_betmy_chipsopponent_chipscheckfoldcallraiseall_inexecutor.pyActions
动作
Supported action JSON:
json
{"action": "wait", "reason": "scene_unstable", "sleep_seconds": 30}
{"action": "view_card", "position": "left"}
{"action": "show_card", "position": "left"}
{"action": "put_down_card", "position": "left", "face_up": false}
{"action": "check"}
{"action": "fold"}
{"action": "call"}
{"action": "raise", "amount": 80}
{"action": "all_in"}
{"action": "collect_winnings"}
{"action": "collect_winnings", "chip_counts": {"5": 2, "10": 1, "50": 0, "100": 1}}
{"action": "request_human", "reason": "dexterous hand is holding an unreadable card"}
{"action": "stop", "reason": "session ended"}Run actions through ; use to write the action and
action-sequence cache without sending robot commands.
executor.py--dry-runFor betting actions, the executor reads , , and
from the current table. pushes
. is the target total
bet after the raise, so the physical chips pushed are
.
my_chipsmy_current_betopponent_bet01_parsed_state.mdcallsum(opponent_bet) - sum(my_current_bet)raise.amountamount - sum(my_current_bet)For and , chip selection must be exact. If available
cannot form the required amount exactly, the translator fails before robot
dispatch. Do not silently overpay with a larger chip; choose a different poker
action, repair chip recognition, or request human help.
callraisemy_chipsChip actions are translated into one atom step per moved chip, such as
and , followed by .
push_chip_10_1push_chip_5_1verify_idlecollect_winningswinopponent_betmy_current_betchip_counts支持的动作JSON:
json
{"action": "wait", "reason": "scene_unstable", "sleep_seconds": 30}
{"action": "view_card", "position": "left"}
{"action": "show_card", "position": "left"}
{"action": "put_down_card", "position": "left", "face_up": false}
{"action": "check"}
{"action": "fold"}
{"action": "call"}
{"action": "raise", "amount": 80}
{"action": "all_in"}
{"action": "collect_winnings"}
{"action": "collect_winnings", "chip_counts": {"5": 2, "10": 1, "50": 0, "100": 1}}
{"action": "request_human", "reason": "dexterous hand is holding an unreadable card"}
{"action": "stop", "reason": "session ended"}通过运行动作;使用可写入动作和动作序列缓存,而不发送机器人命令。
executor.py--dry-run对于下注动作,执行器从当前的table中读取、和。会推动的筹码。是加注后的目标总下注额,因此实际推动的筹码为。
01_parsed_state.mdmy_chipsmy_current_betopponent_betcallsum(opponent_bet) - sum(my_current_bet)raise.amountamount - sum(my_current_bet)对于和,筹码选择必须精确。如果可用的无法精确凑出所需金额,转换器会在机器人调度前失败。不要默默使用更大的筹码超额支付;请选择其他扑克动作、修复筹码识别或请求人类帮助。
callraisemy_chips筹码动作会转换为每个移动筹码一个原子步骤,例如和,随后是。
push_chip_10_1push_chip_5_1verify_idlecollect_winningswinopponent_betmy_current_betchip_countsRecovery
恢复
Use when a recent robot atom failed harmlessly after the hand
settled and the current table layout is still safe to retry:
to_recover- during , the target card was not picked up and remains face-down near its original position,
view_card - during chip movement, the intended chip did not move or did not follow the hand, and the card/chip layout remains countable and undisturbed,
- after an atom attempt, no intended physical progress happened but no non-target object moved.
Use when direct continuation is unsafe or unclear:
down- a card was dropped during viewing,
- a returned card covers chips or hides game state,
- chip movement displaced cards, buttons, or unrelated chips,
- chip movement destroyed the table layout,
- the dexterous hand appears stuck,
- command progress is unknown,
- repeated captures remain unstable.
Request human help when a person must fix or confirm the table:
bash
python3 executor.py --action '{"action":"request_human","reason":"Dexterous hand is holding an unreadable card","resume_options":["mark_card","confirm_card_returned","abort_hand"]}'request_human02_action.mdhuman_pausecommands_after_humanRetry only when the cached sequence plan and recent images show that repeating
the current step is physically safe. In normal routing, that means the parsed
state should be ; otherwise keep the state and request human
help or wait for clearer evidence. For retryable atom failures, use
followed by
; the router emits these commands when the
current step has a cached atom command. Safety counters in
cap repeated waits and recoveries; when a cap is reached,
the router escalates to instead of continuing automatically.
If a human inspects the table and explicitly approves continuing, run
before creating the next captured
state. Use only when the human intentionally clears total wait or
total recovery caps for the session.
to_recoverdownstate.py prepare-retry --step <current_step>executor.py --continue-currentaction_sequence.jsonrequest_humanstate.py reset-safety --scope consecutive--scope allAfter a hand ends, either stop the session or reset local caches before the next
hand. Use to clear hole cards and reset
while preserving blind/dealer cache. Use
when the dealer/small-blind button may
have moved and blind recognition must run again during the next preflight-like
visual pass.
state.py next-handaction_sequence.jsonstate.py next-hand --refresh-blinds当最近的机器人原子动作在手部稳定后无危害地失败,且当前桌面布局仍可安全重试时,使用:
to_recover- 在期间,目标卡牌未被拿起,仍翻面留在原位置附近;
view_card - 在筹码移动期间,目标筹码未移动或未跟随手部,且卡牌/筹码布局仍可计数且未被干扰;
- 原子动作尝试后,未产生预期的物理进展,但未移动非目标物体。
当直接继续存在安全风险或情况不明时,使用:
down- 查看卡牌期间掉落卡牌;
- 放回的卡牌覆盖筹码或隐藏游戏状态;
- 筹码移动移位了卡牌、按钮或无关筹码;
- 筹码移动破坏了桌面布局;
- 灵巧手似乎卡住;
- 命令进度未知;
- 多次捕获的图像仍不稳定。
当需要人类修复或确认桌面情况时,请求帮助:
bash
python3 executor.py --action '{"action":"request_human","reason":"Dexterous hand is holding an unreadable card","resume_options":["mark_card","confirm_card_returned","abort_hand"]}'request_human02_action.mdhuman_pausecommands_after_human仅当缓存的序列计划和近期图像显示重复当前步骤在物理上安全时,才进行重试。在常规路由中,这意味着解析状态应标记为;否则保持状态并请求人类帮助或等待更明确的证据。对于可重试的原子动作失败,使用,然后运行;当当前步骤有缓存的原子命令时,路由器会发出这些命令。中的安全计数器限制了重复等待和恢复的次数;当达到上限时,路由器会升级为而非自动继续。如果人类检查桌面并明确批准继续,请在创建下一个捕获状态前运行。仅当人类有意清除会话的总等待或总恢复上限时,才使用。
to_recoverdownstate.py prepare-retry --step <current_step>executor.py --continue-currentaction_sequence.jsonrequest_humanstate.py reset-safety --scope consecutive--scope all手牌结束后,要么停止会话,要么在下手牌开始前重置本地缓存。使用清除底牌并重置,同时保留盲注/庄家缓存。当庄家/小盲注按钮可能已移动,且下手牌的预检式视觉识别需重新运行盲注识别时,使用。
state.py next-handaction_sequence.jsonstate.py next-hand --refresh-blindsCore Workflow
核心工作流
After preflight, repeat this loop from the experiment root until the action is
:
stop- Capture or reuse the current state's image. If is missing, run
s_current/00_capture.jpg.python3 capture.py --output s_current/00_capture.jpg - Select only the visual guidelines needed for this state, then use visual
agents or vision models to parse the current image. Provide recent state
images, , and
action_sequence.jsonwhen they help the visual agent judge motion, robot behavior, held cards, chips, bets, showdown, or recovery state.hole_card_cache.json - The main coding agent summarizes the visual outputs into
. This file is the authoritative parsed state for the router. It must include the compact JSON block with
s_current/01_parsed_state.md,loop_stage, androbot.table - Run . Treat its JSON as the initial gating result for the current state.
python3 router.py - Follow the gated route:
- If the router returns a command and , run the command.
agent_required: false - If it asks for visual parsing, repair the parsed state and rerun the router.
- If it asks to verify a dispatched step, inspect the current image and
cached sequence. If the intended atom succeeded, run the provided
command and rerun the router. If it failed harmlessly, mark
state.py complete-step ...; if unsafe, markto_recoveror request human help.down - If it asks for held-card reading, use visual parsing to read the held card,
update , and continue the cached action sequence.
hole_card_cache.json - If it returns , run
continue_cached_command; this sends the next pending robot atom fromexecutor.py --continue-current.action_sequence.json - If it returns with commands, run them in order to reset and retry the exact cached atom. If it requires the agent, inspect the cached sequence and recent images before retrying or requesting help.
recover_retryable - If it returns , inspect recent states and choose wait or
recover_down; only retry after the state is safely classified asrequest_human.to_recover - If it returns , wait for human confirmation before running the supplied
human_pause.commands_after_human - If it returns , reveal robot cards as needed with
show_handactions, then useshow_cardto decideSHOWDOWN_OUTCOME.md,win, or keep resolving showdown ambiguity.lose - If it returns , execute the suggested
collect_winningsaction withcollect_winnings.executor.py - If it returns , do not move chips toward the robot; decide whether to wait for reset, request human help, run
hand_lost, or stop.state.py next-hand - If it returns , delegate Texas Hold'em reasoning to a reasoning subagent with the parsed table state, hole-card cache, blind/dealer assignment, action history, supported action space, and blind amounts. Validate the subagent's recommended action, use
choose_poker_actionif you need to inspect the new action sequence, and execute the final action withaction_translator.py.executor.py
- If the router returns a command and
- Use when you need to inspect or create the action sequence for a new poker or embodied action. The executor also calls the translator internally before dispatch.
action_translator.py - Use every time you want to send robot commands or commit an executable action. Do not send robot policy commands directly through
executor.pyduring normal operation. Examples:remote_exec.py
bash
python3 executor.py --action '{"action":"wait","reason":"not_my_turn","sleep_seconds":3}'
python3 executor.py --action '{"action":"view_card","position":"left"}'
python3 executor.py --action '{"action":"show_card","position":"left"}'
python3 executor.py --action '{"action":"put_down_card","position":"left","face_up":false}'
python3 executor.py --continue-current
python3 executor.py --action '{"action":"call"}'
python3 executor.py --action '{"action":"collect_winnings"}'
python3 executor.py --action '{"action":"request_human","reason":"card was dropped"}'After writes , create the next state and capture the
next observation unless the route is or the action is :
executor.py02_action.mdhuman_pausestopbash
python3 state.py current
python3 state.py begin-next --after sN
python3 capture.py --output s_current/00_capture.jpgThen start the loop again from visual parsing. The next image verifies what
actually happened after the last wait, retry, robot action, or human-help
request.
预检完成后,从实验根目录重复以下循环,直到动作变为:
stop- 捕获或复用当前状态的图像。如果缺失,运行
s_current/00_capture.jpg。python3 capture.py --output s_current/00_capture.jpg - 仅选择当前状态所需的视觉指南,然后使用视觉Agent或视觉模型解析当前图像。当有助于视觉Agent判断运动、机器人行为、握持卡牌、筹码、下注、摊牌或恢复状态时,提供近期状态图像、和
action_sequence.json。hole_card_cache.json - 主编码Agent将视觉输出汇总为。该文件是路由器的权威解析状态,必须包含带有
s_current/01_parsed_state.md、loop_stage和robot的精简JSON块。table - 运行。将其返回的JSON作为当前状态的初始指引结果。
python3 router.py - 遵循指引路径:
- 如果路由器返回命令且,运行该命令。
agent_required: false - 如果要求进行视觉解析,修复解析状态并重新运行路由器。
- 如果要求验证已调度的步骤,检查当前图像和缓存序列。如果预期的原子动作成功,运行提供的命令并重新运行路由器。如果无危害地失败,标记为
state.py complete-step ...;如果存在安全风险,标记为to_recover或请求人类帮助。down - 如果要求读取手牌,使用视觉解析读取握持的卡牌,更新,并继续缓存的动作序列。
hole_card_cache.json - 如果返回,运行
continue_cached_command;这会从executor.py --continue-current发送下一个待处理的机器人原子动作。action_sequence.json - 如果返回带有命令的,按顺序运行这些命令以重置并重试完全相同的缓存原子动作。如果需要Agent参与,请在重试或请求帮助前检查缓存序列和近期图像。
recover_retryable - 如果返回,检查近期状态并选择等待或
recover_down;仅当状态被安全分类为request_human后才可重试。to_recover - 如果返回,等待人类确认后再运行提供的
human_pause。commands_after_human - 如果返回,使用
show_hand动作根据需要亮出机器人卡牌,然后使用show_card判定SHOWDOWN_OUTCOME.md、win或继续解决摊牌歧义。lose - 如果返回,通过
collect_winnings执行建议的executor.py动作。collect_winnings - 如果返回,不要将筹码移向机器人;决定是等待重置、请求人类帮助、运行
hand_lost还是停止。state.py next-hand - 如果返回,将德州扑克推理任务委托给推理子Agent,提供解析后的桌面状态、底牌缓存、盲注/庄家分配结果、动作历史、支持的动作空间和盲注金额。验证子Agent的推荐动作,如需检查新动作序列请使用
choose_poker_action,并通过action_translator.py执行最终动作。executor.py
- 如果路由器返回命令且
- 如需检查或创建新扑克动作或实体动作的动作序列,请使用。执行器在调度前也会内部调用转换器。
action_translator.py - 每次发送机器人命令或提交可执行动作时,请使用。正常操作期间不要通过
executor.py直接发送机器人策略命令。示例:remote_exec.py
bash
python3 executor.py --action '{"action":"wait","reason":"not_my_turn","sleep_seconds":3}'
python3 executor.py --action '{"action":"view_card","position":"left"}'
python3 executor.py --action '{"action":"show_card","position":"left"}'
python3 executor.py --action '{"action":"put_down_card","position":"left","face_up":false}'
python3 executor.py --continue-current
python3 executor.py --action '{"action":"call"}'
python3 executor.py --action '{"action":"collect_winnings"}'
python3 executor.py --action '{"action":"request_human","reason":"card was dropped"}'executor.py02_action.mdhuman_pausestopbash
python3 state.py current
python3 state.py begin-next --after sN
python3 capture.py --output s_current/00_capture.jpg然后从视觉解析再次开始循环。下一张图像用于验证上次等待、重试、机器人动作或人类帮助请求后实际发生的情况。