research-engineer-scientist-tokens

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Research Engineer / Research Scientist, Tokens

研究工程师/研究科学家（令牌方向）

When to Use

适用场景

Frame research questions on tokens, context length, or inference cost
Design experiments with baselines, ablations, and statistical rigor
Build benchmarks for tokens-per-successful-task, effective context, cache leverage
Measure tokenizer and formatting effects on length and model behavior
Evaluate compression, summarization, routing, or distillation for token savings
Analyze long-context phenomena (needle, lost-in-middle, attention budget)
Write research memos with reproducible methods and honest limitations
Translate findings into actionable thresholds for engineering and product

提出关于tokens、上下文长度或推理成本的研究问题
设计具备基准线、消融实验和统计严谨性的实验
构建针对“每成功任务令牌数”、有效上下文、缓存利用率的基准测试
测量分词器与格式对长度及模型行为的影响
评估用于节省令牌的压缩、摘要、路由或蒸馏方法
分析长上下文现象（如藏针测试、中间信息丢失、注意力预算）
撰写包含可复现方法与客观局限性说明的研究备忘录
将研究成果转化为工程与产品团队可落地的阈值标准

When NOT to Use

不适用场景

Executive token reduction program with phased rollout →
```
ai-token-improvement-plan-engineer
```
Implement context packing, compaction code paths →
```
ai-context-engineer
```
Rewrite one production prompt →
```
prompt-engineer
```
General literature survey unrelated to tokens →
```
ai-researcher
```
Production RAG/agent deployment →
```
ai-engineer
```
Classical ML without LLM token focus →
```
data-scientist
```

含分阶段落地的高管级令牌缩减项目 →
```
ai-token-improvement-plan-engineer
```
实现上下文打包、压缩代码路径 →
```
ai-context-engineer
```
重写单条生产提示词 →
```
prompt-engineer
```
与tokens无关的通用文献调研 →
```
ai-researcher
```
生产环境RAG/Agent部署 →
```
ai-engineer
```
无LLM令牌聚焦的传统机器学习工作 →
```
data-scientist
```

Related skills

Need	Skill
General research methodology	`ai-researcher`
Cost improvement program / roadmap	`ai-token-improvement-plan-engineer`
Production context assembly	`ai-context-engineer`
Prompt wording and eval harness	`prompt-engineer`
RAG and agent runtime build	`ai-engineer`
Statistical testing and cohort analysis	`data-scientist`
Adversarial robustness of compressed context	`ai-redteam`
Commercial AI architecture	`applied-ai-architect-commercial-enterprise`

需求	技能
通用研究方法论	`ai-researcher`
成本优化项目/路线图	`ai-token-improvement-plan-engineer`
生产上下文组装	`ai-context-engineer`
提示词措辞与评估框架	`prompt-engineer`
RAG与Agent运行时构建	`ai-engineer`
统计测试与群组分析	`data-scientist`
压缩上下文的对抗鲁棒性	`ai-redteam`
商用AI架构	`applied-ai-architect-commercial-enterprise`

Core Workflows

核心工作流程

1. Research framing (tokens)

1. 研究框架搭建（令牌方向）

Hypothesis, metrics, baselines, budget.

See
references/research_framing_tokens.md
.

假设、指标、基准线、预算。

详见
references/research_framing_tokens.md
。

2. Measurement and instrumentation

2. 测量与工具部署

Token accounting, logging, fair comparison.

See
references/measurement_instrumentation.md
.

令牌统计、日志记录、公平对比。

详见
references/measurement_instrumentation.md
。

3. Experiment design and ablations

3. 实验设计与消融实验

Controls, sweeps, power, stopping rules.

See
references/experiment_design_ablations.md
.

控制变量、参数扫描、统计效力、停止规则。

详见
references/experiment_design_ablations.md
。

4. Context, tokenization, and long-context

4. 上下文、分词与长上下文

Tokenizer, placement, window effects.

See
references/context_tokenization_longcontext.md
.

分词器、信息位置、窗口效应。

详见
references/context_tokenization_longcontext.md
。

5. Compression and efficiency methods

5. 压缩与效率方法

Summarization, routing, distillation research.

See
references/compression_efficiency_methods.md
.

摘要、路由、蒸馏研究。

详见
references/compression_efficiency_methods.md
。

6. Reproducibility and research reporting

6. 可复现性与研究报告撰写

Memos, artifacts, handoff to engineering.

See
references/reproducibility_reporting.md
.

备忘录、成果artifact、向工程团队交接。

详见
references/reproducibility_reporting.md
。

Outputs

产出物

Pre-registration / experiment plan — hypothesis, metrics, stop criteria
Results table — mean ± CI; tokens and quality side by side
Pareto chart narrative — quality vs tokens at operating points
Ablation appendix — what mattered, what did not
Research memo — conclusion, limits, recommended next build
Artifact bundle — configs, seeds, eval scripts, hashed datasets

预注册/实验计划 — 假设、指标、停止标准
结果表格 — 均值±置信区间；令牌数与质量并列展示
帕累托图表说明 — 各运行节点下的质量与令牌数对比
消融实验附录 — 影响因素与非影响因素分析
研究备忘录 — 结论、局限性、后续构建建议
成果包 — 配置文件、随机种子、评估脚本、哈希数据集

Principles

原则

Report tokens and quality together — never optimize one without the other
Match tokenizer and model — counts from the deployment tokenizer/API
Control confounds — temperature, system prompt, tool schemas held fixed across arms
Pre-register primary metric — avoid p-hacking across slice metrics
Separate science from rollout — research recommends;
```
ai-token-improvement-plan-engineer
```
owns program

令牌数与质量同步报告 — 永远不要单独优化其中一项
匹配分词器与模型 — 使用部署环境的分词器/API统计令牌数
控制干扰变量 — 温度、系统提示词、工具架构在各组实验中保持一致
预注册核心指标 — 避免在细分指标中进行p-hacking
区分研究与落地 — 研究提供建议；
```
ai-token-improvement-plan-engineer
```
负责项目落地