research-engineer-scientist-tokens
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseResearch Engineer / Research Scientist, Tokens
研究工程师/研究科学家(令牌方向)
When to Use
适用场景
- Frame research questions on tokens, context length, or inference cost
- Design experiments with baselines, ablations, and statistical rigor
- Build benchmarks for tokens-per-successful-task, effective context, cache leverage
- Measure tokenizer and formatting effects on length and model behavior
- Evaluate compression, summarization, routing, or distillation for token savings
- Analyze long-context phenomena (needle, lost-in-middle, attention budget)
- Write research memos with reproducible methods and honest limitations
- Translate findings into actionable thresholds for engineering and product
- 提出关于tokens、上下文长度或推理成本的研究问题
- 设计具备基准线、消融实验和统计严谨性的实验
- 构建针对“每成功任务令牌数”、有效上下文、缓存利用率的基准测试
- 测量分词器与格式对长度及模型行为的影响
- 评估用于节省令牌的压缩、摘要、路由或蒸馏方法
- 分析长上下文现象(如藏针测试、中间信息丢失、注意力预算)
- 撰写包含可复现方法与客观局限性说明的研究备忘录
- 将研究成果转化为工程与产品团队可落地的阈值标准
When NOT to Use
不适用场景
- Executive token reduction program with phased rollout →
ai-token-improvement-plan-engineer - Implement context packing, compaction code paths →
ai-context-engineer - Rewrite one production prompt →
prompt-engineer - General literature survey unrelated to tokens →
ai-researcher - Production RAG/agent deployment →
ai-engineer - Classical ML without LLM token focus →
data-scientist
- 含分阶段落地的高管级令牌缩减项目 →
ai-token-improvement-plan-engineer - 实现上下文打包、压缩代码路径 →
ai-context-engineer - 重写单条生产提示词 →
prompt-engineer - 与tokens无关的通用文献调研 →
ai-researcher - 生产环境RAG/Agent部署 →
ai-engineer - 无LLM令牌聚焦的传统机器学习工作 →
data-scientist
Related skills
相关技能
| Need | Skill |
|---|---|
| General research methodology | |
| Cost improvement program / roadmap | |
| Production context assembly | |
| Prompt wording and eval harness | |
| RAG and agent runtime build | |
| Statistical testing and cohort analysis | |
| Adversarial robustness of compressed context | |
| Commercial AI architecture | |
| 需求 | 技能 |
|---|---|
| 通用研究方法论 | |
| 成本优化项目/路线图 | |
| 生产上下文组装 | |
| 提示词措辞与评估框架 | |
| RAG与Agent运行时构建 | |
| 统计测试与群组分析 | |
| 压缩上下文的对抗鲁棒性 | |
| 商用AI架构 | |
Core Workflows
核心工作流程
1. Research framing (tokens)
1. 研究框架搭建(令牌方向)
Hypothesis, metrics, baselines, budget.
See .
references/research_framing_tokens.md假设、指标、基准线、预算。
详见 。
references/research_framing_tokens.md2. Measurement and instrumentation
2. 测量与工具部署
Token accounting, logging, fair comparison.
See .
references/measurement_instrumentation.md令牌统计、日志记录、公平对比。
详见 。
references/measurement_instrumentation.md3. Experiment design and ablations
3. 实验设计与消融实验
Controls, sweeps, power, stopping rules.
See .
references/experiment_design_ablations.md控制变量、参数扫描、统计效力、停止规则。
详见 。
references/experiment_design_ablations.md4. Context, tokenization, and long-context
4. 上下文、分词与长上下文
Tokenizer, placement, window effects.
See .
references/context_tokenization_longcontext.md分词器、信息位置、窗口效应。
详见 。
references/context_tokenization_longcontext.md5. Compression and efficiency methods
5. 压缩与效率方法
Summarization, routing, distillation research.
See .
references/compression_efficiency_methods.md摘要、路由、蒸馏研究。
详见 。
references/compression_efficiency_methods.md6. Reproducibility and research reporting
6. 可复现性与研究报告撰写
Memos, artifacts, handoff to engineering.
See .
references/reproducibility_reporting.md备忘录、成果artifact、向工程团队交接。
详见 。
references/reproducibility_reporting.mdOutputs
产出物
- Pre-registration / experiment plan — hypothesis, metrics, stop criteria
- Results table — mean ± CI; tokens and quality side by side
- Pareto chart narrative — quality vs tokens at operating points
- Ablation appendix — what mattered, what did not
- Research memo — conclusion, limits, recommended next build
- Artifact bundle — configs, seeds, eval scripts, hashed datasets
- 预注册/实验计划 — 假设、指标、停止标准
- 结果表格 — 均值±置信区间;令牌数与质量并列展示
- 帕累托图表说明 — 各运行节点下的质量与令牌数对比
- 消融实验附录 — 影响因素与非影响因素分析
- 研究备忘录 — 结论、局限性、后续构建建议
- 成果包 — 配置文件、随机种子、评估脚本、哈希数据集
Principles
原则
- Report tokens and quality together — never optimize one without the other
- Match tokenizer and model — counts from the deployment tokenizer/API
- Control confounds — temperature, system prompt, tool schemas held fixed across arms
- Pre-register primary metric — avoid p-hacking across slice metrics
- Separate science from rollout — research recommends; owns program
ai-token-improvement-plan-engineer
- 令牌数与质量同步报告 — 永远不要单独优化其中一项
- 匹配分词器与模型 — 使用部署环境的分词器/API统计令牌数
- 控制干扰变量 — 温度、系统提示词、工具架构在各组实验中保持一致
- 预注册核心指标 — 避免在细分指标中进行p-hacking
- 区分研究与落地 — 研究提供建议;负责项目落地
ai-token-improvement-plan-engineer