research-engineer-scientist-tokens

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Research Engineer / Research Scientist, Tokens

研究工程师/研究科学家(令牌方向)

When to Use

适用场景

  • Frame research questions on tokens, context length, or inference cost
  • Design experiments with baselines, ablations, and statistical rigor
  • Build benchmarks for tokens-per-successful-task, effective context, cache leverage
  • Measure tokenizer and formatting effects on length and model behavior
  • Evaluate compression, summarization, routing, or distillation for token savings
  • Analyze long-context phenomena (needle, lost-in-middle, attention budget)
  • Write research memos with reproducible methods and honest limitations
  • Translate findings into actionable thresholds for engineering and product
  • 提出关于tokens、上下文长度或推理成本的研究问题
  • 设计具备基准线、消融实验和统计严谨性的实验
  • 构建针对“每成功任务令牌数”、有效上下文、缓存利用率的基准测试
  • 测量分词器与格式对长度及模型行为的影响
  • 评估用于节省令牌的压缩、摘要、路由或蒸馏方法
  • 分析长上下文现象(如藏针测试、中间信息丢失、注意力预算)
  • 撰写包含可复现方法与客观局限性说明的研究备忘录
  • 将研究成果转化为工程与产品团队可落地的阈值标准

When NOT to Use

不适用场景

  • Executive token reduction program with phased rollout →
    ai-token-improvement-plan-engineer
  • Implement context packing, compaction code paths →
    ai-context-engineer
  • Rewrite one production prompt →
    prompt-engineer
  • General literature survey unrelated to tokens →
    ai-researcher
  • Production RAG/agent deployment →
    ai-engineer
  • Classical ML without LLM token focus →
    data-scientist
  • 含分阶段落地的高管级令牌缩减项目 →
    ai-token-improvement-plan-engineer
  • 实现上下文打包、压缩代码路径 →
    ai-context-engineer
  • 重写单条生产提示词 →
    prompt-engineer
  • 与tokens无关的通用文献调研 →
    ai-researcher
  • 生产环境RAG/Agent部署 →
    ai-engineer
  • 无LLM令牌聚焦的传统机器学习工作 →
    data-scientist

Related skills

相关技能

NeedSkill
General research methodology
ai-researcher
Cost improvement program / roadmap
ai-token-improvement-plan-engineer
Production context assembly
ai-context-engineer
Prompt wording and eval harness
prompt-engineer
RAG and agent runtime build
ai-engineer
Statistical testing and cohort analysis
data-scientist
Adversarial robustness of compressed context
ai-redteam
Commercial AI architecture
applied-ai-architect-commercial-enterprise
需求技能
通用研究方法论
ai-researcher
成本优化项目/路线图
ai-token-improvement-plan-engineer
生产上下文组装
ai-context-engineer
提示词措辞与评估框架
prompt-engineer
RAG与Agent运行时构建
ai-engineer
统计测试与群组分析
data-scientist
压缩上下文的对抗鲁棒性
ai-redteam
商用AI架构
applied-ai-architect-commercial-enterprise

Core Workflows

核心工作流程

1. Research framing (tokens)

1. 研究框架搭建(令牌方向)

Hypothesis, metrics, baselines, budget.
See
references/research_framing_tokens.md
.
假设、指标、基准线、预算。
详见
references/research_framing_tokens.md

2. Measurement and instrumentation

2. 测量与工具部署

Token accounting, logging, fair comparison.
See
references/measurement_instrumentation.md
.
令牌统计、日志记录、公平对比。
详见
references/measurement_instrumentation.md

3. Experiment design and ablations

3. 实验设计与消融实验

Controls, sweeps, power, stopping rules.
See
references/experiment_design_ablations.md
.
控制变量、参数扫描、统计效力、停止规则。
详见
references/experiment_design_ablations.md

4. Context, tokenization, and long-context

4. 上下文、分词与长上下文

Tokenizer, placement, window effects.
See
references/context_tokenization_longcontext.md
.
分词器、信息位置、窗口效应。
详见
references/context_tokenization_longcontext.md

5. Compression and efficiency methods

5. 压缩与效率方法

Summarization, routing, distillation research.
See
references/compression_efficiency_methods.md
.
摘要、路由、蒸馏研究。
详见
references/compression_efficiency_methods.md

6. Reproducibility and research reporting

6. 可复现性与研究报告撰写

Memos, artifacts, handoff to engineering.
See
references/reproducibility_reporting.md
.
备忘录、成果artifact、向工程团队交接。
详见
references/reproducibility_reporting.md

Outputs

产出物

  • Pre-registration / experiment plan — hypothesis, metrics, stop criteria
  • Results table — mean ± CI; tokens and quality side by side
  • Pareto chart narrative — quality vs tokens at operating points
  • Ablation appendix — what mattered, what did not
  • Research memo — conclusion, limits, recommended next build
  • Artifact bundle — configs, seeds, eval scripts, hashed datasets
  • 预注册/实验计划 — 假设、指标、停止标准
  • 结果表格 — 均值±置信区间;令牌数与质量并列展示
  • 帕累托图表说明 — 各运行节点下的质量与令牌数对比
  • 消融实验附录 — 影响因素与非影响因素分析
  • 研究备忘录 — 结论、局限性、后续构建建议
  • 成果包 — 配置文件、随机种子、评估脚本、哈希数据集

Principles

原则

  • Report tokens and quality together — never optimize one without the other
  • Match tokenizer and model — counts from the deployment tokenizer/API
  • Control confounds — temperature, system prompt, tool schemas held fixed across arms
  • Pre-register primary metric — avoid p-hacking across slice metrics
  • Separate science from rollout — research recommends;
    ai-token-improvement-plan-engineer
    owns program
  • 令牌数与质量同步报告 — 永远不要单独优化其中一项
  • 匹配分词器与模型 — 使用部署环境的分词器/API统计令牌数
  • 控制干扰变量 — 温度、系统提示词、工具架构在各组实验中保持一致
  • 预注册核心指标 — 避免在细分指标中进行p-hacking
  • 区分研究与落地 — 研究提供建议;
    ai-token-improvement-plan-engineer
    负责项目落地