mistral

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Mistral

Mistral

Mistral AI focuses on efficiency and coding capabilities. Their "Mixture of Experts" (MoE) architecture (Mixtral) changed the game.
Mistral AI 专注于效率编码能力。其“混合专家”(MoE)架构(Mixtral)颠覆了行业格局。

When to Use

适用场景

  • Coding: Mistral Large 2 (Codestral) is specifically optimized for code generation.
  • Efficiency: Mixtral 8x7B offers GPT-3.5+ performance at a fraction of the inference cost.
  • Open Weights: Apache 2.0 licenses (for smaller models).
  • 编码任务:Mistral Large 2(Codestral)专为代码生成优化。
  • 高效推理:Mixtral 8x7B 能达到GPT-3.5+的性能水平,而推理成本仅为其一小部分。
  • 开源权重:小模型采用Apache 2.0许可证。

Core Concepts

核心概念

MoE (Mixture of Experts)

MoE(混合专家)

Only a subset of parameters (experts) are active per token. High quality, low compute.
每个token仅激活部分参数(专家模块),兼顾高质量与低算力消耗。

Codestral

Codestral

A model trained specifically on 80+ programming languages.
一款针对80余种编程语言训练的模型。

Le Chat

Le Chat

Mistral's chat interface (
chat.mistral.ai
).
Mistral的聊天界面(
chat.mistral.ai
)。

Best Practices (2025)

2025年最佳实践

Do:
  • Use
    codestral-mamba
    : For infinite context window coding tasks (linear time complexity).
  • Deploy via vLLM: Mistral models run exceptionally well on vLLM.
Don't:
  • Don't ignore small models: Mistral NeMo (12B) is surprisingly capable for RAG.
推荐做法
  • 使用
    codestral-mamba
    :适用于无限上下文窗口的编码任务(线性时间复杂度)。
  • 通过vLLM部署:Mistral模型在vLLM上的运行表现极为出色。
不推荐做法
  • 不要忽视小模型:Mistral NeMo(12B)在RAG场景中的表现出人意料地出色。

References

参考资料