technology-selection

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

.NET AI and Machine Learning

.NET 人工智能与机器学习

Inputs

输入项

Input	Required	Description
Task description	Yes	What the AI/ML feature should accomplish (e.g., "classify support tickets", "summarize documents")
Data description	Yes	Type and shape of input data (structured/tabular, unstructured text, images, mixed)
Deployment constraints	No	Cloud vs. local, latency SLO, cost budget, offline requirements
Existing project context	No	Current .csproj, existing packages, target framework

输入项	是否必填	描述
任务描述	是	AI/ML功能需要实现的目标（例如："分类支持工单"、"文档摘要"）
数据描述	是	输入数据的类型和结构（结构化/表格型、非结构化文本、图像、混合类型）
部署约束	否	云端 vs 本地、延迟SLO、成本预算、离线需求
现有项目上下文	否	当前的.csproj文件、已安装的包、目标框架

Workflow

工作流程

Step 1: Classify the task using the decision tree

步骤1：通过决策树对任务进行分类

Evaluate the developer's task against this decision tree and select the appropriate technology. State which branch applies and why.

Task type	Technology	Rationale
Structured/tabular data: classification, regression, clustering, anomaly detection, recommendation	ML.NET ( `Microsoft.ML` )	Reproducible (given a fixed seed and dataset), no cloud dependency, purpose-built models for these tasks
Natural language understanding, generation, summarization, reasoning over unstructured text (single prompt → response, no tool calling)	LLM via Microsoft.Extensions.AI ( `IChatClient` )	Requires language model capabilities beyond pattern matching; no orchestration needed
Agentic workflows: tool/function calling, multi-step reasoning, agent loops, multi-agent collaboration	Microsoft Agent Framework ( `Microsoft.Agents.AI` ) built on top of Microsoft.Extensions.AI	Requires orchestration, tool dispatch, iteration control, and guardrails that `IChatClient` alone does not provide
Building GitHub Copilot extensions, custom agents, or developer workflow tools	GitHub Copilot SDK ( `GitHub.Copilot.SDK` )	Integrates with the Copilot agent runtime for IDE and CLI extensibility
Running a pre-trained or fine-tuned custom model in production	ONNX Runtime ( `Microsoft.ML.OnnxRuntime` )	Hardware-accelerated inference, model-format agnostic
Local/offline LLM inference with no cloud dependency	OllamaSharp with local AI models supported by Ollama	Privacy-sensitive, air-gapped, or cost-constrained scenarios
Semantic search, RAG, or embedding storage	Microsoft.Extensions.VectorData.Abstractions + a vector database provider (e.g., Azure AI Search, Milvus, MongoDB, pgvector, Pinecone, Qdrant, Redis, SQL)	Provider-agnostic abstractions for vector similarity search; pair with a database-specific connector package (many are moving to community toolkits)
Ingesting, chunking, and loading documents into a vector store	Microsoft.Extensions.AI.DataIngestion (preview) + Microsoft.Extensions.VectorData.Abstractions (MEVD)	Handles document parsing, text chunking, embedding generation, and upserting into a vector database; pairs with Microsoft.Extensions.VectorData.Abstractions
Both structured ML predictions AND natural language reasoning	Hybrid: ML.NET for predictions + LLM for reasoning layer	Keep loosely coupled; ML.NET handles reproducible scoring, LLM adds explanation

Critical rule: Do NOT use an LLM for tasks that ML.NET handles well (classification on tabular data, regression, clustering). LLMs are slower, more expensive, and non-deterministic for these tasks.

根据以下决策树评估开发者的任务，选择合适的技术。说明适用的分支及原因。

任务类型	技术选型	理由
结构化/表格型数据：分类、回归、聚类、异常检测、推荐系统	ML.NET ( `Microsoft.ML` )	可复现（给定固定种子和数据集）、无云端依赖、专为上述任务打造的模型
自然语言理解、生成、摘要、非结构化文本推理（单Prompt→响应，无工具调用）	通过Microsoft.Extensions.AI调用LLM ( `IChatClient` )	需要超出模式匹配的语言模型能力；无需编排逻辑
智能代理工作流：工具/函数调用、多步推理、代理循环、多代理协作	基于Microsoft.Extensions.AI构建的Microsoft Agent Framework ( `Microsoft.Agents.AI` )	需要 `IChatClient` 单独无法提供的编排、工具调度、迭代控制和防护机制
构建GitHub Copilot扩展、自定义代理或开发者工作流工具	GitHub Copilot SDK ( `GitHub.Copilot.SDK` )	与Copilot代理运行时集成，支持IDE和CLI扩展
在生产环境中运行预训练或微调的自定义模型	ONNX Runtime ( `Microsoft.ML.OnnxRuntime` )	硬件加速推理、模型格式无关
无云端依赖的本地/离线LLM推理	搭配Ollama支持的本地AI模型的OllamaSharp（模型列表见Ollama模型搜索）	适用于隐私敏感、离线隔离或成本受限的场景
语义搜索、RAG或嵌入存储	Microsoft.Extensions.VectorData.Abstractions + 向量数据库提供商（例如Azure AI Search、Milvus、MongoDB、pgvector、Pinecone、Qdrant、Redis、SQL）	向量相似度搜索的提供商无关抽象；需搭配数据库特定的连接器包（多数已迁移至社区工具集）
文档摄入、分块并加载至向量存储	Microsoft.Extensions.AI.DataIngestion（预览版） + Microsoft.Extensions.VectorData.Abstractions（MEVD）	处理文档解析、文本分块、嵌入生成及向量数据库写入；与Microsoft.Extensions.VectorData.Abstractions搭配使用
同时涉及结构化ML预测与自然语言推理	混合方案：ML.NET负责预测 + LLM负责推理层	保持松耦合；ML.NET处理可复现的评分，LLM提供解释能力

关键规则： 请勿将LLM用于ML.NET可胜任的任务（表格数据分类、回归、聚类）。LLM在这些任务上速度更慢、成本更高且非确定性。

Step 1b: Select the correct library layer

步骤1b：选择正确的库层级

After identifying the task type, select the right library layer. These libraries form a stack — each builds on the one below it. Using the wrong layer is a major source of non-deterministic agent behavior.

Layer	Library	NuGet package	Use when
Abstraction	Microsoft.Extensions.AI (MEAI)	`Microsoft.Extensions.AI`	You need a provider-agnostic interface for chat, embeddings, or tool calling. This is the foundation — always include it. Use `IChatClient` directly only for simple prompt-in/response-out scenarios with no tool calling or agentic loops. If the task involves tools, agents, or multi-step reasoning, you must add the Orchestration layer above.
Provider SDK	OpenAI, Azure.AI.OpenAI, Azure.AI.Inference, OllamaSharp	`OpenAI` , `Azure.AI.OpenAI` , `Azure.AI.Inference` , `OllamaSharp`	You need a concrete LLM provider implementation. These wire into MEAI via `AddChatClient` . Use `OpenAI` for direct OpenAI access, `Azure.AI.OpenAI` for Azure OpenAI, `Azure.AI.Inference` for Azure AI Foundry / GitHub Models, or `OllamaSharp` for local Ollama. Use directly only if you need provider-specific features not exposed through MEAI.
Orchestration	Microsoft Agent Framework	`Microsoft.Agents.AI` (prerelease)	The task involves tool/function calling, agentic loops, multi-step reasoning, multi-agent coordination, durable context, or graph-based workflows. This is required whenever the scenario involves agents or tools — do not hand-roll tool dispatch loops with `IChatClient` . Builds on top of MEAI. Note: This package is currently prerelease — use `dotnet add package Microsoft.Agents.AI --prerelease` to install it.
Copilot integration	GitHub Copilot SDK	`GitHub.Copilot.SDK`	You are building extensions or tools that integrate with the GitHub Copilot runtime — custom agents, IDE extensions, or developer workflow automation that leverages the Copilot agent platform.

确定任务类型后，选择合适的库层级。这些库构成一个栈——每个库都构建在下层库之上。使用错误的层级是导致代理行为非确定性的主要原因。

层级	库	NuGet包	适用场景
抽象层	Microsoft.Extensions.AI (MEAI)	`Microsoft.Extensions.AI`	需要聊天、嵌入生成或工具调用的提供商无关接口。这是基础——必须包含。仅当场景为简单的Prompt→响应且无工具调用时，直接使用 `IChatClient` 。如果任务涉及工具、代理或多步推理，必须添加上层的编排层。
提供商SDK	OpenAI、Azure.AI.OpenAI、Azure.AI.Inference、OllamaSharp	`OpenAI` 、 `Azure.AI.OpenAI` 、 `Azure.AI.Inference` 、 `OllamaSharp`	需要具体的LLM提供商实现。通过 `AddChatClient` 与MEAI集成。使用 `OpenAI` 直接访问OpenAI服务， `Azure.AI.OpenAI` 访问Azure OpenAI， `Azure.AI.Inference` 访问Azure AI Foundry/GitHub Models， `OllamaSharp` 访问本地Ollama。仅当需要MEAI未暴露的提供商特定功能时，直接使用该SDK。
编排层	Microsoft Agent Framework	`Microsoft.Agents.AI` （预发布版）	任务涉及工具/函数调用、代理循环、多步推理、多代理协调、持久化上下文或基于图的工作流。只要场景涉及代理或工具，就必须使用该层——请勿使用 `IChatClient` 手动实现工具调度循环。构建在MEAI之上。注意：该包目前为预发布版——使用 `dotnet add package Microsoft.Agents.AI --prerelease` 安装。
Copilot集成层	GitHub Copilot SDK	`GitHub.Copilot.SDK`	构建与GitHub Copilot运行时集成的扩展或工具——自定义代理、IDE扩展或利用Copilot代理平台的开发者工作流自动化

Decision rules for library selection

库选择决策规则

Start with MEAI. Every AI integration begins with
```
Microsoft.Extensions.AI
```
for the
```
IChatClient
```
/
```
IEmbeddingGenerator
```
abstractions. This ensures provider-swappability and testability.
Add a provider SDK (
```
OpenAI
```
,
```
Azure.AI.OpenAI
```
) as the concrete implementation behind MEAI. Do not call the provider SDK directly in business logic — always go through the MEAI abstraction.
Use Agent Framework (
Microsoft.Agents.AI
) for any task that involves tools or agents. If the task is a single prompt → response with no tool calling, MEAI is sufficient. You MUST use
Microsoft.Agents.AI
when any of these apply:
- Tool/function calling (agent decides which tools to invoke)
- Multi-step reasoning with state carried across turns
- Agentic loops that iterate until a goal is met
- Multi-agent collaboration with handoff protocols
- Graph-based or durable workflows
Do not implement these patterns by hand with
```
IChatClient
```
— the Agent Framework provides iteration limits, observability, and tool dispatch that are error-prone to reimplement.
Add Copilot SDK only when building Copilot extensions. Use
```
GitHub.Copilot.SDK
```
when the goal is to build a custom agent or tool that runs inside the GitHub Copilot platform (CLI, IDE, or Copilot Chat). This is not a general-purpose LLM orchestration library — it is specifically for Copilot extensibility.
Never skip layers. Do not use Agent Framework without MEAI underneath. Do not call
```
HttpClient
```
to OpenAI alongside MEAI in the same workflow. Each layer depends on the one below it.

从MEAI开始。所有AI集成都以
```
Microsoft.Extensions.AI
```
的
```
IChatClient
```
/
```
IEmbeddingGenerator
```
抽象为起点。这确保了提供商可替换性和可测试性。
添加提供商SDK（
```
OpenAI
```
、
```
Azure.AI.OpenAI
```
）作为MEAI的具体实现。不要在业务逻辑中直接调用提供商SDK——始终通过MEAI抽象层调用。
涉及工具或代理的任务使用Agent Framework（
Microsoft.Agents.AI
）。如果任务是单Prompt→响应且无工具调用，MEAI已足够。当以下任一情况适用时，必须使用
Microsoft.Agents.AI
：
- 工具/函数调用（代理决定调用哪些工具）
- 跨轮次携带状态的多步推理
- 迭代直至达成目标的代理循环
- 带有交接协议的多代理协作
- 基于图或持久化的工作流
请勿使用
```
IChatClient
```
手动实现这些模式——Agent Framework提供了手动实现易出错的迭代限制、可观测性和工具调度功能。
仅在构建Copilot扩展时添加Copilot SDK。当目标是构建在GitHub Copilot平台内运行的自定义代理或工具（CLI、IDE扩展或利用Copilot代理平台的开发者工作流自动化）时，使用
```
GitHub.Copilot.SDK
```
。
请勿跳过层级。不要在无MEAI的情况下使用Agent Framework。不要在同一工作流中同时使用
```
HttpClient
```
调用OpenAI和MEAI、Microsoft Agent Framework或Copilot SDK。每个工作流边界选择一个抽象层并坚持使用。请参阅步骤1b的层级规则。

Step 2: Select packages and set up the project

步骤2：选择包并设置项目

Install only the packages needed for the selected technology branch. Do not mix competing abstractions.

仅安装所选技术分支所需的包。不要混合使用相互竞争的抽象。

Classic ML packages

经典机器学习包

xml

<PackageReference Include="Microsoft.ML" Version="4.*" />
<PackageReference Include="Microsoft.ML.AutoML" Version="0.*" />
<!-- Only if custom numerical work is needed: -->
PackageReference Include="System.Numerics.Tensors" Version="10.*"
<PackageReference Include="MathNet.Numerics" Version="5.*" />
<!-- Only for data exploration: -->
<PackageReference Include="Microsoft.Data.Analysis" Version="0.*" />

Do NOT use Accord.NET — it is archived and unmaintained.

xml

<PackageReference Include="Microsoft.ML" Version="4.*" />
<PackageReference Include="Microsoft.ML.AutoML" Version="0.*" />
<!-- 仅当需要自定义数值计算时添加： -->
<PackageReference Include="System.Numerics.Tensors" Version="10.*" />
<PackageReference Include="MathNet.Numerics" Version="5.*" />
<!-- 仅用于数据探索： -->
<PackageReference Include="Microsoft.Data.Analysis" Version="0.*" />

请勿使用 Accord.NET——该项目已归档且不再维护。

Modern AI packages

现代人工智能包

xml

<!-- Always start with the abstraction layer -->
<PackageReference Include="Microsoft.Extensions.AI" Version="9.*" />

<!-- Orchestration (agents, workflows, tools, memory) — prerelease; use dotnet add package Microsoft.Agents.AI --prerelease -->
<PackageReference Include="Microsoft.Agents.AI" Version="1.*-*" />

<!-- Cloud LLM provider (pick one) -->
<PackageReference Include="Azure.AI.OpenAI" Version="2.*" />
<!-- OR -->
<PackageReference Include="OpenAI" Version="2.*" />

<!-- Client-side token counting for cost management -->
    <PackageReference Include="Microsoft.ML.Tokenizers" Version="2.*" 

<!-- Local LLM inference -->
<PackageReference Include="OllamaSharp" Version="5.*" />

<!-- Custom model inference -->
<PackageReference Include="Microsoft.ML.OnnxRuntime" Version="1.*" />

<!-- Vector store abstraction -->
<PackageReference Include="Microsoft.Extensions.VectorData.Abstractions" Version="9.*" />

<!-- Document ingestion, chunking, and vector store loading (preview) -->
<PackageReference Include="Microsoft.Extensions.AI.DataIngestion" Version="9.*-*" />

<!-- Copilot platform extensibility -->
<PackageReference Include="GitHub.Copilot.SDK" Version="1.*" />

Stack coherence rule: Never mix raw SDK calls (
HttpClient
to OpenAI) with
Microsoft.Extensions.AI
, Microsoft Agent Framework, or Copilot SDK in the same workflow. Pick one abstraction layer per workflow boundary and commit to it. See Step 1b for the layering rules.

xml

<!-- 始终从抽象层开始 -->
<PackageReference Include="Microsoft.Extensions.AI" Version="9.*" />

<!-- 编排层（代理、工作流、工具、内存）——预发布版；使用dotnet add package Microsoft.Agents.AI --prerelease安装 -->
<PackageReference Include="Microsoft.Agents.AI" Version="1.*-*" />

<!-- 云端LLM提供商（选其一） -->
<PackageReference Include="Azure.AI.OpenAI" Version="2.*" />
<!-- 或 -->
<PackageReference Include="OpenAI" Version="2.*" />

<!-- 客户端侧Token计数（用于成本管理） -->
<PackageReference Include="Microsoft.ML.Tokenizers" Version="2.*" />

<!-- 本地LLM推理 -->
<PackageReference Include="OllamaSharp" Version="5.*" />

<!-- 自定义模型推理 -->
<PackageReference Include="Microsoft.ML.OnnxRuntime" Version="1.*" />

<!-- 向量存储抽象 -->
<PackageReference Include="Microsoft.Extensions.VectorData.Abstractions" Version="9.*" />

<!-- 文档摄入、分块及向量存储加载（预览版） -->
<PackageReference Include="Microsoft.Extensions.AI.DataIngestion" Version="9.*-*" />

<!-- Copilot平台扩展 -->
<PackageReference Include="GitHub.Copilot.SDK" Version="1.*" />

栈一致性规则： 请勿在同一工作流中混合使用原始SDK调用（如
HttpClient
调用OpenAI）与
Microsoft.Extensions.AI
、Microsoft Agent Framework或Copilot SDK。每个工作流边界选择一个抽象层并坚持使用。请参阅步骤1b的层级规则。

Register services with dependency injection

通过依赖注入注册服务

All AI/ML services must be registered via DI. Never instantiate clients directly in business logic.

csharp

// Configuration via IOptions<T>
services.Configure<AiOptions>(configuration.GetSection("AI"));

// Register the AI client through the abstraction
services.AddChatClient(builder => builder
    .UseOpenAIChatClient("gpt-4o-mini-2024-07-18"));

所有AI/ML服务必须通过DI注册。永远不要在业务逻辑中直接实例化客户端。

csharp

// 通过IOptions<T>配置
services.Configure<AiOptions>(configuration.GetSection("AI"));

// 通过抽象层注册AI客户端
services.AddChatClient(builder => builder
    .UseOpenAIChatClient("gpt-4o-mini-2024-07-18"));

Step 3: Implement with guardrails

步骤3：带防护机制的实现

Apply the guardrails for the selected technology branch. Every generated implementation must follow these rules.

为所选技术分支应用防护机制。所有生成的实现必须遵循以下规则。

Classic ML guardrails

经典机器学习防护机制

Reproducibility: Always set a random seed in the ML context:
csharp
```
var mlContext = new MLContext(seed: 42);
```
Data splitting: Always split into train/test (and optionally validation). Never evaluate on training data:
csharp
```
var split = mlContext.Data.TrainTestSplit(data, testFraction: 0.2);
```

Metrics logging: Always compute and log evaluation metrics appropriate to the task:

csharp

var metrics = mlContext.BinaryClassification.Evaluate(predictions);
logger.LogInformation("AUC: {Auc:F4}, F1: {F1:F4}", metrics.AreaUnderRocCurve, metrics.F1Score);

AutoML first: Prefer
```
mlContext.Auto()
```
for initial model selection, then refine manually.
PredictionEngine pooling: In ASP.NET Core, always use the pooled prediction engine — never a singleton:
csharp
```
services.AddPredictionEnginePool<ModelInput, ModelOutput>()
    .FromFile(modelPath);
```

可复现性：始终在ML上下文设置随机种子：
csharp
```
var mlContext = new MLContext(seed: 42);
```
数据拆分：始终拆分为训练/测试集（可选验证集）。永远不要在训练数据上评估模型：
csharp
```
var split = mlContext.Data.TrainTestSplit(data, testFraction: 0.2);
```

指标日志：始终计算并记录与任务匹配的评估指标：

csharp

var metrics = mlContext.BinaryClassification.Evaluate(predictions);
logger.LogInformation("AUC: {Auc:F4}, F1: {F1:F4}", metrics.AreaUnderRocCurve, metrics.F1Score);

优先使用AutoML：初始模型选择优先使用
```
mlContext.Auto()
```
，再手动优化。
PredictionEngine池化：在ASP.NET Core中，始终使用池化预测引擎——不要使用单例：
csharp
```
services.AddPredictionEnginePool<ModelInput, ModelOutput>()
    .FromFile(modelPath);
```

LLM integration guardrails

LLM集成防护机制

Temperature: Always set explicitly. Use

for factual/deterministic tasks:

csharp

var options = new ChatOptions
{
    Temperature = 0f,
    MaxOutputTokens = 1024,
};

Structured output: Always parse LLM output into strongly-typed objects with fallback handling:
csharp
```
var result = await chatClient.GetResponseAsync<MySchema>(prompt, options, cancellationToken);
```

Retry logic: Always implement retry with exponential backoff:

csharp

services.AddChatClient(builder => builder
    .UseOpenAIChatClient(modelId)
    .Use(new RetryingChatClient(maxRetries: 3)));

Cost control: Always estimate and log token usage. Use Microsoft.ML.Tokenizers to count tokens client-side before sending requests so you can enforce budgets proactively. Choose the smallest model tier that meets quality requirements (e.g., gpt-4o-mini before gpt-4o).

Secret management: Never hardcode API keys. Use Azure Key Vault, user-secrets, or environment variables:

csharp

var apiKey = configuration["AI:ApiKey"]
    ?? throw new InvalidOperationException("AI:ApiKey not configured");

Model version pinning: Specify exact model versions to reduce behavioral drift:

csharp

// Pin to a specific dated version, not just "gpt-4o"
var modelId = "gpt-4o-2024-08-06";

Temperature设置：始终显式设置。事实性/确定性任务使用

：

csharp

var options = new ChatOptions
{
    Temperature = 0f,
    MaxOutputTokens = 1024,
};

结构化输出：始终将LLM输出解析为强类型对象并处理回退逻辑：

csharp

var result = await chatClient.GetResponseAsync<MySchema>(prompt, options, cancellationToken);

重试逻辑：始终实现指数退避重试：

csharp

services.AddChatClient(builder => builder
    .UseOpenAIChatClient(modelId)
    .Use(new RetryingChatClient(maxRetries: 3)));

成本控制：始终预估并记录Token使用量。使用Microsoft.ML.Tokenizers在客户端侧准确计数Token，主动执行预算控制。选择满足质量要求的最小模型层级（例如优先使用gpt-4o-mini而非gpt-4o）。

密钥管理：永远不要硬编码API密钥。使用Azure Key Vault、用户机密或环境变量：

csharp

var apiKey = configuration["AI:ApiKey"]
    ?? throw new InvalidOperationException("AI:ApiKey未配置");

模型版本固定：指定精确的模型版本以减少行为漂移：

csharp

// 固定到具体日期版本，而非仅"gpt-4o"
var modelId = "gpt-4o-2024-08-06";

Agentic workflow guardrails

智能代理工作流防护机制

Use
Microsoft.Agents.AI
for all agentic workflows. Do not implement tool dispatch loops or multi-step agent reasoning by hand with
```
IChatClient
```
. The Agent Framework provides
```
ChatClientAgent
```
(or
```
AgentWorker
```
) which handles the tool call → result → re-prompt cycle with built-in guardrails. All rules below assume you are using
```
Microsoft.Agents.AI
```
.

Iteration limits: Always cap agentic loops to prevent runaway execution:

csharp

var settings = new AgentInvokeOptions
{
    MaximumIterations = 10,
};

Cost ceiling: Implement a token budget per execution and terminate when reached. Use Microsoft.ML.Tokenizers to count prompt and completion tokens locally and compare against the budget before each iteration.

Observability: Log non-sensitive metadata for every agent step. Never log raw

message.Content

— it may contain user prompts, tool outputs, secrets, or PII that persist in plaintext in central logging systems:

csharp

await foreach (var message in agent.InvokeStreamingAsync(history, settings))
{
    logger.LogDebug("Agent step: Role={Role}, ContentLength={Length}",
        message.Role, message.Content?.Length ?? 0);
}

Tool schemas: Define explicit tool/function schemas with descriptions. Never rely on implicit tool discovery.
Simplicity preference: Prefer single-agent with tools over multi-agent unless the task genuinely requires agent collaboration.

所有智能代理工作流使用Microsoft.Agents.AI。不要使用
```
IChatClient
```
手动实现工具调度循环或多步代理推理。Agent Framework提供的
```
ChatClientAgent
```
（或
```
AgentWorker
```
）内置防护机制，可处理工具调用→结果→重新Prompt的循环。以下所有规则均假设使用
```
Microsoft.Agents.AI
```
。

迭代限制：始终限制代理循环次数以防止无限执行：

csharp

var settings = new AgentInvokeOptions
{
    MaximumIterations = 10,
};

成本上限：为每次执行设置Token预算，达到预算时终止。使用Microsoft.ML.Tokenizers在本地计数Prompt和响应Token，并在每次迭代前与预算对比。

可观测性：记录每个代理步骤的非敏感元数据。永远不要记录原始

message.Content

——它可能包含用户Prompt、工具输出、密钥或PII，会以明文形式保留在中央日志系统中：

csharp

await foreach (var message in agent.InvokeStreamingAsync(history, settings))
{
    logger.LogDebug("代理步骤：角色={Role}, 内容长度={Length}",
        message.Role, message.Content?.Length ?? 0);
}

工具Schema：定义带描述的明确工具/函数Schema。永远不要依赖隐式工具发现。
优先简洁性：除非任务确实需要代理协作，否则优先选择带工具的单代理而非多代理。

RAG guardrails

RAG防护机制

Embedding caching: Never re-embed the same content on every query. Cache embeddings in the vector store.
Chunking strategy: Use semantic chunking (split on paragraph/section boundaries) over fixed-size chunking. Ensure chunks have enough context to be useful on their own.

Relevance thresholds: Do not inject low-relevance chunks into context. Set a minimum similarity score:

csharp

var results = await vectorStore.SearchAsync(query, new VectorSearchOptions
{
    Top = 5,
    MinimumScore = 0.75f,
});

Source attribution: Track which chunks contributed to the final response. Include source references in the output.
Batch embeddings: Batch embedding API calls where possible to reduce latency and cost.

嵌入缓存：永远不要在每次查询时重新嵌入相同内容。将嵌入缓存到向量存储中。
分块策略：使用语义分块（按段落/章节边界拆分）而非固定大小分块。确保每个分块自身包含足够的上下文信息。

相关性阈值：不要将低相关性分块注入上下文。设置最小相似度分数：

csharp

var results = await vectorStore.SearchAsync(query, new VectorSearchOptions
{
    Top = 5,
    MinimumScore = 0.75f,
});

来源归因：跟踪哪些分块对最终响应有贡献。在输出中包含来源引用。
批量嵌入：尽可能批量调用嵌入API以减少延迟和成本。

Step 4: Handle non-determinism

步骤4：处理非确定性

When the solution involves LLM calls or agentic workflows, explicitly address non-determinism:

Acknowledge it: Inform the developer that LLM outputs are non-deterministic even at temperature 0 (due to batching, quantization, and model updates).
Validate outputs: Implement schema validation and content assertion checks on every LLM response.

Graceful degradation: Design a fallback path for when the LLM returns unexpected, malformed, or empty output:

csharp

var response = await chatClient.GetResponseAsync<ClassificationResult>(prompt, options);
if (response is null || !response.IsValid())
{
    logger.LogWarning("LLM returned invalid response, falling back to rule-based classifier");
    return ruleBasedClassifier.Classify(input);
}

Evaluation harness: For any prompt that will be iterated on, recommend creating a golden dataset and evaluation scaffold to measure prompt quality over time.
Model version pinning: Pin to specific dated model versions (e.g.,
```
gpt-4o-2024-08-06
```
) to reduce drift between deployments.

当解决方案涉及LLM调用或智能代理工作流时，明确处理非确定性：

主动告知：告知开发者即使Temperature设为0，LLM输出仍可能非确定性（由于批处理、量化和模型更新）。
输出验证：对每个LLM响应实现Schema验证和内容断言检查。

优雅降级：当LLM返回意外、格式错误或空输出时，设计回退路径：

csharp

var response = await chatClient.GetResponseAsync<ClassificationResult>(prompt, options);
if (response is null || !response.IsValid())
{
    logger.LogWarning("LLM返回无效响应，回退到基于规则的分类器");
    return ruleBasedClassifier.Classify(input);
}

评估框架：对于需要迭代优化的Prompt，建议创建黄金数据集和评估脚手架，以随时间衡量Prompt质量。
模型版本固定：固定到具体日期的模型版本（例如
```
gpt-4o-2024-08-06
```
）以减少部署间的漂移。

Step 5: Apply performance and cost controls

步骤5：应用性能与成本控制

Connection pooling: Use
```
IHttpClientFactory
```
and DI-managed clients for all external services.
Response caching: Cache repeated or similar queries. Consider semantic caching for LLM responses where appropriate.

Streaming: Use

IAsyncEnumerable

for LLM responses in user-facing scenarios to reduce time-to-first-token:

csharp

await foreach (var update in chatClient.GetStreamingResponseAsync(prompt, options))
{
    yield return update.Text;
}

Health checks: Implement health checks for external AI service dependencies:
csharp
```
services.AddHealthChecks()
    .AddCheck<OpenAIHealthCheck>("openai");
```
ML.NET prediction pooling: In web applications, always use
```
PredictionEnginePool<TIn, TOut>
```
, never a single
```
PredictionEngine
```
instance (it is not thread-safe).

连接池：对所有外部服务使用
```
IHttpClientFactory
```
和DI管理的客户端。
响应缓存：缓存重复或相似的查询。适当时考虑LLM响应的语义缓存。

流式输出：在面向用户的场景中使用

IAsyncEnumerable

处理LLM响应以减少首Token延迟：

csharp

await foreach (var update in chatClient.GetStreamingResponseAsync(prompt, options))
{
    yield return update.Text;
}

健康检查：为外部AI服务依赖项实现健康检查：

csharp

services.AddHealthChecks()
    .AddCheck<OpenAIHealthCheck>("openai");

ML.NET预测池化：在Web应用中，始终使用
```
PredictionEnginePool<TIn, TOut>
```
，不要使用单个
```
PredictionEngine
```
实例（它不是线程安全的）。

Step 6: Validate the implementation

步骤6：验证实现

Build the project and verify no warnings:
bash
```
dotnet build -c Release -warnaserror
```
Run tests, including integration tests that validate AI/ML behavior:
bash
```
dotnet test -c Release
```
For ML.NET pipelines, verify that evaluation metrics meet the project's quality bar and that the model can be serialized and loaded correctly.
For LLM integrations, verify that structured output parsing handles both valid and malformed responses.
For RAG pipelines, verify that retrieval returns relevant results and that irrelevant chunks are filtered out.

构建项目并确认无警告：
bash
```
dotnet build -c Release -warnaserror
```
运行测试，包括验证AI/ML行为的集成测试：
bash
```
dotnet test -c Release
```
对于ML.NET管道，验证评估指标符合项目质量标准，且模型可序列化和正确加载。
对于LLM集成，验证结构化输出解析可处理有效和格式错误的响应。
对于RAG管道，验证检索返回相关结果，且无关分块已被过滤。

Validation

验证清单

Technology selection follows the decision tree — LLMs are not used for tasks ML.NET handles
All AI/ML services are registered via dependency injection
Configuration uses
```
IOptions<T>
```
pattern — no hardcoded values
API keys are loaded from secure sources — not in source code or committed config files
ML.NET pipelines set a random seed and split data for evaluation
LLM calls set temperature, max tokens, and retry logic explicitly
Agentic workflows have iteration limits and cost ceilings
RAG pipelines implement chunking, relevance thresholds, and source attribution
Non-deterministic outputs have validation and fallback paths
```
dotnet build -c Release -warnaserror
```
completes cleanly

技术选型遵循决策树——未将LLM用于ML.NET可处理的任务
所有AI/ML服务通过依赖注入注册
配置使用
```
IOptions<T>
```
模式——无硬编码值
API密钥从安全源加载——未包含在源代码或已提交的配置文件中
ML.NET管道设置了随机种子并拆分数据用于评估
LLM调用显式设置了Temperature、最大Token数和重试逻辑
智能代理工作流有迭代限制和成本上限
RAG管道实现了分块、相关性阈值和来源归因
非确定性输出有验证和回退路径
```
dotnet build -c Release -warnaserror
```
可正常完成

Anti-Patterns to Reject

需拒绝的反模式

When reviewing or generating code, flag and redirect the developer if any of these patterns are detected:

Anti-pattern	Redirect
Using an LLM for classification on structured/tabular data	Use ML.NET instead — it is faster, cheaper, and deterministic
Calling LLM APIs without retry or timeout logic	Add `RetryingChatClient` or Polly-based retry with exponential backoff
Storing API keys in `appsettings.json` committed to source control	Use user-secrets (dev), environment variables, or Azure Key Vault (prod)
Using Accord.NET for new projects	Migrate to ML.NET — Accord.NET is archived and unmaintained
Building custom neural networks in .NET from scratch	Use a pre-trained model via ONNX Runtime or call an LLM API
RAG without chunking strategy or relevance filtering	Implement semantic chunking and set a minimum similarity score threshold
Agentic loops without iteration limits or cost ceilings	Add `MaximumIterations` and a token budget ceiling
Using MEAI `IChatClient` with raw `HttpClient` calls to the same provider	Pick one abstraction layer and commit to it
Implementing tool calling or agentic loops manually with `IChatClient` instead of using `Microsoft.Agents.AI`	Use `Microsoft.Agents.AI` — it provides iteration limits ( `MaximumIterations` ), built-in tool dispatch, observability hooks, and cost controls. Hand-rolled loops lack these guardrails.
Using Agent Framework for a single prompt→response call	Use MEAI `IChatClient` directly — Agent Framework is for multi-step orchestration
Using Copilot SDK for general-purpose LLM apps	Copilot SDK is for Copilot platform extensions only — use MEAI + Agent Framework for standalone apps
Calling OpenAI SDK directly in business logic instead of through MEAI	Register the provider via `AddChatClient` and depend on `IChatClient` in business code
Using `PredictionEngine` as a singleton in ASP.NET Core	Use `PredictionEnginePool<TIn, TOut>` — `PredictionEngine` is not thread-safe
Using `Func<ReadOnlySpan<T>>` for delegates with ref struct parameters	Define a custom delegate type — ref structs cannot be generic type arguments
Using `Microsoft.SemanticKernel` for new projects	Use `Microsoft.Extensions.AI` + `Microsoft.Agents.AI` — Semantic Kernel is superseded by these newer abstractions for LLM orchestration and tool calling

在评审或生成代码时，如果检测到以下模式，需标记并引导开发者修正：

反模式	修正方向
使用LLM处理结构化/表格数据的分类	改用ML.NET——速度更快、成本更低且确定性
调用LLM API时无重试或超时逻辑	添加 `RetryingChatClient` 或基于Polly的指数退避重试
将API密钥存储在已提交到源代码控制的 `appsettings.json` 中	开发环境使用用户机密，生产环境使用环境变量或Azure Key Vault
在新项目中使用Accord.NET	迁移到ML.NET——Accord.NET已归档且不再维护
在.NET中从头构建自定义神经网络	通过ONNX Runtime使用预训练模型，或调用LLM API
RAG无分块策略或相关性过滤	实现语义分块并设置最小相似度分数阈值
智能代理循环无迭代限制或成本上限	添加 `MaximumIterations` 和Token预算上限
同时使用MEAI `IChatClient` 和对同一提供商的原始 `HttpClient` 调用	选择一个抽象层并坚持使用
使用 `IChatClient` 手动实现工具调用或智能代理循环，而非使用 `Microsoft.Agents.AI`	使用 `Microsoft.Agents.AI` ——它提供迭代限制（ `MaximumIterations` ）、内置工具调度、可观测性钩子和成本控制。手动实现的循环缺乏这些防护机制
使用Agent Framework处理单Prompt→响应调用	直接使用MEAI `IChatClient` ——Agent Framework用于多步编排
使用Copilot SDK构建通用LLM应用	Copilot SDK仅用于Copilot平台扩展——独立应用使用MEAI + Agent Framework
在业务逻辑中直接调用OpenAI SDK，而非通过MEAI	通过 `AddChatClient` 注册提供商，并在业务代码中依赖 `IChatClient`
在ASP.NET Core中使用 `PredictionEngine` 作为单例	使用 `PredictionEnginePool<TIn, TOut>` —— `PredictionEngine` 不是线程安全的
对带ref struct参数的委托使用 `Func<ReadOnlySpan<T>>`	定义自定义委托类型——ref struct不能作为泛型类型参数
在新项目中使用 `Microsoft.SemanticKernel`	使用 `Microsoft.Extensions.AI` + `Microsoft.Agents.AI` ——Semantic Kernel已被这些更新的LLM编排和工具调用抽象取代

Common Pitfalls

常见陷阱

Pitfall	Solution
Over-engineering with LLMs	Start with the simplest approach (rules, ML.NET) and add LLM capability only when simpler methods fall short
Evaluating ML models on training data	Always use `TrainTestSplit` and report metrics on the held-out test set
LLM output drift between deployments	Pin to specific dated model versions (e.g., `gpt-4o-2024-08-06` )
Token cost surprises	Set `MaxOutputTokens` , use Microsoft.ML.Tokenizers for accurate client-side token counting, log token counts per request, and alert on budget thresholds
Non-reproducible ML training	Set `MLContext(seed: N)` and version your training data alongside the code
RAG returning irrelevant context	Set a minimum similarity score and limit the number of injected chunks
Cold start latency on ML.NET models	Pre-warm the `PredictionEnginePool` during application startup
Microsoft Agent Framework + raw OpenAI SDK in same class	Choose one orchestration layer per workflow boundary

陷阱	解决方案
过度使用LLM	从最简单的方案开始（规则引擎、ML.NET），仅当简单方法无法满足需求时再添加LLM能力
在训练数据上评估ML模型	始终使用 `TrainTestSplit` 并在预留的测试集上报告指标
部署间LLM输出漂移	固定到具体日期的模型版本（例如 `gpt-4o-2024-08-06` ）
Token成本超支	设置 `MaxOutputTokens` ，使用Microsoft.ML.Tokenizers进行准确的客户端Token计数，记录每个请求的Token数，并在预算阈值触发告警
ML训练不可复现	设置 `MLContext(seed: N)` 并将训练数据与代码一起版本化
RAG返回无关上下文	设置最小相似度分数并限制注入的分块数量
ML.NET模型冷启动延迟	在应用启动时预热 `PredictionEnginePool`
同一类中同时使用Microsoft Agent Framework和原始OpenAI SDK	每个工作流边界选择一个编排层