technology-selection

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

.NET AI and Machine Learning

.NET 人工智能与机器学习

Inputs

输入项

InputRequiredDescription
Task descriptionYesWhat the AI/ML feature should accomplish (e.g., "classify support tickets", "summarize documents")
Data descriptionYesType and shape of input data (structured/tabular, unstructured text, images, mixed)
Deployment constraintsNoCloud vs. local, latency SLO, cost budget, offline requirements
Existing project contextNoCurrent .csproj, existing packages, target framework
输入项是否必填描述
任务描述AI/ML功能需要实现的目标(例如:"分类支持工单"、"文档摘要")
数据描述输入数据的类型和结构(结构化/表格型、非结构化文本、图像、混合类型)
部署约束云端 vs 本地、延迟SLO、成本预算、离线需求
现有项目上下文当前的.csproj文件、已安装的包、目标框架

Workflow

工作流程

Step 1: Classify the task using the decision tree

步骤1:通过决策树对任务进行分类

Evaluate the developer's task against this decision tree and select the appropriate technology. State which branch applies and why.
Task typeTechnologyRationale
Structured/tabular data: classification, regression, clustering, anomaly detection, recommendationML.NET (
Microsoft.ML
)
Reproducible (given a fixed seed and dataset), no cloud dependency, purpose-built models for these tasks
Natural language understanding, generation, summarization, reasoning over unstructured text (single prompt → response, no tool calling)LLM via Microsoft.Extensions.AI (
IChatClient
)
Requires language model capabilities beyond pattern matching; no orchestration needed
Agentic workflows: tool/function calling, multi-step reasoning, agent loops, multi-agent collaborationMicrosoft Agent Framework (
Microsoft.Agents.AI
) built on top of Microsoft.Extensions.AI
Requires orchestration, tool dispatch, iteration control, and guardrails that
IChatClient
alone does not provide
Building GitHub Copilot extensions, custom agents, or developer workflow toolsGitHub Copilot SDK (
GitHub.Copilot.SDK
)
Integrates with the Copilot agent runtime for IDE and CLI extensibility
Running a pre-trained or fine-tuned custom model in productionONNX Runtime (
Microsoft.ML.OnnxRuntime
)
Hardware-accelerated inference, model-format agnostic
Local/offline LLM inference with no cloud dependencyOllamaSharp with local AI models supported by OllamaPrivacy-sensitive, air-gapped, or cost-constrained scenarios
Semantic search, RAG, or embedding storageMicrosoft.Extensions.VectorData.Abstractions + a vector database provider (e.g., Azure AI Search, Milvus, MongoDB, pgvector, Pinecone, Qdrant, Redis, SQL)Provider-agnostic abstractions for vector similarity search; pair with a database-specific connector package (many are moving to community toolkits)
Ingesting, chunking, and loading documents into a vector storeMicrosoft.Extensions.AI.DataIngestion (preview) + Microsoft.Extensions.VectorData.Abstractions (MEVD)Handles document parsing, text chunking, embedding generation, and upserting into a vector database; pairs with Microsoft.Extensions.VectorData.Abstractions
Both structured ML predictions AND natural language reasoningHybrid: ML.NET for predictions + LLM for reasoning layerKeep loosely coupled; ML.NET handles reproducible scoring, LLM adds explanation
Critical rule: Do NOT use an LLM for tasks that ML.NET handles well (classification on tabular data, regression, clustering). LLMs are slower, more expensive, and non-deterministic for these tasks.
根据以下决策树评估开发者的任务,选择合适的技术。说明适用的分支及原因。
任务类型技术选型理由
结构化/表格型数据:分类、回归、聚类、异常检测、推荐系统ML.NET (
Microsoft.ML
)
可复现(给定固定种子和数据集)、无云端依赖、专为上述任务打造的模型
自然语言理解、生成、摘要、非结构化文本推理(单Prompt→响应,无工具调用)通过Microsoft.Extensions.AI调用LLM (
IChatClient
)
需要超出模式匹配的语言模型能力;无需编排逻辑
智能代理工作流:工具/函数调用、多步推理、代理循环、多代理协作基于Microsoft.Extensions.AI构建的Microsoft Agent Framework (
Microsoft.Agents.AI
)
需要
IChatClient
单独无法提供的编排、工具调度、迭代控制和防护机制
构建GitHub Copilot扩展、自定义代理或开发者工作流工具GitHub Copilot SDK (
GitHub.Copilot.SDK
)
与Copilot代理运行时集成,支持IDE和CLI扩展
在生产环境中运行预训练或微调的自定义模型ONNX Runtime (
Microsoft.ML.OnnxRuntime
)
硬件加速推理、模型格式无关
无云端依赖的本地/离线LLM推理搭配Ollama支持的本地AI模型的OllamaSharp(模型列表见Ollama模型搜索适用于隐私敏感、离线隔离或成本受限的场景
语义搜索、RAG或嵌入存储Microsoft.Extensions.VectorData.Abstractions + 向量数据库提供商(例如Azure AI Search、Milvus、MongoDB、pgvector、Pinecone、Qdrant、Redis、SQL)向量相似度搜索的提供商无关抽象;需搭配数据库特定的连接器包(多数已迁移至社区工具集)
文档摄入、分块并加载至向量存储Microsoft.Extensions.AI.DataIngestion(预览版) + Microsoft.Extensions.VectorData.Abstractions(MEVD)处理文档解析、文本分块、嵌入生成及向量数据库写入;与Microsoft.Extensions.VectorData.Abstractions搭配使用
同时涉及结构化ML预测与自然语言推理混合方案:ML.NET负责预测 + LLM负责推理层保持松耦合;ML.NET处理可复现的评分,LLM提供解释能力
关键规则: 请勿将LLM用于ML.NET可胜任的任务(表格数据分类、回归、聚类)。LLM在这些任务上速度更慢、成本更高且非确定性。

Step 1b: Select the correct library layer

步骤1b:选择正确的库层级

After identifying the task type, select the right library layer. These libraries form a stack — each builds on the one below it. Using the wrong layer is a major source of non-deterministic agent behavior.
LayerLibraryNuGet packageUse when
AbstractionMicrosoft.Extensions.AI (MEAI)
Microsoft.Extensions.AI
You need a provider-agnostic interface for chat, embeddings, or tool calling. This is the foundation — always include it. Use
IChatClient
directly only for simple prompt-in/response-out scenarios with no tool calling or agentic loops. If the task involves tools, agents, or multi-step reasoning, you must add the Orchestration layer above.
Provider SDKOpenAI, Azure.AI.OpenAI, Azure.AI.Inference, OllamaSharp
OpenAI
,
Azure.AI.OpenAI
,
Azure.AI.Inference
,
OllamaSharp
You need a concrete LLM provider implementation. These wire into MEAI via
AddChatClient
. Use
OpenAI
for direct OpenAI access,
Azure.AI.OpenAI
for Azure OpenAI,
Azure.AI.Inference
for Azure AI Foundry / GitHub Models, or
OllamaSharp
for local Ollama. Use directly only if you need provider-specific features not exposed through MEAI.
OrchestrationMicrosoft Agent Framework
Microsoft.Agents.AI
(prerelease)
The task involves tool/function calling, agentic loops, multi-step reasoning, multi-agent coordination, durable context, or graph-based workflows. This is required whenever the scenario involves agents or tools — do not hand-roll tool dispatch loops with
IChatClient
.
Builds on top of MEAI. Note: This package is currently prerelease — use
dotnet add package Microsoft.Agents.AI --prerelease
to install it.
Copilot integrationGitHub Copilot SDK
GitHub.Copilot.SDK
You are building extensions or tools that integrate with the GitHub Copilot runtime — custom agents, IDE extensions, or developer workflow automation that leverages the Copilot agent platform.
确定任务类型后,选择合适的库层级。这些库构成一个栈——每个库都构建在下层库之上。使用错误的层级是导致代理行为非确定性的主要原因。
层级NuGet包适用场景
抽象层Microsoft.Extensions.AI (MEAI)
Microsoft.Extensions.AI
需要聊天、嵌入生成或工具调用的提供商无关接口。这是基础——必须包含。仅当场景为简单的Prompt→响应且无工具调用时,直接使用
IChatClient
。如果任务涉及工具、代理或多步推理,必须添加上层的编排层。
提供商SDKOpenAI、Azure.AI.OpenAI、Azure.AI.Inference、OllamaSharp
OpenAI
Azure.AI.OpenAI
Azure.AI.Inference
OllamaSharp
需要具体的LLM提供商实现。通过
AddChatClient
与MEAI集成。使用
OpenAI
直接访问OpenAI服务,
Azure.AI.OpenAI
访问Azure OpenAI,
Azure.AI.Inference
访问Azure AI Foundry/GitHub Models,
OllamaSharp
访问本地Ollama。仅当需要MEAI未暴露的提供商特定功能时,直接使用该SDK。
编排层Microsoft Agent Framework
Microsoft.Agents.AI
(预发布版)
任务涉及工具/函数调用、代理循环、多步推理、多代理协调、持久化上下文或基于图的工作流。只要场景涉及代理或工具,就必须使用该层——请勿使用
IChatClient
手动实现工具调度循环。
构建在MEAI之上。注意: 该包目前为预发布版——使用
dotnet add package Microsoft.Agents.AI --prerelease
安装。
Copilot集成层GitHub Copilot SDK
GitHub.Copilot.SDK
构建与GitHub Copilot运行时集成的扩展或工具——自定义代理、IDE扩展或利用Copilot代理平台的开发者工作流自动化

Decision rules for library selection

库选择决策规则

  1. Start with MEAI. Every AI integration begins with
    Microsoft.Extensions.AI
    for the
    IChatClient
    /
    IEmbeddingGenerator
    abstractions. This ensures provider-swappability and testability.
  2. Add a provider SDK (
    OpenAI
    ,
    Azure.AI.OpenAI
    ) as the concrete implementation behind MEAI. Do not call the provider SDK directly in business logic — always go through the MEAI abstraction.
  3. Use Agent Framework (
    Microsoft.Agents.AI
    ) for any task that involves tools or agents.
    If the task is a single prompt → response with no tool calling, MEAI is sufficient. You MUST use
    Microsoft.Agents.AI
    when any of these apply:
    • Tool/function calling (agent decides which tools to invoke)
    • Multi-step reasoning with state carried across turns
    • Agentic loops that iterate until a goal is met
    • Multi-agent collaboration with handoff protocols
    • Graph-based or durable workflows
    Do not implement these patterns by hand with
    IChatClient
    — the Agent Framework provides iteration limits, observability, and tool dispatch that are error-prone to reimplement.
  4. Add Copilot SDK only when building Copilot extensions. Use
    GitHub.Copilot.SDK
    when the goal is to build a custom agent or tool that runs inside the GitHub Copilot platform (CLI, IDE, or Copilot Chat). This is not a general-purpose LLM orchestration library — it is specifically for Copilot extensibility.
  5. Never skip layers. Do not use Agent Framework without MEAI underneath. Do not call
    HttpClient
    to OpenAI alongside MEAI in the same workflow. Each layer depends on the one below it.
  1. 从MEAI开始。所有AI集成都以
    Microsoft.Extensions.AI
    IChatClient
    /
    IEmbeddingGenerator
    抽象为起点。这确保了提供商可替换性和可测试性。
  2. 添加提供商SDK
    OpenAI
    Azure.AI.OpenAI
    )作为MEAI的具体实现。不要在业务逻辑中直接调用提供商SDK——始终通过MEAI抽象层调用。
  3. 涉及工具或代理的任务使用Agent Framework(
    Microsoft.Agents.AI
    。如果任务是单Prompt→响应且无工具调用,MEAI已足够。当以下任一情况适用时,必须使用
    Microsoft.Agents.AI
    • 工具/函数调用(代理决定调用哪些工具)
    • 跨轮次携带状态的多步推理
    • 迭代直至达成目标的代理循环
    • 带有交接协议的多代理协作
    • 基于图或持久化的工作流
    请勿使用
    IChatClient
    手动实现这些模式——Agent Framework提供了手动实现易出错的迭代限制、可观测性和工具调度功能。
  4. 仅在构建Copilot扩展时添加Copilot SDK。当目标是构建在GitHub Copilot平台内运行的自定义代理或工具(CLI、IDE扩展或利用Copilot代理平台的开发者工作流自动化)时,使用
    GitHub.Copilot.SDK
  5. 请勿跳过层级。不要在无MEAI的情况下使用Agent Framework。不要在同一工作流中同时使用
    HttpClient
    调用OpenAI和MEAI、Microsoft Agent Framework或Copilot SDK。每个工作流边界选择一个抽象层并坚持使用。请参阅步骤1b的层级规则。

Step 2: Select packages and set up the project

步骤2:选择包并设置项目

Install only the packages needed for the selected technology branch. Do not mix competing abstractions.
仅安装所选技术分支所需的包。不要混合使用相互竞争的抽象。

Classic ML packages

经典机器学习包

xml
<PackageReference Include="Microsoft.ML" Version="4.*" />
<PackageReference Include="Microsoft.ML.AutoML" Version="0.*" />
<!-- Only if custom numerical work is needed: -->
PackageReference Include="System.Numerics.Tensors" Version="10.*"
<PackageReference Include="MathNet.Numerics" Version="5.*" />
<!-- Only for data exploration: -->
<PackageReference Include="Microsoft.Data.Analysis" Version="0.*" />
Do NOT use Accord.NET — it is archived and unmaintained.
xml
<PackageReference Include="Microsoft.ML" Version="4.*" />
<PackageReference Include="Microsoft.ML.AutoML" Version="0.*" />
<!-- 仅当需要自定义数值计算时添加: -->
<PackageReference Include="System.Numerics.Tensors" Version="10.*" />
<PackageReference Include="MathNet.Numerics" Version="5.*" />
<!-- 仅用于数据探索: -->
<PackageReference Include="Microsoft.Data.Analysis" Version="0.*" />
请勿使用 Accord.NET——该项目已归档且不再维护。

Modern AI packages

现代人工智能包

xml
<!-- Always start with the abstraction layer -->
<PackageReference Include="Microsoft.Extensions.AI" Version="9.*" />

<!-- Orchestration (agents, workflows, tools, memory) — prerelease; use dotnet add package Microsoft.Agents.AI --prerelease -->
<PackageReference Include="Microsoft.Agents.AI" Version="1.*-*" />

<!-- Cloud LLM provider (pick one) -->
<PackageReference Include="Azure.AI.OpenAI" Version="2.*" />
<!-- OR -->
<PackageReference Include="OpenAI" Version="2.*" />

<!-- Client-side token counting for cost management -->
    <PackageReference Include="Microsoft.ML.Tokenizers" Version="2.*" 

<!-- Local LLM inference -->
<PackageReference Include="OllamaSharp" Version="5.*" />

<!-- Custom model inference -->
<PackageReference Include="Microsoft.ML.OnnxRuntime" Version="1.*" />

<!-- Vector store abstraction -->
<PackageReference Include="Microsoft.Extensions.VectorData.Abstractions" Version="9.*" />

<!-- Document ingestion, chunking, and vector store loading (preview) -->
<PackageReference Include="Microsoft.Extensions.AI.DataIngestion" Version="9.*-*" />

<!-- Copilot platform extensibility -->
<PackageReference Include="GitHub.Copilot.SDK" Version="1.*" />
Stack coherence rule: Never mix raw SDK calls (
HttpClient
to OpenAI) with
Microsoft.Extensions.AI
, Microsoft Agent Framework, or Copilot SDK in the same workflow. Pick one abstraction layer per workflow boundary and commit to it. See Step 1b for the layering rules.
xml
<!-- 始终从抽象层开始 -->
<PackageReference Include="Microsoft.Extensions.AI" Version="9.*" />

<!-- 编排层(代理、工作流、工具、内存)——预发布版;使用dotnet add package Microsoft.Agents.AI --prerelease安装 -->
<PackageReference Include="Microsoft.Agents.AI" Version="1.*-*" />

<!-- 云端LLM提供商(选其一) -->
<PackageReference Include="Azure.AI.OpenAI" Version="2.*" />
<!-- 或 -->
<PackageReference Include="OpenAI" Version="2.*" />

<!-- 客户端侧Token计数(用于成本管理) -->
<PackageReference Include="Microsoft.ML.Tokenizers" Version="2.*" />

<!-- 本地LLM推理 -->
<PackageReference Include="OllamaSharp" Version="5.*" />

<!-- 自定义模型推理 -->
<PackageReference Include="Microsoft.ML.OnnxRuntime" Version="1.*" />

<!-- 向量存储抽象 -->
<PackageReference Include="Microsoft.Extensions.VectorData.Abstractions" Version="9.*" />

<!-- 文档摄入、分块及向量存储加载(预览版) -->
<PackageReference Include="Microsoft.Extensions.AI.DataIngestion" Version="9.*-*" />

<!-- Copilot平台扩展 -->
<PackageReference Include="GitHub.Copilot.SDK" Version="1.*" />
栈一致性规则: 请勿在同一工作流中混合使用原始SDK调用(如
HttpClient
调用OpenAI)与
Microsoft.Extensions.AI
、Microsoft Agent Framework或Copilot SDK。每个工作流边界选择一个抽象层并坚持使用。请参阅步骤1b的层级规则。

Register services with dependency injection

通过依赖注入注册服务

All AI/ML services must be registered via DI. Never instantiate clients directly in business logic.
csharp
// Configuration via IOptions<T>
services.Configure<AiOptions>(configuration.GetSection("AI"));

// Register the AI client through the abstraction
services.AddChatClient(builder => builder
    .UseOpenAIChatClient("gpt-4o-mini-2024-07-18"));
所有AI/ML服务必须通过DI注册。永远不要在业务逻辑中直接实例化客户端。
csharp
// 通过IOptions<T>配置
services.Configure<AiOptions>(configuration.GetSection("AI"));

// 通过抽象层注册AI客户端
services.AddChatClient(builder => builder
    .UseOpenAIChatClient("gpt-4o-mini-2024-07-18"));

Step 3: Implement with guardrails

步骤3:带防护机制的实现

Apply the guardrails for the selected technology branch. Every generated implementation must follow these rules.
为所选技术分支应用防护机制。所有生成的实现必须遵循以下规则。

Classic ML guardrails

经典机器学习防护机制

  1. Reproducibility: Always set a random seed in the ML context:
    csharp
    var mlContext = new MLContext(seed: 42);
  2. Data splitting: Always split into train/test (and optionally validation). Never evaluate on training data:
    csharp
    var split = mlContext.Data.TrainTestSplit(data, testFraction: 0.2);
  3. Metrics logging: Always compute and log evaluation metrics appropriate to the task:
    csharp
    var metrics = mlContext.BinaryClassification.Evaluate(predictions);
    logger.LogInformation("AUC: {Auc:F4}, F1: {F1:F4}", metrics.AreaUnderRocCurve, metrics.F1Score);
  4. AutoML first: Prefer
    mlContext.Auto()
    for initial model selection, then refine manually.
  5. PredictionEngine pooling: In ASP.NET Core, always use the pooled prediction engine — never a singleton:
    csharp
    services.AddPredictionEnginePool<ModelInput, ModelOutput>()
        .FromFile(modelPath);
  1. 可复现性:始终在ML上下文设置随机种子:
    csharp
    var mlContext = new MLContext(seed: 42);
  2. 数据拆分:始终拆分为训练/测试集(可选验证集)。永远不要在训练数据上评估模型:
    csharp
    var split = mlContext.Data.TrainTestSplit(data, testFraction: 0.2);
  3. 指标日志:始终计算并记录与任务匹配的评估指标:
    csharp
    var metrics = mlContext.BinaryClassification.Evaluate(predictions);
    logger.LogInformation("AUC: {Auc:F4}, F1: {F1:F4}", metrics.AreaUnderRocCurve, metrics.F1Score);
  4. 优先使用AutoML:初始模型选择优先使用
    mlContext.Auto()
    ,再手动优化。
  5. PredictionEngine池化:在ASP.NET Core中,始终使用池化预测引擎——不要使用单例:
    csharp
    services.AddPredictionEnginePool<ModelInput, ModelOutput>()
        .FromFile(modelPath);

LLM integration guardrails

LLM集成防护机制

  1. Temperature: Always set explicitly. Use
    0
    for factual/deterministic tasks:
    csharp
    var options = new ChatOptions
    {
        Temperature = 0f,
        MaxOutputTokens = 1024,
    };
  2. Structured output: Always parse LLM output into strongly-typed objects with fallback handling:
    csharp
    var result = await chatClient.GetResponseAsync<MySchema>(prompt, options, cancellationToken);
  3. Retry logic: Always implement retry with exponential backoff:
    csharp
    services.AddChatClient(builder => builder
        .UseOpenAIChatClient(modelId)
        .Use(new RetryingChatClient(maxRetries: 3)));
  4. Cost control: Always estimate and log token usage. Use Microsoft.ML.Tokenizers to count tokens client-side before sending requests so you can enforce budgets proactively. Choose the smallest model tier that meets quality requirements (e.g., gpt-4o-mini before gpt-4o).
  5. Secret management: Never hardcode API keys. Use Azure Key Vault, user-secrets, or environment variables:
    csharp
    var apiKey = configuration["AI:ApiKey"]
        ?? throw new InvalidOperationException("AI:ApiKey not configured");
  6. Model version pinning: Specify exact model versions to reduce behavioral drift:
    csharp
    // Pin to a specific dated version, not just "gpt-4o"
    var modelId = "gpt-4o-2024-08-06";
  1. Temperature设置:始终显式设置。事实性/确定性任务使用
    0
    csharp
    var options = new ChatOptions
    {
        Temperature = 0f,
        MaxOutputTokens = 1024,
    };
  2. 结构化输出:始终将LLM输出解析为强类型对象并处理回退逻辑:
    csharp
    var result = await chatClient.GetResponseAsync<MySchema>(prompt, options, cancellationToken);
  3. 重试逻辑:始终实现指数退避重试:
    csharp
    services.AddChatClient(builder => builder
        .UseOpenAIChatClient(modelId)
        .Use(new RetryingChatClient(maxRetries: 3)));
  4. 成本控制:始终预估并记录Token使用量。使用Microsoft.ML.Tokenizers在客户端侧准确计数Token,主动执行预算控制。选择满足质量要求的最小模型层级(例如优先使用gpt-4o-mini而非gpt-4o)。
  5. 密钥管理:永远不要硬编码API密钥。使用Azure Key Vault、用户机密或环境变量:
    csharp
    var apiKey = configuration["AI:ApiKey"]
        ?? throw new InvalidOperationException("AI:ApiKey未配置");
  6. 模型版本固定:指定精确的模型版本以减少行为漂移:
    csharp
    // 固定到具体日期版本,而非仅"gpt-4o"
    var modelId = "gpt-4o-2024-08-06";

Agentic workflow guardrails

智能代理工作流防护机制

  1. Use
    Microsoft.Agents.AI
    for all agentic workflows.
    Do not implement tool dispatch loops or multi-step agent reasoning by hand with
    IChatClient
    . The Agent Framework provides
    ChatClientAgent
    (or
    AgentWorker
    ) which handles the tool call → result → re-prompt cycle with built-in guardrails. All rules below assume you are using
    Microsoft.Agents.AI
    .
  2. Iteration limits: Always cap agentic loops to prevent runaway execution:
    csharp
    var settings = new AgentInvokeOptions
    {
        MaximumIterations = 10,
    };
  3. Cost ceiling: Implement a token budget per execution and terminate when reached. Use Microsoft.ML.Tokenizers to count prompt and completion tokens locally and compare against the budget before each iteration.
  4. Observability: Log non-sensitive metadata for every agent step. Never log raw
    message.Content
    — it may contain user prompts, tool outputs, secrets, or PII that persist in plaintext in central logging systems:
    csharp
    await foreach (var message in agent.InvokeStreamingAsync(history, settings))
    {
        logger.LogDebug("Agent step: Role={Role}, ContentLength={Length}",
            message.Role, message.Content?.Length ?? 0);
    }
  5. Tool schemas: Define explicit tool/function schemas with descriptions. Never rely on implicit tool discovery.
  6. Simplicity preference: Prefer single-agent with tools over multi-agent unless the task genuinely requires agent collaboration.
  1. 所有智能代理工作流使用Microsoft.Agents.AI。不要使用
    IChatClient
    手动实现工具调度循环或多步代理推理。Agent Framework提供的
    ChatClientAgent
    (或
    AgentWorker
    )内置防护机制,可处理工具调用→结果→重新Prompt的循环。以下所有规则均假设使用
    Microsoft.Agents.AI
  2. 迭代限制:始终限制代理循环次数以防止无限执行:
    csharp
    var settings = new AgentInvokeOptions
    {
        MaximumIterations = 10,
    };
  3. 成本上限:为每次执行设置Token预算,达到预算时终止。使用Microsoft.ML.Tokenizers在本地计数Prompt和响应Token,并在每次迭代前与预算对比。
  4. 可观测性:记录每个代理步骤的非敏感元数据。永远不要记录原始
    message.Content
    ——它可能包含用户Prompt、工具输出、密钥或PII,会以明文形式保留在中央日志系统中:
    csharp
    await foreach (var message in agent.InvokeStreamingAsync(history, settings))
    {
        logger.LogDebug("代理步骤:角色={Role}, 内容长度={Length}",
            message.Role, message.Content?.Length ?? 0);
    }
  5. 工具Schema:定义带描述的明确工具/函数Schema。永远不要依赖隐式工具发现。
  6. 优先简洁性:除非任务确实需要代理协作,否则优先选择带工具的单代理而非多代理。

RAG guardrails

RAG防护机制

  1. Embedding caching: Never re-embed the same content on every query. Cache embeddings in the vector store.
  2. Chunking strategy: Use semantic chunking (split on paragraph/section boundaries) over fixed-size chunking. Ensure chunks have enough context to be useful on their own.
  3. Relevance thresholds: Do not inject low-relevance chunks into context. Set a minimum similarity score:
    csharp
    var results = await vectorStore.SearchAsync(query, new VectorSearchOptions
    {
        Top = 5,
        MinimumScore = 0.75f,
    });
  4. Source attribution: Track which chunks contributed to the final response. Include source references in the output.
  5. Batch embeddings: Batch embedding API calls where possible to reduce latency and cost.
  1. 嵌入缓存:永远不要在每次查询时重新嵌入相同内容。将嵌入缓存到向量存储中。
  2. 分块策略:使用语义分块(按段落/章节边界拆分)而非固定大小分块。确保每个分块自身包含足够的上下文信息。
  3. 相关性阈值:不要将低相关性分块注入上下文。设置最小相似度分数:
    csharp
    var results = await vectorStore.SearchAsync(query, new VectorSearchOptions
    {
        Top = 5,
        MinimumScore = 0.75f,
    });
  4. 来源归因:跟踪哪些分块对最终响应有贡献。在输出中包含来源引用。
  5. 批量嵌入:尽可能批量调用嵌入API以减少延迟和成本。

Step 4: Handle non-determinism

步骤4:处理非确定性

When the solution involves LLM calls or agentic workflows, explicitly address non-determinism:
  1. Acknowledge it: Inform the developer that LLM outputs are non-deterministic even at temperature 0 (due to batching, quantization, and model updates).
  2. Validate outputs: Implement schema validation and content assertion checks on every LLM response.
  3. Graceful degradation: Design a fallback path for when the LLM returns unexpected, malformed, or empty output:
    csharp
    var response = await chatClient.GetResponseAsync<ClassificationResult>(prompt, options);
    if (response is null || !response.IsValid())
    {
        logger.LogWarning("LLM returned invalid response, falling back to rule-based classifier");
        return ruleBasedClassifier.Classify(input);
    }
  4. Evaluation harness: For any prompt that will be iterated on, recommend creating a golden dataset and evaluation scaffold to measure prompt quality over time.
  5. Model version pinning: Pin to specific dated model versions (e.g.,
    gpt-4o-2024-08-06
    ) to reduce drift between deployments.
当解决方案涉及LLM调用或智能代理工作流时,明确处理非确定性:
  1. 主动告知:告知开发者即使Temperature设为0,LLM输出仍可能非确定性(由于批处理、量化和模型更新)。
  2. 输出验证:对每个LLM响应实现Schema验证和内容断言检查。
  3. 优雅降级:当LLM返回意外、格式错误或空输出时,设计回退路径:
    csharp
    var response = await chatClient.GetResponseAsync<ClassificationResult>(prompt, options);
    if (response is null || !response.IsValid())
    {
        logger.LogWarning("LLM返回无效响应,回退到基于规则的分类器");
        return ruleBasedClassifier.Classify(input);
    }
  4. 评估框架:对于需要迭代优化的Prompt,建议创建黄金数据集和评估脚手架,以随时间衡量Prompt质量。
  5. 模型版本固定:固定到具体日期的模型版本(例如
    gpt-4o-2024-08-06
    )以减少部署间的漂移。

Step 5: Apply performance and cost controls

步骤5:应用性能与成本控制

  1. Connection pooling: Use
    IHttpClientFactory
    and DI-managed clients for all external services.
  2. Response caching: Cache repeated or similar queries. Consider semantic caching for LLM responses where appropriate.
  3. Streaming: Use
    IAsyncEnumerable
    for LLM responses in user-facing scenarios to reduce time-to-first-token:
    csharp
    await foreach (var update in chatClient.GetStreamingResponseAsync(prompt, options))
    {
        yield return update.Text;
    }
  4. Health checks: Implement health checks for external AI service dependencies:
    csharp
    services.AddHealthChecks()
        .AddCheck<OpenAIHealthCheck>("openai");
  5. ML.NET prediction pooling: In web applications, always use
    PredictionEnginePool<TIn, TOut>
    , never a single
    PredictionEngine
    instance (it is not thread-safe).
  1. 连接池:对所有外部服务使用
    IHttpClientFactory
    和DI管理的客户端。
  2. 响应缓存:缓存重复或相似的查询。适当时考虑LLM响应的语义缓存。
  3. 流式输出:在面向用户的场景中使用
    IAsyncEnumerable
    处理LLM响应以减少首Token延迟:
    csharp
    await foreach (var update in chatClient.GetStreamingResponseAsync(prompt, options))
    {
        yield return update.Text;
    }
  4. 健康检查:为外部AI服务依赖项实现健康检查:
    csharp
    services.AddHealthChecks()
        .AddCheck<OpenAIHealthCheck>("openai");
  5. ML.NET预测池化:在Web应用中,始终使用
    PredictionEnginePool<TIn, TOut>
    ,不要使用单个
    PredictionEngine
    实例(它不是线程安全的)。

Step 6: Validate the implementation

步骤6:验证实现

  1. Build the project and verify no warnings:
    bash
    dotnet build -c Release -warnaserror
  2. Run tests, including integration tests that validate AI/ML behavior:
    bash
    dotnet test -c Release
  3. For ML.NET pipelines, verify that evaluation metrics meet the project's quality bar and that the model can be serialized and loaded correctly.
  4. For LLM integrations, verify that structured output parsing handles both valid and malformed responses.
  5. For RAG pipelines, verify that retrieval returns relevant results and that irrelevant chunks are filtered out.
  1. 构建项目并确认无警告:
    bash
    dotnet build -c Release -warnaserror
  2. 运行测试,包括验证AI/ML行为的集成测试:
    bash
    dotnet test -c Release
  3. 对于ML.NET管道,验证评估指标符合项目质量标准,且模型可序列化和正确加载。
  4. 对于LLM集成,验证结构化输出解析可处理有效和格式错误的响应。
  5. 对于RAG管道,验证检索返回相关结果,且无关分块已被过滤。

Validation

验证清单

  • Technology selection follows the decision tree — LLMs are not used for tasks ML.NET handles
  • All AI/ML services are registered via dependency injection
  • Configuration uses
    IOptions<T>
    pattern — no hardcoded values
  • API keys are loaded from secure sources — not in source code or committed config files
  • ML.NET pipelines set a random seed and split data for evaluation
  • LLM calls set temperature, max tokens, and retry logic explicitly
  • Agentic workflows have iteration limits and cost ceilings
  • RAG pipelines implement chunking, relevance thresholds, and source attribution
  • Non-deterministic outputs have validation and fallback paths
  • dotnet build -c Release -warnaserror
    completes cleanly
  • 技术选型遵循决策树——未将LLM用于ML.NET可处理的任务
  • 所有AI/ML服务通过依赖注入注册
  • 配置使用
    IOptions<T>
    模式——无硬编码值
  • API密钥从安全源加载——未包含在源代码或已提交的配置文件中
  • ML.NET管道设置了随机种子并拆分数据用于评估
  • LLM调用显式设置了Temperature、最大Token数和重试逻辑
  • 智能代理工作流有迭代限制和成本上限
  • RAG管道实现了分块、相关性阈值和来源归因
  • 非确定性输出有验证和回退路径
  • dotnet build -c Release -warnaserror
    可正常完成

Anti-Patterns to Reject

需拒绝的反模式

When reviewing or generating code, flag and redirect the developer if any of these patterns are detected:
Anti-patternRedirect
Using an LLM for classification on structured/tabular dataUse ML.NET instead — it is faster, cheaper, and deterministic
Calling LLM APIs without retry or timeout logicAdd
RetryingChatClient
or Polly-based retry with exponential backoff
Storing API keys in
appsettings.json
committed to source control
Use user-secrets (dev), environment variables, or Azure Key Vault (prod)
Using Accord.NET for new projectsMigrate to ML.NET — Accord.NET is archived and unmaintained
Building custom neural networks in .NET from scratchUse a pre-trained model via ONNX Runtime or call an LLM API
RAG without chunking strategy or relevance filteringImplement semantic chunking and set a minimum similarity score threshold
Agentic loops without iteration limits or cost ceilingsAdd
MaximumIterations
and a token budget ceiling
Using MEAI
IChatClient
with raw
HttpClient
calls to the same provider
Pick one abstraction layer and commit to it
Implementing tool calling or agentic loops manually with
IChatClient
instead of using
Microsoft.Agents.AI
Use
Microsoft.Agents.AI
— it provides iteration limits (
MaximumIterations
), built-in tool dispatch, observability hooks, and cost controls. Hand-rolled loops lack these guardrails.
Using Agent Framework for a single prompt→response callUse MEAI
IChatClient
directly — Agent Framework is for multi-step orchestration
Using Copilot SDK for general-purpose LLM appsCopilot SDK is for Copilot platform extensions only — use MEAI + Agent Framework for standalone apps
Calling OpenAI SDK directly in business logic instead of through MEAIRegister the provider via
AddChatClient
and depend on
IChatClient
in business code
Using
PredictionEngine
as a singleton in ASP.NET Core
Use
PredictionEnginePool<TIn, TOut>
PredictionEngine
is not thread-safe
Using
Func<ReadOnlySpan<T>>
for delegates with ref struct parameters
Define a custom delegate type — ref structs cannot be generic type arguments
Using
Microsoft.SemanticKernel
for new projects
Use
Microsoft.Extensions.AI
+
Microsoft.Agents.AI
— Semantic Kernel is superseded by these newer abstractions for LLM orchestration and tool calling
在评审或生成代码时,如果检测到以下模式,需标记并引导开发者修正:
反模式修正方向
使用LLM处理结构化/表格数据的分类改用ML.NET——速度更快、成本更低且确定性
调用LLM API时无重试或超时逻辑添加
RetryingChatClient
或基于Polly的指数退避重试
将API密钥存储在已提交到源代码控制的
appsettings.json
开发环境使用用户机密,生产环境使用环境变量或Azure Key Vault
在新项目中使用Accord.NET迁移到ML.NET——Accord.NET已归档且不再维护
在.NET中从头构建自定义神经网络通过ONNX Runtime使用预训练模型,或调用LLM API
RAG无分块策略或相关性过滤实现语义分块并设置最小相似度分数阈值
智能代理循环无迭代限制或成本上限添加
MaximumIterations
和Token预算上限
同时使用MEAI
IChatClient
和对同一提供商的原始
HttpClient
调用
选择一个抽象层并坚持使用
使用
IChatClient
手动实现工具调用或智能代理循环,而非使用
Microsoft.Agents.AI
使用
Microsoft.Agents.AI
——它提供迭代限制(
MaximumIterations
)、内置工具调度、可观测性钩子和成本控制。手动实现的循环缺乏这些防护机制
使用Agent Framework处理单Prompt→响应调用直接使用MEAI
IChatClient
——Agent Framework用于多步编排
使用Copilot SDK构建通用LLM应用Copilot SDK仅用于Copilot平台扩展——独立应用使用MEAI + Agent Framework
在业务逻辑中直接调用OpenAI SDK,而非通过MEAI通过
AddChatClient
注册提供商,并在业务代码中依赖
IChatClient
在ASP.NET Core中使用
PredictionEngine
作为单例
使用
PredictionEnginePool<TIn, TOut>
——
PredictionEngine
不是线程安全的
对带ref struct参数的委托使用
Func<ReadOnlySpan<T>>
定义自定义委托类型——ref struct不能作为泛型类型参数
在新项目中使用
Microsoft.SemanticKernel
使用
Microsoft.Extensions.AI
+
Microsoft.Agents.AI
——Semantic Kernel已被这些更新的LLM编排和工具调用抽象取代

Common Pitfalls

常见陷阱

PitfallSolution
Over-engineering with LLMsStart with the simplest approach (rules, ML.NET) and add LLM capability only when simpler methods fall short
Evaluating ML models on training dataAlways use
TrainTestSplit
and report metrics on the held-out test set
LLM output drift between deploymentsPin to specific dated model versions (e.g.,
gpt-4o-2024-08-06
)
Token cost surprisesSet
MaxOutputTokens
, use Microsoft.ML.Tokenizers for accurate client-side token counting, log token counts per request, and alert on budget thresholds
Non-reproducible ML trainingSet
MLContext(seed: N)
and version your training data alongside the code
RAG returning irrelevant contextSet a minimum similarity score and limit the number of injected chunks
Cold start latency on ML.NET modelsPre-warm the
PredictionEnginePool
during application startup
Microsoft Agent Framework + raw OpenAI SDK in same classChoose one orchestration layer per workflow boundary
陷阱解决方案
过度使用LLM从最简单的方案开始(规则引擎、ML.NET),仅当简单方法无法满足需求时再添加LLM能力
在训练数据上评估ML模型始终使用
TrainTestSplit
并在预留的测试集上报告指标
部署间LLM输出漂移固定到具体日期的模型版本(例如
gpt-4o-2024-08-06
Token成本超支设置
MaxOutputTokens
,使用Microsoft.ML.Tokenizers进行准确的客户端Token计数,记录每个请求的Token数,并在预算阈值触发告警
ML训练不可复现设置
MLContext(seed: N)
并将训练数据与代码一起版本化
RAG返回无关上下文设置最小相似度分数并限制注入的分块数量
ML.NET模型冷启动延迟在应用启动时预热
PredictionEnginePool
同一类中同时使用Microsoft Agent Framework和原始OpenAI SDK每个工作流边界选择一个编排层