axiom-foundation-models

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Foundation Models — On-Device AI for Apple Platforms

Foundation Models — Apple平台的端侧AI

When to Use This Skill

何时使用该技能

Use when:

Implementing on-device AI features with Foundation Models
Adding text summarization, classification, or extraction capabilities
Creating structured output from LLM responses
Building tool-calling patterns for external data integration
Streaming generated content for better UX
Debugging Foundation Models issues (context overflow, slow generation, wrong output)
Deciding between Foundation Models vs server LLMs (ChatGPT, Claude, etc.)

适用于以下场景：

使用Foundation Models实现端侧AI功能
添加文本摘要、分类或提取能力
从大语言模型（LLM）响应中创建结构化输出
构建用于外部数据集成的工具调用模式
流式传输生成内容以提升用户体验（UX）
调试Foundation Models相关问题（上下文溢出、生成缓慢、输出错误）
选择Foundation Models还是云端大语言模型（ChatGPT、Claude等）

Related Skills

Red Flags — Anti-Patterns That Will Fail

警示信号——会导致失败的反模式

❌ Using for World Knowledge

❌ 用于获取通用知识

Why it fails: The on-device model is 3 billion parameters, optimized for summarization, extraction, classification — NOT world knowledge or complex reasoning.

Example of wrong use:

swift

// ❌ BAD - Asking for world knowledge
let session = LanguageModelSession()
let response = try await session.respond(to: "What's the capital of France?")

Why: Model will hallucinate or give low-quality answers. It's trained for content generation, not encyclopedic knowledge.

Correct approach: Use server LLMs (ChatGPT, Claude) for world knowledge, or provide factual data through Tool calling.

失败原因：端侧模型为30亿参数，针对摘要、提取、分类进行优化——不适合通用知识或复杂推理场景。

错误示例：

swift

// ❌ BAD - Asking for world knowledge
let session = LanguageModelSession()
let response = try await session.respond(to: "What's the capital of France?")

原因：模型会产生幻觉或输出低质量答案。它是为内容生成训练的，而非百科知识。

正确做法：使用云端大语言模型（ChatGPT、Claude）获取通用知识，或通过工具调用提供真实数据。

❌ Blocking Main Thread

❌ 阻塞主线程

Why it fails:

session.respond()

async

but if called synchronously on main thread, freezes UI for seconds.

Example of wrong use:

swift

// ❌ BAD - Blocking main thread
Button("Generate") {
    let response = try await session.respond(to: prompt) // UI frozen!
}

Why: Generation takes 1-5 seconds. User sees frozen app, bad reviews follow.

Correct approach:

swift

// ✅ GOOD - Async on background
Button("Generate") {
    Task {
        let response = try await session.respond(to: prompt)
        // Update UI with response
    }
}

失败原因：

session.respond()

是异步方法，但如果在主线程同步调用，会导致UI冻结数秒。

错误示例：

swift

// ❌ BAD - Blocking main thread
Button("Generate") {
    let response = try await session.respond(to: prompt) // UI frozen!
}

原因：生成过程需要1-5秒。用户会看到应用冻结，进而给出差评。

正确做法：

swift

// ✅ GOOD - Async on background
Button("Generate") {
    Task {
        let response = try await session.respond(to: prompt)
        // Update UI with response
    }
}

❌ Manual JSON Parsing

❌ 手动JSON解析

Why it fails: Prompting for JSON and parsing with JSONDecoder leads to hallucinated keys, invalid JSON, no type safety.

Example of wrong use:

swift

// ❌ BAD - Manual JSON parsing
let prompt = "Generate a person with name and age as JSON"
let response = try await session.respond(to: prompt)
let data = response.content.data(using: .utf8)!
let person = try JSONDecoder().decode(Person.self, from: data) // CRASHES!

Why: Model might output

{firstName: "John"}

when you expect

{name: "John"}

. Or invalid JSON entirely.

Correct approach:

swift

// ✅ GOOD - @Generable guarantees structure
@Generable
struct Person {
    let name: String
    let age: Int
}

let response = try await session.respond(
    to: "Generate a person",
    generating: Person.self
)
// response.content is type-safe Person instance

失败原因：提示生成JSON并使用JSONDecoder解析会导致键幻觉、无效JSON、无类型安全问题。

错误示例：

swift

// ❌ BAD - Manual JSON parsing
let prompt = "Generate a person with name and age as JSON"
let response = try await session.respond(to: prompt)
let data = response.content.data(using: .utf8)!
let person = try JSONDecoder().decode(Person.self, from: data) // CRASHES!

原因：模型可能输出

{firstName: "John"}

，而你期望的是

{name: "John"}

。或者输出完全无效的JSON。

正确做法：

swift

// ✅ GOOD - @Generable guarantees structure
@Generable
struct Person {
    let name: String
    let age: Int
}

let response = try await session.respond(
    to: "Generate a person",
    generating: Person.self
)
// response.content is type-safe Person instance

❌ Ignoring Availability Check

❌ 忽略可用性检查

Why it fails: Foundation Models only runs on Apple Intelligence devices in supported regions. App crashes or shows errors without check.

Example of wrong use:

swift

// ❌ BAD - No availability check
let session = LanguageModelSession() // Might fail!

Correct approach:

swift

// ✅ GOOD - Check first
switch SystemLanguageModel.default.availability {
case .available:
    let session = LanguageModelSession()
    // proceed
case .unavailable(let reason):
    // Show graceful UI: "AI features require Apple Intelligence"
}

失败原因：Foundation Models仅在支持Apple Intelligence的设备及地区运行。如果不检查，应用会崩溃或显示错误。

错误示例：

swift

// ❌ BAD - No availability check
let session = LanguageModelSession() // Might fail!

正确做法：

swift

// ✅ GOOD - Check first
switch SystemLanguageModel.default.availability {
case .available:
    let session = LanguageModelSession()
    // proceed
case .unavailable(let reason):
    // Show graceful UI: "AI features require Apple Intelligence"
}

❌ Single Huge Prompt

❌ 单个超大提示词

Why it fails: 4096 token context window (input + output). One massive prompt hits limit, gives poor results.

Example of wrong use:

swift

// ❌ BAD - Everything in one prompt
let prompt = """
    Generate a 7-day itinerary for Tokyo including hotels, restaurants,
    activities for each day, transportation details, budget breakdown...
    """
// Exceeds context, poor quality

Correct approach: Break into smaller tasks, use tools for external data, multi-turn conversation.

失败原因：上下文窗口为4096个token（输入+输出）。一个庞大的提示词会超出限制，导致结果质量差。

错误示例：

swift

// ❌ BAD - Everything in one prompt
let prompt = """
    Generate a 7-day itinerary for Tokyo including hotels, restaurants,
    activities for each day, transportation details, budget breakdown...
    """
// Exceeds context, poor quality

正确做法：拆分成更小的任务，使用工具获取外部数据，采用多轮对话。

❌ Not Handling Context Overflow

❌ 不处理上下文溢出

Why it fails: Multi-turn conversations grow transcript. Eventually exceeds 4096 tokens, throws error, conversation ends.

Must handle:

swift

// ✅ GOOD - Handle overflow
do {
    let response = try await session.respond(to: prompt)
} catch LanguageModelSession.GenerationError.exceededContextWindowSize {
    // Condense transcript and create new session
    session = condensedSession(from: session)
}

失败原因：多轮对话会增加对话记录，最终会超过4096个token，抛出错误，对话终止。

必须处理：

swift

// ✅ GOOD - Handle overflow
do {
    let response = try await session.respond(to: prompt)
} catch LanguageModelSession.GenerationError.exceededContextWindowSize {
    // Condense transcript and create new session
    session = condensedSession(from: session)
}

❌ Not Handling Guardrail Violations

❌ 不处理内容规范违规

Why it fails: Model has content policy. Certain prompts trigger guardrails, throw error.

Must handle:

swift

// ✅ GOOD - Handle guardrails
do {
    let response = try await session.respond(to: userInput)
} catch LanguageModelSession.GenerationError.guardrailViolation {
    // Show message: "I can't help with that request"
}

失败原因：模型有内容政策。某些提示词会触发规范限制，抛出错误。

必须处理：

swift

// ✅ GOOD - Handle guardrails
do {
    let response = try await session.respond(to: userInput)
} catch LanguageModelSession.GenerationError.guardrailViolation {
    // Show message: "I can't help with that request"
}

❌ Not Handling Unsupported Language

❌ 不处理不支持的语言

Why it fails: Model supports specific languages. User input might be unsupported, throws error.

Must check:

swift

// ✅ GOOD - Check supported languages
let supported = SystemLanguageModel.default.supportedLanguages
guard supported.contains(Locale.current.language) else {
    // Show disclaimer
    return
}

失败原因：模型仅支持特定语言。用户输入可能不被支持，抛出错误。

必须检查：

swift

// ✅ GOOD - Check supported languages
let supported = SystemLanguageModel.default.supportedLanguages
guard supported.contains(Locale.current.language) else {
    // Show disclaimer
    return
}

Mandatory First Steps

必备前置步骤

Before writing any Foundation Models code, complete these steps:

在编写任何Foundation Models代码之前，请完成以下步骤：

1. Check Availability

1. 检查可用性

swift

switch SystemLanguageModel.default.availability {
case .available:
    // Proceed with implementation
    print("✅ Foundation Models available")
case .unavailable(let reason):
    // Handle gracefully - show UI message
    print("❌ Unavailable: \(reason)")
}

Why: Foundation Models requires:

Apple Intelligence-enabled device
Supported region
User opted in to Apple Intelligence

Failure mode: App crashes or shows confusing errors without check.

swift

switch SystemLanguageModel.default.availability {
case .available:
    // Proceed with implementation
    print("✅ Foundation Models available")
case .unavailable(let reason):
    // Handle gracefully - show UI message
    print("❌ Unavailable: \(reason)")
}

原因：Foundation Models需要满足以下条件：

支持Apple Intelligence的设备
支持的地区
用户已启用Apple Intelligence

失败模式：如果不检查，应用会崩溃或显示令人困惑的错误。

2. Identify Use Case

2. 确定使用场景

Ask yourself: What is my primary goal?

Use Case	Foundation Models?	Alternative
Summarization	✅ YES
Extraction (key info from text)	✅ YES
Classification (categorize content)	✅ YES
Content tagging	✅ YES (built-in adapter!)
World knowledge	❌ NO	ChatGPT, Claude, Gemini
Complex reasoning	❌ NO	Server LLMs
Mathematical computation	❌ NO	Calculator, symbolic math

Critical: If your use case requires world knowledge or advanced reasoning, stop. Foundation Models is the wrong tool.

自问：我的主要目标是什么？

使用场景	是否适合Foundation Models？	替代方案
文本摘要	✅ 是
文本提取（从文本中提取关键信息）	✅ 是
文本分类（对内容进行分类）	✅ 是
内容打标签	✅ 是（内置适配器！）
通用知识	❌ 否	ChatGPT、Claude、Gemini
复杂推理	❌ 否	云端大语言模型
数学计算	❌ 否	计算器、符号数学工具

关键提示：如果你的使用场景需要通用知识或高级推理，请停止。Foundation Models不是合适的工具。

3. Design @Generable Schema

3. 设计@Generable架构

If you need structured output (not just plain text):

Bad approach: Prompt for "JSON" and parse manually Good approach: Define @Generable type

swift

@Generable
struct SearchSuggestions {
    @Guide(description: "Suggested search terms", .count(4))
    var searchTerms: [String]
}

Why: Constrained decoding guarantees structure. No parsing errors, no hallucinated keys.

如果你需要结构化输出（而非纯文本）：

错误做法：提示生成“JSON”并手动解析 正确做法：定义@Generable类型

swift

@Generable
struct SearchSuggestions {
    @Guide(description: "Suggested search terms", .count(4))
    var searchTerms: [String]
}

原因：约束解码可保证结构。无解析错误，无键幻觉。

4. Consider Tools for External Data

4. 考虑使用工具获取外部数据

If your feature needs external information:

Weather → WeatherKit tool
Locations → MapKit tool
Contacts → Contacts API tool
Calendar → EventKit tool

Don't try to get this information from the model (it will hallucinate). Do define Tool protocol implementations.

如果你的功能需要外部信息：

天气 → WeatherKit工具
地点 → MapKit工具
联系人 → Contacts API工具
日历 → EventKit工具

不要尝试从模型获取这些信息（它会产生幻觉）。应该定义Tool协议的实现。

5. Plan Streaming for Long Generations

5. 为长生成任务规划流式传输

If generation takes >1 second, use streaming:

swift

let stream = session.streamResponse(
    to: prompt,
    generating: Itinerary.self
)

for try await partial in stream {
    // Update UI incrementally
    self.itinerary = partial
}

Why: Users see progress immediately, perceived latency drops dramatically.

如果生成过程超过1秒，请使用流式传输：

swift

let stream = session.streamResponse(
    to: prompt,
    generating: Itinerary.self
)

for try await partial in stream {
    // Update UI incrementally
    self.itinerary = partial
}

原因：用户会立即看到进度，感知延迟大幅降低。

Decision Tree

决策树

Need on-device AI?
│
├─ World knowledge/reasoning?
│  └─ ❌ NOT Foundation Models
│     → Use ChatGPT, Claude, Gemini, etc.
│     → Reason: 3B parameter model, not trained for encyclopedic knowledge
│
├─ Summarization?
│  └─ ✅ YES → Pattern 1 (Basic Session)
│     → Example: Summarize article, condense email
│     → Time: 10-15 minutes
│
├─ Structured extraction?
│  └─ ✅ YES → Pattern 2 (@Generable)
│     → Example: Extract name, date, amount from invoice
│     → Time: 15-20 minutes
│
├─ Content tagging?
│  └─ ✅ YES → Pattern 3 (contentTagging use case)
│     → Example: Tag article topics, extract entities
│     → Time: 10 minutes
│
├─ Need external data?
│  └─ ✅ YES → Pattern 4 (Tool calling)
│     → Example: Fetch weather, query contacts, get locations
│     → Time: 20-30 minutes
│
├─ Long generation?
│  └─ ✅ YES → Pattern 5 (Streaming)
│     → Example: Generate itinerary, create story
│     → Time: 15-20 minutes
│
└─ Dynamic schemas (runtime-defined structure)?
   └─ ✅ YES → Pattern 6 (DynamicGenerationSchema)
      → Example: Level creator, user-defined forms
      → Time: 30-40 minutes

需要端侧AI？
│
├─ 需要通用知识/复杂推理？
│  └─ ❌ 不适合Foundation Models
│     → 使用ChatGPT、Claude、Gemini等
│     → 原因：30亿参数模型，并非为百科知识训练
│
├─ 文本摘要？
│  └─ ✅ 是 → 模式1（基础会话）
│     → 示例：文章摘要、邮件压缩
│     → 耗时：10-15分钟
│
├─ 结构化提取？
│  └─ ✅ 是 → 模式2（@Generable）
│     → 示例：从发票中提取姓名、日期、金额
│     → 耗时：15-20分钟
│
├─ 内容打标签？
│  └─ ✅ 是 → 模式3（contentTagging场景）
│     → 示例：为文章主题打标签、提取实体
│     → 耗时：10分钟
│
├─ 需要外部数据？
│  └─ ✅ 是 → 模式4（工具调用）
│     → 示例：获取天气、查询联系人、获取地点
│     → 耗时：20-30分钟
│
├─ 长生成任务？
│  └─ ✅ 是 → 模式5（流式传输）
│     → 示例：生成行程、创建故事
│     → 耗时：15-20分钟
│
└─ 动态架构（运行时定义结构）？
   └─ ✅ 是 → 模式6（DynamicGenerationSchema）
      → 示例：关卡创建器、用户定义表单
      → 耗时：30-40分钟

Pattern 1: Basic Session (~1500 words)

模式1：基础会话（约1500词）

Use when: Simple text generation, summarization, or content analysis.

适用场景：简单文本生成、摘要或内容分析。

Core Concepts

核心概念

LanguageModelSession:

Stateful — retains transcript of all interactions
Instructions vs prompts:
- Instructions (from developer): Define model's role, static guidance
- Prompts (from user): Dynamic input for generation
Model trained to obey instructions over prompts (security feature)

LanguageModelSession:

有状态——保留所有交互的对话记录
指令 vs 提示词：
- 指令（来自开发者）：定义模型角色、静态指导
- 提示词（来自用户）：用于生成的动态输入
模型会优先遵循指令而非提示词（安全特性）

Implementation

实现代码

swift

import FoundationModels

func respond(userInput: String) async throws -> String {
    let session = LanguageModelSession(instructions: """
        You are a friendly barista in a pixel art coffee shop.
        Respond to the player's question concisely.
        """
    )
    let response = try await session.respond(to: userInput)
    return response.content
}

// WWDC 301:1:05

swift

import FoundationModels

func respond(userInput: String) async throws -> String {
    let session = LanguageModelSession(instructions: """
        You are a friendly barista in a pixel art coffee shop.
        Respond to the player's question concisely.
        """
    )
    let response = try await session.respond(to: userInput)
    return response.content
}

// WWDC 301:1:05

Key Points

关键点

Instructions are optional — Reasonable defaults if omitted
Never interpolate user input into instructions — Security risk (prompt injection)
Keep instructions concise — Each token adds latency

指令是可选的——如果省略，会使用合理的默认值
切勿将用户输入插入到指令中——存在提示注入的安全风险
保持指令简洁——每个token都会增加延迟

Multi-Turn Interactions

多轮交互

swift

let session = LanguageModelSession()

// First turn
let first = try await session.respond(to: "Write a haiku about fishing")
print(first.content)
// "Silent waters gleam,
//  Casting lines in morning mist—
//  Hope in every cast."

// Second turn - model remembers context
let second = try await session.respond(to: "Do another one about golf")
print(second.content)
// "Silent morning dew,
//  Caddies guide with gentle words—
//  Paths of patience tread."

// Inspect full transcript
print(session.transcript)

// WWDC 286:17:46

Why this works: Session retains transcript automatically. Model uses context from previous turns.

swift

let session = LanguageModelSession()

// First turn
let first = try await session.respond(to: "Write a haiku about fishing")
print(first.content)
// "Silent waters gleam,
//  Casting lines in morning mist—
//  Hope in every cast."

// Second turn - model remembers context
let second = try await session.respond(to: "Do another one about golf")
print(second.content)
// "Silent morning dew,
//  Caddies guide with gentle words—
//  Paths of patience tread."

// Inspect full transcript
print(session.transcript)

// WWDC 286:17:46

为什么有效：会话会自动保留对话记录。模型会使用之前轮次的上下文。

Transcript Inspection

对话记录检查

swift

let transcript = session.transcript
// Use for:
// - Debugging generation issues
// - Showing conversation history in UI
// - Exporting chat logs

swift

let transcript = session.transcript
// 可用于：
// - 调试生成问题
// - 在UI中显示对话历史
// - 导出聊天日志

Error Handling (Basic)

基础错误处理

swift

do {
    let response = try await session.respond(to: prompt)
} catch LanguageModelSession.GenerationError.guardrailViolation {
    // Content policy triggered
    print("Cannot generate that content")
} catch LanguageModelSession.GenerationError.unsupportedLanguageOrLocale {
    // Language not supported
    print("Please use English or another supported language")
}

swift

do {
    let response = try await session.respond(to: prompt)
} catch LanguageModelSession.GenerationError.guardrailViolation {
    // Content policy triggered
    print("Cannot generate that content")
} catch LanguageModelSession.GenerationError.unsupportedLanguageOrLocale {
    // Language not supported
    print("Please use English or another supported language")
}

When to Use This Pattern

何时使用该模式

✅ Good for:

Simple Q&A
Text summarization
Content analysis
Single-turn generation

❌ Not good for:

Structured output (use Pattern 2)
Long conversations (will hit context limit)
External data needs (use Pattern 4)

✅ 适合:

简单问答
文本摘要
内容分析
单轮生成

❌ 不适合:

结构化输出（使用模式2）
长对话（会达到上下文限制）
需要外部数据（使用模式4）

Time Cost

时间成本

Implementation: 10-15 minutes for basic usage Debugging: +5-10 minutes if hitting errors

实现：基础用法需要10-15分钟调试：如果遇到错误，额外需要5-10分钟

Pattern 2: @Generable Structured Output (~2000 words)

模式2：@Generable结构化输出（约2000词）

Use when: You need structured data from model, not just plain text.

适用场景：你需要从模型获取结构化数据，而非纯文本。

The Problem

存在的问题

Without @Generable:

swift

// ❌ BAD - Unreliable
let prompt = "Generate a person with name and age as JSON"
let response = try await session.respond(to: prompt)
// Might get: {"firstName": "John"} when you expect {"name": "John"}
// Might get invalid JSON entirely
// Must parse manually, prone to crashes

不使用@Generable的情况：

swift

// ❌ BAD - Unreliable
let prompt = "Generate a person with name and age as JSON"
let response = try await session.respond(to: prompt)
// Might get: {"firstName": "John"} when you expect {"name": "John"}
// Might get invalid JSON entirely
// Must parse manually, prone to crashes

The Solution: @Generable

解决方案：@Generable

swift

@Generable
struct Person {
    let name: String
    let age: Int
}

let session = LanguageModelSession()
let response = try await session.respond(
    to: "Generate a person",
    generating: Person.self
)

let person = response.content // Type-safe Person instance!

// WWDC 301:8:14

swift

@Generable
struct Person {
    let name: String
    let age: Int
}

let session = LanguageModelSession()
let response = try await session.respond(
    to: "Generate a person",
    generating: Person.self
)

let person = response.content // Type-safe Person instance!

// WWDC 301:8:14

How It Works (Constrained Decoding)

工作原理（约束解码）

```
@Generable
```
macro generates schema at compile-time
Schema passed to model automatically
Model generates tokens constrained by schema
Framework parses output into Swift type
Guaranteed structural correctness — No hallucinated keys, no parsing errors

"Constrained decoding masks out invalid tokens. Model can only pick tokens valid according to schema."

```
@Generable
```
宏在编译时生成架构
架构自动传递给模型
模型生成符合架构约束的token
框架将输出解析为Swift类型
保证结构正确性——无键幻觉，无解析错误

"约束解码会屏蔽无效token。模型只能选择符合架构的有效token。"

Supported Types

支持的类型

Primitives:

```
String
```
,
```
Int
```
,
```
Float
```
,
```
Double
```
,
```
Bool
```

Arrays:

swift

@Generable
struct SearchSuggestions {
    var searchTerms: [String]
}

Nested/Composed:

swift

@Generable
struct Itinerary {
    var destination: String
    var days: [DayPlan] // Composed type
}

@Generable
struct DayPlan {
    var activities: [String]
}

// WWDC 286:6:18

Enums with Associated Values:

swift

@Generable
struct NPC {
    let name: String
    let encounter: Encounter

    @Generable
    enum Encounter {
        case orderCoffee(String)
        case wantToTalkToManager(complaint: String)
    }
}

// WWDC 301:10:49

Recursive Types:

swift

@Generable
struct Itinerary {
    var destination: String
    var relatedItineraries: [Itinerary] // Recursive!
}

基本类型:

```
String
```
,
```
Int
```
,
```
Float
```
,
```
Double
```
,
```
Bool
```

数组:

swift

@Generable
struct SearchSuggestions {
    var searchTerms: [String]
}

嵌套/组合类型:

swift

@Generable
struct Itinerary {
    var destination: String
    var days: [DayPlan] // Composed type
}

@Generable
struct DayPlan {
    var activities: [String]
}

// WWDC 286:6:18

带关联值的枚举:

swift

@Generable
struct NPC {
    let name: String
    let encounter: Encounter

    @Generable
    enum Encounter {
        case orderCoffee(String)
        case wantToTalkToManager(complaint: String)
    }
}

// WWDC 301:10:49

递归类型:

swift

@Generable
struct Itinerary {
    var destination: String
    var relatedItineraries: [Itinerary] // Recursive!
}

@Guide Constraints

@Guide约束

Control generated values with @Guide:

Natural Language Description:

swift

@Generable
struct NPC {
    @Guide(description: "A full name with first and last")
    let name: String
}

Numeric Ranges:

swift

@Generable
struct Character {
    @Guide(.range(1...10))
    let level: Int
}

// WWDC 301:11:20

Array Count:

swift

@Generable
struct Suggestions {
    @Guide(description: "Suggested search terms", .count(4))
    var searchTerms: [String]
}

// WWDC 286:5:32

Maximum Count:

swift

@Generable
struct Result {
    @Guide(.maximumCount(3))
    let topics: [String]
}

Regex Patterns:

swift

@Generable
struct NPC {
    @Guide(Regex {
        Capture {
            ChoiceOf {
                "Mr"
                "Mrs"
            }
        }
        ". "
        OneOrMore(.word)
    })
    let name: String
}

// Output: {name: "Mrs. Brewster"}

// WWDC 301:13:40

使用@Guide控制生成的值：

自然语言描述:

swift

@Generable
struct NPC {
    @Guide(description: "A full name with first and last")
    let name: String
}

数值范围:

swift

@Generable
struct Character {
    @Guide(.range(1...10))
    let level: Int
}

// WWDC 301:11:20

数组数量:

swift

@Generable
struct Suggestions {
    @Guide(description: "Suggested search terms", .count(4))
    var searchTerms: [String]
}

// WWDC 286:5:32

最大数量:

swift

@Generable
struct Result {
    @Guide(.maximumCount(3))
    let topics: [String]
}

正则表达式模式:

swift

@Generable
struct NPC {
    @Guide(Regex {
        Capture {
            ChoiceOf {
                "Mr"
                "Mrs"
            }
        }
        ". "
        OneOrMore(.word)
    })
    let name: String
}

// Output: {name: "Mrs. Brewster"}

// WWDC 301:13:40

Property Order Matters

属性顺序很重要

Properties generated in declaration order:

swift

@Generable
struct Itinerary {
    var destination: String // Generated first
    var days: [DayPlan]     // Generated second
    var summary: String     // Generated last
}

"You may find model produces best summaries when they're last property."

Why: Later properties can reference earlier ones. Put most important properties first for streaming.

属性会按照声明顺序生成：

swift

@Generable
struct Itinerary {
    var destination: String // 首先生成
    var days: [DayPlan]     // 其次生成
    var summary: String     // 最后生成
}

"你会发现当摘要作为最后一个属性时，模型的输出效果最佳。"

原因：后面的属性可以引用前面的属性。为了流式传输，将最重要的属性放在前面。

Pattern 3: Streaming with PartiallyGenerated (~1500 words)

模式3：使用PartiallyGenerated进行流式传输（约1500词）

Use when: Generation takes >1 second and you want progressive UI updates.

适用场景：生成过程超过1秒，你需要渐进式更新UI。

The Problem

存在的问题

Without streaming:

swift

// User waits 3-5 seconds seeing nothing
let response = try await session.respond(to: prompt, generating: Itinerary.self)
// Then entire result appears at once

User experience: Feels slow, frozen UI.

不使用流式传输的情况：

swift

// User waits 3-5 seconds seeing nothing
let response = try await session.respond(to: prompt, generating: Itinerary.self)
// Then entire result appears at once

用户体验：感觉缓慢，UI冻结。

The Solution: Streaming

解决方案：流式传输

swift

@Generable
struct Itinerary {
    var name: String
    var days: [DayPlan]
}

let stream = session.streamResponse(
    to: "Generate a 3-day itinerary to Mt. Fuji",
    generating: Itinerary.self
)

for try await partial in stream {
    print(partial) // Incrementally updated
}

// WWDC 286:9:40

swift

@Generable
struct Itinerary {
    var name: String
    var days: [DayPlan]
}

let stream = session.streamResponse(
    to: "Generate a 3-day itinerary to Mt. Fuji",
    generating: Itinerary.self
)

for try await partial in stream {
    print(partial) // 增量更新
}

// WWDC 286:9:40

PartiallyGenerated Type

PartiallyGenerated类型

@Generable

macro automatically creates

PartiallyGenerated

type:

swift

// Compiler generates:
extension Itinerary {
    struct PartiallyGenerated {
        var name: String?        // All properties optional!
        var days: [DayPlan]?
    }
}

Why optional: Properties fill in as model generates them.

@Generable

宏会自动创建

PartiallyGenerated

类型：

swift

// Compiler generates:
extension Itinerary {
    struct PartiallyGenerated {
        var name: String?        // 所有属性都是可选的!
        var days: [DayPlan]?
    }
}

为什么是可选的：属性会随着模型生成逐步填充。

SwiftUI Integration

SwiftUI集成

swift

struct ItineraryView: View {
    let session: LanguageModelSession
    @State private var itinerary: Itinerary.PartiallyGenerated?

    var body: some View {
        VStack {
            if let name = itinerary?.name {
                Text(name)
                    .font(.title)
            }

            if let days = itinerary?.days {
                ForEach(days, id: \.self) { day in
                    DayView(day: day)
                }
            }

            Button("Generate") {
                Task {
                    let stream = session.streamResponse(
                        to: "Generate 3-day itinerary to Tokyo",
                        generating: Itinerary.self
                    )

                    for try await partial in stream {
                        self.itinerary = partial
                    }
                }
            }
        }
    }
}

// WWDC 286:10:05

swift

struct ItineraryView: View {
    let session: LanguageModelSession
    @State private var itinerary: Itinerary.PartiallyGenerated?

    var body: some View {
        VStack {
            if let name = itinerary?.name {
                Text(name)
                    .font(.title)
            }

            if let days = itinerary?.days {
                ForEach(days, id: \.self) { day in
                    DayView(day: day)
                }
            }

            Button("Generate") {
                Task {
                    let stream = session.streamResponse(
                        to: "Generate 3-day itinerary to Tokyo",
                        generating: Itinerary.self
                    )

                    for try await partial in stream {
                        self.itinerary = partial
                    }
                }
            }
        }
    }
}

// WWDC 286:10:05

Animations & Transitions

动画与过渡

Add polish:

swift

if let name = itinerary?.name {
    Text(name)
        .transition(.opacity)
}

if let days = itinerary?.days {
    ForEach(days, id: \.self) { day in
        DayView(day: day)
            .transition(.slide)
    }
}

"Get creative with SwiftUI animations to hide latency. Turn waiting into delight."

添加润色:

swift

if let name = itinerary?.name {
    Text(name)
        .transition(.opacity)
}

if let days = itinerary?.days {
    ForEach(days, id: \.self) { day in
        DayView(day: day)
            .transition(.slide)
    }
}

"发挥创意使用SwiftUI动画来隐藏延迟，将等待过程变得愉悦。"

View Identity

视图标识

Critical for arrays:

swift

// ✅ GOOD - Stable identity
ForEach(days, id: \.id) { day in
    DayView(day: day)
}

// ❌ BAD - Identity changes, animations break
ForEach(days.indices, id: \.self) { index in
    DayView(day: days[index])
}

数组的关键注意事项:

swift

// ✅ GOOD - Stable identity
ForEach(days, id: \.id) { day in
    DayView(day: day)
}

// ❌ BAD - Identity changes, animations break
ForEach(days.indices, id: \.self) { index in
    DayView(day: days[index])
}

Property Order for Streaming UX

为流式传输UX优化属性顺序

swift

// ✅ GOOD - Title appears first, summary last
@Generable
struct Itinerary {
    var name: String        // Shows first
    var days: [DayPlan]     // Shows second
    var summary: String     // Shows last (can reference days)
}

// ❌ BAD - Summary before content
@Generable
struct Itinerary {
    var summary: String     // Doesn't make sense before days!
    var days: [DayPlan]
}

// WWDC 286:11:00

swift

// ✅ GOOD - 标题先显示，摘要最后显示
@Generable
struct Itinerary {
    var name: String        // 首先显示
    var days: [DayPlan]     // 其次显示
    var summary: String     // 最后显示（可以引用days）
}

// ❌ BAD - 摘要在内容之前
@Generable
struct Itinerary {
    var summary: String     // 在days之前显示毫无意义!
    var days: [DayPlan]
}

// WWDC 286:11:00

When to Use Streaming

何时使用流式传输

✅ Use for:

Itineraries
Stories
Long descriptions
Multi-section content

❌ Skip for:

Simple Q&A (< 1 sentence)
Quick classification
Content tagging

✅ 适合:

行程生成
故事创作
长描述
多章节内容

❌ 不适合:

简单问答（少于1句话）
快速分类
内容打标签

Time Cost

时间成本

Implementation: 15-20 minutes with SwiftUI Polish (animations): +5-10 minutes

实现：结合SwiftUI需要15-20分钟 润色（动画）：额外需要5-10分钟

Pattern 4: Tool Calling (~2000 words)

模式4：工具调用（约2000词）

Use when: Model needs external data (weather, locations, contacts) to generate response.

适用场景：模型需要外部数据（天气、地点、联系人）来生成响应。

The Problem

存在的问题

swift

// ❌ BAD - Model will hallucinate
let response = try await session.respond(
    to: "What's the temperature in Cupertino?"
)
// Output: "It's about 72°F" (completely made up!)

Why: 3B parameter model doesn't have real-time weather data.

swift

// ❌ BAD - Model will hallucinate
let response = try await session.respond(
    to: "What's the temperature in Cupertino?"
)
// Output: "It's about 72°F" (completely made up!)

原因：30亿参数模型没有实时天气数据。

The Solution: Tool Calling

解决方案：工具调用

Let model autonomously call your code to fetch external data.

swift

import FoundationModels
import WeatherKit
import CoreLocation

struct GetWeatherTool: Tool {
    let name = "getWeather"
    let description = "Retrieve latest weather for a city"

    @Generable
    struct Arguments {
        @Guide(description: "The city to fetch weather for")
        var city: String
    }

    func call(arguments: Arguments) async throws -> ToolOutput {
        let places = try await CLGeocoder().geocodeAddressString(arguments.city)
        let weather = try await WeatherService.shared.weather(for: places.first!.location!)
        let temp = weather.currentWeather.temperature.value

        return ToolOutput("\(arguments.city)'s temperature is \(temp) degrees.")
    }
}

// WWDC 286:13:42

让模型自主调用你的代码来获取外部数据。

swift

import FoundationModels
import WeatherKit
import CoreLocation

struct GetWeatherTool: Tool {
    let name = "getWeather"
    let description = "Retrieve latest weather for a city"

    @Generable
    struct Arguments {
        @Guide(description: "The city to fetch weather for")
        var city: String
    }

    func call(arguments: Arguments) async throws -> ToolOutput {
        let places = try await CLGeocoder().geocodeAddressString(arguments.city)
        let weather = try await WeatherService.shared.weather(for: places.first!.location!)
        let temp = weather.currentWeather.temperature.value

        return ToolOutput("\(arguments.city)'s temperature is \(temp) degrees.")
    }
}

// WWDC 286:13:42

Attaching Tool to Session

为会话附加工具

swift

let session = LanguageModelSession(
    tools: [GetWeatherTool()],
    instructions: "Help user with weather forecasts."
)

let response = try await session.respond(
    to: "What's the temperature in Cupertino?"
)

print(response.content)
// "It's 71°F in Cupertino!"

// WWDC 286:15:03

Model autonomously:

Recognizes it needs weather data
Calls
```
GetWeatherTool
```
Receives real temperature
Incorporates into natural response

swift

let session = LanguageModelSession(
    tools: [GetWeatherTool()],
    instructions: "Help user with weather forecasts."
)

let response = try await session.respond(
    to: "What's the temperature in Cupertino?"
)

print(response.content)
// "It's 71°F in Cupertino!"

// WWDC 286:15:03

模型会自主:

识别到需要天气数据
调用
```
GetWeatherTool
```
接收真实温度
将其整合到自然语言响应中

Tool Protocol Requirements

Tool协议要求

swift

protocol Tool {
    var name: String { get }
    var description: String { get }

    associatedtype Arguments: Generable

    func call(arguments: Arguments) async throws -> ToolOutput
}

Name: Short, verb-based (e.g.

getWeather

findContact

) Description: One sentence explaining purpose Arguments: Must be

@Generable

(guarantees valid input) call: Your code — fetch data, process, return

swift

protocol Tool {
    var name: String { get }
    var description: String { get }

    associatedtype Arguments: Generable

    func call(arguments: Arguments) async throws -> ToolOutput
}

Name：简短，动词开头（例如

getWeather

、

findContact

） Description：一句话说明用途 Arguments：必须是

@Generable

类型（保证输入有效） call：你的代码——获取数据、处理、返回结果

ToolOutput

Two forms:

Natural language (String):

swift

return ToolOutput("Temperature is 71°F")

Structured (GeneratedContent):

swift

let content = GeneratedContent(properties: ["temperature": 71])
return ToolOutput(content)

两种形式:

自然语言（字符串）:

swift

return ToolOutput("Temperature is 71°F")

结构化（GeneratedContent）:

swift

let content = GeneratedContent(properties: ["temperature": 71])
return ToolOutput(content)

Multiple Tools Example

多工具示例

swift

let session = LanguageModelSession(
    tools: [
        GetWeatherTool(),
        FindRestaurantTool(),
        FindHotelTool()
    ],
    instructions: "Plan travel itineraries."
)

let response = try await session.respond(
    to: "Create a 2-day plan for Tokyo"
)

// Model autonomously decides:
// - Calls FindRestaurantTool for dining
// - Calls FindHotelTool for accommodation
// - Calls GetWeatherTool to suggest activities

swift

let session = LanguageModelSession(
    tools: [
        GetWeatherTool(),
        FindRestaurantTool(),
        FindHotelTool()
    ],
    instructions: "Plan travel itineraries."
)

let response = try await session.respond(
    to: "Create a 2-day plan for Tokyo"
)

// Model autonomously decides:
// - Calls FindRestaurantTool for dining
// - Calls FindHotelTool for accommodation
// - Calls GetWeatherTool to suggest activities

Stateful Tools

有状态工具

Tools can maintain state across calls:

swift

class FindContactTool: Tool {
    let name = "findContact"
    let description = "Find contact from age generation"

    var pickedContacts = Set<String>() // State!

    @Generable
    struct Arguments {
        let generation: Generation

        @Generable
        enum Generation {
            case babyBoomers
            case genX
            case millennial
            case genZ
        }
    }

    func call(arguments: Arguments) async throws -> ToolOutput {
        // Use Contacts API
        var contacts = fetchContacts(for: arguments.generation)

        // Remove already picked
        contacts.removeAll(where: { pickedContacts.contains($0.name) })

        guard let picked = contacts.randomElement() else {
            return ToolOutput("No more contacts")
        }

        pickedContacts.insert(picked.name) // Update state
        return ToolOutput(picked.name)
    }
}

// WWDC 301:21:55

Why class, not struct: Need to mutate state from

call

method.

工具可以在多次调用之间保持状态：

swift

class FindContactTool: Tool {
    let name = "findContact"
    let description = "Find contact from age generation"

    var pickedContacts = Set<String>() // State!

    @Generable
    struct Arguments {
        let generation: Generation

        @Generable
        enum Generation {
            case babyBoomers
            case genX
            case millennial
            case genZ
        }
    }

    func call(arguments: Arguments) async throws -> ToolOutput {
        // Use Contacts API
        var contacts = fetchContacts(for: arguments.generation)

        // Remove already picked
        contacts.removeAll(where: { pickedContacts.contains($0.name) })

        guard let picked = contacts.randomElement() else {
            return ToolOutput("No more contacts")
        }

        pickedContacts.insert(picked.name) // Update state
        return ToolOutput(picked.name)
    }
}

// WWDC 301:21:55

为什么用class而不是struct：需要在

call

方法中修改状态。

Tool Calling Flow

工具调用流程

1. Session initialized with tools
2. User prompt: "What's Tokyo's weather?"
3. Model analyzes: "Need weather data"
4. Model generates tool call: getWeather(city: "Tokyo")
5. Framework calls your tool's call() method
6. Your tool fetches real data from API
7. Tool output inserted into transcript
8. Model generates final response using tool output

"Model decides autonomously when and how often to call tools. Can call multiple tools per request, even in parallel."

1. 使用工具初始化会话
2. 用户提示："东京的天气如何？"
3. 模型分析："需要天气数据"
4. 模型生成工具调用：getWeather(city: "Tokyo")
5. 框架调用你的工具的call()方法
6. 你的工具从API获取真实数据
7. 工具输出被插入到对话记录中
8. 模型使用工具输出生成最终响应

"模型会自主决定何时以及调用工具的频率。可以在一次请求中调用多个工具，甚至并行调用。"

Tool Calling Guarantees

工具调用的保证

✅ Guaranteed:

Valid tool names (no hallucinated tools)
Valid arguments (via @Generable)
Structural correctness

❌ Not guaranteed:

Tool will be called (model might not need it)
Specific argument values (model decides based on context)

✅ 保证:

有效的工具名称（无幻觉工具）
有效的参数（通过@Generable）
结构正确性

❌ 不保证:

工具会被调用（模型可能不需要）
特定的参数值（模型会根据上下文决定）

Real-World Example: Itinerary Planner

真实世界示例：行程规划器

swift

struct FindPointsOfInterestTool: Tool {
    let name = "findPointsOfInterest"
    let description = "Find restaurants, museums, parks near a landmark"

    let landmark: String

    @Generable
    struct Arguments {
        let category: Category

        @Generable
        enum Category {
            case restaurant
            case museum
            case park
            case marina
        }
    }

    func call(arguments: Arguments) async throws -> ToolOutput {
        // Use MapKit
        let request = MKLocalSearch.Request()
        request.naturalLanguageQuery = "\(arguments.category) near \(landmark)"

        let search = MKLocalSearch(request: request)
        let response = try await search.start()

        let names = response.mapItems.prefix(5).map { $0.name ?? "" }
        return ToolOutput(names.joined(separator: ", "))
    }
}

From WWDC 259 summary: "Tool fetches points of interest from MapKit. Model uses world knowledge to determine promising categories."

swift

struct FindPointsOfInterestTool: Tool {
    let name = "findPointsOfInterest"
    let description = "Find restaurants, museums, parks near a landmark"

    let landmark: String

    @Generable
    struct Arguments {
        let category: Category

        @Generable
        enum Category {
            case restaurant
            case museum
            case park
            case marina
        }
    }

    func call(arguments: Arguments) async throws -> ToolOutput {
        // Use MapKit
        let request = MKLocalSearch.Request()
        request.naturalLanguageQuery = "\(arguments.category) near \(landmark)"

        let search = MKLocalSearch(request: request)
        let response = try await search.start()

        let names = response.mapItems.prefix(5).map { $0.name ?? "" }
        return ToolOutput(names.joined(separator: ", "))
    }
}

来自WWDC 259摘要："工具从MapKit获取兴趣点。模型利用通用知识确定有前景的类别。"

When to Use Tools

何时使用工具

✅ Use for:

Weather data
Map/location queries
Contact information
Calendar events
External APIs

❌ Don't use for:

Data model already has
Information in prompt/instructions
Simple calculations (model can do these)

✅ 适合:

天气数据
地图/地点查询
联系人信息
日历事件
外部API

❌ 不适合:

数据模型已包含的信息
提示词/指令中已有的信息
简单计算（模型可以完成）

Time Cost

时间成本

Simple tool: 20-25 minutes Complex tool with state: 30-40 minutes

简单工具：20-25分钟 带状态的复杂工具：30-40分钟

Pattern 5: Context Management (~1500 words)

模式5：上下文管理（约1500词）

Use when: Multi-turn conversations that might exceed 4096 token limit.

适用场景：可能超过4096token限制的多轮对话。

The Problem

存在的问题

swift

// Long conversation...
for i in 1...100 {
    let response = try await session.respond(to: "Question \(i)")
    // Eventually...
    // Error: exceededContextWindowSize
}

Context window: 4096 tokens (input + output combined) Average: ~3 characters per token in English

Rough calculation:

4096 tokens ≈ 12,000 characters
≈ 2,000-3,000 words total

Long conversation or verbose prompts/responses → Exceed limit

swift

// Long conversation...
for i in 1...100 {
    let response = try await session.respond(to: "Question \(i)")
    // Eventually...
    // Error: exceededContextWindowSize
}

上下文窗口：4096个token（输入+输出总和）平均：英文中每个token约3个字符

粗略计算:

4096个token ≈ 12,000个字符
≈ 2,000-3,000个单词

长对话或冗长的提示词/响应 → 超出限制

Handling Context Overflow

处理上下文溢出

Basic: Start fresh session

基础方案：启动新会话

swift

var session = LanguageModelSession()

do {
    let response = try await session.respond(to: prompt)
    print(response.content)
} catch LanguageModelSession.GenerationError.exceededContextWindowSize {
    // New session, no history
    session = LanguageModelSession()
}

// WWDC 301:3:37

Problem: Loses entire conversation history.

swift

var session = LanguageModelSession()

do {
    let response = try await session.respond(to: prompt)
    print(response.content)
} catch LanguageModelSession.GenerationError.exceededContextWindowSize {
    // New session, no history
    session = LanguageModelSession()
}

// WWDC 301:3:37

问题：丢失整个对话历史。

Better: Condense Transcript

更好的方案：压缩对话记录

swift

var session = LanguageModelSession()

do {
    let response = try await session.respond(to: prompt)
} catch LanguageModelSession.GenerationError.exceededContextWindowSize {
    // New session with condensed history
    session = condensedSession(from: session)
}

func condensedSession(from previous: LanguageModelSession) -> LanguageModelSession {
    let allEntries = previous.transcript.entries
    var condensedEntries = [Transcript.Entry]()

    // Always include first entry (instructions)
    if let first = allEntries.first {
        condensedEntries.append(first)

        // Include last entry (most recent context)
        if allEntries.count > 1, let last = allEntries.last {
            condensedEntries.append(last)
        }
    }

    let condensedTranscript = Transcript(entries: condensedEntries)
    return LanguageModelSession(transcript: condensedTranscript)
}

// WWDC 301:3:55

Why this works:

Instructions always preserved
Recent context retained
Total tokens drastically reduced

swift

var session = LanguageModelSession()

do {
    let response = try await session.respond(to: prompt)
} catch LanguageModelSession.GenerationError.exceededContextWindowSize {
    // New session with condensed history
    session = condensedSession(from: session)
}

func condensedSession(from previous: LanguageModelSession) -> LanguageModelSession {
    let allEntries = previous.transcript.entries
    var condensedEntries = [Transcript.Entry]()

    // Always include first entry (instructions)
    if let first = allEntries.first {
        condensedEntries.append(first)

        // Include last entry (most recent context)
        if allEntries.count > 1, let last = allEntries.last {
            condensedEntries.append(last)
        }
    }

    let condensedTranscript = Transcript(entries: condensedEntries)
    return LanguageModelSession(transcript: condensedTranscript)
}

// WWDC 301:3:55

为什么有效:

始终保留指令
保留最近的上下文
大幅减少总token数

Advanced: Summarize Middle Entries

高级方案：总结中间对话记录

For long conversations where recent context isn't enough:

swift

func condensedSession(from previous: LanguageModelSession) -> LanguageModelSession {
    let entries = previous.transcript.entries

    guard entries.count > 3 else {
        return LanguageModelSession(transcript: previous.transcript)
    }

    // Keep first (instructions) and last (recent)
    var condensedEntries = [entries.first!]

    // Summarize middle entries
    let middleEntries = Array(entries[1..<entries.count-1])
    let summaryPrompt = """
        Summarize this conversation in 2-3 sentences:
        \(middleEntries.map { $0.content }.joined(separator: "\n"))
        """

    // Use Foundation Models itself to summarize!
    let summarySession = LanguageModelSession()
    let summary = try await summarySession.respond(to: summaryPrompt)

    condensedEntries.append(Transcript.Entry(content: summary.content))
    condensedEntries.append(entries.last!)

    return LanguageModelSession(transcript: Transcript(entries: condensedEntries))
}

"You could summarize parts of transcript with Foundation Models itself."

对于需要更多上下文的长对话：

swift

func condensedSession(from previous: LanguageModelSession) -> LanguageModelSession {
    let entries = previous.transcript.entries

    guard entries.count > 3 else {
        return LanguageModelSession(transcript: previous.transcript)
    }

    // Keep first (instructions) and last (recent)
    var condensedEntries = [entries.first!]

    // Summarize middle entries
    let middleEntries = Array(entries[1..<entries.count-1])
    let summaryPrompt = """
        Summarize this conversation in 2-3 sentences:
        \(middleEntries.map { $0.content }.joined(separator: "\n"))
        """

    // Use Foundation Models itself to summarize!
    let summarySession = LanguageModelSession()
    let summary = try await summarySession.respond(to: summaryPrompt)

    condensedEntries.append(Transcript.Entry(content: summary.content))
    condensedEntries.append(entries.last!)

    return LanguageModelSession(transcript: Transcript(entries: condensedEntries))
}

"你可以使用Foundation Models本身来总结对话记录的部分内容。"

Preventing Context Overflow

防止上下文溢出

1. Keep prompts concise:

swift

// ❌ BAD
let prompt = """
    I want you to generate a comprehensive detailed analysis of this article
    with multiple sections including summary, key points, sentiment analysis,
    main arguments, counter arguments, logical fallacies, and conclusions...
    """

// ✅ GOOD
let prompt = "Summarize this article's key points"

2. Use tools for data: Instead of putting entire dataset in prompt, use tools to fetch on-demand.

3. Break complex tasks into steps:

swift

// ❌ BAD - One massive generation
let response = try await session.respond(
    to: "Create 7-day itinerary with hotels, restaurants, activities..."
)

// ✅ GOOD - Multiple smaller generations
let overview = try await session.respond(to: "Create high-level 7-day plan")
for day in 1...7 {
    let details = try await session.respond(to: "Detail activities for day \(day)")
}

1. 保持提示词简洁:

swift

// ❌ BAD
let prompt = """
    I want you to generate a comprehensive detailed analysis of this article
    with multiple sections including summary, key points, sentiment analysis,
    main arguments, counter arguments, logical fallacies, and conclusions...
    """

// ✅ GOOD
let prompt = "Summarize this article's key points"

2. 使用工具获取数据: 不要将整个数据集放入提示词中，使用工具按需获取。

3. 将复杂任务拆分为步骤:

swift

// ❌ BAD - One massive generation
let response = try await session.respond(
    to: "Create 7-day itinerary with hotels, restaurants, activities..."
)

// ✅ GOOD - Multiple smaller generations
let overview = try await session.respond(to: "Create high-level 7-day plan")
for day in 1...7 {
    let details = try await session.respond(to: "Detail activities for day \(day)")
}

Monitoring Context Usage

监控上下文使用情况

"Each token in instructions and prompt adds latency. Longer outputs take longer."

Use Instruments (Foundation Models template) to:

See token counts
Identify verbose prompts
Optimize context usage

"指令和提示词中的每个token都会增加延迟。更长的输出需要更多时间。"

使用Instruments（Foundation Models模板）来:

查看token计数
识别冗长的提示词
优化上下文使用
量化改进效果

Time Cost

时间成本

Basic overflow handling: 5-10 minutes Condensing strategy: 15-20 minutes Advanced summarization: 30-40 minutes

基础溢出处理：5-10分钟 压缩策略：15-20分钟 高级总结：30-40分钟

Pattern 6: Sampling & Generation Options (~1000 words)

模式6：采样与生成选项（约1000词）

Use when: You need control over output randomness/determinism.

适用场景：你需要控制输出的随机性/确定性。

Understanding Sampling

理解采样

Model generates output one token at a time:

Creates probability distribution for next token
Samples from distribution
Picks token
Repeats

Default: Random sampling → Different output each time

模型逐个token生成输出：

为下一个token创建概率分布
从分布中采样
选择token
重复

默认：随机采样 → 每次输出不同

Deterministic Output (Greedy)

确定性输出（贪婪采样）

swift

let response = try await session.respond(
    to: prompt,
    options: GenerationOptions(sampling: .greedy)
)

// WWDC 301:6:14

Use cases:

Repeatable demos
Testing/debugging
Consistent results required

Caveat: Only holds for same model version. OS updates may change output.

swift

let response = try await session.respond(
    to: prompt,
    options: GenerationOptions(sampling: .greedy)
)

// WWDC 301:6:14

适用场景:

可重复的演示
测试/调试
需要一致的结果

注意事项：仅在相同模型版本下有效。系统更新可能会改变输出。

Temperature Control

温度控制

Low variance (conservative, focused):

swift

let response = try await session.respond(
    to: prompt,
    options: GenerationOptions(temperature: 0.5)
)

High variance (creative, diverse):

swift

let response = try await session.respond(
    to: prompt,
    options: GenerationOptions(temperature: 2.0)
)

// WWDC 301:6:14

Temperature scale:

```
0.1-0.5
```
: Very focused, predictable
```
1.0
```
(default): Balanced
```
1.5-2.0
```
: Creative, varied

Example use cases:

Low temp: Fact extraction, classification
High temp: Creative writing, brainstorming

低方差（保守、聚焦）:

swift

let response = try await session.respond(
    to: prompt,
    options: GenerationOptions(temperature: 0.5)
)

高方差（创意、多样）:

swift

let response = try await session.respond(
    to: prompt,
    options: GenerationOptions(temperature: 2.0)
)

// WWDC 301:6:14

温度范围:

```
0.1-0.5
```
: 非常聚焦、可预测
```
1.0
```
（默认）: 平衡
```
1.5-2.0
```
: 创意、多样

示例场景:

低温度：事实提取、分类
高温度：创意写作、头脑风暴

When to Adjust Sampling

何时调整采样方式

✅ Greedy for:

Unit tests
Demos
Consistency critical

✅ Low temperature for:

Factual tasks
Classification
Extraction

✅ High temperature for:

Creative content
Story generation
Varied NPC dialog

✅ 贪婪采样适合:

单元测试
演示
对一致性要求高的场景

✅ 低温度适合:

事实性任务
分类
提取

✅ 高温度适合:

创意内容
故事生成
多样的NPC对话

Time Cost

时间成本

Implementation: 2-3 minutes (one line change)

实现：2-3分钟（只需修改一行代码）

Pressure Scenarios

压力场景

Scenario 1: "Just Use ChatGPT API" (~1000 words)

场景1："直接用ChatGPT API"（约1000词）

Context: You're implementing a new AI feature. PM suggests using ChatGPT API for "better results."

Pressure signals:

👔 Authority: PM outranks you
💸 Existing integration: Team already uses OpenAI for other features
⏰ Speed: "ChatGPT is proven, Foundation Models is new"

Rationalization traps:

"PM knows best"
"ChatGPT gives better answers"
"Faster to implement with existing code"

Why this fails:

Privacy violation: User data sent to external server
- Medical notes, financial docs, personal messages
- Violates user expectation of on-device privacy
- Potential GDPR/privacy law issues
Cost: Every API call costs money
- Foundation Models is free
- Scale to millions of users = massive costs
Offline unavailable: Requires internet
- Airplane mode, poor signal → feature broken
- Foundation Models works offline
Latency: Network round-trip adds 500-2000ms
- Foundation Models: On-device, <100ms startup

When ChatGPT IS appropriate:

World knowledge required (e.g. "Who is the president of France?")
Complex reasoning (multi-step logic, math proofs)
Very long context (>4096 tokens)

Mandatory response:

"I understand ChatGPT delivers great results for certain tasks. However,
for this feature, Foundation Models is the right choice for three critical reasons:

1. **Privacy**: This feature processes [medical notes/financial data/personal content].
   Users expect this data stays on-device. Sending to external API violates that trust
   and may have compliance issues.

2. **Cost**: At scale, ChatGPT API calls cost $X per 1000 requests. Foundation Models
   is free. For Y million users, that's $Z annually we can avoid.

3. **Offline capability**: Foundation Models works without internet. Users in airplane
   mode or with poor signal still get full functionality.

**When to use ChatGPT**: If this feature required world knowledge or complex reasoning,
ChatGPT would be the right choice. But this is [summarization/extraction/classification],
which is exactly what Foundation Models is optimized for.

**Time estimate**: Foundation Models implementation: 15-20 minutes.
Privacy compliance review for ChatGPT: 2-4 weeks."

Time saved: Privacy compliance review vs correct implementation: 2-4 weeks vs 20 minutes

背景：你正在实现一个新的AI功能。产品经理建议使用ChatGPT API以获得“更好的结果”。

压力信号:

👔 权威：产品经理的职级比你高
💸 现有集成：团队已经在其他功能中使用OpenAI
⏰ 速度："ChatGPT已经成熟，Foundation Models是新技术"

合理化陷阱:

"产品经理最清楚"
"ChatGPT的答案更好"
"用现有代码实现更快"

为什么这种做法会失败:

隐私违规：用户数据被发送到外部服务器
- 医疗记录、财务文档、个人消息
- 违反用户对端侧隐私的期望
- 可能违反GDPR或其他隐私法规
成本：每次API调用都需要付费
- Foundation Models是免费的
- 扩展到数百万用户会产生巨额成本
离线不可用：需要互联网
- 飞行模式、信号差 → 功能失效
- Foundation Models可离线工作
延迟：网络往返会增加500-2000ms延迟
- Foundation Models：端侧运行，启动延迟<100ms

何时ChatGPT是合适的:

需要通用知识（例如"法国总统是谁？"）
需要复杂推理（多步骤逻辑、数学证明）
需要非常长的上下文（>4096token）

必选回应:

"我理解ChatGPT在某些任务上表现出色。然而，对于这个功能，Foundation Models是更合适的选择，主要有三个关键原因：

1. **隐私**：该功能处理[医疗记录/财务数据/个人内容]。用户期望这些数据保留在设备上。发送到外部API会违背这种信任，还可能引发合规问题。

2. **成本**：大规模使用时，ChatGPT API每1000次请求需要花费$X。而Foundation Models是免费的。对于Y百万用户，我们可以每年节省$Z的成本。

3. **离线能力**：Foundation Models无需互联网即可工作。处于飞行模式或信号差的用户仍然可以使用完整功能。

**何时使用ChatGPT**：如果该功能需要通用知识或复杂推理，ChatGPT会是合适的选择。但当前功能是[摘要/提取/分类]，这正是Foundation Models优化的场景。

**时间估算**：Foundation Models实现需要15-20分钟。而ChatGPT的隐私合规审查需要2-4周。"

节省的时间：隐私合规审查（2-4周） vs 正确实现（20分钟）

Scenario 2: "Parse JSON Manually" (~1000 words)

场景2："手动解析JSON"（约1000词）

Context: Teammate suggests prompting for JSON, parsing with JSONDecoder. Claims it's "simple and familiar."

Pressure signals:

⏰ Deadline: Ship in 2 days
📚 Familiarity: "Everyone knows JSON"
🔧 Existing code: Already have JSON parsing utilities

Rationalization traps:

"JSON is standard"
"We parse JSON everywhere already"
"Faster than learning new API"

Why this fails:

Hallucinated keys: Model outputs
```
{firstName: "John"}
```
when you expect
```
{name: "John"}
```
- JSONDecoder crashes:
```
keyNotFound
```
- No compile-time safety
Invalid JSON: Model might output:
```
Here's the person: {name: "John", age: 30}
```
- Not valid JSON (preamble text)
- Parsing fails
No type safety: Manual string parsing, prone to errors

Real-world example:

swift

// ❌ BAD - Will fail
let prompt = "Generate a person with name and age as JSON"
let response = try await session.respond(to: prompt)

// Model outputs: {"firstName": "John Smith", "years": 30}
// Your code expects: {"name": ..., "age": ...}
// CRASH: keyNotFound(name)

Debugging time: 2-4 hours finding edge cases, writing parsing hacks

Correct approach:

swift

// ✅ GOOD - 15 minutes, guaranteed to work
@Generable
struct Person {
    let name: String
    let age: Int
}

let response = try await session.respond(
    to: "Generate a person",
    generating: Person.self
)
// response.content is type-safe Person, always valid

Mandatory response:

"I understand JSON parsing feels familiar, but for LLM output, @Generable is objectively
better for three technical reasons:

1. **Constrained decoding guarantees structure**: Model can ONLY generate valid Person
   instances. Impossible to get wrong keys, invalid JSON, or missing fields.

2. **No parsing code needed**: Framework handles parsing automatically. Zero chance of
   parsing bugs.

3. **Compile-time safety**: If we change Person struct, compiler catches all issues.
   Manual JSON parsing = runtime crashes.

**Real cost**: Manual JSON approach will hit edge cases. Debugging 'keyNotFound' crashes
takes 2-4 hours. @Generable implementation takes 15 minutes and has zero parsing bugs.

**Analogy**: This is like choosing Swift over Objective-C for new code. Both work, but
Swift's type safety prevents entire categories of bugs."

Time saved: 4-8 hours debugging vs 15 minutes correct implementation

背景：同事建议提示生成JSON，然后用JSONDecoder解析。声称这“简单且熟悉”。

压力信号:

⏰ 截止日期：2天内上线
📚 熟悉度："每个人都懂JSON"
🔧 现有代码：已经有JSON解析工具

合理化陷阱:

"JSON是标准"
"我们已经在各处解析JSON"
"比学习新API更快"

为什么这种做法会失败:

键幻觉：模型输出
```
{firstName: "John"}
```
，而你期望的是
```
{name: "John"}
```
- JSONDecoder会崩溃：
```
keyNotFound
```
- 无编译时安全
无效JSON：模型可能输出：
```
Here's the person: {name: "John", age: 30}
```
- 不是有效的JSON（有前缀文本）
- 解析失败
无类型安全：手动字符串解析，容易出错

真实世界示例:

swift

// ❌ BAD - Will fail
let prompt = "Generate a person with name and age as JSON"
let response = try await session.respond(to: prompt)

// Model outputs: {"firstName": "John Smith", "years": 30}
// Your code expects: {"name": ..., "age": ...}
// CRASH: keyNotFound(name)

调试时间：2-4小时寻找边缘案例，编写解析技巧

正确做法:

swift

// ✅ GOOD - 15 minutes, guaranteed to work
@Generable
struct Person {
    let name: String
    let age: Int
}

let response = try await session.respond(
    to: "Generate a person",
    generating: Person.self
)
// response.content is type-safe Person, always valid

必选回应:

"我理解JSON解析感觉很熟悉，但对于大语言模型的输出，@Generable在技术上更优，主要有三个原因：

1. **约束解码保证结构**：模型只能生成有效的Person实例。不可能出现错误的键、无效JSON或缺失字段。

2. **无需解析代码**：框架会自动处理解析。完全没有解析错误的可能。

3. **编译时安全**：如果我们修改Person结构体，编译器会捕获所有问题。手动JSON解析会导致运行时崩溃。

**真实成本**：手动JSON方法会遇到边缘案例。调试'keyNotFound'崩溃需要2-4小时。而@Generable实现只需15分钟，且没有解析错误。

**类比**：这就像为新代码选择Swift而非Objective-C。两者都能工作，但Swift的类型安全可以避免一整类错误。"

节省的时间：4-8小时调试 vs 15分钟正确实现

Scenario 3: "One Big Prompt" (~1000 words)

场景3："一个大提示词"（约1000词）

Context: Feature requires extracting name, date, amount, category from invoice. Teammate suggests one prompt: "Extract all information."

Pressure signals:

🏗️ Architecture: "Simpler with one API call"
⏰ Speed: "Why make it complicated?"
📉 Complexity: "More prompts = more code"

Rationalization traps:

"Simpler is better"
"One prompt means less code"
"Model is smart enough"

Why this fails:

Context overflow: Complex prompt + large invoice → Exceeds 4096 tokens
Poor results: Model tries to do too much at once, quality suffers
Slow generation: One massive response takes 5-8 seconds
All-or-nothing: If one field fails, entire generation fails

Better approach: Break into tasks + use tools

swift

// ❌ BAD - One massive prompt
let prompt = """
    Extract from this invoice:
    - Vendor name
    - Invoice date
    - Total amount
    - Line items (description, quantity, price each)
    - Payment terms
    - Due date
    - Tax amount
    ...
    """
// 4 seconds, poor quality, might exceed context

// ✅ GOOD - Structured extraction with focused prompts
@Generable
struct InvoiceBasics {
    let vendor: String
    let date: String
    let amount: Double
}

let basics = try await session.respond(
    to: "Extract vendor, date, and amount",
    generating: InvoiceBasics.self
) // 0.5 seconds, axiom-high quality

@Generable
struct LineItem {
    let description: String
    let quantity: Int
    let price: Double
}

let items = try await session.respond(
    to: "Extract line items",
    generating: [LineItem].self
) // 1 second, axiom-high quality

// Total: 1.5 seconds, better quality, graceful partial failures

Mandatory response:

"I understand the appeal of one simple API call. However, this specific task requires
a different approach:

1. **Context limits**: Invoice + complex extraction prompt will likely exceed 4096 token
   limit. Multiple focused prompts stay well under limit.

2. **Better quality**: Model performs better with focused tasks. 'Extract vendor name'
   gets 95%+ accuracy. 'Extract everything' gets 60-70%.

3. **Faster perceived performance**: Multiple prompts with streaming show progressive
   results. Users see vendor name in 0.5s, not waiting 5s for everything.

4. **Graceful degradation**: If line items fail, we still have basics. All-or-nothing
   approach means total failure.

**Implementation**: Breaking into 3-4 focused extractions takes 30 minutes. One big
prompt takes 2-3 hours debugging why it hits context limit and produces poor results."

Time saved: 2-3 hours debugging vs 30 minutes proper design

背景：功能需要从发票中提取姓名、日期、金额、类别。同事建议使用一个提示词："提取所有信息。"

压力信号:

🏗️ 架构："一个API调用更简单"
⏰ 速度："为什么要复杂化？"
📉 复杂度："更多提示词意味着更多代码"

合理化陷阱:

"越简单越好"
"一个提示词意味着更少的代码"
"模型足够智能"

为什么这种做法会失败:

上下文溢出：复杂提示词 + 大发票 → 超过4096token限制
结果质量差：模型试图一次完成太多任务，质量下降
生成缓慢：一个庞大的响应需要5-8秒
全有或全无：如果一个字段失败，整个生成都会失败

更好的做法：拆分为任务 + 使用工具

swift

// ❌ BAD - One massive prompt
let prompt = """
    Extract from this invoice:
    - Vendor name
    - Invoice date
    - Total amount
    - Line items (description, quantity, price each)
    - Payment terms
    - Due date
    - Tax amount
    ...
    """
// 4 seconds, poor quality, might exceed context

// ✅ GOOD - Structured extraction with focused prompts
@Generable
struct InvoiceBasics {
    let vendor: String
    let date: String
    let amount: Double
}

let basics = try await session.respond(
    to: "Extract vendor, date, and amount",
    generating: InvoiceBasics.self
) // 0.5 seconds, 高质量

@Generable
struct LineItem {
    let description: String
    let quantity: Int
    let price: Double
}

let items = try await session.respond(
    to: "Extract line items",
    generating: [LineItem].self
) // 1 second, 高质量

// Total: 1.5 seconds, better quality, graceful partial failures

必选回应:

"我理解一个简单API调用的吸引力。然而，这个特定任务需要不同的方法：

1. **上下文限制**：发票 + 复杂提取提示词很可能会超过4096token限制。多个聚焦的提示词会保持在限制内。

2. **更好的质量**：模型在处理聚焦任务时表现更好。'提取供应商名称'的准确率可达95%以上。而'提取所有信息'的准确率只有60-70%。

3. **更快的感知性能**：多个提示词结合流式传输可以逐步显示结果。用户会在0.5秒内看到供应商名称，而不是等待5秒才能看到所有内容。

4. **优雅降级**：如果行项目提取失败，我们仍然可以获取基础信息。全有或全无的方法会导致完全失败。

**实现**：拆分为3-4个聚焦的提取任务需要30分钟。而一个大提示词需要2-3小时调试，解决上下文限制和结果质量差的问题。"

节省的时间：2-3小时调试 vs 30分钟合理设计

Performance Optimization

性能优化

1. Prewarm Session (~200 words)

1. 预启动会话（约200词）

Problem: First generation takes 1-2 seconds just to load model.

Solution: Create session before user interaction.

swift

class ViewModel: ObservableObject {
    private var session: LanguageModelSession?

    init() {
        // Prewarm on init, not when user taps button
        Task {
            self.session = LanguageModelSession(instructions: "...")
        }
    }

    func generate(prompt: String) async throws -> String {
        let response = try await session!.respond(to: prompt)
        return response.content
    }
}

"Prewarming session before user interaction reduces initial latency."

Time saved: 1-2 seconds off first generation

问题：首次生成需要1-2秒加载模型。

解决方案：在用户交互之前创建会话。

swift

class ViewModel: ObservableObject {
    private var session: LanguageModelSession?

    init() {
        // 在初始化时预启动，而不是用户点击按钮时
        Task {
            self.session = LanguageModelSession(instructions: "...")
        }
    }

    func generate(prompt: String) async throws -> String {
        let response = try await session!.respond(to: prompt)
        return response.content
    }
}

"在用户交互之前预启动会话可以减少初始延迟。"

节省的时间：首次生成减少1-2秒延迟

2. includeSchemaInPrompt: false (~200 words)

2. includeSchemaInPrompt: false（约200词）

Problem: @Generable schemas inserted into prompt, increases token count.

Solution: For subsequent requests with same schema, skip insertion.

swift

let firstResponse = try await session.respond(
    to: "Generate first person",
    generating: Person.self
    // Schema inserted automatically
)

// Subsequent requests with SAME schema
let secondResponse = try await session.respond(
    to: "Generate another person",
    generating: Person.self,
    options: GenerationOptions(includeSchemaInPrompt: false)
)

"Setting includeSchemaInPrompt to false decreases token count and latency for subsequent requests."

When to use: Multi-turn with same @Generable type

Time saved: 10-20% latency reduction per request

问题：@Generable架构会被插入到提示词中，增加token计数。

解决方案：对于后续使用相同架构的请求，跳过插入。

swift

let firstResponse = try await session.respond(
    to: "Generate first person",
    generating: Person.self
    // 架构会自动插入
)

// 后续使用相同架构的请求
let secondResponse = try await session.respond(
    to: "Generate another person",
    generating: Person.self,
    options: GenerationOptions(includeSchemaInPrompt: false)
)

"将includeSchemaInPrompt设置为false可以减少后续请求的token计数和延迟。"

何时使用：使用相同@Generable类型的多轮对话

节省的时间：每个请求减少10-20%延迟

3. Property Order for Streaming UX (~200 words)

3. 为流式传输UX优化属性顺序（约200词）

Problem: User waits for entire generation.

Solution: Put important properties first, stream to show early.

swift

// ✅ GOOD - Title shows immediately
@Generable
struct Article {
    var title: String      // Shows in 0.2s
    var summary: String    // Shows in 0.8s
    var fullText: String   // Shows in 2.5s
}

// ❌ BAD - Wait for everything
@Generable
struct Article {
    var fullText: String   // User waits 2.5s
    var title: String
    var summary: String
}

UX impact: Perceived latency drops from 2.5s to 0.2s

问题：用户需要等待整个生成过程。

解决方案：将重要属性放在前面，通过流式传输提前显示。

swift

// ✅ GOOD - Title shows immediately
@Generable
struct Article {
    var title: String      // 0.2秒内显示
    var summary: String    // 0.8秒内显示
    var fullText: String   // 2.5秒内显示
}

// ❌ BAD - Wait for everything
@Generable
struct Article {
    var fullText: String   // 用户需要等待2.5秒
    var title: String
    var summary: String
}

UX影响：感知延迟从2.5秒降至0.2秒

4. Foundation Models Instrument (~100 words)

4. Foundation Models性能分析工具（约100词）

Use Instruments app with Foundation Models template to:

Profile latency of each request
See token counts (input/output)
Identify optimization opportunities
Quantify improvements

"New Instruments profiling template lets you observe areas of optimization and quantify improvements."

Access: Instruments → Create → Foundation Models template

使用Instruments应用的Foundation Models模板来:

分析每个请求的延迟
查看token计数（输入/输出）
识别优化机会
量化改进效果

"新的Instruments性能分析模板可以让你观察优化空间并量化改进效果。"

访问方式：Instruments → 创建 → Foundation Models模板

Checklist

检查清单

Before shipping Foundation Models features:

在发布Foundation Models功能之前：

Required Checks

必选检查

Availability checked before creating session
Using @Generable for structured output (not manual JSON)
Handling context overflow (
```
exceededContextWindowSize
```
)
Handling guardrail violations (
```
guardrailViolation
```
)
Handling unsupported language (
```
unsupportedLanguageOrLocale
```
)
Streaming for long generations (>1 second)
Not blocking UI (using
```
Task {}
```
for async)
Tools for external data (not prompting for weather/locations)
Prewarmed session if latency-sensitive

已检查可用性，然后再创建会话
使用@Generable处理结构化输出（而非手动JSON）
已处理上下文溢出（
```
exceededContextWindowSize
```
）
已处理内容规范违规（
```
guardrailViolation
```
）
已处理不支持的语言（
```
unsupportedLanguageOrLocale
```
）
长生成任务使用流式传输（>1秒）
未阻塞UI（使用
```
Task {}
```
处理异步）
使用工具获取外部数据（而非提示获取天气/地点）
已预启动会话（如果对延迟敏感）

Best Practices

最佳实践

Model Capability

模型能力

Not using for world knowledge
Not using for complex reasoning
Use case is: summarization, extraction, classification, or generation
Have fallback if unavailable (show message, disable feature)

未用于通用知识
未用于复杂推理
使用场景为：摘要、提取、分类或生成
有不可用时的回退方案（显示提示、禁用功能）

Resources

资源

WWDC: 286, 259, 301

Skills: axiom-foundation-models-diag, axiom-foundation-models-ref

Last Updated: 2025-12-03 Version: 1.0.0 Target: iOS 26+, macOS 26+, iPadOS 26+, axiom-visionOS 26+

WWDC：286, 259, 301

技能：axiom-foundation-models-diag, axiom-foundation-models-ref

最后更新：2025-12-03 版本：1.0.0 支持平台：iOS 26+, macOS 26+, iPadOS 26+, visionOS 26+

axiom-foundation-models

Original

Translation

Foundation Models — On-Device AI for Apple Platforms

Foundation Models — Apple平台的端侧AI

When to Use This Skill

何时使用该技能

Related Skills

相关技能

Red Flags — Anti-Patterns That Will Fail

警示信号——会导致失败的反模式

❌ Using for World Knowledge

❌ 用于获取通用知识

❌ Blocking Main Thread

❌ 阻塞主线程

❌ Manual JSON Parsing

❌ 手动JSON解析

❌ Ignoring Availability Check

❌ 忽略可用性检查

❌ Single Huge Prompt

❌ 单个超大提示词

❌ Not Handling Context Overflow

❌ 不处理上下文溢出

❌ Not Handling Guardrail Violations

❌ 不处理内容规范违规

❌ Not Handling Unsupported Language

❌ 不处理不支持的语言

Mandatory First Steps

必备前置步骤

1. Check Availability

1. 检查可用性

2. Identify Use Case

2. 确定使用场景

3. Design @Generable Schema

3. 设计@Generable架构

4. Consider Tools for External Data

4. 考虑使用工具获取外部数据

5. Plan Streaming for Long Generations

5. 为长生成任务规划流式传输

Decision Tree

决策树

Pattern 1: Basic Session (~1500 words)

模式1：基础会话（约1500词）

Core Concepts

核心概念

Implementation

实现代码

Key Points

关键点

Multi-Turn Interactions

多轮交互

Transcript Inspection

对话记录检查

Error Handling (Basic)

基础错误处理

When to Use This Pattern

何时使用该模式

Time Cost

时间成本

Pattern 2: @Generable Structured Output (~2000 words)

模式2：@Generable结构化输出（约2000词）

The Problem

存在的问题

The Solution: @Generable

解决方案：@Generable

How It Works (Constrained Decoding)

工作原理（约束解码）

Supported Types

支持的类型

@Guide Constraints

@Guide约束

Property Order Matters

属性顺序很重要

Pattern 3: Streaming with PartiallyGenerated (~1500 words)

模式3：使用PartiallyGenerated进行流式传输（约1500词）

The Problem

存在的问题

The Solution: Streaming

解决方案：流式传输

PartiallyGenerated Type