speech-recognition

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Speech Recognition

语音识别

Transcribe live and pre-recorded audio to text using Apple's Speech framework. Covers
SFSpeechRecognizer
(iOS 10+) and the new
SpeechAnalyzer
API (iOS 26+).
使用苹果的Speech框架将实时音频和预录制音频转换为文本。内容涵盖
SFSpeechRecognizer
(iOS 10+)以及iOS 26+新增的
SpeechAnalyzer
API。

Contents

目录

SpeechAnalyzer (iOS 26+)

SpeechAnalyzer(iOS 26+)

SpeechAnalyzer
is an actor-based API introduced in iOS 26 that replaces
SFSpeechRecognizer
for new projects. It uses Swift concurrency,
AsyncSequence
for results, and supports modular analysis via
SpeechTranscriber
.
SpeechAnalyzer
是iOS 26推出的基于actor的API,新项目推荐使用它替代
SFSpeechRecognizer
。它采用Swift并发模型,通过
AsyncSequence
返回识别结果,并支持通过
SpeechTranscriber
实现模块化分析。

Basic transcription with SpeechAnalyzer

使用SpeechAnalyzer实现基础转录

swift
import Speech

// 1. Create a transcriber module
guard let locale = SpeechTranscriber.supportedLocale(
    equivalentTo: Locale.current
) else { return }
let transcriber = SpeechTranscriber(locale: locale, preset: .offlineTranscription)

// 2. Ensure assets are installed
if let request = try await AssetInventory.assetInstallationRequest(
    supporting: [transcriber]
) {
    try await request.downloadAndInstall()
}

// 3. Create input stream and analyzer
let (inputSequence, inputBuilder) = AsyncStream.makeStream(of: AnalyzerInput.self)
let audioFormat = await SpeechAnalyzer.bestAvailableAudioFormat(
    compatibleWith: [transcriber]
)
let analyzer = SpeechAnalyzer(modules: [transcriber])

// 4. Feed audio buffers (from AVAudioEngine or file)
Task {
    // Append PCM buffers converted to audioFormat
    let pcmBuffer: AVAudioPCMBuffer = // ... your audio buffer
    inputBuilder.yield(AnalyzerInput(buffer: pcmBuffer))
    inputBuilder.finish()
}

// 5. Consume results
Task {
    for try await result in transcriber.results {
        let text = String(result.text.characters)
        print(text)
    }
}

// 6. Run analysis
let lastSampleTime = try await analyzer.analyzeSequence(inputSequence)

// 7. Finalize
if let lastSampleTime {
    try await analyzer.finalizeAndFinish(through: lastSampleTime)
} else {
    try analyzer.cancelAndFinishNow()
}
swift
import Speech

// 1. 创建转录器模块
guard let locale = SpeechTranscriber.supportedLocale(
    equivalentTo: Locale.current
) else { return }
let transcriber = SpeechTranscriber(locale: locale, preset: .offlineTranscription)

// 2. 确保资源包已安装
if let request = try await AssetInventory.assetInstallationRequest(
    supporting: [transcriber]
) {
    try await request.downloadAndInstall()
}

// 3. 创建输入流和分析器
let (inputSequence, inputBuilder) = AsyncStream.makeStream(of: AnalyzerInput.self)
let audioFormat = await SpeechAnalyzer.bestAvailableAudioFormat(
    compatibleWith: [transcriber]
)
let analyzer = SpeechAnalyzer(modules: [transcriber])

// 4. 传入音频缓冲区(来自AVAudioEngine或音频文件)
Task {
    // 追加转换为audioFormat格式的PCM缓冲区
    let pcmBuffer: AVAudioPCMBuffer = // ... 你的音频缓冲区
    inputBuilder.yield(AnalyzerInput(buffer: pcmBuffer))
    inputBuilder.finish()
}

// 5. 消费识别结果
Task {
    for try await result in transcriber.results {
        let text = String(result.text.characters)
        print(text)
    }
}

// 6. 运行分析
let lastSampleTime = try await analyzer.analyzeSequence(inputSequence)

// 7. 结束处理
if let lastSampleTime {
    try await analyzer.finalizeAndFinish(through: lastSampleTime)
} else {
    try analyzer.cancelAndFinishNow()
}

Transcribing an audio file with SpeechAnalyzer

使用SpeechAnalyzer转录音频文件

swift
let transcriber = SpeechTranscriber(locale: locale, preset: .offlineTranscription)
let audioFile = try AVAudioFile(forReading: fileURL)
let analyzer = SpeechAnalyzer(
    inputAudioFile: audioFile, modules: [transcriber], finishAfterFile: true
)
for try await result in transcriber.results {
    print(String(result.text.characters))
}
swift
let transcriber = SpeechTranscriber(locale: locale, preset: .offlineTranscription)
let audioFile = try AVAudioFile(forReading: fileURL)
let analyzer = SpeechAnalyzer(
    inputAudioFile: audioFile, modules: [transcriber], finishAfterFile: true
)
for try await result in transcriber.results {
    print(String(result.text.characters))
}

Key differences from SFSpeechRecognizer

与SFSpeechRecognizer的核心差异

FeatureSFSpeechRecognizerSpeechAnalyzer
ConcurrencyCallbacks/delegatesasync/await + AsyncSequence
Type
class
actor
ModulesMonolithicComposable (
SpeechTranscriber
,
SpeechDetector
)
Audio input
append(_:)
on request
AsyncStream<AnalyzerInput>
AvailabilityiOS 10+iOS 26+
On-device
requiresOnDeviceRecognition
Asset-based via
AssetInventory
功能SFSpeechRecognizerSpeechAnalyzer
并发模型回调/代理async/await + AsyncSequence
类型
class
actor
模块设计单体架构可组合(
SpeechTranscriber
SpeechDetector
音频输入请求对象调用
append(_:)
AsyncStream<AnalyzerInput>
可用版本iOS 10+iOS 26+
端侧识别
requiresOnDeviceRecognition
通过
AssetInventory
管理资源包实现

SFSpeechRecognizer Setup

SFSpeechRecognizer 配置

Creating a recognizer with locale

创建指定locale的识别器

swift
import Speech

// Default locale (user's current language)
let recognizer = SFSpeechRecognizer()

// Specific locale
let recognizer = SFSpeechRecognizer(locale: Locale(identifier: "en-US"))

// Check if recognition is available for this locale
guard let recognizer, recognizer.isAvailable else {
    print("Speech recognition not available")
    return
}
swift
import Speech

// 默认locale(用户当前语言)
let recognizer = SFSpeechRecognizer()

// 指定locale
let recognizer = SFSpeechRecognizer(locale: Locale(identifier: "en-US"))

// 检查当前locale是否支持语音识别
guard let recognizer, recognizer.isAvailable else {
    print("语音识别不可用")
    return
}

Monitoring availability changes

监听可用性变化

swift
final class SpeechManager: NSObject, SFSpeechRecognizerDelegate {
    private let recognizer = SFSpeechRecognizer()!

    override init() {
        super.init()
        recognizer.delegate = self
    }

    func speechRecognizer(
        _ speechRecognizer: SFSpeechRecognizer,
        availabilityDidChange available: Bool
    ) {
        // Update UI — disable record button when unavailable
    }
}
swift
final class SpeechManager: NSObject, SFSpeechRecognizerDelegate {
    private let recognizer = SFSpeechRecognizer()!

    override init() {
        super.init()
        recognizer.delegate = self
    }

    func speechRecognizer(
        _ speechRecognizer: SFSpeechRecognizer,
        availabilityDidChange available: Bool
    ) {
        // 更新UI — 不可用时禁用录音按钮
    }
}

Authorization

授权

Request both speech recognition and microphone permissions before starting live transcription. Add these keys to
Info.plist
:
  • NSSpeechRecognitionUsageDescription
  • NSMicrophoneUsageDescription
swift
import Speech
import AVFoundation

func requestPermissions() async -> Bool {
    let speechStatus = await withCheckedContinuation { continuation in
        SFSpeechRecognizer.requestAuthorization { status in
            continuation.resume(returning: status)
        }
    }
    guard speechStatus == .authorized else { return false }

    let micStatus: Bool
    if #available(iOS 17, *) {
        micStatus = await AVAudioApplication.requestRecordPermission()
    } else {
        micStatus = await withCheckedContinuation { continuation in
            AVAudioSession.sharedInstance().requestRecordPermission { granted in
                continuation.resume(returning: granted)
            }
        }
    }
    return micStatus
}
开启实时转录前需要同时申请语音识别麦克风权限。请在
Info.plist
中添加以下键:
  • NSSpeechRecognitionUsageDescription
  • NSMicrophoneUsageDescription
swift
import Speech
import AVFoundation

func requestPermissions() async -> Bool {
    let speechStatus = await withCheckedContinuation { continuation in
        SFSpeechRecognizer.requestAuthorization { status in
            continuation.resume(returning: status)
        }
    }
    guard speechStatus == .authorized else { return false }

    let micStatus: Bool
    if #available(iOS 17, *) {
        micStatus = await AVAudioApplication.requestRecordPermission()
    } else {
        micStatus = await withCheckedContinuation { continuation in
            AVAudioSession.sharedInstance().requestRecordPermission { granted in
                continuation.resume(returning: granted)
            }
        }
    }
    return micStatus
}

Live Microphone Transcription

实时麦克风转录

The standard pattern:
AVAudioEngine
captures microphone audio → buffers are appended to
SFSpeechAudioBufferRecognitionRequest
→ results stream in.
swift
import Speech
import AVFoundation

final class LiveTranscriber {
    private let recognizer = SFSpeechRecognizer(locale: Locale(identifier: "en-US"))!
    private let audioEngine = AVAudioEngine()
    private var recognitionRequest: SFSpeechAudioBufferRecognitionRequest?
    private var recognitionTask: SFSpeechRecognitionTask?

    func startTranscribing() throws {
        // Cancel any in-progress task
        recognitionTask?.cancel()
        recognitionTask = nil

        // Configure audio session
        let audioSession = AVAudioSession.sharedInstance()
        try audioSession.setCategory(.record, mode: .measurement, options: .duckOthers)
        try audioSession.setActive(true, options: .notifyOthersOnDeactivation)

        // Create request
        let request = SFSpeechAudioBufferRecognitionRequest()
        request.shouldReportPartialResults = true
        self.recognitionRequest = request

        // Start recognition task
        recognitionTask = recognizer.recognitionTask(with: request) { result, error in
            if let result {
                let text = result.bestTranscription.formattedString
                print("Transcription: \(text)")

                if result.isFinal {
                    self.stopTranscribing()
                }
            }
            if let error {
                print("Recognition error: \(error)")
                self.stopTranscribing()
            }
        }

        // Install audio tap
        let inputNode = audioEngine.inputNode
        let recordingFormat = inputNode.outputFormat(forBus: 0)
        inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) {
            buffer, _ in
            request.append(buffer)
        }

        audioEngine.prepare()
        try audioEngine.start()
    }

    func stopTranscribing() {
        audioEngine.stop()
        audioEngine.inputNode.removeTap(onBus: 0)
        recognitionRequest?.endAudio()
        recognitionRequest = nil
        recognitionTask?.cancel()
        recognitionTask = nil
    }
}
标准实现流程:
AVAudioEngine
捕获麦克风音频 → 缓冲区追加到
SFSpeechAudioBufferRecognitionRequest
→ 流式返回识别结果。
swift
import Speech
import AVFoundation

final class LiveTranscriber {
    private let recognizer = SFSpeechRecognizer(locale: Locale(identifier: "en-US"))!
    private let audioEngine = AVAudioEngine()
    private var recognitionRequest: SFSpeechAudioBufferRecognitionRequest?
    private var recognitionTask: SFSpeechRecognitionTask?

    func startTranscribing() throws {
        // 取消进行中的识别任务
        recognitionTask?.cancel()
        recognitionTask = nil

        // 配置音频会话
        let audioSession = AVAudioSession.sharedInstance()
        try audioSession.setCategory(.record, mode: .measurement, options: .duckOthers)
        try audioSession.setActive(true, options: .notifyOthersOnDeactivation)

        // 创建识别请求
        let request = SFSpeechAudioBufferRecognitionRequest()
        request.shouldReportPartialResults = true
        self.recognitionRequest = request

        // 启动识别任务
        recognitionTask = recognizer.recognitionTask(with: request) { result, error in
            if let result {
                let text = result.bestTranscription.formattedString
                print("转录结果: \(text)")

                if result.isFinal {
                    self.stopTranscribing()
                }
            }
            if let error {
                print("识别错误: \(error)")
                self.stopTranscribing()
            }
        }

        // 安装音频tap
        let inputNode = audioEngine.inputNode
        let recordingFormat = inputNode.outputFormat(forBus: 0)
        inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) {
            buffer, _ in
            request.append(buffer)
        }

        audioEngine.prepare()
        try audioEngine.start()
    }

    func stopTranscribing() {
        audioEngine.stop()
        audioEngine.inputNode.removeTap(onBus: 0)
        recognitionRequest?.endAudio()
        recognitionRequest = nil
        recognitionTask?.cancel()
        recognitionTask = nil
    }
}

Pre-Recorded Audio File Recognition

预录制音频文件识别

Use
SFSpeechURLRecognitionRequest
for audio files on disk:
swift
func transcribeFile(at url: URL) async throws -> String {
    guard let recognizer = SFSpeechRecognizer(), recognizer.isAvailable else {
        throw SpeechError.unavailable
    }
    let request = SFSpeechURLRecognitionRequest(url: url)
    request.shouldReportPartialResults = false

    return try await withCheckedThrowingContinuation { continuation in
        recognizer.recognitionTask(with: request) { result, error in
            if let error {
                continuation.resume(throwing: error)
            } else if let result, result.isFinal {
                continuation.resume(
                    returning: result.bestTranscription.formattedString
                )
            }
        }
    }
}
使用
SFSpeechURLRecognitionRequest
识别本地存储的音频文件:
swift
func transcribeFile(at url: URL) async throws -> String {
    guard let recognizer = SFSpeechRecognizer(), recognizer.isAvailable else {
        throw SpeechError.unavailable
    }
    let request = SFSpeechURLRecognitionRequest(url: url)
    request.shouldReportPartialResults = false

    return try await withCheckedThrowingContinuation { continuation in
        recognizer.recognitionTask(with: request) { result, error in
            if let error {
                continuation.resume(throwing: error)
            } else if let result, result.isFinal {
                continuation.resume(
                    returning: result.bestTranscription.formattedString
                )
            }
        }
    }
}

On-Device vs Server Recognition

端侧识别 vs 服务端识别

On-device recognition (iOS 13+) works offline but supports fewer locales:
swift
let recognizer = SFSpeechRecognizer(locale: Locale(identifier: "en-US"))!

// Check if on-device is supported for this locale
if recognizer.supportsOnDeviceRecognition {
    let request = SFSpeechAudioBufferRecognitionRequest()
    request.requiresOnDeviceRecognition = true  // Force on-device
}
Tip: On-device recognition avoids network latency and the one-minute audio limit imposed by server-based recognition. However, accuracy may be lower and not all locales are supported. Check
supportsOnDeviceRecognition
before forcing on-device mode.
端侧识别(iOS 13+)支持离线使用,但支持的locale较少:
swift
let recognizer = SFSpeechRecognizer(locale: Locale(identifier: "en-US"))!

// 检查当前locale是否支持端侧识别
if recognizer.supportsOnDeviceRecognition {
    let request = SFSpeechAudioBufferRecognitionRequest()
    request.requiresOnDeviceRecognition = true  // 强制使用端侧识别
}
提示: 端侧识别无需网络,也没有服务端识别的1分钟音频时长限制。但准确率可能更低,且不是所有locale都支持。强制开启端侧识别前请先检查
supportsOnDeviceRecognition

Handling Results

处理识别结果

Partial vs final results

部分结果 vs 最终结果

swift
let request = SFSpeechAudioBufferRecognitionRequest()
request.shouldReportPartialResults = true  // default is true

recognizer.recognitionTask(with: request) { result, error in
    guard let result else { return }

    if result.isFinal {
        // Final transcription — recognition is complete
        let final = result.bestTranscription.formattedString
    } else {
        // Partial result — may change as more audio is processed
        let partial = result.bestTranscription.formattedString
    }
}
swift
let request = SFSpeechAudioBufferRecognitionRequest()
request.shouldReportPartialResults = true  // 默认值为true

recognizer.recognitionTask(with: request) { result, error in
    guard let result else { return }

    if result.isFinal {
        // 最终转录结果 — 识别已完成
        let final = result.bestTranscription.formattedString
    } else {
        // 部分结果 — 后续处理更多音频时可能会变化
        let partial = result.bestTranscription.formattedString
    }
}

Accessing alternative transcriptions and confidence

获取备选转录结果和置信度

swift
recognizer.recognitionTask(with: request) { result, error in
    guard let result else { return }

    // Best transcription
    let best = result.bestTranscription

    // All alternatives (sorted by confidence, descending)
    for transcription in result.transcriptions {
        for segment in transcription.segments {
            print("\(segment.substring): \(segment.confidence)")
        }
    }
}
swift
recognizer.recognitionTask(with: request) { result, error in
    guard let result else { return }

    // 最优转录结果
    let best = result.bestTranscription

    // 所有备选结果(按置信度降序排序)
    for transcription in result.transcriptions {
        for segment in transcription.segments {
            print("\(segment.substring): \(segment.confidence)")
        }
    }
}

Adding punctuation (iOS 16+)

自动添加标点(iOS 16+)

swift
let request = SFSpeechAudioBufferRecognitionRequest()
request.addsPunctuation = true
swift
let request = SFSpeechAudioBufferRecognitionRequest()
request.addsPunctuation = true

Contextual strings

上下文关键词

Improve recognition of domain-specific terms:
swift
let request = SFSpeechAudioBufferRecognitionRequest()
request.contextualStrings = ["SwiftUI", "Xcode", "CloudKit"]
提升领域专有术语的识别准确率:
swift
let request = SFSpeechAudioBufferRecognitionRequest()
request.contextualStrings = ["SwiftUI", "Xcode", "CloudKit"]

Common Mistakes

常见错误

Not requesting both speech and microphone authorization

未同时申请语音识别和麦克风权限

swift
// ❌ DON'T: Only request speech authorization for live audio
SFSpeechRecognizer.requestAuthorization { status in
    // Missing microphone permission — audio engine will fail
    self.startRecording()
}

// ✅ DO: Request both permissions before recording
SFSpeechRecognizer.requestAuthorization { status in
    guard status == .authorized else { return }
    AVAudioSession.sharedInstance().requestRecordPermission { granted in
        guard granted else { return }
        self.startRecording()
    }
}
swift
// ❌ 错误做法:实时转录仅申请语音识别权限
SFSpeechRecognizer.requestAuthorization { status in
    // 缺少麦克风权限 — 音频引擎会启动失败
    self.startRecording()
}

// ✅ 正确做法:录音前同时申请两个权限
SFSpeechRecognizer.requestAuthorization { status in
    guard status == .authorized else { return }
    AVAudioSession.sharedInstance().requestRecordPermission { granted in
        guard granted else { return }
        self.startRecording()
    }
}

Not handling availability changes

未处理可用性变化

swift
// ❌ DON'T: Assume recognizer stays available after initial check
let recognizer = SFSpeechRecognizer()!
// Recognition may fail if network drops or locale changes

// ✅ DO: Monitor availability via delegate
recognizer.delegate = self
func speechRecognizer(
    _ speechRecognizer: SFSpeechRecognizer,
    availabilityDidChange available: Bool
) {
    recordButton.isEnabled = available
}
swift
// ❌ 错误做法:初始检查通过后就默认识别器一直可用
let recognizer = SFSpeechRecognizer()!
// 网络断开或locale变化时识别可能失败

// ✅ 正确做法:通过代理监听可用性变化
recognizer.delegate = self
func speechRecognizer(
    _ speechRecognizer: SFSpeechRecognizer,
    availabilityDidChange available: Bool
) {
    recordButton.isEnabled = available
}

Not stopping the audio engine when recognition ends

识别结束后未停止音频引擎

swift
// ❌ DON'T: Leave audio engine running after recognition finishes
recognizer.recognitionTask(with: request) { result, error in
    if result?.isFinal == true {
        // Audio engine still running, wasting resources and battery
    }
}

// ✅ DO: Clean up all audio resources
recognizer.recognitionTask(with: request) { result, error in
    if result?.isFinal == true || error != nil {
        self.audioEngine.stop()
        self.audioEngine.inputNode.removeTap(onBus: 0)
        self.recognitionRequest?.endAudio()
        self.recognitionRequest = nil
    }
}
swift
// ❌ 错误做法:识别完成后让音频引擎继续运行
recognizer.recognitionTask(with: request) { result, error in
    if result?.isFinal == true {
        // 音频引擎仍在运行,浪费资源和电量
    }
}

// ✅ 正确做法:清理所有音频资源
recognizer.recognitionTask(with: request) { result, error in
    if result?.isFinal == true || error != nil {
        self.audioEngine.stop()
        self.audioEngine.inputNode.removeTap(onBus: 0)
        self.recognitionRequest?.endAudio()
        self.recognitionRequest = nil
    }
}

Assuming on-device recognition is available for all locales

假设所有locale都支持端侧识别

swift
// ❌ DON'T: Force on-device without checking support
let request = SFSpeechAudioBufferRecognitionRequest()
request.requiresOnDeviceRecognition = true // May silently fail

// ✅ DO: Check support before requiring on-device
if recognizer.supportsOnDeviceRecognition {
    request.requiresOnDeviceRecognition = true
} else {
    // Fall back to server-based or inform user
}
swift
// ❌ 错误做法:未检查支持情况就强制开启端侧识别
let request = SFSpeechAudioBufferRecognitionRequest()
request.requiresOnDeviceRecognition = true // 可能静默失败

// ✅ 正确做法:开启前检查支持情况
if recognizer.supportsOnDeviceRecognition {
    request.requiresOnDeviceRecognition = true
} else {
    // 降级为服务端识别或告知用户
}

Not handling the one-minute recognition limit

未处理1分钟识别时长限制

swift
// ❌ DON'T: Start one long continuous recognition session
func startRecording() {
    // This will be cut off after ~60 seconds (server-based)
}

// ✅ DO: Restart recognition when approaching the limit
func startRecording() {
    // Use a timer to restart before the limit
    recognitionTimer = Timer.scheduledTimer(withTimeInterval: 55, repeats: false) {
        [weak self] _ in
        self?.restartRecognition()
    }
}
swift
// ❌ 错误做法:启动一个长时间的连续识别会话
func startRecording() {
    // 服务端识别会在约60秒后被强制截断
}

// ✅ 正确做法:接近时长限制时重启识别
func startRecording() {
    // 使用定时器在到达限制前重启
    recognitionTimer = Timer.scheduledTimer(withTimeInterval: 55, repeats: false) {
        [weak self] _ in
        self?.restartRecognition()
    }
}

Creating multiple simultaneous recognition tasks

创建多个同时运行的识别任务

swift
// ❌ DON'T: Start a new task without canceling the previous one
func startRecording() {
    recognitionTask = recognizer.recognitionTask(with: request) { ... }
    // Previous task is still running — undefined behavior
}

// ✅ DO: Cancel existing task before creating a new one
func startRecording() {
    recognitionTask?.cancel()
    recognitionTask = nil
    recognitionTask = recognizer.recognitionTask(with: request) { ... }
}
swift
// ❌ 错误做法:未取消前一个任务就启动新任务
func startRecording() {
    recognitionTask = recognizer.recognitionTask(with: request) { ... }
    // 前一个任务仍在运行 — 会出现未定义行为
}

// ✅ 正确做法:创建新任务前取消现有任务
func startRecording() {
    recognitionTask?.cancel()
    recognitionTask = nil
    recognitionTask = recognizer.recognitionTask(with: request) { ... }
}

Review Checklist

检查清单

  • NSSpeechRecognitionUsageDescription
    is in Info.plist
  • NSMicrophoneUsageDescription
    is in Info.plist (if using live audio)
  • Authorization is requested before starting recognition
  • SFSpeechRecognizerDelegate
    is set to handle
    availabilityDidChange
  • Audio engine is stopped and tap removed when recognition ends
  • recognitionRequest.endAudio()
    is called when done recording
  • Previous
    recognitionTask
    is canceled before starting a new one
  • supportsOnDeviceRecognition
    is checked before requiring on-device mode
  • Partial results are handled separately from final (
    isFinal
    ) results
  • One-minute limit is accounted for in server-based recognition
  • For iOS 26+:
    AssetInventory
    assets are installed before using
    SpeechAnalyzer
  • For iOS 26+:
    SpeechTranscriber.supportedLocale(equivalentTo:)
    is checked
  • Info.plist中已添加
    NSSpeechRecognitionUsageDescription
  • 如使用实时音频,Info.plist中已添加
    NSMicrophoneUsageDescription
  • 启动识别前已申请所需权限
  • 已设置
    SFSpeechRecognizerDelegate
    处理
    availabilityDidChange
    事件
  • 识别结束时已停止音频引擎并移除tap
  • 录音结束后已调用
    recognitionRequest.endAudio()
  • 启动新识别任务前已取消之前的
    recognitionTask
  • 强制开启端侧识别前已检查
    supportsOnDeviceRecognition
  • 已区分处理部分结果和最终(
    isFinal
    )结果
  • 服务端识别场景已处理1分钟时长限制
  • iOS 26+场景:使用
    SpeechAnalyzer
    前已安装
    AssetInventory
    资源包
  • iOS 26+场景:已检查
    SpeechTranscriber.supportedLocale(equivalentTo:)
    支持性

References

参考资料