practical-haskell
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePractical Haskell (GHC)
实用Haskell编程(基于GHC)
Use this skill when the task is Haskell code quality, performance, or reasoning about evaluation. Assume GHC with optimizations ( / ) unless the user says otherwise.
-O-O2当涉及Haskell代码质量、性能优化或求值逻辑分析时,可使用本技能。除非用户特别说明,否则默认使用开启优化( / )的GHC编译器。
-O-O2Core ideas
核心思想
- Purity lets the compiler rewrite code safely; prefer explicit effects in or appropriate abstraction.
IO - Lazy by default: values are evaluated when needed. That enables composition but can hide space leaks.
- Types catch many bugs early; use them to encode intent (including for domain distinctions).
newtype - Know what GHC emits: when performance matters, treat Core () as ground truth after optimization.
-ddump-simpl
- 纯函数允许编译器安全地重写代码;优先在或合适的抽象中显式处理副作用。
IO - 默认惰性求值:值仅在需要时才会被计算。这支持灵活的代码组合,但可能隐藏空间泄漏问题。
- 类型系统可提前捕获大量错误;利用类型来表达设计意图(包括使用区分领域概念)。
newtype - 了解GHC的输出:当性能至关重要时,将优化后的Core代码(通过生成)作为分析的依据。
-ddump-simpl
Always
务必遵循
- Be explicit about strict vs lazy data and bindings when modeling accumulators, parsers, or long-lived state.
- Prefer from
foldl'(or strict folds from the right library) for numeric accumulation over plainData.Liston strict values.foldl - Profile (, eventlog,
profiling, etc.) before micro-optimizing.ghc-debug - Write small composable functions; rely on inlining and specialization rather than giant monoliths.
- Use fusion-friendly pipelines (,
map,filter-based idioms) where appropriate; validate hot paths in Core if allocation matters.foldr
- 当建模累加器、解析器或长期存在的状态时,显式指定数据和绑定的严格/惰性属性。
- 对严格类型的值进行数值累加时,优先使用中的**
Data.List**(或对应库中的严格折叠函数),而非普通的foldl'。foldl - 进行微优化前先做性能分析(使用profiling、eventlog、等工具)。
ghc-debug - 编写小巧且可组合的函数;依赖内联和特化机制,而非庞大的单体函数。
- 适当时使用支持融合的流水线(基于、
map、filter的惯用写法);若内存分配是问题,需在Core代码中验证热点路径。foldr
Never
切勿触碰
- Accidentally build large chains of thunks (classic on large strict sums).
foldl (+) 0 - Ignore space leaks from unevaluated structure holding onto memory.
- Micro-optimize without evidence from profiling or Core.
- Treat laziness as universally good or bad; decide per use case.
- 避免意外创建大型thunk链(例如对大量严格数值使用的典型错误)。
foldl (+) 0 - 不要忽略因未求值结构占用内存导致的空间泄漏。
- 不要在没有性能分析或Core代码依据的情况下进行微优化。
- 不要将惰性求值视为绝对的好或坏;需根据具体场景决定。
Prefer
优先选择
- Strict fields () on accumulator-like constructor fields;
!for small unboxed numeric fields when profiling supports it.UNPACK - Newtypes for zero-runtime-cost distinctions vs with a single field.
data - /
INLINE/INLINABLEon hot polymorphic glue when dictionaries or lack of specialization shows up in Core.SPECIALIZE - Worker/wrapper style: a strict internal worker and a small external API.
- Monomorphic hot loops when polymorphism still costs after specialization attempts.
- 对类似累加器的构造器字段使用严格字段();当性能分析表明有必要时,对小型未装箱数值字段使用**
!**。UNPACK - 如需零运行时开销的类型区分,优先使用****而非单字段的
newtype。data - 当Core代码中出现类型类字典或缺乏特化的情况时,对热点多态粘合代码使用**/
INLINE/INLINABLE**编译指令。SPECIALIZE - 采用Worker/wrapper风格:内部使用严格的worker函数,对外提供小巧的API。
- 若尝试特化后多态性仍有性能开销,对热点循环使用单态实现。
Laziness and strictness
惰性与严格求值
haskell
import Data.List (foldl')
-- Infinite lists are fine when consumption is bounded.
naturals :: [Integer]
naturals = [1..]
firstTen :: [Integer]
firstTen = take 10 naturals
-- foldl on strict arithmetic often leaks thunks; foldl' forces as it goes.
badSum :: [Int] -> Int
badSum = foldl (+) 0
goodSum :: [Int] -> Int
goodSum = foldl' (+) 0Bang patterns () force evaluation of a binding; use at strategic places (accumulators, fields that must not retain thunks).
{-# LANGUAGE BangPatterns #-}Strict fields on constructors evaluate to WHNF when the constructor is entered; combine with profiling to avoid over-forcing.
datahaskell
import Data.List (foldl')
-- 当消费是有界的时,无限列表是可行的。
naturals :: [Integer]
naturals = [1..]
firstTen :: [Integer]
firstTen = take 10 naturals
-- 对严格算术使用foldl通常会产生thunk泄漏;foldl'会在计算过程中强制求值。
badSum :: [Int] -> Int
badSum = foldl (+) 0
goodSum :: [Int] -> Int
goodSum = foldl' (+) 0Bang模式()可强制绑定的求值;应在关键位置使用(如累加器、不能保留thunk的字段)。
{-# LANGUAGE BangPatterns #-}dataFusion and lists
列表融合
List pipelines like often fuse under into a single loop. If allocation persists, inspect Core. Avoid forcing materialization unnecessarily (e.g. redundant or indexing on huge intermediates in hot code).
sum . map f . filter p-O2lengthGHC applies rewrite rules internally; custom is advanced and must be validated (correctness and phase interactions).
{-# RULES #-}类似的列表流水线在优化下通常会融合为单个循环。若仍存在内存分配问题,需检查Core代码。避免不必要地强制实例化(例如在热点代码中对大型中间列表冗余调用或索引操作)。
sum . map f . filter p-O2lengthGHC内部会应用重写规则;自定义属于高级用法,必须进行验证(包括正确性和阶段交互)。
{-# RULES #-}Newtypes
Newtype
haskell
newtype UserId = UserId Int
deriving (Eq, Ord, Show)
newtype Email = Email String
deriving (Eq, Show)Use for distinct types with identical representation. can derive classes when appropriate and policy allows.
newtypeGeneralizedNewtypeDerivinghaskell
newtype UserId = UserId Int
deriving (Eq, Ord, Show)
newtype Email = Email String
deriving (Eq, Show)当需要表示相同但逻辑不同的类型时,使用。在合适且符合规范的情况下,可使用派生类型类实例。
newtypeGeneralizedNewtypeDerivingSpecialization and inlining
特化与内联
Polymorphic hot code may pass type-class dictionaries. Mitigations:
- Give a monomorphic variant for the hot path.
- Use for concrete instantiations.
{-# SPECIALIZE #-} - Use on small polymorphic helpers so call sites can specialize.
{-# INLINABLE #-}
Verify with Core, not assumptions.
热点多态代码可能会传递类型类字典。缓解方法:
- 为热点路径提供单态变体。
- 对具体实例化使用****。
{-# SPECIALIZE #-} - 对小型多态辅助函数使用****,以便调用点可以进行特化。
{-# INLINABLE #-}
需通过Core代码验证,而非仅凭假设。
Reading Core (quick guide)
阅读Core代码(快速指南)
Compile with something like (flags vary by need).
ghc -O2 -ddump-simpl -dsuppress-all -dno-suppress-type-signatures YourModule.hs- usually forces evaluation; extra
casebindings can mean allocation.let - Look for fusion: one tight recursive loop vs multiple passes.
- Check whether dictionary calls remain in hot loops.
使用类似如下命令编译:(具体标志可按需调整)。
ghc -O2 -ddump-simpl -dsuppress-all -dno-suppress-type-signatures YourModule.hs- **表达式通常会强制求值;额外的
case**绑定可能意味着内存分配。let - 检查是否融合:是单个紧凑的递归循环还是多轮遍历。
- 检查热点循环中是否仍存在字典调用。
Mental checklist
思维检查清单
- When does each subexpression get forced?
- Where might thunks retain memory (closures, lazy fields, -style accumulation)?
foldl - Will this pipeline fuse or allocate intermediates?
- What does simplified Core show for the hot path?
- Is the hot code monomorphic and specialized?
- 何时每个子表达式会被强制求值?
- 何处可能存在thunk占用内存(闭包、惰性字段、风格的累加)?
foldl - 该流水线是否会融合,还是会分配中间列表?
- 简化后的Core代码对热点路径的显示结果是什么?
- 热点代码是否为单态且已特化?
Signature moves (when profiling says so)
关键优化手段(当性能分析表明需要时)
- Strict accumulators: , bang patterns, strict fields.
foldl' - small strict numeric fields to reduce indirection.
UNPACK - /
INLINE/INLINABLEto recover specialization.SPECIALIZE - Fusion-friendly combinators; avoid accidental intermediate lists in inner loops.
- Worker/wrapper refactor for clearer strict internals.
- Re-check Core after each change.
- 严格累加器:使用、bang模式、严格字段。
foldl' - 对小型严格数值字段使用以减少间接引用。
UNPACK - 使用/
INLINE/INLINABLE恢复特化效果。SPECIALIZE - 使用支持融合的组合子;避免在内部循环中意外创建中间列表。
- 采用Worker/wrapper重构,使内部严格逻辑更清晰。
- 每次修改后重新检查Core代码。
Additional resources
额外资源
For extended examples and GHC flag recipes, see reference.md.
如需更多示例和GHC标志使用方法,请参考reference.md。