portfolio-optimization

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Portfolio Optimization

投资组合优化

Overview

概述

This skill provides guidance for implementing high-performance portfolio optimization algorithms using Python C extensions. It covers the workflow for creating C extensions that interface with NumPy arrays, proper verification strategies, and common pitfalls to avoid when optimizing numerical computations.

本技能提供了使用Python C扩展实现高性能投资组合优化算法的指南。涵盖了创建与NumPy数组交互的C扩展的工作流程、正确的验证策略，以及优化数值计算时需避免的常见陷阱。

When to Apply This Skill

适用场景

Apply this skill when:

Implementing portfolio risk calculations (variance, volatility, Sharpe ratio)
Optimizing matrix-vector operations for large asset portfolios
Creating C extensions for Python numerical code
Performance requirements specify speedup ratios (e.g., >= 1.2x)
Working with covariance matrices and portfolio weights

在以下场景中应用本技能：

实现投资组合风险计算（方差、波动率、夏普比率）
针对大型资产投资组合优化矩阵-向量运算
为Python数值代码创建C扩展
性能要求指定加速比率（例如≥1.2倍）
处理协方差矩阵和投资组合权重

Recommended Workflow

推荐工作流程

Phase 1: Codebase Understanding

阶段1：代码库理解

Before writing any code:

Read all relevant source files completely - Understand the baseline implementation, data structures, and expected interfaces
Identify the mathematical operations - Common operations include:
- Matrix-vector multiplication (covariance matrix times weights)
- Dot products (weights times returns)
- Square root operations (for volatility from variance)
Understand the test suite - Know what correctness tolerances are expected (e.g., 1e-10) and what performance benchmarks must be met
Document the input/output contracts - Array shapes, data types (typically float64), and return value specifications

在编写任何代码之前：

完整阅读所有相关源文件 - 理解基准实现、数据结构和预期接口
识别数学运算 - 常见运算包括：
- 矩阵-向量乘法（协方差矩阵乘以权重）
- 点积（权重乘以收益）
- 平方根运算（从方差计算波动率）
理解测试套件 - 明确预期的正确性容差（例如1e-10）以及必须满足的性能基准
记录输入/输出约定 - 数组形状、数据类型（通常为float64）和返回值规范

Phase 2: Implementation Planning

阶段2：实现规划

Consider these factors before implementation:

Why C provides speedup:
- Eliminates Python interpreter overhead
- Enables direct memory access without bounds checking
- Allows compiler optimizations (vectorization, loop unrolling)
- Reduces temporary array allocations
Design decisions to make:
- Whether to use NumPy C API for zero-copy array access
- Memory layout assumptions (C-contiguous vs Fortran-contiguous)
- Error handling strategy for type mismatches and dimension errors
Potential algorithmic optimizations:
- Cache-friendly memory access patterns (row-major iteration for C arrays)
- SIMD vectorization opportunities
- Minimizing Python-to-C data conversion overhead

在实现前考虑以下因素：

C语言实现加速的原因：
- 消除Python解释器开销
- 支持无边界检查的直接内存访问
- 允许编译器优化（向量化、循环展开）
- 减少临时数组分配
需要做出的设计决策：
- 是否使用NumPy C API实现零拷贝数组访问
- 内存布局假设（C连续 vs Fortran连续）
- 类型不匹配和维度错误的错误处理策略
潜在的算法优化：
- 缓存友好的内存访问模式（C数组的行优先迭代）
- SIMD向量化机会
- 最小化Python到C的数据转换开销

Phase 3: C Extension Implementation

阶段3：C扩展实现

When implementing the C extension:

Include proper headers:
- ```
Python.h
```
  (must be first)
- ```
numpy/arrayobject.h
```
  for NumPy array access
Initialize NumPy in the module init function:
- Call
```
import_array()
```
  to initialize NumPy C API
Use NumPy C API for array access:
- ```
PyArray_DATA()
```
  for getting data pointer
- ```
PyArray_DIM()
```
  for dimensions
- ```
PyArray_STRIDE()
```
  for memory strides
- Check
```
PyArray_IS_C_CONTIGUOUS()
```
  for memory layout
Implement robust error handling:
- Validate array dimensions match expected shapes
- Check data types (expect
```
NPY_FLOAT64
```
  for double precision)
- Handle non-contiguous arrays (either reject or handle strides)
- Set appropriate Python exceptions on error

实现C扩展时：

包含正确的头文件：
- ```
Python.h
```
  （必须放在首位）
- ```
numpy/arrayobject.h
```
  用于NumPy数组访问
在模块初始化函数中初始化NumPy：
- 调用
```
import_array()
```
  初始化NumPy C API
使用NumPy C API访问数组：
- ```
PyArray_DATA()
```
  获取数据指针
- ```
PyArray_DIM()
```
  获取维度
- ```
PyArray_STRIDE()
```
  获取内存步长
- 检查
```
PyArray_IS_C_CONTIGUOUS()
```
  确认内存布局
实现健壮的错误处理：
- 验证数组维度是否符合预期形状
- 检查数据类型（期望
```
NPY_FLOAT64
```
  双精度）
- 处理非连续数组（拒绝或处理步长）
- 出错时设置合适的Python异常

Phase 4: Python Wrapper Implementation

阶段4：Python封装器实现

Create a Python module that:

Imports the C extension module
Provides a clean interface matching the baseline API
Handles any necessary array preparation (ensuring contiguity)
Documents the interface clearly

创建一个Python模块，该模块需：

导入C扩展模块
提供与基准API匹配的简洁接口
处理必要的数组准备工作（确保连续性）
清晰记录接口

Phase 5: Verification Strategy

阶段5：验证策略

Critical: Verify every change completely

After editing files, re-read them - Confirm edits were applied correctly, especially for multi-line changes
Test incrementally:
- Build the C extension first and verify it compiles
- Test individual functions before running full benchmarks
- Use small test cases for correctness verification before scaling up
Correctness verification:
- Compare outputs against baseline implementation
- Use appropriate numerical tolerances (typically 1e-10 for double precision)
- Test with known inputs where expected outputs can be calculated manually
Performance verification:
- Run benchmarks with representative data sizes
- Verify speedup meets requirements across different portfolio sizes
- Test edge cases: small portfolios (n=1, n=10), large portfolios (n=5000+)

关键：完整验证每一处变更

编辑文件后重新阅读 - 确认编辑已正确应用，尤其是多行变更
增量测试：
- 先构建C扩展并验证编译通过
- 在运行完整基准测试前测试单个函数
- 在扩容前使用小型测试用例验证正确性
正确性验证：
- 将输出与基准实现进行比较
- 使用合适的数值容差（双精度通常为1e-10）
- 使用可手动计算预期输出的已知输入进行测试
性能验证：
- 使用具有代表性的数据大小运行基准测试
- 验证不同投资组合规模下加速比均满足要求
- 测试边缘案例：小型投资组合（n=1、n=10）、大型投资组合（n=5000+）

Edge Cases to Handle

需要处理的边缘案例

Ensure the implementation addresses:

Empty portfolios (n=0) - Return appropriate default or error
Single-asset portfolios (n=1) - Degenerate case for covariance
Dimension mismatches - Weights vector length vs covariance matrix dimensions
Invalid inputs:
- Non-square covariance matrices
- NaN or infinity values in inputs
- Negative variance (mathematically invalid)
Memory considerations:
- Non-contiguous NumPy arrays
- Memory allocation failures in C code
- Large portfolios that may stress memory

确保实现能够处理以下情况：

空投资组合（n=0）- 返回合适的默认值或错误
单资产投资组合（n=1）- 协方差的退化情况
维度不匹配 - 权重向量长度与协方差矩阵维度不匹配
无效输入：
- 非方阵的协方差矩阵
- 输入中包含NaN或无穷值
- 负方差（数学上无效）
内存考虑：
- 非连续的NumPy数组
- C代码中的内存分配失败
- 可能占用大量内存的大型投资组合

Common Pitfalls to Avoid

需避免的常见陷阱

Code Completeness

代码完整性

Never truncate code in edit operations - always provide complete implementations
Verify file contents after editing to confirm changes applied correctly
Document all design choices explicitly

编辑操作中切勿截断代码 - 始终提供完整实现
编辑后验证文件内容，确认变更已正确应用
明确记录所有设计选择

Testing Approach

测试方法

Avoid going directly from implementation to full benchmark testing
Test each function individually before integration testing
Do not rely solely on "tests pass" for validation - understand why they pass

避免直接从实现跳到完整基准测试
在集成测试前单独测试每个函数
不要仅依赖“测试通过”进行验证 - 理解测试通过的原因

C Extension Specific

C扩展特定陷阱

Always check NumPy array types before accessing data
Handle reference counting properly to avoid memory leaks
Initialize NumPy API with
```
import_array()
```
in module init
Use
```
PyErr_SetString()
```
to set exceptions on errors

在访问数据前始终检查NumPy数组类型
正确处理引用计数以避免内存泄漏
在模块初始化中使用
```
import_array()
```
初始化NumPy API
出错时使用
```
PyErr_SetString()
```
设置异常

Performance Validation

性能验证

Verify speedup is consistent across different input sizes
Profile if further optimizations might be needed
Consider the overhead of Python-to-C transitions for small inputs

验证不同输入大小下加速比的一致性
若需要进一步优化则进行性能分析
考虑小输入时Python到C转换的开销

Build and Test Commands

构建与测试命令

Typical workflow commands:

bash

undefined

典型工作流程命令：

bash

undefined

Build the C extension

python setup.py build_ext --inplace

Run correctness tests

python -c "from portfolio_optimized import *; # test calls"

Run benchmark

python benchmark.py

Run full test suite

pytest test_portfolio.py -v

undefined

pytest test_portfolio.py -v

undefined

portfolio-optimization

Original

Translation

Portfolio Optimization

投资组合优化

Overview

概述

When to Apply This Skill

适用场景

Recommended Workflow

推荐工作流程

Phase 1: Codebase Understanding

阶段1：代码库理解

Phase 2: Implementation Planning

阶段2：实现规划

Phase 3: C Extension Implementation

阶段3：C扩展实现

Phase 4: Python Wrapper Implementation

阶段4：Python封装器实现

Phase 5: Verification Strategy

阶段5：验证策略

Edge Cases to Handle

需要处理的边缘案例

Common Pitfalls to Avoid

需避免的常见陷阱

Code Completeness

代码完整性

Testing Approach

测试方法

C Extension Specific

C扩展特定陷阱

Performance Validation

性能验证

Build and Test Commands

构建与测试命令

Build the C extension

Build the C extension

Run correctness tests

Run correctness tests

Run benchmark

Run benchmark

Run full test suite

Run full test suite

Verification Checklist

验证检查清单