xlsx

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Requirements for Outputs

输出要求

All Excel files

所有Excel文件

Professional Font

专业字体

Use a consistent, professional font (e.g., Arial, Times New Roman) for all deliverables unless otherwise instructed by the user

除非用户另有指示，所有交付物需使用统一的专业字体（例如Arial、Times New Roman）

Zero Formula Errors

无公式错误

Every Excel model MUST be delivered with ZERO formula errors (#REF!, #DIV/0!, #VALUE!, #N/A, #NAME?)

交付的每个Excel模型必须完全没有公式错误（#REF!、#DIV/0!、#VALUE!、#N/A、#NAME?）

Preserve Existing Templates (when updating templates)

保留现有模板（更新模板时）

Study and EXACTLY match existing format, style, and conventions when modifying files
Never impose standardized formatting on files with established patterns
Existing template conventions ALWAYS override these guidelines

修改文件时，需研究并完全匹配现有格式、样式和约定
不得对已有固定格式的文件强加标准化格式
现有模板约定始终优先于本指南

Financial models

财务模型

Color Coding Standards

颜色编码标准

Unless otherwise stated by the user or existing template

除非用户另有说明或已有模板约定

Industry-Standard Color Conventions

行业标准颜色规范

Blue text (RGB: 0,0,255): Hardcoded inputs, and numbers users will change for scenarios
Black text (RGB: 0,0,0): ALL formulas and calculations
Green text (RGB: 0,128,0): Links pulling from other worksheets within same workbook
Red text (RGB: 255,0,0): External links to other files
Yellow background (RGB: 255,255,0): Key assumptions needing attention or cells that need to be updated

蓝色文本（RGB: 0,0,255）：硬编码输入值，以及用户可根据场景修改的数字
黑色文本（RGB: 0,0,0）：所有公式和计算结果
绿色文本（RGB: 0,128,0）：指向同一工作簿内其他工作表的链接
红色文本（RGB: 255,0,0）：指向其他外部文件的链接
黄色背景（RGB: 255,255,0）：需要关注的关键假设或需更新的单元格

Number Formatting Standards

数字格式标准

Required Format Rules

必填格式规则

Years: Format as text strings (e.g., "2024" not "2,024")
Currency: Use $#,##0 format; ALWAYS specify units in headers ("Revenue ($mm)")
Zeros: Use number formatting to make all zeros "-", including percentages (e.g., "$#,##0;($#,##0);-")
Percentages: Default to 0.0% format (one decimal)
Multiples: Format as 0.0x for valuation multiples (EV/EBITDA, P/E)
Negative numbers: Use parentheses (123) not minus -123

年份：格式化为文本字符串（例如“2024”而非“2,024”）
货币：使用$#,##0格式；表头中必须明确单位（例如“Revenue ($mm)”）
零值：使用数字格式将所有零显示为“-”，包括百分比（例如“$#,##0;($#,##0);-”）
百分比：默认使用0.0%格式（保留一位小数）
倍数：估值倍数（EV/EBITDA、P/E）格式化为0.0x
负数：使用括号(123)表示，而非负号-123

Formula Construction Rules

公式构建规则

Assumptions Placement

假设值放置

Place ALL assumptions (growth rates, margins, multiples, etc.) in separate assumption cells
Use cell references instead of hardcoded values in formulas
Example: Use =B5*(1+$B$6) instead of =B5*1.05

所有假设值（增长率、利润率、倍数等）需放在单独的假设单元格中
公式中使用单元格引用而非硬编码值
示例：使用=B5*(1+$B$6)而非=B5*1.05

Formula Error Prevention

公式错误预防

Verify all cell references are correct
Check for off-by-one errors in ranges
Ensure consistent formulas across all projection periods
Test with edge cases (zero values, negative numbers)
Verify no unintended circular references

验证所有单元格引用是否正确
检查范围是否存在差一错误
确保所有预测周期的公式保持一致
使用边缘情况进行测试（零值、负数）
确认无意外的循环引用

Documentation Requirements for Hardcodes

硬编码值的文档要求

Comment or in cells beside (if end of table). Format: "Source: [System/Document], [Date], [Specific Reference], [URL if applicable]"
Examples:
- "Source: Company 10-K, FY2024, Page 45, Revenue Note, [SEC EDGAR URL]"
- "Source: Company 10-Q, Q2 2025, Exhibit 99.1, [SEC EDGAR URL]"
- "Source: Bloomberg Terminal, 8/15/2025, AAPL US Equity"
- "Source: FactSet, 8/20/2025, Consensus Estimates Screen"

在硬编码值旁的单元格添加注释（如果在表格末尾）。格式：“来源：[系统/文档], [日期], [具体引用], [适用URL]”
示例：
- "来源：公司10-K报告，2024财年，第45页，收入附注，[SEC EDGAR URL]"
- "来源：公司10-Q报告，2025年第二季度，附件99.1，[SEC EDGAR URL]"
- "来源：Bloomberg Terminal，2025年8月15日，AAPL US Equity"
- "来源：FactSet，2025年8月20日，一致预期屏幕"

XLSX creation, editing, and analysis

XLSX文件的创建、编辑与分析

Overview

概述

A user may ask you to create, edit, or analyze the contents of an .xlsx file. You have different tools and workflows available for different tasks.

用户可能会要求你创建、编辑或分析.xlsx文件的内容。针对不同任务，你可以使用不同的工具和工作流。

Important Requirements

重要要求

LibreOffice Required for Formula Recalculation: You can assume LibreOffice is installed for recalculating formula values using the

scripts/recalc.py

script. The script automatically configures LibreOffice on first run, including in sandboxed environments where Unix sockets are restricted (handled by

scripts/office/soffice.py

)

公式重新计算需使用LibreOffice：可假设已安装LibreOffice，用于通过

scripts/recalc.py

脚本重新计算公式值。该脚本在首次运行时会自动配置LibreOffice，包括在Unix套接字受限制的沙箱环境中（由

scripts/office/soffice.py

处理）

Reading and analyzing data

数据读取与分析

Data analysis with pandas

使用pandas进行数据分析

For data analysis, visualization, and basic operations, use pandas which provides powerful data manipulation capabilities:

python

import pandas as pd

对于数据分析、可视化和基础操作，使用pandas，它提供强大的数据处理能力：

python

import pandas as pd

Read Excel

df = pd.read_excel('file.xlsx') # Default: first sheet all_sheets = pd.read_excel('file.xlsx', sheet_name=None) # All sheets as dict

Analyze

df.head() # Preview data df.info() # Column info df.describe() # Statistics

Write Excel

df.to_excel('output.xlsx', index=False)

undefined

df.to_excel('output.xlsx', index=False)

undefined

Excel File Workflows

Excel文件工作流

CRITICAL: Use Formulas, Not Hardcoded Values

关键规则：使用公式，而非硬编码值

Always use Excel formulas instead of calculating values in Python and hardcoding them. This ensures the spreadsheet remains dynamic and updateable.

始终使用Excel公式，而非在Python中计算后硬编码结果。这样可确保电子表格保持动态可更新。

❌ WRONG - Hardcoding Calculated Values

❌ 错误示例 - 硬编码计算值

python

undefined

python

undefined

Bad: Calculating in Python and hardcoding result

total = df['Sales'].sum() sheet['B10'] = total # Hardcodes 5000

Bad: Computing growth rate in Python

growth = (df.iloc[-1]['Revenue'] - df.iloc[0]['Revenue']) / df.iloc[0]['Revenue'] sheet['C5'] = growth # Hardcodes 0.15

Bad: Python calculation for average

avg = sum(values) / len(values) sheet['D20'] = avg # Hardcodes 42.5

undefined

avg = sum(values) / len(values) sheet['D20'] = avg # Hardcodes 42.5

undefined

✅ CORRECT - Using Excel Formulas

✅ 正确示例 - 使用Excel公式

python

undefined

python

undefined

Good: Let Excel calculate the sum

sheet['B10'] = '=SUM(B2:B9)'

Good: Growth rate as Excel formula

sheet['C5'] = '=(C4-C2)/C2'

Good: Average using Excel function

sheet['D20'] = '=AVERAGE(D2:D19)'


This applies to ALL calculations - totals, percentages, ratios, differences, etc. The spreadsheet should be able to recalculate when source data changes.

sheet['D20'] = '=AVERAGE(D2:D19)'


此规则适用于所有计算——总计、百分比、比率、差值等。当源数据更改时，电子表格应能重新计算结果。

Common Workflow

通用工作流

Choose tool: pandas for data, openpyxl for formulas/formatting
Create/Load: Create new workbook or load existing file
Modify: Add/edit data, formulas, and formatting
Save: Write to file
Recalculate formulas (MANDATORY IF USING FORMULAS): Use the scripts/recalc.py script
bash
```
python scripts/recalc.py output.xlsx
```
Verify and fix any errors:
- The script returns JSON with error details
- If
```
status
```
  is
```
errors_found
```
  , check
```
error_summary
```
  for specific error types and locations
- Fix the identified errors and recalculate again
- Common errors to fix:
  - ```
  #REF!
```
  : Invalid cell references
- ```
#DIV/0!
```
    : Division by zero
  - ```
  #VALUE!
```
  : Wrong data type in formula
- ```
#NAME?
```
    : Unrecognized formula name

选择工具：数据分析用pandas，公式/格式设置用openpyxl
创建/加载：创建新工作簿或加载现有文件
修改：添加/编辑数据、公式和格式
保存：写入文件
重新计算公式（使用公式时必须执行）：使用scripts/recalc.py脚本
bash
```
python scripts/recalc.py output.xlsx
```
验证并修复错误：
- 脚本会返回包含错误详情的JSON
- 如果
```
status
```
  为
```
errors_found
```
  ，查看
```
error_summary
```
  获取具体错误类型和位置
- 修复已识别的错误并重新计算
- 常见需修复的错误：
  - ```
  #REF!
```
  ：无效单元格引用
- ```
#DIV/0!
```
    ：除以零
  - ```
  #VALUE!
```
  ：公式中数据类型错误
- ```
#NAME?
```
    ：无法识别的公式名称

Creating new Excel files

创建新Excel文件

python

undefined

python

undefined

Using openpyxl for formulas and formatting

from openpyxl import Workbook from openpyxl.styles import Font, PatternFill, Alignment

wb = Workbook() sheet = wb.active

from openpyxl import Workbook from openpyxl.styles import Font, PatternFill, Alignment

wb = Workbook() sheet = wb.active

Add data

sheet['A1'] = 'Hello' sheet['B1'] = 'World' sheet.append(['Row', 'of', 'data'])

Add formula

sheet['B2'] = '=SUM(A1:A10)'

Formatting

sheet['A1'].font = Font(bold=True, color='FF0000') sheet['A1'].fill = PatternFill('solid', start_color='FFFF00') sheet['A1'].alignment = Alignment(horizontal='center')

Column width

sheet.column_dimensions['A'].width = 20

wb.save('output.xlsx')

undefined

sheet.column_dimensions['A'].width = 20

wb.save('output.xlsx')

undefined

Editing existing Excel files

编辑现有Excel文件

python

undefined

python

undefined

Using openpyxl to preserve formulas and formatting

from openpyxl import load_workbook

Load existing file

wb = load_workbook('existing.xlsx') sheet = wb.active # or wb['SheetName'] for specific sheet

Working with multiple sheets

for sheet_name in wb.sheetnames: sheet = wb[sheet_name] print(f"Sheet: {sheet_name}")

for sheet_name in wb.sheetnames: sheet = wb[sheet_name] print(f"Sheet:{sheet_name}")

Modify cells

sheet['A1'] = 'New Value' sheet.insert_rows(2) # Insert row at position 2 sheet.delete_cols(3) # Delete column 3

Add new sheet

new_sheet = wb.create_sheet('NewSheet') new_sheet['A1'] = 'Data'

wb.save('modified.xlsx')

undefined

new_sheet = wb.create_sheet('NewSheet') new_sheet['A1'] = 'Data'

wb.save('modified.xlsx')

undefined

Recalculating formulas

重新计算公式

Excel files created or modified by openpyxl contain formulas as strings but not calculated values. Use the provided

scripts/recalc.py

script to recalculate formulas:

bash

python scripts/recalc.py <excel_file> [timeout_seconds]

Example:

bash

python scripts/recalc.py output.xlsx 30

The script:

Automatically sets up LibreOffice macro on first run
Recalculates all formulas in all sheets
Scans ALL cells for Excel errors (#REF!, #DIV/0!, etc.)
Returns JSON with detailed error locations and counts
Works on both Linux and macOS

由openpyxl创建或修改的Excel文件中，公式以字符串形式存在，未计算出结果。使用提供的

scripts/recalc.py

脚本重新计算公式：

bash

python scripts/recalc.py <excel_file> [timeout_seconds]

示例：

bash

python scripts/recalc.py output.xlsx 30

该脚本：

首次运行时自动设置LibreOffice宏
重新计算所有工作表中的所有公式
扫描所有单元格查找Excel错误（#REF!、#DIV/0!等）
返回包含错误位置和数量详情的JSON
支持Linux和macOS系统

Formula Verification Checklist

公式验证清单

Quick checks to ensure formulas work correctly:

快速检查确保公式正常工作：

Essential Verification

核心验证

Test 2-3 sample references: Verify they pull correct values before building full model
Column mapping: Confirm Excel columns match (e.g., column 64 = BL, not BK)
Row offset: Remember Excel rows are 1-indexed (DataFrame row 5 = Excel row 6)

测试2-3个示例引用：在构建完整模型前，验证引用是否能获取正确值
列映射：确认Excel列匹配（例如，第64列=BL，而非BK）
行偏移：注意Excel行是从1开始索引的（DataFrame第5行=Excel第6行）

Common Pitfalls

常见陷阱

NaN handling: Check for null values with
```
pd.notna()
```
Far-right columns: FY data often in columns 50+
Multiple matches: Search all occurrences, not just first
Division by zero: Check denominators before using
```
/
```
in formulas (#DIV/0!)
Wrong references: Verify all cell references point to intended cells (#REF!)
Cross-sheet references: Use correct format (Sheet1!A1) for linking sheets

NaN处理：使用
```
pd.notna()
```
检查空值
右侧列：财年数据通常在第50列之后
多匹配项：搜索所有匹配结果，而非仅第一个
除以零：在公式中使用
```
/
```
前检查分母是否为零（避免#DIV/0!）
错误引用：验证所有单元格引用指向预期单元格（避免#REF!）
跨工作表引用：使用正确格式（Sheet1!A1）链接工作表

Formula Testing Strategy

公式测试策略

Start small: Test formulas on 2-3 cells before applying broadly
Verify dependencies: Check all cells referenced in formulas exist
Test edge cases: Include zero, negative, and very large values

从小规模开始：在大范围应用前，先在2-3个单元格测试公式
验证依赖项：检查公式中引用的所有单元格是否存在
测试边缘情况：包含零值、负值和极大值

Interpreting scripts/recalc.py Output

解读scripts/recalc.py输出

The script returns JSON with error details:

json

{
  "status": "success",           // or "errors_found"
  "total_errors": 0,              // Total error count
  "total_formulas": 42,           // Number of formulas in file
  "error_summary": {              // Only present if errors found
    "#REF!": {
      "count": 2,
      "locations": ["Sheet1!B5", "Sheet1!C10"]
    }
  }
}

脚本会返回包含错误详情的JSON：

json

{
  "status": "success",           // or "errors_found"
  "total_errors": 0,              // Total error count
  "total_formulas": 42,           // Number of formulas in file
  "error_summary": {              // Only present if errors found
    "#REF!": {
      "count": 2,
      "locations": ["Sheet1!B5", "Sheet1!C10"]
    }
  }
}

Best Practices

最佳实践

Library Selection

库选择

pandas: Best for data analysis, bulk operations, and simple data export
openpyxl: Best for complex formatting, formulas, and Excel-specific features

pandas：最适合数据分析、批量操作和简单数据导出
openpyxl：最适合复杂格式设置、公式编写和Excel专属功能

Working with openpyxl

使用openpyxl的注意事项

Cell indices are 1-based (row=1, column=1 refers to cell A1)

Use

data_only=True

to read calculated values:

load_workbook('file.xlsx', data_only=True)

Warning: If opened with
```
data_only=True
```
and saved, formulas are replaced with values and permanently lost
For large files: Use
```
read_only=True
```
for reading or
```
write_only=True
```
for writing
Formulas are preserved but not evaluated - use scripts/recalc.py to update values

单元格索引从1开始（row=1, column=1对应单元格A1）

使用

data_only=True

读取计算后的值：

load_workbook('file.xlsx', data_only=True)

警告：若使用
```
data_only=True
```
打开并保存文件，公式会被值替换且永久丢失
处理大文件：读取用
```
read_only=True
```
，写入用
```
write_only=True
```
公式会被保留但不会自动计算——需使用scripts/recalc.py更新值

Working with pandas

使用pandas的注意事项

Specify data types to avoid inference issues:

pd.read_excel('file.xlsx', dtype={'id': str})

For large files, read specific columns:

pd.read_excel('file.xlsx', usecols=['A', 'C', 'E'])

Handle dates properly:

pd.read_excel('file.xlsx', parse_dates=['date_column'])

指定数据类型以避免推断问题：

pd.read_excel('file.xlsx', dtype={'id': str})

处理大文件时，读取指定列：

pd.read_excel('file.xlsx', usecols=['A', 'C', 'E'])

正确处理日期：

pd.read_excel('file.xlsx', parse_dates=['date_column'])

Code Style Guidelines

代码风格指南

IMPORTANT: When generating Python code for Excel operations:

Write minimal, concise Python code without unnecessary comments
Avoid verbose variable names and redundant operations
Avoid unnecessary print statements

For Excel files themselves:

Add comments to cells with complex formulas or important assumptions
Document data sources for hardcoded values
Include notes for key calculations and model sections

重要：生成用于Excel操作的Python代码时：

编写简洁的Python代码，无需多余注释
避免冗长的变量名和冗余操作
避免不必要的打印语句

对于Excel文件本身：

为包含复杂公式或重要假设的单元格添加注释
为硬编码值添加数据源说明
为关键计算和模型部分添加注释