xlsx

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Excel/Spreadsheet Processing

Excel/电子表格处理

Reading and Analyzing Data

读取与数据分析

python
import pandas as pd
python
import pandas as pd

Read Excel

Read Excel

df = pd.read_excel('file.xlsx') # Default: first sheet all_sheets = pd.read_excel('file.xlsx', sheet_name=None) # All sheets as dict
df = pd.read_excel('file.xlsx') # Default: first sheet all_sheets = pd.read_excel('file.xlsx', sheet_name=None) # All sheets as dict

Analyze

Analyze

df.head() # Preview data df.info() # Column info df.describe() # Statistics
df.head() # Preview data df.info() # Column info df.describe() # Statistics

Write Excel

Write Excel

df.to_excel('output.xlsx', index=False)
undefined
df.to_excel('output.xlsx', index=False)
undefined

Creating Excel Files with openpyxl

使用openpyxl创建Excel文件

python
from openpyxl import Workbook
from openpyxl.styles import Font, PatternFill, Alignment

wb = Workbook()
sheet = wb.active
python
from openpyxl import Workbook
from openpyxl.styles import Font, PatternFill, Alignment

wb = Workbook()
sheet = wb.active

Add data

Add data

sheet['A1'] = 'Hello' sheet['B1'] = 'World' sheet.append(['Row', 'of', 'data'])
sheet['A1'] = 'Hello' sheet['B1'] = 'World' sheet.append(['Row', 'of', 'data'])

Add formula - ALWAYS use formulas, not hardcoded values

Add formula - ALWAYS use formulas, not hardcoded values

sheet['B2'] = '=SUM(A1:A10)'
sheet['B2'] = '=SUM(A1:A10)'

Formatting

Formatting

sheet['A1'].font = Font(bold=True, color='FF0000') sheet['A1'].fill = PatternFill('solid', start_color='FFFF00') sheet['A1'].alignment = Alignment(horizontal='center')
sheet['A1'].font = Font(bold=True, color='FF0000') sheet['A1'].fill = PatternFill('solid', start_color='FFFF00') sheet['A1'].alignment = Alignment(horizontal='center')

Column width

Column width

sheet.column_dimensions['A'].width = 20
wb.save('output.xlsx')
undefined
sheet.column_dimensions['A'].width = 20
wb.save('output.xlsx')
undefined

Editing Existing Files

编辑现有文件

python
from openpyxl import load_workbook

wb = load_workbook('existing.xlsx')
sheet = wb.active
python
from openpyxl import load_workbook

wb = load_workbook('existing.xlsx')
sheet = wb.active

Modify cells

Modify cells

sheet['A1'] = 'New Value' sheet.insert_rows(2) sheet.delete_cols(3)
sheet['A1'] = 'New Value' sheet.insert_rows(2) sheet.delete_cols(3)

Add new sheet

Add new sheet

new_sheet = wb.create_sheet('NewSheet') new_sheet['A1'] = 'Data'
wb.save('modified.xlsx')
undefined
new_sheet = wb.create_sheet('NewSheet') new_sheet['A1'] = 'Data'
wb.save('modified.xlsx')
undefined

Critical: Use Formulas, Not Hardcoded Values

重要提示:使用公式,而非硬编码值

python
undefined
python
undefined

BAD - Hardcoding calculated values

BAD - Hardcoding calculated values

total = df['Sales'].sum() sheet['B10'] = total # Hardcodes 5000
total = df['Sales'].sum() sheet['B10'] = total # Hardcodes 5000

GOOD - Using Excel formulas

GOOD - Using Excel formulas

sheet['B10'] = '=SUM(B2:B9)' sheet['C5'] = '=(C4-C2)/C2' # Growth rate sheet['D20'] = '=AVERAGE(D2:D19)'
undefined
sheet['B10'] = '=SUM(B2:B9)' sheet['C5'] = '=(C4-C2)/C2' # Growth rate sheet['D20'] = '=AVERAGE(D2:D19)'
undefined

Financial Model Standards

财务模型标准

  • Blue text: Hardcoded inputs
  • Black text: ALL formulas
  • Green text: Links from other worksheets
  • Yellow background: Key assumptions
  • 蓝色文本:硬编码输入值
  • 黑色文本:所有公式
  • 绿色文本:来自其他工作表的链接
  • 黄色背景:关键假设

Best Practices

最佳实践

  • Use
    data_only=True
    to read calculated values
  • For large files: Use
    read_only=True
    or
    write_only=True
  • Formulas are preserved but not evaluated by openpyxl
  • 使用
    data_only=True
    读取计算后的值
  • 处理大文件时:使用
    read_only=True
    write_only=True
  • openpyxl会保留公式但不会对其进行计算