taiwan-equity-research-coverage
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTaiwan Equity Research Coverage (My-TW-Coverage)
台湾股票研究库 (My-TW-Coverage)
Skill by ara.so — Daily 2026 Skills collection.
A structured equity research database covering 1,735 Taiwan-listed companies (TWSE + OTC) across 99 industry sectors. Each report contains a business overview, supply chain mapping, customer/supplier relationships, and financial data — all cross-referenced through 4,900+ wikilinks forming a searchable knowledge graph.
由 ara.so 提供的Skill — 2026年度每日技能合集。
这是一个结构化的股票研究数据库,覆盖99个行业板块的1735家台湾上市企业(台交所+上柜)。每份报告都包含业务概览、供应链映射、客户/供应商关系和财务数据——所有内容通过4900+条wikilink交叉关联,形成可搜索的知识图谱。
Installation
安装
bash
git clone https://github.com/Timeverse/My-TW-Coverage
cd My-TW-Coverage
pip install yfinance pandas tabulatebash
git clone https://github.com/Timeverse/My-TW-Coverage
cd My-TW-Coverage
pip install yfinance pandas tabulateProject Structure
项目结构
My-TW-Coverage/
├── Pilot_Reports/ # 1,735 ticker reports across 99 sectors
│ ├── Semiconductors/ # 155 tickers
│ ├── Electronic Components/ # 267 tickers
│ ├── Computer Hardware/ # 114 tickers
│ └── ... (99 sector folders)
├── scripts/
│ ├── utils.py # Shared utilities
│ ├── add_ticker.py # Generate new ticker reports
│ ├── update_financials.py # Refresh financial tables + valuation
│ ├── update_valuation.py # Refresh valuation multiples only (fast)
│ ├── update_enrichment.py # Update business descriptions from JSON
│ ├── audit_batch.py # Quality auditing
│ ├── discover.py # Buzzword → related companies search
│ ├── build_wikilink_index.py# Rebuild WIKILINKS.md index
│ ├── build_themes.py # Generate thematic investment screens
│ └── build_network.py # Generate D3.js network graph
├── WIKILINKS.md # Auto-generated browsable wikilink index
├── network/index.html # Interactive D3.js wikilink network
├── themes/ # Thematic investment screens (auto-generated)
└── task.md # Batch definitions and progress trackingMy-TW-Coverage/
├── Pilot_Reports/ # 覆盖99个板块的1735份股票报告
│ ├── Semiconductors/ # 155只股票
│ ├── Electronic Components/ # 267只股票
│ ├── Computer Hardware/ # 114只股票
│ └── ... (共99个板块文件夹)
├── scripts/
│ ├── utils.py # 共享工具函数
│ ├── add_ticker.py # 生成新的股票报告
│ ├── update_financials.py # 刷新财务表格+估值数据
│ ├── update_valuation.py # 仅刷新估值倍数(速度快)
│ ├── update_enrichment.py # 从JSON更新业务描述内容
│ ├── audit_batch.py # 质量审核
│ ├── discover.py # 关键词→关联企业搜索
│ ├── build_wikilink_index.py# 重建WIKILINKS.md索引
│ ├── build_themes.py # 生成主题投资筛选列表
│ └── build_network.py # 生成D3.js网络图
├── WIKILINKS.md # 自动生成的可浏览wikilink索引
├── network/index.html # 交互式D3.js wikilink网络图
├── themes/ # 主题投资筛选列表(自动生成)
└── task.md # 批次定义和进度跟踪Report Format
报告格式
Each ticker report is a markdown file at :
Pilot_Reports/<Sector>/<TICKER>_<Name>.mdmarkdown
undefined每只股票的报告都是存放在的markdown文件:
Pilot_Reports/<板块>/<股票代码>_<公司名称>.mdmarkdown
undefined2330 - [[台積電]]
2330 - [[台積電]]
業務簡介
業務簡介
板塊: Technology
產業: Semiconductors
市值: 47,326,857 百萬台幣
企業價值: 44,978,990 百萬台幣
台積電為全球最大晶圓代工廠,專注於 [[CoWoS]]、[[3奈米]] 先進製程...
板塊: Technology
產業: Semiconductors
市值: 47,326,857 百萬台幣
企業價值: 44,978,990 百萬台幣
台積電為全球最大晶圓代工廠,專注於 [[CoWoS]]、[[3奈米]] 先進製程...
供應鏈位置
供應鏈位置
上游: [[ASML]], [[Applied Materials]], [[SUMCO]]
中游: 台積電 (晶圓代工)
下游: [[Apple]], [[NVIDIA]], [[AMD]], [[Broadcom]]
上游: [[ASML]], [[Applied Materials]], [[SUMCO]]
中游: 台積電 (晶圓代工)
下游: [[Apple]], [[NVIDIA]], [[AMD]], [[Broadcom]]
主要客戶及供應商
主要客戶及供應商
主要客戶
主要客戶
- [[Apple]], [[NVIDIA]], [[AMD]], [[Qualcomm]]
- [[Apple]], [[NVIDIA]], [[AMD]], [[Qualcomm]]
主要供應商
主要供應商
- [[ASML]], [[Tokyo Electron]], [[Shin-Etsu]]
- [[ASML]], [[Tokyo Electron]], [[Shin-Etsu]]
財務概況
財務概況
估值指標
估值指標
| P/E (TTM) | Forward P/E | P/S (TTM) | P/B | EV/EBITDA |
|---|---|---|---|---|
| 28.5 | 22.1 | 9.3 | 7.2 | 16.4 |
| P/E (TTM) | Forward P/E | P/S (TTM) | P/B | EV/EBITDA |
|---|---|---|---|---|
| 28.5 | 22.1 | 9.3 | 7.2 | 16.4 |
年度財務數據
年度財務數據
[Annual 3-year financial table with 14 metrics]
[Annual 3-year financial table with 14 metrics]
季度財務數據
季度財務數據
[Quarterly 4-quarter financial table]
undefined[Quarterly 4-quarter financial table]
undefinedKey Commands
关键命令
Add a New Ticker
添加新股票
bash
undefinedbash
undefinedBasic (auto-detect sector)
基础用法(自动识别板块)
python scripts/add_ticker.py 2330 台積電
python scripts/add_ticker.py 2330 台積電
With explicit sector
指定板块
python scripts/add_ticker.py 2330 台積電 --sector Semiconductors
undefinedpython scripts/add_ticker.py 2330 台積電 --sector Semiconductors
undefinedUpdate Financial Data
更新财务数据
bash
undefinedbash
undefinedSingle ticker
单只股票
python scripts/update_financials.py 2330
python scripts/update_financials.py 2330
Multiple tickers
多只股票
python scripts/update_financials.py 2330 2454 3034
python scripts/update_financials.py 2330 2454 3034
By batch number (see task.md for batch definitions)
按批次编号(批次定义见task.md)
python scripts/update_financials.py --batch 101
python scripts/update_financials.py --batch 101
By sector
按板块
python scripts/update_financials.py --sector Semiconductors
python scripts/update_financials.py --sector Semiconductors
All 1,735 tickers (slow)
更新全部1735只股票(速度慢)
python scripts/update_financials.py
undefinedpython scripts/update_financials.py
undefinedUpdate Valuation Only (~3x Faster)
仅更新估值(速度快约3倍)
Refreshes only P/E, Forward P/E, P/S, P/B, EV/EBITDA, and stock price — skips full financial statement re-fetch.
bash
python scripts/update_valuation.py 2330
python scripts/update_valuation.py --batch 101
python scripts/update_valuation.py --sector Semiconductors
python scripts/update_valuation.py # All tickers仅更新市盈率(TTM)、前瞻市盈率、市销率(TTM)、市净率、企业价值/息税折旧摊销前利润和股价,跳过完整财报的重新拉取。
bash
python scripts/update_valuation.py 2330
python scripts/update_valuation.py --batch 101
python scripts/update_valuation.py --sector Semiconductors
python scripts/update_valuation.py # 全部股票Discover Companies by Buzzword
按关键词搜索相关企业
Find every Taiwan-listed company related to a theme or technology:
bash
undefined查找与某个主题或技术相关的所有台湾上市企业:
bash
undefinedBasic search
基础搜索
python scripts/discover.py "液冷散熱"
python scripts/discover.py "液冷散熱"
Auto-detect relevant sectors (skips banks/insurance/real estate for tech terms)
自动识别相关板块(搜索技术术语时跳过银行/保险/地产板块)
python scripts/discover.py "液冷散熱" --smart
python scripts/discover.py "液冷散熱" --smart
Tag matching companies with [[wikilinks]] in their reports
给匹配的报告打上[[wikilink]]标签
python scripts/discover.py "液冷散熱" --apply
python scripts/discover.py "液冷散熱" --apply
Apply + rebuild themes + rebuild network graph
打标签+重建主题列表+重建网络图
python scripts/discover.py "液冷散熱" --apply --rebuild
python scripts/discover.py "液冷散熱" --apply --rebuild
Limit to a specific sector
限定搜索板块
python scripts/discover.py "液冷散熱" --sector Semiconductors
Common buzzword examples:
- `"CoWoS"` — TSMC advanced packaging supply chain
- `"HBM"` — High Bandwidth Memory ecosystem
- `"電動車"` — EV component suppliers
- `"AI 伺服器"` — AI server supply chain (148 companies)
- `"光阻液"` — Photoresist suppliers and consumers
- `"碳化矽"` — Silicon carbide (SiC) companiespython scripts/discover.py "液冷散熱" --sector Semiconductors
常见关键词示例:
- `"CoWoS"` — 台积电先进封装供应链
- `"HBM"` — 高带宽内存生态
- `"電動車"` — 电动汽车零部件供应商
- `"AI 伺服器"` — AI服务器供应链(148家企业)
- `"光阻液"` — 光刻胶供应商和客户
- `"碳化矽"` — 碳化硅(SiC)相关企业Update Enrichment Content (Bulk AI Research)
更新补充内容(批量AI研究)
Prepare a JSON file, then apply to specific tickers, batches, or sectors:
bash
python scripts/update_enrichment.py --data enrichment.json 2330
python scripts/update_enrichment.py --data enrichment.json --batch 101
python scripts/update_enrichment.py --data enrichment.json --sector SemiconductorsEnrichment JSON format:
json
{
"2330": {
"desc": "台積電為全球最大晶圓代工廠,專注於 [[CoWoS]]、[[3奈米]] 先進製程,為 [[Apple]]、[[NVIDIA]] 等科技巨頭提供晶片製造服務。",
"supply_chain": "**上游:**\n- [[ASML]] (EUV 微影設備)\n- [[Applied Materials]] (薄膜沉積)\n**中游:**\n- **台積電** (晶圓代工)\n**下游:**\n- [[Apple]]\n- [[NVIDIA]]",
"cust": "### 主要客戶\n- [[Apple]] (約25%營收)\n- [[NVIDIA]]\n- [[AMD]]\n\n### 主要供應商\n- [[ASML]]\n- [[Tokyo Electron]]"
},
"2454": {
"desc": "...",
"supply_chain": "...",
"cust": "..."
}
}准备JSON文件,然后应用到指定的股票、批次或板块:
bash
python scripts/update_enrichment.py --data enrichment.json 2330
python scripts/update_enrichment.py --data enrichment.json --batch 101
python scripts/update_enrichment.py --data enrichment.json --sector Semiconductors补充内容JSON格式:
json
{
"2330": {
"desc": "台積電為全球最大晶圓代工廠,專注於 [[CoWoS]]、[[3奈米]] 先進製程,為 [[Apple]]、[[NVIDIA]] 等科技巨頭提供晶片製造服務。",
"supply_chain": "**上游:**\n- [[ASML]] (EUV 微影設備)\n- [[Applied Materials]] (薄膜沉積)\n**中游:**\n- **台積電** (晶圓代工)\n**下游:**\n- [[Apple]]\n- [[NVIDIA]]",
"cust": "### 主要客戶\n- [[Apple]] (約25%營收)\n- [[NVIDIA]]\n- [[AMD]]\n\n### 主要供應商\n- [[ASML]]\n- [[Tokyo Electron]]"
},
"2454": {
"desc": "...",
"supply_chain": "...",
"cust": "..."
}
}Audit Report Quality
审核报告质量
bash
undefinedbash
undefinedSingle batch
单个批次
python scripts/audit_batch.py 101 -v
python scripts/audit_batch.py 101 -v
All batches
所有批次
python scripts/audit_batch.py --all -v
Audit checks:
- Minimum 8 wikilinks per report
- No generic terms in brackets (e.g. `[[公司]]`, `[[產品]]`)
- No placeholder text remaining
- No English text in Chinese-language sections
- Metadata completeness (板塊, 產業, 市值, 企業價值)
- Section depth (業務簡介, 供應鏈位置, 主要客戶及供應商, 財務概況 all present)python scripts/audit_batch.py --all -v
审核项:
- 每份报告至少8条wikilink
- 括号内无通用术语(例如`[[公司]]`、`[[產品]]`)
- 无剩余占位符文本
- 中文板块无英文内容
- 元数据完整(板塊、產業、市值、企業價值都存在)
- 板块完整(業務簡介、供應鏈位置、主要客戶及供應商、財務概況都存在)Rebuild Wikilink Index
重建Wikilink索引
bash
python scripts/build_wikilink_index.pyRegenerates — a browsable index of all 4,900+ wikilinks categorized as Technologies, Materials, Applications, and Companies. Run after any enrichment update.
WIKILINKS.mdbash
python scripts/build_wikilink_index.py重新生成——这是一个可浏览的索引,包含所有4900+条wikilink,分为技术、材料、应用、企业四类。更新补充内容后请运行此命令。
WIKILINKS.mdGenerate Thematic Investment Screens
生成主题投资筛选列表
bash
undefinedbash
undefinedBuild all 20 themes
构建全部20个主题
python scripts/build_themes.py
python scripts/build_themes.py
Single theme
单个主题
python scripts/build_themes.py "CoWoS"
python scripts/build_themes.py "CoWoS"
List available themes
列出可用主题
python scripts/build_themes.py --list
Output in `themes/` — each page shows companies grouped by upstream/midstream/downstream role.python scripts/build_themes.py --list
输出到`themes/`目录——每个页面按上游/中游/下游角色分组展示企业。Generate Interactive Network Graph
生成交互式网络图
bash
undefinedbash
undefinedDefault: min 5 co-occurrences
默认:最低共现5次
python scripts/build_network.py
python scripts/build_network.py
Fewer edges for cleaner view
减少边数量获得更清晰的视图
python scripts/build_network.py --min-weight 10
python scripts/build_network.py --min-weight 10
Only top 200 nodes
仅展示前200个节点
python scripts/build_network.py --top 200
Opens `network/index.html` in browser — D3.js force-directed graph. Node colors:
- 🔴 Red = Taiwan company
- 🔵 Blue = International company
- 🟢 Green = Technology
- 🟠 Orange = Material
- 🟣 Purple = Applicationpython scripts/build_network.py --top 200
在浏览器中打开`network/index.html`——D3.js力导向图。节点颜色:
- 🔴 红色 = 台湾企业
- 🔵 蓝色 = 国际企业
- 🟢 绿色 = 技术
- 🟠 橙色 = 材料
- 🟣 紫色 = 应用Wikilink Graph — Core Feature
Wikilink图谱——核心功能
The wikilink graph is what makes this database powerful. Every in every report creates edges in a knowledge graph.
[[entity]]Search by entity to find related companies:
| Search | Results | Insight |
|---|---|---|
| 207 companies | Apple's full Taiwan supplier network |
| 277 companies | NVIDIA's Taiwan supply chain |
| 469 companies | Taiwan semiconductor ecosystem |
| 39 companies | TSMC advanced packaging players |
| 148 companies | AI server supply chain |
| 263 companies | Printed circuit board ecosystem |
| 223 companies | EV component suppliers |
Browse: Open for the full categorized index.
WIKILINKS.mdWikilink图谱是本数据库的核心优势。每份报告中的每个都会在知识图谱中创建关联边。
[[实体]]按实体搜索关联企业:
| 搜索 | 结果 | 洞察 |
|---|---|---|
| 207家企业 | Apple完整的台湾供应商网络 |
| 277家企业 | NVIDIA的台湾供应链 |
| 469家企业 | 台湾半导体生态 |
| 39家企业 | 台积电先进封装参与者 |
| 148家企业 | AI服务器供应链 |
| 263家企业 | 印刷电路板生态 |
| 223家企业 | 电动汽车零部件供应商 |
浏览: 打开查看完整的分类索引。
WIKILINKS.mdCode Examples
代码示例
Read and Parse a Report
读取并解析报告
python
import re
from pathlib import Path
def get_report(ticker: str, reports_dir: str = "Pilot_Reports") -> dict:
"""Find and parse a ticker report."""
base = Path(reports_dir)
# Find the file across all sector subdirectories
matches = list(base.rglob(f"{ticker}_*.md"))
if not matches:
return {}
content = matches[0].read_text(encoding="utf-8")
# Extract all wikilinks
wikilinks = re.findall(r'\[\[([^\]]+)\]\]', content)
# Extract sector metadata
sector_match = re.search(r'\*\*產業:\*\*\s*(.+)', content)
board_match = re.search(r'\*\*板塊:\*\*\s*(.+)', content)
return {
"ticker": ticker,
"file": str(matches[0]),
"sector": sector_match.group(1).strip() if sector_match else None,
"board": board_match.group(1).strip() if board_match else None,
"wikilinks": list(set(wikilinks)),
"wikilink_count": len(set(wikilinks)),
"content": content
}python
import re
from pathlib import Path
def get_report(ticker: str, reports_dir: str = "Pilot_Reports") -> dict:
"""查找并解析股票报告。"""
base = Path(reports_dir)
# 遍历所有板块子目录查找文件
matches = list(base.rglob(f"{ticker}_*.md"))
if not matches:
return {}
content = matches[0].read_text(encoding="utf-8")
# 提取所有wikilink
wikilinks = re.findall(r'\[\[([^\]]+)\]\]', content)
# 提取板块元数据
sector_match = re.search(r'\*\*產業:\*\*\s*(.+)', content)
board_match = re.search(r'\*\*板塊:\*\*\s*(.+)', content)
return {
"ticker": ticker,
"file": str(matches[0]),
"sector": sector_match.group(1).strip() if sector_match else None,
"board": board_match.group(1).strip() if board_match else None,
"wikilinks": list(set(wikilinks)),
"wikilink_count": len(set(wikilinks)),
"content": content
}Usage
使用示例
report = get_report("2330")
print(f"Sector: {report['sector']}")
print(f"Wikilinks ({report['wikilink_count']}): {report['wikilinks'][:10]}")
undefinedreport = get_report("2330")
print(f"板块: {report['sector']}")
print(f"Wikilink ({report['wikilink_count']}条): {report['wikilinks'][:10]}")
undefinedBuild a Custom Wikilink Index
构建自定义Wikilink索引
python
import re
from pathlib import Path
from collections import defaultdict
def build_wikilink_index(reports_dir: str = "Pilot_Reports") -> dict:
"""
Returns: {entity: [list of tickers that mention it]}
"""
index = defaultdict(list)
for md_file in Path(reports_dir).rglob("*.md"):
# Extract ticker from filename (e.g. "2330_台積電.md" -> "2330")
ticker = md_file.stem.split("_")[0]
content = md_file.read_text(encoding="utf-8")
wikilinks = set(re.findall(r'\[\[([^\]]+)\]\]', content))
for link in wikilinks:
index[link].append(ticker)
# Sort by mention count
return dict(sorted(index.items(), key=lambda x: len(x[1]), reverse=True))python
import re
from pathlib import Path
from collections import defaultdict
def build_wikilink_index(reports_dir: str = "Pilot_Reports") -> dict:
"""
返回: {实体: [提及该实体的股票代码列表]}
"""
index = defaultdict(list)
for md_file in Path(reports_dir).rglob("*.md"):
# 从文件名提取股票代码(例如 "2330_台積電.md" -> "2330")
ticker = md_file.stem.split("_")[0]
content = md_file.read_text(encoding="utf-8")
wikilinks = set(re.findall(r'\[\[([^\]]+)\]\]', content))
for link in wikilinks:
index[link].append(ticker)
# 按提及次数排序
return dict(sorted(index.items(), key=lambda x: len(x[1]), reverse=True))Find all companies in Apple's supply chain
查找Apple供应链的所有企业
index = build_wikilink_index()
apple_suppliers = index.get("Apple", [])
print(f"Apple supply chain: {len(apple_suppliers)} companies")
print(apple_suppliers[:20])
index = build_wikilink_index()
apple_suppliers = index.get("Apple", [])
print(f"Apple供应链: {len(apple_suppliers)}家企业")
print(apple_suppliers[:20])
Find companies involved in CoWoS
查找涉及CoWoS的企业
cowos_companies = index.get("CoWoS", [])
print(f"\nCoWoS ecosystem: {len(cowos_companies)} companies: {cowos_companies}")
undefinedcowos_companies = index.get("CoWoS", [])
print(f"\nCoWoS生态: {len(cowos_companies)}家企业: {cowos_companies}")
undefinedFind Supply Chain Overlaps Between Two Entities
查找两个实体的供应链重叠部分
python
def supply_chain_overlap(entity_a: str, entity_b: str, reports_dir: str = "Pilot_Reports"):
"""Find tickers that appear in both entities' supply chains."""
index = build_wikilink_index(reports_dir)
set_a = set(index.get(entity_a, []))
set_b = set(index.get(entity_b, []))
overlap = set_a & set_b
print(f"{entity_a}: {len(set_a)} companies")
print(f"{entity_b}: {len(set_b)} companies")
print(f"Overlap: {len(overlap)} companies — {sorted(overlap)}")
return overlappython
def supply_chain_overlap(entity_a: str, entity_b: str, reports_dir: str = "Pilot_Reports"):
"""查找同时出现在两个实体供应链中的股票代码。"""
index = build_wikilink_index(reports_dir)
set_a = set(index.get(entity_a, []))
set_b = set(index.get(entity_b, []))
overlap = set_a & set_b
print(f"{entity_a}: {len(set_a)}家企业")
print(f"{entity_b}: {len(set_b)}家企业")
print(f"重叠: {len(overlap)}家企业 — {sorted(overlap)}")
return overlapCompanies in both NVIDIA and Apple supply chains
同时出现在NVIDIA和Apple供应链中的企业
supply_chain_overlap("NVIDIA", "Apple")
supply_chain_overlap("NVIDIA", "Apple")
Companies in both AI server and EV supply chains
同时出现在AI服务器和电动汽车供应链中的企业
supply_chain_overlap("AI 伺服器", "電動車")
undefinedsupply_chain_overlap("AI 伺服器", "電動車")
undefinedBatch Financial Update with Error Handling
带错误处理的批量财务数据更新
python
import subprocess
import sys
def update_sector_financials(sector: str, valuation_only: bool = False):
"""Update financials for all tickers in a sector."""
script = "update_valuation.py" if valuation_only else "update_financials.py"
cmd = [sys.executable, f"scripts/{script}", "--sector", sector]
result = subprocess.run(cmd, capture_output=True, text=True)
if result.returncode != 0:
print(f"Error: {result.stderr}")
else:
print(result.stdout)
return result.returncodepython
import subprocess
import sys
def update_sector_financials(sector: str, valuation_only: bool = False):
"""更新某个板块所有股票的财务数据。"""
script = "update_valuation.py" if valuation_only else "update_financials.py"
cmd = [sys.executable, f"scripts/{script}", "--sector", sector]
result = subprocess.run(cmd, capture_output=True, text=True)
if result.returncode != 0:
print(f"错误: {result.stderr}")
else:
print(result.stdout)
return result.returncodeUpdate valuation multiples for semiconductors (fast)
更新半导体板块的估值倍数(速度快)
update_sector_financials("Semiconductors", valuation_only=True)
update_sector_financials("Semiconductors", valuation_only=True)
Full financial update for a sector
完整更新某个板块的财务数据
update_sector_financials("Electronic Components", valuation_only=False)
undefinedupdate_sector_financials("Electronic Components", valuation_only=False)
undefinedPrepare Enrichment JSON
准备补充内容JSON
python
import json
def build_enrichment_entry(ticker: str, company_name: str,
description: str, upstream: list[str],
midstream: str, downstream: list[str],
customers: list[str], suppliers: list[str]) -> dict:
"""
Build a properly formatted enrichment entry.
All entity names in lists will be wrapped in [[wikilinks]].
"""
def wikify(items):
return "\n".join(f"- [[{item}]]" for item in items)
supply_chain = (
f"**上游:**\n{wikify(upstream)}\n"
f"**中游:**\n- **{company_name}** ({midstream})\n"
f"**下游:**\n{wikify(downstream)}"
)
cust_section = (
f"### 主要客戶\n{wikify(customers)}\n\n"
f"### 主要供應商\n{wikify(suppliers)}"
)
return {
"desc": description,
"supply_chain": supply_chain,
"cust": cust_section
}python
import json
def build_enrichment_entry(ticker: str, company_name: str,
description: str, upstream: list[str],
midstream: str, downstream: list[str],
customers: list[str], suppliers: list[str]) -> dict:
"""
构建格式正确的补充内容条目。
列表中的所有实体名称都会被包裹在[[wikilink]]中。
"""
def wikify(items):
return "\n".join(f"- [[{item}]]" for item in items)
supply_chain = (
f"**上游:**\n{wikify(upstream)}\n"
f"**中游:**\n- **{company_name}** ({midstream})\n"
f"**下游:**\n{wikify(downstream)}"
)
cust_section = (
f"### 主要客戶\n{wikify(customers)}\n\n"
f"### 主要供應商\n{wikify(suppliers)}"
)
return {
"desc": description,
"supply_chain": supply_chain,
"cust": cust_section
}Build enrichment for multiple tickers
为多只股票构建补充内容
enrichment = {
"2330": build_enrichment_entry(
ticker="2330",
company_name="台積電",
description="台積電為全球最大晶圓代工廠,專注於 [[CoWoS]]、[[3奈米]] 先進製程,為全球領先科技公司提供晶片製造服務。",
upstream=["ASML", "Applied Materials", "SUMCO", "Tokyo Electron"],
midstream="晶圓代工",
downstream=["Apple", "NVIDIA", "AMD", "Broadcom", "Qualcomm"],
customers=["Apple", "NVIDIA", "AMD", "Qualcomm", "MediaTek"],
suppliers=["ASML", "Tokyo Electron", "Shin-Etsu", "Applied Materials"]
)
}
enrichment = {
"2330": build_enrichment_entry(
ticker="2330",
company_name="台積電",
description="台積電為全球最大晶圓代工廠,專注於 [[CoWoS]]、[[3奈米]] 先進製程,為全球領先科技公司提供晶片製造服務。",
upstream=["ASML", "Applied Materials", "SUMCO", "Tokyo Electron"],
midstream="晶圓代工",
downstream=["Apple", "NVIDIA", "AMD", "Broadcom", "Qualcomm"],
customers=["Apple", "NVIDIA", "AMD", "Qualcomm", "MediaTek"],
suppliers=["ASML", "Tokyo Electron", "Shin-Etsu", "Applied Materials"]
)
}
Save to file
保存到文件
with open("enrichment.json", "w", encoding="utf-8") as f:
json.dump(enrichment, f, ensure_ascii=False, indent=2)
print("enrichment.json ready. Apply with:")
print("python scripts/update_enrichment.py --data enrichment.json 2330")
undefinedwith open("enrichment.json", "w", encoding="utf-8") as f:
json.dump(enrichment, f, ensure_ascii=False, indent=2)
print("enrichment.json已准备好,使用以下命令应用:")
print("python scripts/update_enrichment.py --data enrichment.json 2330")
undefinedAudit a Batch Programmatically
编程式批量审核
python
import subprocess
import json
def audit_and_report(batch_id: int) -> dict:
"""Run audit and parse results."""
result = subprocess.run(
["python", "scripts/audit_batch.py", str(batch_id), "-v"],
capture_output=True, text=True
)
output = result.stdout
# Parse pass/fail counts from output
passed = output.count("✓")
failed = output.count("✗")
return {
"batch": batch_id,
"passed": passed,
"failed": failed,
"pass_rate": passed / (passed + failed) if (passed + failed) > 0 else 0,
"output": output
}
results = audit_and_report(101)
print(f"Batch 101: {results['passed']} passed, {results['failed']} failed")
print(f"Pass rate: {results['pass_rate']:.1%}")python
import subprocess
import json
def audit_and_report(batch_id: int) -> dict:
"""运行审核并解析结果。"""
result = subprocess.run(
["python", "scripts/audit_batch.py", str(batch_id), "-v"],
capture_output=True, text=True
)
output = result.stdout
# 从输出解析通过/失败数量
passed = output.count("✓")
failed = output.count("✗")
return {
"batch": batch_id,
"passed": passed,
"failed": failed,
"pass_rate": passed / (passed + failed) if (passed + failed) > 0 else 0,
"output": output
}
results = audit_and_report(101)
print(f"批次101: {results['passed']}通过, {results['failed']}失败")
print(f"通过率: {results['pass_rate']:.1%}")Common Workflows
常见工作流
Workflow 1: Research a New Investment Theme
工作流1:研究新的投资主题
bash
undefinedbash
undefined1. Search for related companies
1. 搜索相关企业
python scripts/discover.py "液冷散熱" --smart
python scripts/discover.py "液冷散熱" --smart
2. Apply wikilinks to matching reports
2. 给匹配的报告添加wikilink标签
python scripts/discover.py "液冷散熱" --apply
python scripts/discover.py "液冷散熱" --apply
3. Rebuild themes to include new theme
3. 重建主题列表包含新主题
python scripts/build_themes.py "液冷散熱"
python scripts/build_themes.py "液冷散熱"
4. Rebuild wikilink index and network
4. 重建wikilink索引和网络图
python scripts/build_wikilink_index.py
python scripts/build_network.py
python scripts/build_wikilink_index.py
python scripts/build_network.py
5. Browse results
5. 浏览结果
open themes/液冷散熱.md
open network/index.html
undefinedopen themes/液冷散熱.md
open network/index.html
undefinedWorkflow 2: Onboard a New Ticker
工作流2:新增一只股票
bash
undefinedbash
undefined1. Add the report (Python script, free)
1. 添加报告(Python脚本,免费)
python scripts/add_ticker.py 6669 緯穎 --sector Computer Hardware
python scripts/add_ticker.py 6669 緯穎 --sector Computer Hardware
2. Update financial data
2. 更新财务数据
python scripts/update_financials.py 6669
python scripts/update_financials.py 6669
3. Prepare enrichment JSON (use AI research or manual)
3. 准备补充内容JSON(使用AI研究或手动编写)
Edit enrichment.json with business description, supply chain, customers
编辑enrichment.json添加业务描述、供应链、客户信息
4. Apply enrichment
4. 应用补充内容
python scripts/update_enrichment.py --data enrichment.json 6669
python scripts/update_enrichment.py --data enrichment.json 6669
5. Audit quality
5. 审核质量
python scripts/audit_batch.py --all -v
python scripts/audit_batch.py --all -v
6. Rebuild index
6. 重建索引
python scripts/build_wikilink_index.py
undefinedpython scripts/build_wikilink_index.py
undefinedWorkflow 3: Refresh Valuation for Earnings Season
工作流3:财报季更新估值
bash
undefinedbash
undefinedFast valuation refresh only (no full financial re-fetch)
仅快速刷新估值(不重新拉取完整财报)
python scripts/update_valuation.py --sector Semiconductors
python scripts/update_valuation.py --sector Electronic Components
python scripts/update_valuation.py --sector Computer Hardware
python scripts/update_valuation.py --sector Semiconductors
python scripts/update_valuation.py --sector Electronic Components
python scripts/update_valuation.py --sector Computer Hardware
Or refresh everything (slow, run overnight)
或者刷新所有股票(速度慢,夜间运行)
python scripts/update_valuation.py
undefinedpython scripts/update_valuation.py
undefinedWorkflow 4: Map a Supply Chain
工作流4:映射供应链
python
undefinedpython
undefinedFind all Taiwan companies connected to a specific technology
查找与特定技术相关的所有台湾企业
from collections import defaultdict
import re
from pathlib import Path
def map_supply_chain(technology: str, reports_dir: str = "Pilot_Reports"):
results = {"upstream": [], "midstream": [], "downstream": []}
for md_file in Path(reports_dir).rglob("*.md"):
content = md_file.read_text(encoding="utf-8")
if f"[[{technology}]]" not in content:
continue
ticker = md_file.stem.split("_")[0]
company = md_file.stem.split("_", 1)[1] if "_" in md_file.stem else ""
# Detect position in supply chain
if f"**上游:**" in content and f"[[{technology}]]" in content.split("**上游:**")[1].split("**中游:**")[0]:
results["upstream"].append(f"{ticker} {company}")
elif f"[[{technology}]]" in content and "**中游:**" in content:
mid_section = content.split("**中游:**")[1].split("**下游:**")[0] if "**下游:**" in content else ""
if f"[[{technology}]]" in mid_section:
results["midstream"].append(f"{ticker} {company}")
else:
results["downstream"].append(f"{ticker} {company}")
return resultschain = map_supply_chain("CoWoS")
print(f"Upstream: {chain['upstream']}")
print(f"Midstream: {chain['midstream']}")
print(f"Downstream: {chain['downstream']}")
undefinedfrom collections import defaultdict
import re
from pathlib import Path
def map_supply_chain(technology: str, reports_dir: str = "Pilot_Reports"):
results = {"upstream": [], "midstream": [], "downstream": []}
for md_file in Path(reports_dir).rglob("*.md"):
content = md_file.read_text(encoding="utf-8")
if f"[[{technology}]]" not in content:
continue
ticker = md_file.stem.split("_")[0]
company = md_file.stem.split("_", 1)[1] if "_" in md_file.stem else ""
# 检测供应链位置
if f"**上游:**" in content and f"[[{technology}]]" in content.split("**上游:**")[1].split("**中游:**")[0]:
results["upstream"].append(f"{ticker} {company}")
elif f"[[{technology}]]" in content and "**中游:**" in content:
mid_section = content.split("**中游:**")[1].split("**下游:**")[0] if "**下游:**" in content else ""
if f"[[{technology}]]" in mid_section:
results["midstream"].append(f"{ticker} {company}")
else:
results["downstream"].append(f"{ticker} {company}")
return resultschain = map_supply_chain("CoWoS")
print(f"上游: {chain['upstream']}")
print(f"中游: {chain['midstream']}")
print(f"下游: {chain['downstream']}")
undefinedToken Cost Reference
Token成本参考
| Operation | Tokens Used | Command |
|---|---|---|
| Update financials | Free (yfinance) | |
| Update valuation | Free (yfinance) | |
| Discover (with results) | Free | |
| Audit | Free | |
| Build themes/network/index | Free | |
| Medium | AI research per ticker |
| Medium | 3–5 web searches per ticker |
| Low–High | AI researches online |
Rule of thumb: Use Python scripts for bulk data operations. Use Claude Code slash commands only when AI research is needed for a specific ticker.
| 操作 | Token消耗 | 命令 |
|---|---|---|
| 更新财务数据 | 免费 (yfinance) | |
| 更新估值 | 免费 (yfinance) | |
| 搜索(有结果) | 免费 | |
| 审核 | 免费 | |
| 构建主题/网络图/索引 | 免费 | |
| 中等 | 每只股票AI研究消耗 |
| 中等 | 每只股票3-5次网页搜索消耗 |
| 低-高 | AI在线研究消耗 |
经验法则: 批量数据操作使用Python脚本。仅当需要为特定股票做AI研究时使用Claude Code斜杠命令。
Troubleshooting
问题排查
yfinancepython
import yfinance as yfyfinancepython
import yfinance as yfTaiwan tickers need .TW suffix for TWSE, .TWO for OTC
台湾股票代码需要加.TW后缀(台交所)或.TWO后缀(上柜)
tsmc = yf.Ticker("2330.TW")
print(tsmc.info.get("marketCap"))
**Report not found by scripts:**
- Filename must match pattern: `{TICKER}_{CompanyName}.md`
- Must be inside a subfolder of `Pilot_Reports/`
- Use `python scripts/utils.py` to test file discovery
**Audit fails "too few wikilinks":**
- Minimum 8 unique `[[wikilinks]]` required per report
- Use `update_enrichment.py` to add richer content
- Run `discover.py --apply` to auto-tag relevant wikilinks
**`build_network.py` produces empty graph:**
```bashtsmc = yf.Ticker("2330.TW")
print(tsmc.info.get("marketCap"))
**脚本找不到报告:**
- 文件名必须匹配格式:`{股票代码}_{公司名称}.md`
- 必须放在`Pilot_Reports/`的子文件夹中
- 使用`python scripts/utils.py`测试文件查找功能
**审核失败「wikilink数量太少」:**
- 每份报告至少需要8个唯一的`[[wikilink]]`
- 使用`update_enrichment.py`添加更丰富的内容
- 运行`discover.py --apply`自动打上相关wikilink标签
**`build_network.py`生成空图:**
```bashEnsure reports exist and have wikilinks first
首先确保报告存在且包含wikilink
python scripts/build_wikilink_index.py # Check WIKILINKS.md has entries
python scripts/build_network.py --min-weight 2 # Lower threshold
**Enrichment JSON rejected:**
- Ensure file is valid UTF-8 with `ensure_ascii=False`
- Keys must be ticker strings (`"2330"`, not `2330`)
- Required keys per entry: `desc`, `supply_chain`, `cust`
- Content must be in Traditional Chinese; English only for proper nouns
**Finding batch numbers:**
- See `task.md` for batch definitions and which tickers are in each batch
- Batches are used for incremental processing of the 1,735 tickerspython scripts/build_wikilink_index.py # 检查WIKILINKS.md是否有内容
python scripts/build_network.py --min-weight 2 # 降低阈值
**补充内容JSON被拒绝:**
- 确保文件是有效的UTF-8编码,保存时设置`ensure_ascii=False`
- 键必须是字符串类型的股票代码(`"2330"`,不是`2330`)
- 每个条目必须包含必填键:`desc`、`supply_chain`、`cust`
- 内容必须为繁体中文,仅专有名词可使用英文
**查找批次编号:**
- 批次定义和每批包含的股票见`task.md`
- 批次用于增量处理1735只股票