taiwan-equity-research-coverage

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Taiwan Equity Research Coverage (My-TW-Coverage)

台湾股票研究库 (My-TW-Coverage)

Skill by ara.so — Daily 2026 Skills collection.
A structured equity research database covering 1,735 Taiwan-listed companies (TWSE + OTC) across 99 industry sectors. Each report contains a business overview, supply chain mapping, customer/supplier relationships, and financial data — all cross-referenced through 4,900+ wikilinks forming a searchable knowledge graph.
ara.so 提供的Skill — 2026年度每日技能合集。
这是一个结构化的股票研究数据库,覆盖99个行业板块1735家台湾上市企业(台交所+上柜)。每份报告都包含业务概览、供应链映射、客户/供应商关系和财务数据——所有内容通过4900+条wikilink交叉关联,形成可搜索的知识图谱。

Installation

安装

bash
git clone https://github.com/Timeverse/My-TW-Coverage
cd My-TW-Coverage
pip install yfinance pandas tabulate
bash
git clone https://github.com/Timeverse/My-TW-Coverage
cd My-TW-Coverage
pip install yfinance pandas tabulate

Project Structure

项目结构

My-TW-Coverage/
├── Pilot_Reports/             # 1,735 ticker reports across 99 sectors
│   ├── Semiconductors/        # 155 tickers
│   ├── Electronic Components/ # 267 tickers
│   ├── Computer Hardware/     # 114 tickers
│   └── ... (99 sector folders)
├── scripts/
│   ├── utils.py               # Shared utilities
│   ├── add_ticker.py          # Generate new ticker reports
│   ├── update_financials.py   # Refresh financial tables + valuation
│   ├── update_valuation.py    # Refresh valuation multiples only (fast)
│   ├── update_enrichment.py   # Update business descriptions from JSON
│   ├── audit_batch.py         # Quality auditing
│   ├── discover.py            # Buzzword → related companies search
│   ├── build_wikilink_index.py# Rebuild WIKILINKS.md index
│   ├── build_themes.py        # Generate thematic investment screens
│   └── build_network.py       # Generate D3.js network graph
├── WIKILINKS.md               # Auto-generated browsable wikilink index
├── network/index.html         # Interactive D3.js wikilink network
├── themes/                    # Thematic investment screens (auto-generated)
└── task.md                    # Batch definitions and progress tracking
My-TW-Coverage/
├── Pilot_Reports/             # 覆盖99个板块的1735份股票报告
│   ├── Semiconductors/        # 155只股票
│   ├── Electronic Components/ # 267只股票
│   ├── Computer Hardware/     # 114只股票
│   └── ... (共99个板块文件夹)
├── scripts/
│   ├── utils.py               # 共享工具函数
│   ├── add_ticker.py          # 生成新的股票报告
│   ├── update_financials.py   # 刷新财务表格+估值数据
│   ├── update_valuation.py    # 仅刷新估值倍数(速度快)
│   ├── update_enrichment.py   # 从JSON更新业务描述内容
│   ├── audit_batch.py         # 质量审核
│   ├── discover.py            # 关键词→关联企业搜索
│   ├── build_wikilink_index.py# 重建WIKILINKS.md索引
│   ├── build_themes.py        # 生成主题投资筛选列表
│   └── build_network.py       # 生成D3.js网络图
├── WIKILINKS.md               # 自动生成的可浏览wikilink索引
├── network/index.html         # 交互式D3.js wikilink网络图
├── themes/                    # 主题投资筛选列表(自动生成)
└── task.md                    # 批次定义和进度跟踪

Report Format

报告格式

Each ticker report is a markdown file at
Pilot_Reports/<Sector>/<TICKER>_<Name>.md
:
markdown
undefined
每只股票的报告都是存放在
Pilot_Reports/<板块>/<股票代码>_<公司名称>.md
的markdown文件:
markdown
undefined

2330 - [[台積電]]

2330 - [[台積電]]

業務簡介

業務簡介

板塊: Technology 產業: Semiconductors 市值: 47,326,857 百萬台幣 企業價值: 44,978,990 百萬台幣
台積電為全球最大晶圓代工廠,專注於 [[CoWoS]]、[[3奈米]] 先進製程...
板塊: Technology 產業: Semiconductors 市值: 47,326,857 百萬台幣 企業價值: 44,978,990 百萬台幣
台積電為全球最大晶圓代工廠,專注於 [[CoWoS]]、[[3奈米]] 先進製程...

供應鏈位置

供應鏈位置

上游: [[ASML]], [[Applied Materials]], [[SUMCO]] 中游: 台積電 (晶圓代工) 下游: [[Apple]], [[NVIDIA]], [[AMD]], [[Broadcom]]
上游: [[ASML]], [[Applied Materials]], [[SUMCO]] 中游: 台積電 (晶圓代工) 下游: [[Apple]], [[NVIDIA]], [[AMD]], [[Broadcom]]

主要客戶及供應商

主要客戶及供應商

主要客戶

主要客戶

  • [[Apple]], [[NVIDIA]], [[AMD]], [[Qualcomm]]
  • [[Apple]], [[NVIDIA]], [[AMD]], [[Qualcomm]]

主要供應商

主要供應商

  • [[ASML]], [[Tokyo Electron]], [[Shin-Etsu]]
  • [[ASML]], [[Tokyo Electron]], [[Shin-Etsu]]

財務概況

財務概況

估值指標

估值指標

P/E (TTM)Forward P/EP/S (TTM)P/BEV/EBITDA
28.522.19.37.216.4
P/E (TTM)Forward P/EP/S (TTM)P/BEV/EBITDA
28.522.19.37.216.4

年度財務數據

年度財務數據

[Annual 3-year financial table with 14 metrics]
[Annual 3-year financial table with 14 metrics]

季度財務數據

季度財務數據

[Quarterly 4-quarter financial table]
undefined
[Quarterly 4-quarter financial table]
undefined

Key Commands

关键命令

Add a New Ticker

添加新股票

bash
undefined
bash
undefined

Basic (auto-detect sector)

基础用法(自动识别板块)

python scripts/add_ticker.py 2330 台積電
python scripts/add_ticker.py 2330 台積電

With explicit sector

指定板块

python scripts/add_ticker.py 2330 台積電 --sector Semiconductors
undefined
python scripts/add_ticker.py 2330 台積電 --sector Semiconductors
undefined

Update Financial Data

更新财务数据

bash
undefined
bash
undefined

Single ticker

单只股票

python scripts/update_financials.py 2330
python scripts/update_financials.py 2330

Multiple tickers

多只股票

python scripts/update_financials.py 2330 2454 3034
python scripts/update_financials.py 2330 2454 3034

By batch number (see task.md for batch definitions)

按批次编号(批次定义见task.md)

python scripts/update_financials.py --batch 101
python scripts/update_financials.py --batch 101

By sector

按板块

python scripts/update_financials.py --sector Semiconductors
python scripts/update_financials.py --sector Semiconductors

All 1,735 tickers (slow)

更新全部1735只股票(速度慢)

python scripts/update_financials.py
undefined
python scripts/update_financials.py
undefined

Update Valuation Only (~3x Faster)

仅更新估值(速度快约3倍)

Refreshes only P/E, Forward P/E, P/S, P/B, EV/EBITDA, and stock price — skips full financial statement re-fetch.
bash
python scripts/update_valuation.py 2330
python scripts/update_valuation.py --batch 101
python scripts/update_valuation.py --sector Semiconductors
python scripts/update_valuation.py                          # All tickers
仅更新市盈率(TTM)、前瞻市盈率、市销率(TTM)、市净率、企业价值/息税折旧摊销前利润和股价,跳过完整财报的重新拉取。
bash
python scripts/update_valuation.py 2330
python scripts/update_valuation.py --batch 101
python scripts/update_valuation.py --sector Semiconductors
python scripts/update_valuation.py                          # 全部股票

Discover Companies by Buzzword

按关键词搜索相关企业

Find every Taiwan-listed company related to a theme or technology:
bash
undefined
查找与某个主题或技术相关的所有台湾上市企业:
bash
undefined

Basic search

基础搜索

python scripts/discover.py "液冷散熱"
python scripts/discover.py "液冷散熱"

Auto-detect relevant sectors (skips banks/insurance/real estate for tech terms)

自动识别相关板块(搜索技术术语时跳过银行/保险/地产板块)

python scripts/discover.py "液冷散熱" --smart
python scripts/discover.py "液冷散熱" --smart

Tag matching companies with [[wikilinks]] in their reports

给匹配的报告打上[[wikilink]]标签

python scripts/discover.py "液冷散熱" --apply
python scripts/discover.py "液冷散熱" --apply

Apply + rebuild themes + rebuild network graph

打标签+重建主题列表+重建网络图

python scripts/discover.py "液冷散熱" --apply --rebuild
python scripts/discover.py "液冷散熱" --apply --rebuild

Limit to a specific sector

限定搜索板块

python scripts/discover.py "液冷散熱" --sector Semiconductors

Common buzzword examples:
- `"CoWoS"` — TSMC advanced packaging supply chain
- `"HBM"` — High Bandwidth Memory ecosystem
- `"電動車"` — EV component suppliers
- `"AI 伺服器"` — AI server supply chain (148 companies)
- `"光阻液"` — Photoresist suppliers and consumers
- `"碳化矽"` — Silicon carbide (SiC) companies
python scripts/discover.py "液冷散熱" --sector Semiconductors

常见关键词示例:
- `"CoWoS"` — 台积电先进封装供应链
- `"HBM"` — 高带宽内存生态
- `"電動車"` — 电动汽车零部件供应商
- `"AI 伺服器"` — AI服务器供应链(148家企业)
- `"光阻液"` — 光刻胶供应商和客户
- `"碳化矽"` — 碳化硅(SiC)相关企业

Update Enrichment Content (Bulk AI Research)

更新补充内容(批量AI研究)

Prepare a JSON file, then apply to specific tickers, batches, or sectors:
bash
python scripts/update_enrichment.py --data enrichment.json 2330
python scripts/update_enrichment.py --data enrichment.json --batch 101
python scripts/update_enrichment.py --data enrichment.json --sector Semiconductors
Enrichment JSON format:
json
{
  "2330": {
    "desc": "台積電為全球最大晶圓代工廠,專注於 [[CoWoS]]、[[3奈米]] 先進製程,為 [[Apple]]、[[NVIDIA]] 等科技巨頭提供晶片製造服務。",
    "supply_chain": "**上游:**\n- [[ASML]] (EUV 微影設備)\n- [[Applied Materials]] (薄膜沉積)\n**中游:**\n- **台積電** (晶圓代工)\n**下游:**\n- [[Apple]]\n- [[NVIDIA]]",
    "cust": "### 主要客戶\n- [[Apple]] (約25%營收)\n- [[NVIDIA]]\n- [[AMD]]\n\n### 主要供應商\n- [[ASML]]\n- [[Tokyo Electron]]"
  },
  "2454": {
    "desc": "...",
    "supply_chain": "...",
    "cust": "..."
  }
}
准备JSON文件,然后应用到指定的股票、批次或板块:
bash
python scripts/update_enrichment.py --data enrichment.json 2330
python scripts/update_enrichment.py --data enrichment.json --batch 101
python scripts/update_enrichment.py --data enrichment.json --sector Semiconductors
补充内容JSON格式:
json
{
  "2330": {
    "desc": "台積電為全球最大晶圓代工廠,專注於 [[CoWoS]]、[[3奈米]] 先進製程,為 [[Apple]]、[[NVIDIA]] 等科技巨頭提供晶片製造服務。",
    "supply_chain": "**上游:**\n- [[ASML]] (EUV 微影設備)\n- [[Applied Materials]] (薄膜沉積)\n**中游:**\n- **台積電** (晶圓代工)\n**下游:**\n- [[Apple]]\n- [[NVIDIA]]",
    "cust": "### 主要客戶\n- [[Apple]] (約25%營收)\n- [[NVIDIA]]\n- [[AMD]]\n\n### 主要供應商\n- [[ASML]]\n- [[Tokyo Electron]]"
  },
  "2454": {
    "desc": "...",
    "supply_chain": "...",
    "cust": "..."
  }
}

Audit Report Quality

审核报告质量

bash
undefined
bash
undefined

Single batch

单个批次

python scripts/audit_batch.py 101 -v
python scripts/audit_batch.py 101 -v

All batches

所有批次

python scripts/audit_batch.py --all -v

Audit checks:
- Minimum 8 wikilinks per report
- No generic terms in brackets (e.g. `[[公司]]`, `[[產品]]`)
- No placeholder text remaining
- No English text in Chinese-language sections
- Metadata completeness (板塊, 產業, 市值, 企業價值)
- Section depth (業務簡介, 供應鏈位置, 主要客戶及供應商, 財務概況 all present)
python scripts/audit_batch.py --all -v

审核项:
- 每份报告至少8条wikilink
- 括号内无通用术语(例如`[[公司]]`、`[[產品]]`)
- 无剩余占位符文本
- 中文板块无英文内容
- 元数据完整(板塊、產業、市值、企業價值都存在)
- 板块完整(業務簡介、供應鏈位置、主要客戶及供應商、財務概況都存在)

Rebuild Wikilink Index

重建Wikilink索引

bash
python scripts/build_wikilink_index.py
Regenerates
WIKILINKS.md
— a browsable index of all 4,900+ wikilinks categorized as Technologies, Materials, Applications, and Companies. Run after any enrichment update.
bash
python scripts/build_wikilink_index.py
重新生成
WIKILINKS.md
——这是一个可浏览的索引,包含所有4900+条wikilink,分为技术、材料、应用、企业四类。更新补充内容后请运行此命令。

Generate Thematic Investment Screens

生成主题投资筛选列表

bash
undefined
bash
undefined

Build all 20 themes

构建全部20个主题

python scripts/build_themes.py
python scripts/build_themes.py

Single theme

单个主题

python scripts/build_themes.py "CoWoS"
python scripts/build_themes.py "CoWoS"

List available themes

列出可用主题

python scripts/build_themes.py --list

Output in `themes/` — each page shows companies grouped by upstream/midstream/downstream role.
python scripts/build_themes.py --list

输出到`themes/`目录——每个页面按上游/中游/下游角色分组展示企业。

Generate Interactive Network Graph

生成交互式网络图

bash
undefined
bash
undefined

Default: min 5 co-occurrences

默认:最低共现5次

python scripts/build_network.py
python scripts/build_network.py

Fewer edges for cleaner view

减少边数量获得更清晰的视图

python scripts/build_network.py --min-weight 10
python scripts/build_network.py --min-weight 10

Only top 200 nodes

仅展示前200个节点

python scripts/build_network.py --top 200

Opens `network/index.html` in browser — D3.js force-directed graph. Node colors:
- 🔴 Red = Taiwan company
- 🔵 Blue = International company
- 🟢 Green = Technology
- 🟠 Orange = Material
- 🟣 Purple = Application
python scripts/build_network.py --top 200

在浏览器中打开`network/index.html`——D3.js力导向图。节点颜色:
- 🔴 红色 = 台湾企业
- 🔵 蓝色 = 国际企业
- 🟢 绿色 = 技术
- 🟠 橙色 = 材料
- 🟣 紫色 = 应用

Wikilink Graph — Core Feature

Wikilink图谱——核心功能

The wikilink graph is what makes this database powerful. Every
[[entity]]
in every report creates edges in a knowledge graph.
Search by entity to find related companies:
SearchResultsInsight
[[Apple]]
207 companiesApple's full Taiwan supplier network
[[NVIDIA]]
277 companiesNVIDIA's Taiwan supply chain
[[台積電]]
469 companiesTaiwan semiconductor ecosystem
[[CoWoS]]
39 companiesTSMC advanced packaging players
[[AI 伺服器]]
148 companiesAI server supply chain
[[PCB]]
263 companiesPrinted circuit board ecosystem
[[電動車]]
223 companiesEV component suppliers
Browse: Open
WIKILINKS.md
for the full categorized index.
Wikilink图谱是本数据库的核心优势。每份报告中的每个
[[实体]]
都会在知识图谱中创建关联边。
按实体搜索关联企业:
搜索结果洞察
[[Apple]]
207家企业Apple完整的台湾供应商网络
[[NVIDIA]]
277家企业NVIDIA的台湾供应链
[[台積電]]
469家企业台湾半导体生态
[[CoWoS]]
39家企业台积电先进封装参与者
[[AI 伺服器]]
148家企业AI服务器供应链
[[PCB]]
263家企业印刷电路板生态
[[電動車]]
223家企业电动汽车零部件供应商
浏览: 打开
WIKILINKS.md
查看完整的分类索引。

Code Examples

代码示例

Read and Parse a Report

读取并解析报告

python
import re
from pathlib import Path

def get_report(ticker: str, reports_dir: str = "Pilot_Reports") -> dict:
    """Find and parse a ticker report."""
    base = Path(reports_dir)
    # Find the file across all sector subdirectories
    matches = list(base.rglob(f"{ticker}_*.md"))
    if not matches:
        return {}
    
    content = matches[0].read_text(encoding="utf-8")
    
    # Extract all wikilinks
    wikilinks = re.findall(r'\[\[([^\]]+)\]\]', content)
    
    # Extract sector metadata
    sector_match = re.search(r'\*\*產業:\*\*\s*(.+)', content)
    board_match = re.search(r'\*\*板塊:\*\*\s*(.+)', content)
    
    return {
        "ticker": ticker,
        "file": str(matches[0]),
        "sector": sector_match.group(1).strip() if sector_match else None,
        "board": board_match.group(1).strip() if board_match else None,
        "wikilinks": list(set(wikilinks)),
        "wikilink_count": len(set(wikilinks)),
        "content": content
    }
python
import re
from pathlib import Path

def get_report(ticker: str, reports_dir: str = "Pilot_Reports") -> dict:
    """查找并解析股票报告。"""
    base = Path(reports_dir)
    # 遍历所有板块子目录查找文件
    matches = list(base.rglob(f"{ticker}_*.md"))
    if not matches:
        return {}
    
    content = matches[0].read_text(encoding="utf-8")
    
    # 提取所有wikilink
    wikilinks = re.findall(r'\[\[([^\]]+)\]\]', content)
    
    # 提取板块元数据
    sector_match = re.search(r'\*\*產業:\*\*\s*(.+)', content)
    board_match = re.search(r'\*\*板塊:\*\*\s*(.+)', content)
    
    return {
        "ticker": ticker,
        "file": str(matches[0]),
        "sector": sector_match.group(1).strip() if sector_match else None,
        "board": board_match.group(1).strip() if board_match else None,
        "wikilinks": list(set(wikilinks)),
        "wikilink_count": len(set(wikilinks)),
        "content": content
    }

Usage

使用示例

report = get_report("2330") print(f"Sector: {report['sector']}") print(f"Wikilinks ({report['wikilink_count']}): {report['wikilinks'][:10]}")
undefined
report = get_report("2330") print(f"板块: {report['sector']}") print(f"Wikilink ({report['wikilink_count']}条): {report['wikilinks'][:10]}")
undefined

Build a Custom Wikilink Index

构建自定义Wikilink索引

python
import re
from pathlib import Path
from collections import defaultdict

def build_wikilink_index(reports_dir: str = "Pilot_Reports") -> dict:
    """
    Returns: {entity: [list of tickers that mention it]}
    """
    index = defaultdict(list)
    
    for md_file in Path(reports_dir).rglob("*.md"):
        # Extract ticker from filename (e.g. "2330_台積電.md" -> "2330")
        ticker = md_file.stem.split("_")[0]
        content = md_file.read_text(encoding="utf-8")
        wikilinks = set(re.findall(r'\[\[([^\]]+)\]\]', content))
        
        for link in wikilinks:
            index[link].append(ticker)
    
    # Sort by mention count
    return dict(sorted(index.items(), key=lambda x: len(x[1]), reverse=True))
python
import re
from pathlib import Path
from collections import defaultdict

def build_wikilink_index(reports_dir: str = "Pilot_Reports") -> dict:
    """
    返回: {实体: [提及该实体的股票代码列表]}
    """
    index = defaultdict(list)
    
    for md_file in Path(reports_dir).rglob("*.md"):
        # 从文件名提取股票代码(例如 "2330_台積電.md" -> "2330")
        ticker = md_file.stem.split("_")[0]
        content = md_file.read_text(encoding="utf-8")
        wikilinks = set(re.findall(r'\[\[([^\]]+)\]\]', content))
        
        for link in wikilinks:
            index[link].append(ticker)
    
    # 按提及次数排序
    return dict(sorted(index.items(), key=lambda x: len(x[1]), reverse=True))

Find all companies in Apple's supply chain

查找Apple供应链的所有企业

index = build_wikilink_index() apple_suppliers = index.get("Apple", []) print(f"Apple supply chain: {len(apple_suppliers)} companies") print(apple_suppliers[:20])
index = build_wikilink_index() apple_suppliers = index.get("Apple", []) print(f"Apple供应链: {len(apple_suppliers)}家企业") print(apple_suppliers[:20])

Find companies involved in CoWoS

查找涉及CoWoS的企业

cowos_companies = index.get("CoWoS", []) print(f"\nCoWoS ecosystem: {len(cowos_companies)} companies: {cowos_companies}")
undefined
cowos_companies = index.get("CoWoS", []) print(f"\nCoWoS生态: {len(cowos_companies)}家企业: {cowos_companies}")
undefined

Find Supply Chain Overlaps Between Two Entities

查找两个实体的供应链重叠部分

python
def supply_chain_overlap(entity_a: str, entity_b: str, reports_dir: str = "Pilot_Reports"):
    """Find tickers that appear in both entities' supply chains."""
    index = build_wikilink_index(reports_dir)
    
    set_a = set(index.get(entity_a, []))
    set_b = set(index.get(entity_b, []))
    overlap = set_a & set_b
    
    print(f"{entity_a}: {len(set_a)} companies")
    print(f"{entity_b}: {len(set_b)} companies")
    print(f"Overlap: {len(overlap)} companies — {sorted(overlap)}")
    return overlap
python
def supply_chain_overlap(entity_a: str, entity_b: str, reports_dir: str = "Pilot_Reports"):
    """查找同时出现在两个实体供应链中的股票代码。"""
    index = build_wikilink_index(reports_dir)
    
    set_a = set(index.get(entity_a, []))
    set_b = set(index.get(entity_b, []))
    overlap = set_a & set_b
    
    print(f"{entity_a}: {len(set_a)}家企业")
    print(f"{entity_b}: {len(set_b)}家企业")
    print(f"重叠: {len(overlap)}家企业 — {sorted(overlap)}")
    return overlap

Companies in both NVIDIA and Apple supply chains

同时出现在NVIDIA和Apple供应链中的企业

supply_chain_overlap("NVIDIA", "Apple")
supply_chain_overlap("NVIDIA", "Apple")

Companies in both AI server and EV supply chains

同时出现在AI服务器和电动汽车供应链中的企业

supply_chain_overlap("AI 伺服器", "電動車")
undefined
supply_chain_overlap("AI 伺服器", "電動車")
undefined

Batch Financial Update with Error Handling

带错误处理的批量财务数据更新

python
import subprocess
import sys

def update_sector_financials(sector: str, valuation_only: bool = False):
    """Update financials for all tickers in a sector."""
    script = "update_valuation.py" if valuation_only else "update_financials.py"
    cmd = [sys.executable, f"scripts/{script}", "--sector", sector]
    
    result = subprocess.run(cmd, capture_output=True, text=True)
    
    if result.returncode != 0:
        print(f"Error: {result.stderr}")
    else:
        print(result.stdout)
    
    return result.returncode
python
import subprocess
import sys

def update_sector_financials(sector: str, valuation_only: bool = False):
    """更新某个板块所有股票的财务数据。"""
    script = "update_valuation.py" if valuation_only else "update_financials.py"
    cmd = [sys.executable, f"scripts/{script}", "--sector", sector]
    
    result = subprocess.run(cmd, capture_output=True, text=True)
    
    if result.returncode != 0:
        print(f"错误: {result.stderr}")
    else:
        print(result.stdout)
    
    return result.returncode

Update valuation multiples for semiconductors (fast)

更新半导体板块的估值倍数(速度快)

update_sector_financials("Semiconductors", valuation_only=True)
update_sector_financials("Semiconductors", valuation_only=True)

Full financial update for a sector

完整更新某个板块的财务数据

update_sector_financials("Electronic Components", valuation_only=False)
undefined
update_sector_financials("Electronic Components", valuation_only=False)
undefined

Prepare Enrichment JSON

准备补充内容JSON

python
import json

def build_enrichment_entry(ticker: str, company_name: str,
                           description: str, upstream: list[str],
                           midstream: str, downstream: list[str],
                           customers: list[str], suppliers: list[str]) -> dict:
    """
    Build a properly formatted enrichment entry.
    All entity names in lists will be wrapped in [[wikilinks]].
    """
    def wikify(items):
        return "\n".join(f"- [[{item}]]" for item in items)
    
    supply_chain = (
        f"**上游:**\n{wikify(upstream)}\n"
        f"**中游:**\n- **{company_name}** ({midstream})\n"
        f"**下游:**\n{wikify(downstream)}"
    )
    
    cust_section = (
        f"### 主要客戶\n{wikify(customers)}\n\n"
        f"### 主要供應商\n{wikify(suppliers)}"
    )
    
    return {
        "desc": description,
        "supply_chain": supply_chain,
        "cust": cust_section
    }
python
import json

def build_enrichment_entry(ticker: str, company_name: str,
                           description: str, upstream: list[str],
                           midstream: str, downstream: list[str],
                           customers: list[str], suppliers: list[str]) -> dict:
    """
    构建格式正确的补充内容条目。
    列表中的所有实体名称都会被包裹在[[wikilink]]中。
    """
    def wikify(items):
        return "\n".join(f"- [[{item}]]" for item in items)
    
    supply_chain = (
        f"**上游:**\n{wikify(upstream)}\n"
        f"**中游:**\n- **{company_name}** ({midstream})\n"
        f"**下游:**\n{wikify(downstream)}"
    )
    
    cust_section = (
        f"### 主要客戶\n{wikify(customers)}\n\n"
        f"### 主要供應商\n{wikify(suppliers)}"
    )
    
    return {
        "desc": description,
        "supply_chain": supply_chain,
        "cust": cust_section
    }

Build enrichment for multiple tickers

为多只股票构建补充内容

enrichment = { "2330": build_enrichment_entry( ticker="2330", company_name="台積電", description="台積電為全球最大晶圓代工廠,專注於 [[CoWoS]]、[[3奈米]] 先進製程,為全球領先科技公司提供晶片製造服務。", upstream=["ASML", "Applied Materials", "SUMCO", "Tokyo Electron"], midstream="晶圓代工", downstream=["Apple", "NVIDIA", "AMD", "Broadcom", "Qualcomm"], customers=["Apple", "NVIDIA", "AMD", "Qualcomm", "MediaTek"], suppliers=["ASML", "Tokyo Electron", "Shin-Etsu", "Applied Materials"] ) }
enrichment = { "2330": build_enrichment_entry( ticker="2330", company_name="台積電", description="台積電為全球最大晶圓代工廠,專注於 [[CoWoS]]、[[3奈米]] 先進製程,為全球領先科技公司提供晶片製造服務。", upstream=["ASML", "Applied Materials", "SUMCO", "Tokyo Electron"], midstream="晶圓代工", downstream=["Apple", "NVIDIA", "AMD", "Broadcom", "Qualcomm"], customers=["Apple", "NVIDIA", "AMD", "Qualcomm", "MediaTek"], suppliers=["ASML", "Tokyo Electron", "Shin-Etsu", "Applied Materials"] ) }

Save to file

保存到文件

with open("enrichment.json", "w", encoding="utf-8") as f: json.dump(enrichment, f, ensure_ascii=False, indent=2)
print("enrichment.json ready. Apply with:") print("python scripts/update_enrichment.py --data enrichment.json 2330")
undefined
with open("enrichment.json", "w", encoding="utf-8") as f: json.dump(enrichment, f, ensure_ascii=False, indent=2)
print("enrichment.json已准备好,使用以下命令应用:") print("python scripts/update_enrichment.py --data enrichment.json 2330")
undefined

Audit a Batch Programmatically

编程式批量审核

python
import subprocess
import json

def audit_and_report(batch_id: int) -> dict:
    """Run audit and parse results."""
    result = subprocess.run(
        ["python", "scripts/audit_batch.py", str(batch_id), "-v"],
        capture_output=True, text=True
    )
    
    output = result.stdout
    
    # Parse pass/fail counts from output
    passed = output.count("✓") 
    failed = output.count("✗")
    
    return {
        "batch": batch_id,
        "passed": passed,
        "failed": failed,
        "pass_rate": passed / (passed + failed) if (passed + failed) > 0 else 0,
        "output": output
    }

results = audit_and_report(101)
print(f"Batch 101: {results['passed']} passed, {results['failed']} failed")
print(f"Pass rate: {results['pass_rate']:.1%}")
python
import subprocess
import json

def audit_and_report(batch_id: int) -> dict:
    """运行审核并解析结果。"""
    result = subprocess.run(
        ["python", "scripts/audit_batch.py", str(batch_id), "-v"],
        capture_output=True, text=True
    )
    
    output = result.stdout
    
    # 从输出解析通过/失败数量
    passed = output.count("✓") 
    failed = output.count("✗")
    
    return {
        "batch": batch_id,
        "passed": passed,
        "failed": failed,
        "pass_rate": passed / (passed + failed) if (passed + failed) > 0 else 0,
        "output": output
    }

results = audit_and_report(101)
print(f"批次101: {results['passed']}通过, {results['failed']}失败")
print(f"通过率: {results['pass_rate']:.1%}")

Common Workflows

常见工作流

Workflow 1: Research a New Investment Theme

工作流1:研究新的投资主题

bash
undefined
bash
undefined

1. Search for related companies

1. 搜索相关企业

python scripts/discover.py "液冷散熱" --smart
python scripts/discover.py "液冷散熱" --smart

2. Apply wikilinks to matching reports

2. 给匹配的报告添加wikilink标签

python scripts/discover.py "液冷散熱" --apply
python scripts/discover.py "液冷散熱" --apply

3. Rebuild themes to include new theme

3. 重建主题列表包含新主题

python scripts/build_themes.py "液冷散熱"
python scripts/build_themes.py "液冷散熱"

4. Rebuild wikilink index and network

4. 重建wikilink索引和网络图

python scripts/build_wikilink_index.py python scripts/build_network.py
python scripts/build_wikilink_index.py python scripts/build_network.py

5. Browse results

5. 浏览结果

open themes/液冷散熱.md open network/index.html
undefined
open themes/液冷散熱.md open network/index.html
undefined

Workflow 2: Onboard a New Ticker

工作流2:新增一只股票

bash
undefined
bash
undefined

1. Add the report (Python script, free)

1. 添加报告(Python脚本,免费)

python scripts/add_ticker.py 6669 緯穎 --sector Computer Hardware
python scripts/add_ticker.py 6669 緯穎 --sector Computer Hardware

2. Update financial data

2. 更新财务数据

python scripts/update_financials.py 6669
python scripts/update_financials.py 6669

3. Prepare enrichment JSON (use AI research or manual)

3. 准备补充内容JSON(使用AI研究或手动编写)

Edit enrichment.json with business description, supply chain, customers

编辑enrichment.json添加业务描述、供应链、客户信息

4. Apply enrichment

4. 应用补充内容

python scripts/update_enrichment.py --data enrichment.json 6669
python scripts/update_enrichment.py --data enrichment.json 6669

5. Audit quality

5. 审核质量

python scripts/audit_batch.py --all -v
python scripts/audit_batch.py --all -v

6. Rebuild index

6. 重建索引

python scripts/build_wikilink_index.py
undefined
python scripts/build_wikilink_index.py
undefined

Workflow 3: Refresh Valuation for Earnings Season

工作流3:财报季更新估值

bash
undefined
bash
undefined

Fast valuation refresh only (no full financial re-fetch)

仅快速刷新估值(不重新拉取完整财报)

python scripts/update_valuation.py --sector Semiconductors python scripts/update_valuation.py --sector Electronic Components python scripts/update_valuation.py --sector Computer Hardware
python scripts/update_valuation.py --sector Semiconductors python scripts/update_valuation.py --sector Electronic Components python scripts/update_valuation.py --sector Computer Hardware

Or refresh everything (slow, run overnight)

或者刷新所有股票(速度慢,夜间运行)

python scripts/update_valuation.py
undefined
python scripts/update_valuation.py
undefined

Workflow 4: Map a Supply Chain

工作流4:映射供应链

python
undefined
python
undefined

Find all Taiwan companies connected to a specific technology

查找与特定技术相关的所有台湾企业

from collections import defaultdict import re from pathlib import Path
def map_supply_chain(technology: str, reports_dir: str = "Pilot_Reports"): results = {"upstream": [], "midstream": [], "downstream": []}
for md_file in Path(reports_dir).rglob("*.md"):
    content = md_file.read_text(encoding="utf-8")
    
    if f"[[{technology}]]" not in content:
        continue
    
    ticker = md_file.stem.split("_")[0]
    company = md_file.stem.split("_", 1)[1] if "_" in md_file.stem else ""
    
    # Detect position in supply chain
    if f"**上游:**" in content and f"[[{technology}]]" in content.split("**上游:**")[1].split("**中游:**")[0]:
        results["upstream"].append(f"{ticker} {company}")
    elif f"[[{technology}]]" in content and "**中游:**" in content:
        mid_section = content.split("**中游:**")[1].split("**下游:**")[0] if "**下游:**" in content else ""
        if f"[[{technology}]]" in mid_section:
            results["midstream"].append(f"{ticker} {company}")
    else:
        results["downstream"].append(f"{ticker} {company}")

return results
chain = map_supply_chain("CoWoS") print(f"Upstream: {chain['upstream']}") print(f"Midstream: {chain['midstream']}") print(f"Downstream: {chain['downstream']}")
undefined
from collections import defaultdict import re from pathlib import Path
def map_supply_chain(technology: str, reports_dir: str = "Pilot_Reports"): results = {"upstream": [], "midstream": [], "downstream": []}
for md_file in Path(reports_dir).rglob("*.md"):
    content = md_file.read_text(encoding="utf-8")
    
    if f"[[{technology}]]" not in content:
        continue
    
    ticker = md_file.stem.split("_")[0]
    company = md_file.stem.split("_", 1)[1] if "_" in md_file.stem else ""
    
    # 检测供应链位置
    if f"**上游:**" in content and f"[[{technology}]]" in content.split("**上游:**")[1].split("**中游:**")[0]:
        results["upstream"].append(f"{ticker} {company}")
    elif f"[[{technology}]]" in content and "**中游:**" in content:
        mid_section = content.split("**中游:**")[1].split("**下游:**")[0] if "**下游:**" in content else ""
        if f"[[{technology}]]" in mid_section:
            results["midstream"].append(f"{ticker} {company}")
    else:
        results["downstream"].append(f"{ticker} {company}")

return results
chain = map_supply_chain("CoWoS") print(f"上游: {chain['upstream']}") print(f"中游: {chain['midstream']}") print(f"下游: {chain['downstream']}")
undefined

Token Cost Reference

Token成本参考

OperationTokens UsedCommand
Update financialsFree (yfinance)
python scripts/update_financials.py
Update valuationFree (yfinance)
python scripts/update_valuation.py
Discover (with results)Free
python scripts/discover.py "term"
AuditFree
python scripts/audit_batch.py
Build themes/network/indexFree
python scripts/build_*.py
/add-ticker
(Claude Code)
MediumAI research per ticker
/update-enrichment
(Claude Code)
Medium3–5 web searches per ticker
/discover
(no results found)
Low–HighAI researches online
Rule of thumb: Use Python scripts for bulk data operations. Use Claude Code slash commands only when AI research is needed for a specific ticker.
操作Token消耗命令
更新财务数据免费 (yfinance)
python scripts/update_financials.py
更新估值免费 (yfinance)
python scripts/update_valuation.py
搜索(有结果)免费
python scripts/discover.py "term"
审核免费
python scripts/audit_batch.py
构建主题/网络图/索引免费
python scripts/build_*.py
/add-ticker
(Claude Code)
中等每只股票AI研究消耗
/update-enrichment
(Claude Code)
中等每只股票3-5次网页搜索消耗
/discover
(无结果)
低-高AI在线研究消耗
经验法则: 批量数据操作使用Python脚本。仅当需要为特定股票做AI研究时使用Claude Code斜杠命令。

Troubleshooting

问题排查

yfinance
returns no data for a Taiwan ticker:
python
import yfinance as yf
yfinance
无法返回台湾股票的数据:
python
import yfinance as yf

Taiwan tickers need .TW suffix for TWSE, .TWO for OTC

台湾股票代码需要加.TW后缀(台交所)或.TWO后缀(上柜)

tsmc = yf.Ticker("2330.TW") print(tsmc.info.get("marketCap"))

**Report not found by scripts:**
- Filename must match pattern: `{TICKER}_{CompanyName}.md`
- Must be inside a subfolder of `Pilot_Reports/`
- Use `python scripts/utils.py` to test file discovery

**Audit fails "too few wikilinks":**
- Minimum 8 unique `[[wikilinks]]` required per report
- Use `update_enrichment.py` to add richer content
- Run `discover.py --apply` to auto-tag relevant wikilinks

**`build_network.py` produces empty graph:**
```bash
tsmc = yf.Ticker("2330.TW") print(tsmc.info.get("marketCap"))

**脚本找不到报告:**
- 文件名必须匹配格式:`{股票代码}_{公司名称}.md`
- 必须放在`Pilot_Reports/`的子文件夹中
- 使用`python scripts/utils.py`测试文件查找功能

**审核失败「wikilink数量太少」:**
- 每份报告至少需要8个唯一的`[[wikilink]]`
- 使用`update_enrichment.py`添加更丰富的内容
- 运行`discover.py --apply`自动打上相关wikilink标签

**`build_network.py`生成空图:**
```bash

Ensure reports exist and have wikilinks first

首先确保报告存在且包含wikilink

python scripts/build_wikilink_index.py # Check WIKILINKS.md has entries python scripts/build_network.py --min-weight 2 # Lower threshold

**Enrichment JSON rejected:**
- Ensure file is valid UTF-8 with `ensure_ascii=False`
- Keys must be ticker strings (`"2330"`, not `2330`)
- Required keys per entry: `desc`, `supply_chain`, `cust`
- Content must be in Traditional Chinese; English only for proper nouns

**Finding batch numbers:**
- See `task.md` for batch definitions and which tickers are in each batch
- Batches are used for incremental processing of the 1,735 tickers
python scripts/build_wikilink_index.py # 检查WIKILINKS.md是否有内容 python scripts/build_network.py --min-weight 2 # 降低阈值

**补充内容JSON被拒绝:**
- 确保文件是有效的UTF-8编码,保存时设置`ensure_ascii=False`
- 键必须是字符串类型的股票代码(`"2330"`,不是`2330`)
- 每个条目必须包含必填键:`desc`、`supply_chain`、`cust`
- 内容必须为繁体中文,仅专有名词可使用英文

**查找批次编号:**
- 批次定义和每批包含的股票见`task.md`
- 批次用于增量处理1735只股票