market-data
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseMarket Data
市场数据
Purpose
目的
Guide the design and management of market data infrastructure for financial services
firms. Covers real-time and delayed market data, depth of book levels, consolidated
tape and direct feeds, data vendor selection and management, market data licensing and
entitlements, data distribution architecture, and market data quality management.
Enables building or evaluating market data infrastructure that delivers accurate,
timely data to trading, portfolio management, and client-facing systems.
为金融服务机构指导市场数据基础设施的设计与管理。涵盖实时与延迟市场数据、订单簿深度层级、统一行情磁带与直接数据馈送、数据供应商选择与管理、市场数据许可与授权、数据分发架构以及市场数据质量管理。助力构建或评估可向交易、投资组合管理及客户面向系统提供准确、及时数据的市场数据基础设施。
Layer
层级
13 — Data Integration (Reference Data & Integration)
13 — 数据集成(参考数据与集成)
Direction
适用方向
both
双向
When to Use
适用场景
- A firm is designing or upgrading its market data infrastructure
- Questions arise about real-time vs delayed data or Level 1/2/3 requirements
- A firm is evaluating market data vendors (Bloomberg, Refinitiv, ICE, FactSet)
- Questions involve consolidated tape (SIP) vs direct exchange feeds
- A firm needs to manage exchange data licensing, entitlements, or usage reporting
- A technology team is designing ticker plants, fan-out, or distribution architecture
- Data quality issues arise: stale quotes, missing ticks, erroneous prints
- Trigger phrases: "market data," "Level 1/2/3," "depth of book," "consolidated tape," "SIP," "direct feed," "NBBO," "ticker plant," "B-PIPE," "data license," "non-display use," "market data entitlements," "conflation," "tick data," "real-time feed"
- 机构正在设计或升级其市场数据基础设施
- 对实时vs延迟数据或Level 1/2/3需求存在疑问
- 机构正在评估市场数据供应商(Bloomberg、Refinitiv、ICE、FactSet)
- 涉及统一行情磁带(SIP)与直接交易所馈送的对比疑问
- 机构需要管理交易所数据许可、授权或使用报告
- 技术团队正在设计行情处理系统(ticker plant)、数据分发(fan-out)或分发架构
- 出现数据质量问题: stale报价、缺失交易tick、错误成交记录
- 触发关键词:"market data"、"Level 1/2/3"、"depth of book"、"consolidated tape"、"SIP"、"direct feed"、"NBBO"、"ticker plant"、"B-PIPE"、"data license"、"non-display use"、"market data entitlements"、"conflation"、"tick data"、"real-time feed"
Core Concepts
核心概念
1. Market Data Types
1. 市场数据类型
- Trade data: Last sale price, quantity, timestamp, condition codes (regular, odd lot, opening/closing), cumulative volume, VWAP.
- Quote data: NBBO (best bid/offer across all exchanges), bid/ask sizes. Quote updates vastly outnumber trade updates in liquid instruments.
- Depth of book: Multiple price levels beyond the NBBO with resting order quantities. Aggregated depth (size per level) or order-by-order (individual orders visible).
- Index data: Real-time index values, composition, weightings, total return values. Sources include exchange-calculated (S&P 500 via Cboe) and third-party (MSCI, FTSE Russell).
- Fixed income pricing: Dealer quotes, evaluated pricing (ICE, Bloomberg BVAL), TRACE trade reports for corporates, EMMA for municipals. Inherently less standardized than equities.
- Options data: Chains (strikes/expirations), Greeks, implied volatility, volume, open interest. OPRA provides the consolidated options feed.
- Fundamental data: Earnings, financial statements, corporate actions, analyst estimates. Sourced from vendors (Bloomberg, FactSet, S&P Capital IQ) rather than exchange feeds.
- News and events: Headlines, economic calendar (FOMC, employment), corporate events (earnings dates, ex-dates), sentiment scores.
- 交易数据:最新成交价、成交量、时间戳、成交条件代码(常规、零股、开盘/收盘)、累计成交量、VWAP(成交量加权平均价)。
- 报价数据:NBBO(全美最优买卖报价)、买卖盘规模。在流动性工具中,报价更新数量远超交易更新。
- 订单簿深度:NBBO之外的多个价格层级及挂单量。包括聚合深度(每层级规模)或逐单数据(可见单个订单)。
- 指数数据:实时指数值、成分构成、权重、总回报值。来源包括交易所计算(如Cboe提供的标普500)及第三方(MSCI、富时罗素)。
- 固定收益定价:交易商报价、估值定价(ICE、Bloomberg BVAL)、公司债TRACE交易报告、市政债EMMA数据。本质上比股票数据标准化程度低。
- 期权数据:期权链(行权价/到期日)、希腊字母、隐含波动率、成交量、持仓量。OPRA提供统一期权数据馈送。
- 基本面数据:盈利、财务报表、公司行动、分析师预期。来源于供应商(Bloomberg、FactSet、标普Capital IQ)而非交易所馈送。
- 新闻与事件:头条、经济日历(FOMC、就业数据)、公司事件(盈利日期、除权日)、情绪评分。
2. Data Levels
2. 数据层级
Level 1 — Top of Book: NBBO, last sale, volume, daily OHLC. Sufficient for portfolio
management, client reporting, and order entry. Lowest cost and bandwidth.
Level 2 — Market Depth: Multiple price levels with aggregate size (top 5-20 levels per
side). Reveals liquidity beyond the NBBO. Essential for active trading, market impact
assessment, and algorithmic execution (TWAP, VWAP). Higher cost and bandwidth.
Level 3 — Full Order Book: Individual order detail (price, size, order ID) enabling
complete book reconstruction and order lifecycle tracking. Provided by direct feeds (Nasdaq
ITCH, NYSE Arca). Required for market making, HFT, and queue position modeling. Highest
cost — hundreds of thousands of messages per second per exchange.
| Use Case | Level | Rationale |
|---|---|---|
| Portfolio management / reporting | Level 1 | NBBO and last sale sufficient for valuation |
| Active equity trading desk | Level 2 | Traders assess depth before large orders |
| Algorithmic execution | Level 2 | Algorithms adapt pace based on available liquidity |
| Market making / HFT | Level 3 | Requires queue position and order flow modeling |
| Client-facing app (delayed) | Level 1 (delayed) | Display only, 15-minute delay acceptable |
Level 1 — 最优报价层:NBBO、最新成交价、成交量、每日OHLC(开盘/最高/最低/收盘价)。足以满足投资组合管理、客户报告及订单录入需求。成本与带宽最低。
Level 2 — 市场深度:多个价格层级及聚合规模(每侧前5-20层级)。揭示NBBO之外的流动性。对活跃交易、市场影响评估及算法执行(TWAP、VWAP)至关重要。成本与带宽更高。
Level 3 — 完整订单簿:单个订单详情(价格、规模、订单ID),支持完整订单簿重建及订单生命周期追踪。由直接数据馈送提供(如Nasdaq ITCH、NYSE Arca)。做市商、高频交易(HFT)及队列位置建模必备。成本最高——每个交易所每秒可产生数十万条消息。
| 适用场景 | 层级 | 理由 |
|---|---|---|
| 投资组合管理/报告 | Level 1 | NBBO与最新成交价足以满足估值需求 |
| 股票活跃交易 desk | Level 2 | 交易员在下达大额订单前需评估市场深度 |
| 算法执行 | Level 2 | 算法需根据可用流动性调整执行节奏 |
| 做市/高频交易 | Level 3 | 需要队列位置及订单流建模 |
| 客户面向应用(延迟) | Level 1(延迟) | 仅用于展示,15分钟延迟可接受 |
3. Consolidated Tape vs Direct Feeds
3. 统一行情磁带 vs 直接数据馈送
Securities Information Processors (SIPs): CTA/CQS for NYSE-listed (Tape A/B), UTP for
Nasdaq-listed (Tape C), OPRA for options. SIPs collect data from all exchanges, compute the
NBBO, and disseminate a consolidated stream. Under Reg NMS, the SIP NBBO is the regulatory
benchmark for best execution.
Direct exchange feeds: Proprietary feeds from individual exchanges (NYSE Arca, Nasdaq
TotalView/ITCH, Cboe PITCH, IEX DEEP) delivering order-by-order data with lower latency
than the SIP. A firm must subscribe to multiple feeds and compute NBBO internally. Each
exchange uses different protocols requiring per-exchange parsers.
| Dimension | SIP (Consolidated) | Direct Feeds |
|---|---|---|
| Latency | Higher (~10-50 microseconds SIP processing) | Lower (bypasses SIP) |
| NBBO | Provided directly | Must compute from multiple feeds |
| Data depth | Level 1 (NBBO + last sale) | Level 2/3 (full depth, order-by-order) |
| Cost | Lower, predictable | Higher, scales with exchange count |
| Normalization | Pre-normalized | Requires per-exchange parsers |
| Typical consumer | Buy-side, advisory, retail | Prop trading, market making, HFT |
证券信息处理器(SIPs):NYSE上市品种的CTA/CQS(Tape A/B)、Nasdaq上市品种的UTP(Tape C)、期权的OPRA。SIPs收集所有交易所的数据,计算NBBO,并分发统一数据流。根据Reg NMS规定,SIP的NBBO是最佳执行的监管基准。
直接交易所馈送:来自单个交易所的专有馈送(如NYSE Arca、Nasdaq TotalView/ITCH、Cboe PITCH、IEX DEEP),提供逐单数据,延迟低于SIP。机构需订阅多个馈送并自行计算NBBO。每个交易所使用不同协议,需针对每个交易所开发解析器。
| 维度 | SIP(统一行情) | 直接数据馈送 |
|---|---|---|
| 延迟 | 较高(约10-50微秒SIP处理时间) | 较低(绕过SIP) |
| NBBO | 直接提供 | 需从多个馈送自行计算 |
| 数据深度 | Level 1(NBBO+最新成交价) | Level 2/3(完整深度、逐单数据) |
| 成本 | 较低、可预测 | 较高,随交易所数量增加而上升 |
| 标准化 | 预标准化 | 需针对每个交易所开发解析器 |
| 典型用户 | 买方、咨询、零售 | 自营交易、做市商、高频交易 |
4. Market Data Vendors
4. 市场数据供应商
Bloomberg: Terminal ($20K-$25K/user/year), B-PIPE (enterprise real-time feed), Data
License (bulk EOD/reference data), BEAP (cloud API).
Refinitiv (LSEG): Eikon (desktop, lower cost than Bloomberg, strong FX/FI), Elektron/
LSEG Real-Time (enterprise feed), DataScope (bulk EOD), Tick History (historical ticks).
ICE Data Services: Consolidated feeds, evaluated fixed income pricing (widely used for
NAV and regulatory reporting), ICE Benchmark Administration.
FactSet: Research-oriented, flexible API delivery, competitive pricing for smaller
buy-side, strong Excel/portfolio management integration.
S&P Capital IQ / Market Intelligence: Comprehensive fundamentals, credit ratings,
company filings. Morningstar: Fund/ETF data, ratings, Morningstar Direct for research.
Free/open sources: Exchange websites and financial portals provide delayed (15-min)
quotes. Useful for non-time-sensitive display but limited reliability and coverage.
Vendor selection criteria: Asset class coverage, latency, reliability/uptime SLA, API
quality, total licensing cost (including exchange fees), historical data depth, support,
data quality handling.
Bloomberg:终端(2万-2.5万美元/用户/年)、B-PIPE(企业级实时馈送)、Data License(批量日终/参考数据)、BEAP(云API)。
Refinitiv(LSEG):Eikon(桌面端,比Bloomberg成本低,外汇/固定收益领域优势明显)、Elektron/LSEG Real-Time(企业级馈送)、DataScope(批量日终数据)、Tick History(历史tick数据)。
ICE Data Services:统一数据馈送、固定收益估值定价(广泛用于NAV及监管报告)、ICE基准管理。
FactSet:研究导向,API交付灵活,对小型买方定价有竞争力,与Excel/投资组合管理集成度高。
标普Capital IQ / Market Intelligence:全面的基本面数据、信用评级、公司 filings。Morningstar:基金/ETF数据、评级、Morningstar Direct研究工具。
免费/开源来源:交易所网站及金融门户提供延迟(15分钟)报价。适用于非时间敏感的展示场景,但可靠性及覆盖范围有限。
供应商选择标准:资产类别覆盖、延迟、可靠性/uptime SLA、API质量、总许可成本(含交易所费用)、历史数据深度、支持服务、数据质量处理能力。
5. Market Data Licensing and Entitlements
5. 市场数据许可与授权
License categories: Non-professional (retail, personal use, lower fees) vs professional
(business use, significantly higher). Display (human views on screen) vs non-display
(automated systems: algorithms, risk engines, pricing — fees based on application type, not
per-user). Derived data (substantially transformed; redistribution may be permitted if
original data cannot be reverse-engineered; policies vary by exchange).
Licensing models: Per-user/per-device (exact monthly count required), enterprise (flat
fee covering a defined entity), usage-based non-display (fees by application category:
trading, risk, valuation).
Reporting obligations: Monthly/quarterly subscriber counts submitted to each exchange or
via data vendor. Under-reporting triggers back-billing, penalties, and contract termination.
Redistribution: Raw exchange data requires explicit redistribution agreements and
additional fees for client-facing display. Vendors typically handle redistribution for data
consumed through their platforms.
Cost management: Audit usage periodically to eliminate unused subscriptions. Use delayed
data where real-time is unnecessary. Track non-display use — many firms discover unreported
non-display obligations only during exchange audits.
许可类别:非专业(零售、个人使用,费用较低)vs 专业(商业使用,费用显著更高)。展示类(屏幕上的人工查看)vs 非展示类(自动化系统:算法、风险引擎、定价——费用基于应用类型,而非按用户)。衍生数据(经过大幅转换;若无法逆向还原原始数据,可允许再分发;政策因交易所而异)。
许可模式:按用户/设备(需准确月度计数)、企业级(固定费用覆盖特定实体)、基于使用的非展示类(按应用类别收费:交易、风险、估值)。
报告义务:每月/每季度向各交易所或通过数据供应商提交订阅者计数。少报将导致回溯计费、罚款及合同终止。
再分发:原始交易所数据用于客户面向展示时,需明确的再分发协议及额外费用。供应商通常处理通过其平台消费的数据的再分发事宜。
成本管理:定期审计使用情况,取消未使用的订阅。在无需实时数据的场景使用延迟数据。追踪非展示类使用——许多机构仅在交易所审计时才发现未申报的非展示类使用义务。
6. Market Data Distribution Architecture
6. 市场数据分发架构
Ticker plant: Central ingestion and normalization layer. Parses exchange protocols (ITCH,
PITCH, FIX), normalizes to unified schema, maps symbology, caches latest values, applies
conflation, and monitors feed health.
Fan-out patterns: Topic-based pub-sub (dominant pattern; middleware: Solace, TIBCO,
29West, Kafka for lower-latency needs), request-reply (REST for on-demand lookups),
multicast (network-level fan-out for ultra-low-latency co-located environments).
Conflation: Throttles update rates for slower consumers. Time-based (deliver latest
value every N ms), change-based (suppress duplicates), priority-based (never conflate
trades; conflate quotes for slower consumers).
APIs: REST for historical/reference data, WebSocket for real-time streaming to web/mobile
applications, proprietary binary APIs for ultra-low-latency consumers.
Cloud services: AWS Data Exchange, Google Cloud Marketplace, Azure Data Share. Adds
network latency (unsuitable for latency-sensitive trading) but appropriate for analytics,
portfolio management, and client-facing applications.
行情处理系统(ticker plant):中央 ingestion与标准化层。解析交易所协议(ITCH、PITCH、FIX),标准化为统一 schema,映射代码,缓存最新值,应用合并(conflation),并监控馈送健康状况。
分发模式:基于主题的发布-订阅(主流模式;中间件:Solace、TIBCO、29West、Kafka用于低延迟需求)、请求-响应(REST用于按需查询)、多播(网络级分发,适用于超低延迟的托管环境)。
合并(conflation):为较慢的消费者限制更新速率。基于时间(每N毫秒发送最新值)、基于变化(抑制重复)、基于优先级(从不合并交易数据;为较慢消费者合并报价数据)。
APIs:REST用于历史/参考数据、WebSocket用于向Web/移动应用提供实时流、专有二进制API用于超低延迟消费者。
云服务:AWS Data Exchange、Google Cloud Marketplace、Azure Data Share。增加网络延迟(不适用于对延迟敏感的交易),但适用于分析、投资组合管理及客户面向应用。
7. Historical Market Data
7. 历史市场数据
EOD databases: Daily OHLCV. Sufficient for portfolio analytics and long-horizon
backtesting. Tick-level data: Every trade/quote with microsecond timestamps. Required for
intraday backtesting and microstructure research. A single day of U.S. equity ticks may
exceed 10-20 TB. Providers: Refinitiv Tick History, NYSE TAQ, LOBSTER.
Adjusted vs unadjusted prices: Unadjusted for trade-level analysis and regulatory
records. Split-adjusted and fully adjusted (splits + dividends) for return calculations.
Survivorship bias: Databases including only current listings inflate backtested returns.
Point-in-time databases (showing the universe as it existed historically) are required for
unbiased research. Point-in-time data also applies to fundamentals: initial earnings
reports may be restated; using restated data introduces look-ahead bias.
日终数据库:每日OHLCV。足以满足投资组合分析及长期回测。Tick级数据:每条交易/报价都带有微秒级时间戳。日内回测及微观结构研究必备。美国股票单日tick数据可能超过10-20TB。供应商:Refinitiv Tick History、NYSE TAQ、LOBSTER。
调整后 vs 未调整价格:未调整价格用于交易级分析及监管记录。拆股调整及完全调整(拆股+分红)价格用于回报计算。
生存偏差:仅包含当前上市品种的数据库会高估回测回报。需使用时点数据库(展示历史上实际存在的品种 universe)进行无偏差研究。时点数据也适用于基本面:初始盈利报告可能被重述;使用重述后数据会引入前瞻偏差。
8. Market Data Quality
8. 市场数据质量
Stale data detection: Flag quotes not updated within expected timeframes during market
hours. Suppress stale data from trading and valuation decisions.
Gap detection: Feed-level (sequence number gaps in ITCH/PITCH) and application-level
(expected vs actual data frequency).
Erroneous tick filtering: Process exchange trade-bust messages. Filter outlier prints
(prices far from NBBO, adjusted for spread and volatility). Distinguish legitimate unusual
trades (blocks, auctions, after-hours) from errors.
Monitoring and alerting: Feed health dashboards, latency tracking (exchange-to-receipt),
volume monitoring against baselines, automated alerts for disconnections, latency spikes,
staleness, and gaps.
Failover: Primary/secondary feed architecture with automatic failover on disconnection,
excessive latency, or quality breach. Downstream systems must handle graceful degradation
(e.g., losing Level 3 depth when failing from direct feed to SIP).
| Metric | Target |
|---|---|
| Feed uptime (trading hours) | > 99.95% |
| Median latency | < 1ms (direct), < 50ms (SIP) |
| 99th percentile latency | < 10ms (direct), < 100ms (SIP) |
| Staleness rate | < 0.1% of instruments |
| Gap rate | < 0.01% of expected messages |
** stale数据检测**:标记市场时段内未在预期时间内更新的报价。从交易及估值决策中排除stale数据。
缺口检测:馈送级(ITCH/PITCH中的序列号缺口)及应用级(预期vs实际数据频率)。
错误tick过滤:处理交易所的交易撤销消息。过滤异常成交记录(价格远偏离NBBO,需调整点差及波动率)。区分合法的异常交易(大宗交易、集合竞价、盘后交易)与错误。
监控与告警:馈送健康仪表板、延迟追踪(交易所到接收端)、基准量监控、针对断开连接、延迟飙升、stale数据及缺口的自动告警。
故障转移:主/备馈送架构,在断开连接、延迟过高或质量违规时自动切换。下游系统需处理优雅降级(如从直接馈送切换到SIP时丢失Level 3深度)。
| 指标 | 目标 |
|---|---|
| 馈送 uptime(交易时段) | > 99.95% |
| 中位数延迟 | <1ms(直接馈送),<50ms(SIP) |
| 99分位延迟 | <10ms(直接馈送),<100ms(SIP) |
| Stale数据率 | <0.1%的品种 |
| 缺口率 | <0.01%的预期消息 |
Worked Examples
实战案例
Example 1: Market Data Infrastructure for a Mid-Size RIA
案例1:中型RIA的市场数据基础设施
Scenario: A $2B RIA with 3,000 client accounts needs: real-time quotes for 15 portfolio
managers/traders, delayed data for 40 client service associates, EOD data for portfolio
accounting and performance, historical data for research, and a client portal with current
market values.
Data level assessment: Level 1 is sufficient. The firm places client orders, not market
making or HFT. This significantly reduces cost and infrastructure complexity.
Vendor evaluation:
| Option | Est. Annual Cost | Key Trade-off |
|---|---|---|
| Bloomberg (15 Terminals + Data License) | $375K-$425K | Deep analytics but expensive per-terminal model |
| Refinitiv Eikon (15 seats) + DataScope | $200K-$275K | Lower cost but smaller user community |
| FactSet (15 seats) + EOD package | $150K-$225K | Flexible pricing, strong API, less real-time trading depth |
FactSet offers the best balance for this firm: real-time quotes and screening for portfolio
managers, historical data and factor tools for research, and API access for internal systems.
Client portal data strategy: Real-time redistribution would add $100K-$200K/year in
exchange fees for 3,000 non-professional users. The firm selects 15-minute delayed data,
eliminating redistribution fees and clearly labeling prices as delayed.
Exchange licensing: 15 professional users for real-time Level 1. 40 associates on delayed
data (no exchange license). Client portal on delayed data (no redistribution fees). One
administrator handles monthly subscriber reporting through the vendor.
Analysis: Total cost of approximately $175K-$250K vs $400K+ for Bloomberg-centric.
The architecture separates real-time (licensed professionals) from delayed (everyone else),
minimizing licensing complexity. Annual vendor reviews and usage audits ensure compliance.
场景:管理20亿美元资产、拥有3000个客户账户的RIA需求:15名投资组合经理/交易员的实时报价、40名客户服务人员的延迟数据、投资组合会计与绩效的日终数据、研究用历史数据,以及显示当前市值的客户门户。
数据层级评估:Level 1足够。该机构仅下达客户订单,不从事做市或高频交易。这显著降低了成本及基础设施复杂度。
供应商评估:
| 选项 | 年度成本估算 | 关键权衡 |
|---|---|---|
| Bloomberg(15个终端+Data License) | 37.5万-42.5万美元 | 深度分析能力,但按终端计费模式昂贵 |
| Refinitiv Eikon(15个席位)+DataScope | 20万-27.5万美元 | 成本较低,但用户群体较小 |
| FactSet(15个席位)+日终数据包 | 15万-22.5万美元 | 定价灵活,API强大,但实时交易深度不足 |
FactSet为该机构提供最佳平衡:投资组合经理的实时报价与筛选、研究用历史数据及因子工具、内部系统的API访问。
客户门户数据策略:实时再分发将为3000名非专业用户增加每年10万-20万美元的交易所费用。该机构选择15分钟延迟数据,消除了再分发费用,并明确标注价格为延迟数据。
交易所许可:15名专业用户使用实时Level 1数据。40名员工使用延迟数据(无需交易所许可)。客户门户使用延迟数据(无再分发费用)。一名管理员通过供应商处理月度订阅者报告。
分析:总成本约17.5万-25万美元,而基于Bloomberg的架构成本超过40万美元。该架构将实时(许可专业用户)与延迟数据(其他所有用户)分离,最小化许可复杂度。年度供应商审核及使用情况审计确保合规。
Example 2: Market Data for an Electronic Trading Platform
案例2:电子交易平台的市场数据
Scenario: A broker-dealer building an institutional equity platform with real-time
market data display, smart order routing, execution algorithms (TWAP, VWAP), and post-trade
TCA. Must balance latency, completeness, cost, and Reg NMS compliance.
The firm needs both SIP and direct feeds: SIP provides the authoritative NBBO for best
execution compliance. Direct feeds from major exchanges provide the per-exchange depth that
smart order routing and algorithms require.
Feed selection: Direct feeds from NYSE Arca, Nasdaq TotalView (ITCH), NYSE (Pillar),
Cboe BZX/EDGX (PITCH), and IEX DEEP — covering the majority of volume. Lower-volume
exchanges added later if routing analysis indicates missed liquidity.
Ticker plant design: (1) Feed handlers per exchange with kernel bypass networking,
(2) NBBO calculator comparing internal NBBO against SIP for validation, (3) Book builder
maintaining per-exchange and consolidated order books, (4) Pub-sub publishing layer with
full-rate feeds for algorithms and conflated feeds for client displays, (5) Historical
capture for TCA, regulatory records, and strategy research.
Redistribution licensing: Displaying real-time data to institutional clients requires
redistribution agreements with each exchange, monthly professional user reporting, and
per-user fees — or enterprise redistribution pricing if economical.
Analysis: Total market data cost is substantial: direct feeds ($300K-$500K/year),
SIP ($50K-$100K/year), ticker plant build ($200K-$400K initial), redistribution fees
($100K-$500K/year). Market data is one of the largest operating costs for an electronic
platform. Budget for annual exchange fee increases.
场景:经纪商正在构建机构股票平台,具备实时市场数据展示、智能订单路由、执行算法(TWAP、VWAP)及交易后TCA功能。需平衡延迟、完整性、成本及Reg NMS合规性。
机构同时需要SIP及直接数据馈送:SIP提供最佳执行合规的权威NBBO。主要交易所的直接数据馈送提供智能订单路由及算法所需的单交易所深度数据。
馈送选择:来自NYSE Arca、Nasdaq TotalView(ITCH)、NYSE(Pillar)、Cboe BZX/EDGX(PITCH)、IEX DEEP的直接数据馈送——覆盖大部分成交量。若路由分析显示存在未获取的流动性,后续再添加低成交量交易所的馈送。
行情处理系统设计:(1) 每个交易所配备馈送处理程序,使用内核旁路网络;(2) NBBO计算器,将内部计算的NBBO与SIP进行验证;(3) 订单簿构建器,维护单交易所及统一订单簿;(4) 发布-订阅层,为算法提供全速率馈送,为客户展示提供合并后的馈送;(5) 历史数据捕获,用于TCA、监管记录及策略研究。
再分发许可:向机构客户展示实时数据需与每个交易所签订再分发协议,月度专业用户报告及按用户收费——或在经济可行时选择企业级再分发定价。
分析:市场数据总成本高昂:直接馈送(30万-50万美元/年)、SIP(5万-10万美元/年)、行情处理系统构建(初始20万-40万美元)、再分发费用(10万-50万美元/年)。市场数据是电子平台最大的运营成本之一。需为年度交易所费用增长预留预算。
Example 3: Entitlement Management and Exchange Licensing Compliance
案例3:授权管理与交易所许可合规
Scenario: A 200-employee multi-strategy hedge fund (New York, London, Hong Kong)
receives an NYSE audit notification. Subscriber counts have been estimated rather than
tracked, and the fund is uncertain whether its risk system's use of NYSE pricing
constitutes non-display use.
Data consumption inventory: The fund catalogs all NYSE data consumers: (1) Display
users — every Bloomberg Terminal, Eikon desktop, and internal dashboard showing NYSE
real-time data. Result: 120 professional display users found vs 95 previously reported.
(2) Non-display applications — algorithmic trading, risk (VaR/Greeks), portfolio valuation,
OMS, pricing engines. Result: 8 unreported non-display applications identified.
(3) Derived data — a daily position file with NYSE closing prices sent to the prime broker
requires review against NYSE's derived data policy.
Remediation: File amended subscriber reports (expect back-billing). Register non-display
applications by category (A: trading, B: internal non-trading, C: derived/redistribution).
Deploy an entitlement management platform (Bloomberg SSEOMS, Refinitiv DACS, or dedicated
tools like TRG Screen). Establish provisioning/deprovisioning policies. Automate monthly
subscriber count generation and reconciliation.
Financial exposure:
| Gap | Estimated Back-Billing |
|---|---|
| Display under-reporting (25 users x 12 months) | $75K-$150K |
| Non-display applications (8, some Category A) | $200K-$500K |
| Potential redistribution (1 flow under review) | $0-$100K |
| Total exposure | $275K-$750K |
Analysis: Remediation cost ($100K-$200K for entitlement system + ongoing administration)
is modest vs audit exposure. Market data entitlement management must be a formal compliance
function. Conduct internal audits annually before exchanges audit externally.
场景:拥有200名员工的多策略对冲基金(纽约、伦敦、香港)收到NYSE审计通知。订阅者计数为估算值而非实际追踪,基金不确定其风险系统使用NYSE定价是否属于非展示类使用。
数据消费清单:基金梳理所有NYSE数据消费者:(1) 展示用户——每个显示NYSE实时数据的Bloomberg终端、Eikon桌面端及内部仪表盘。结果:发现120名专业展示用户,此前报告为95名。(2) 非展示类应用——算法交易、风险(VaR/希腊字母)、投资组合估值、OMS、定价引擎。结果:识别出8个未申报的非展示类应用。(3) 衍生数据——发送给主经纪商的每日持仓文件包含NYSE收盘价,需根据NYSE的衍生数据政策进行审核。
整改措施:提交修正后的订阅者报告(预计回溯计费)。按类别注册非展示类应用(A:交易、B:内部非交易、C:衍生/再分发)。部署授权管理平台(Bloomberg SSEOMS、Refinitiv DACS或专用工具如TRG Screen)。建立开通/注销政策。自动化月度订阅者计数生成及对账。
财务风险:
| 缺口 | 回溯计费估算 |
|---|---|
| 展示用户少报(25名用户×12个月) | 7.5万-15万美元 |
| 非展示类应用(8个,部分为A类) | 20万-50万美元 |
| 潜在再分发(1个流待审核) | 0-10万美元 |
| 总风险 | 27.5万-75万美元 |
分析:整改成本(授权系统10万-20万美元+持续管理)与审计风险相比微不足道。市场数据授权管理必须成为正式的合规职能。在交易所外部审计前,每年进行内部审计。
Common Pitfalls
常见陷阱
-
Conflating SIP NBBO with direct feed best prices. The SIP NBBO is the Reg NMS regulatory benchmark. A firm's internally computed NBBO from direct feeds may differ due to latency. For best execution compliance, the SIP NBBO is authoritative.
-
Under-reporting exchange subscribers. Estimating rather than counting professional users and non-display applications risks material back-billing during exchange audits.
-
Ignoring non-display use fees. Any system consuming exchange data for automated purposes (algorithms, risk, pricing) typically requires a separate non-display license.
-
Treating delayed data as free. Vendor delivery costs and professional-user fees for delayed data through certain platforms still apply. Verify terms per use case.
-
Over-subscribing to market data. Firms accumulate unused subscriptions over time. Periodic usage audits identify significant cost savings.
-
Neglecting data quality monitoring. Consuming data without staleness, gap, and erroneous tick monitoring exposes the firm to silent failures. VaR computed on stale prices is dangerously misleading.
-
Failing to plan for peak data rates. Volumes spike during market events. Size infrastructure for 2-3x typical peak volumes to avoid failures when data matters most.
-
Ignoring survivorship bias in historical data. Use point-in-time, survivorship-free databases for strategy research to avoid inflated backtest returns.
-
Distributing raw exchange data without redistribution licenses. Client-facing real-time quotes require explicit redistribution agreements. Violations risk license termination and legal liability.
-
混淆SIP NBBO与直接馈送的最优价格:SIP的NBBO是Reg NMS的监管基准。机构从直接馈送自行计算的NBBO可能因延迟而不同。为确保最佳执行合规,SIP的NBBO是权威标准。
-
少报交易所订阅者:估算而非计数专业用户及非展示类应用,在交易所审计时可能面临大额回溯计费风险。
-
忽略非展示类使用费用:任何将交易所数据用于自动化目的的系统(算法、风险、定价)通常都需要单独的非展示类许可。
-
认为延迟数据免费:通过某些平台获取延迟数据仍需支付供应商交付成本及专业用户费用。需根据使用场景核实条款。
-
过度订阅市场数据:机构随时间积累未使用的订阅。定期使用情况审计可识别显著的成本节约空间。
-
忽视数据质量监控:未进行stale数据、缺口及错误tick监控就使用数据,会使机构面临静默故障风险。基于stale价格计算的VaR具有严重误导性。
-
未规划峰值数据速率:市场事件期间交易量激增。基础设施需按典型峰值的2-3倍规模设计,以避免在关键时期出现故障。
-
忽视历史数据的生存偏差:策略研究需使用时点、无生存偏差的数据库,以避免高估回测回报。
-
未获得再分发许可就分发原始交易所数据:客户面向的实时报价需明确的再分发协议。违规可能导致许可终止及法律责任。
Cross-References
交叉引用
- reference-data (Layer 13) — Security master and symbology underpin market data infrastructure; market data systems rely on reference data for symbol mapping and corporate action processing.
- exchange-connectivity (Layer 13) — Physical and logical exchange connections over which market data feeds travel; covers co-location and protocol handling.
- trade-execution (Layer 12) — Smart order routers and execution algorithms consume Level 2/3 market data for routing decisions and execution pacing.
- portfolio-management-systems (Layer 10) — PMS platforms consume market data for position valuation, drift monitoring, and rebalancing triggers.
- performance-metrics (Layer 1a) — EOD pricing feeds provide closing prices for daily return calculations; data quality directly affects computed metrics.
- volatility-modeling (Layer 1b) — Implied volatility derived from OPRA options data; GARCH/EWMA models calibrated on historical price series from market data infrastructure.
- equities (Layer 2) — Equity market structure and instruments; this skill covers the data infrastructure delivering equity market information to consuming systems.
- reference-data(层级13)——证券主数据及代码体系是市场数据基础设施的基础;市场数据系统依赖参考数据进行代码映射及公司行动处理。
- exchange-connectivity(层级13)——市场数据馈送传输的物理及逻辑交易所连接;涵盖托管及协议处理。
- trade-execution(层级12)——智能订单路由器及执行算法使用Level 2/3市场数据进行路由决策及执行节奏控制。
- portfolio-management-systems(层级10)——PMS平台使用市场数据进行持仓估值、漂移监控及再平衡触发。
- performance-metrics(层级1a)——日终定价馈送提供收盘价用于每日回报计算;数据质量直接影响计算的指标。
- volatility-modeling(层级1b)——从OPRA期权数据推导隐含波动率;GARCH/EWMA模型使用市场数据基础设施提供的历史价格序列进行校准。
- equities(层级2)——股票市场结构及工具;本技能涵盖向消费系统交付股票市场信息的数据基础设施。