algo-social-virality

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Viral Spread Models

病毒式传播模型

Overview

概述

Compartmental models (SIR, SIS, SEIR) model how content/information spreads through populations. Susceptible → Infected → Recovered mirrors unaware → sharing → stopped sharing. Key metric: R0 (basic reproduction number). Solves as ODEs in O(T × N) for T timesteps, N compartments.

仓室模型（SIR、SIS、SEIR）用于模拟内容/信息在人群中的传播过程。易感者（Susceptible）→ 感染者（Infected）→ 康复者（Recovered）的流程对应着未知情→分享→停止分享的行为。关键指标：R0（基本再生数）。通过求解常微分方程（ODEs）实现，时间复杂度为O(T × N)，其中T为时间步长，N为仓室数量。

When to Use

使用场景

Trigger conditions:

Modeling how content spreads through a social network
Estimating whether a campaign will achieve viral threshold
Analyzing post-hoc spread dynamics of viral events

When NOT to use:

When predicting individual user behavior (use influence scoring)
When measuring engagement metrics (use engagement rate calculator)

触发场景：

模拟内容在社交网络中的传播路径
估算营销活动是否能达到病毒式传播阈值
事后分析病毒式事件的传播动态

不适用场景：

预测单个用户行为（使用影响力评分模型）
衡量参与度指标（使用参与率计算器）

Algorithm

算法

IRON LAW: Viral Spread Occurs ONLY When R0 > 1
R0 = transmission rate (β) / recovery rate (γ).
Below R0 = 1, content dies out regardless of initial seed size.
Above R0 = 1, exponential growth phase begins before saturation.
Design interventions (seeding, incentives) to push R0 above threshold.

铁律：只有当R0 > 1时才会发生病毒式传播
R0 = 传播率（β）/ 恢复率（γ）。
当R0 < 1时，无论初始种子规模多大，内容都会逐渐消失。
当R0 > 1时，会先进入指数增长阶段，随后达到饱和。
可通过种子用户投放、激励措施等干预手段将R0推至阈值以上。

Phase 1: Input Validation

阶段1：输入验证

Define: population size (N), initial seed size (I₀), transmission rate (β — probability of sharing upon exposure), recovery rate (γ — rate of losing interest). Gate: Parameters non-negative, β and γ estimated from historical data or assumed.

定义：人群规模（N）、初始种子用户数（I₀）、传播率（β — 接触后分享的概率）、恢复率（γ — 失去兴趣的速率）。 校验规则： 参数需非负，β和γ可通过历史数据估算或假设取值。

Phase 2: Core Algorithm

阶段2：核心算法

SIR Model: dS/dt = -βSI/N, dI/dt = βSI/N - γI, dR/dt = γI

Initialize: S=N-I₀, I=I₀, R=0
Iterate using Euler method or RK4 at discrete timesteps
Track peak infected (maximum simultaneous sharers) and total ever-infected

SIS variant: No recovery to immune state — recovered become susceptible again (recurring content).

SIR模型： dS/dt = -βSI/N, dI/dt = βSI/N - γI, dR/dt = γI

初始化：S=N-I₀, I=I₀, R=0
使用欧拉法或RK4法在离散时间步长上迭代计算
记录峰值感染数（同时分享的最大用户数）和总感染数（所有参与过分享的用户数）

SIS变体： 无免疫状态，康复者会重新变为易感者（适用于反复传播的内容）。

Phase 3: Verification

阶段3：结果验证

Check: S+I+R = N at all timesteps (conservation). Peak and final sizes plausible for given R0. Gate: Population conserved, dynamics consistent with R0.

校验：在所有时间步长上需满足S+I+R = N（人群守恒）。峰值和最终规模需与给定的R0相符。 校验规则： 人群数量守恒，传播动态与R0一致。

Phase 4: Output

阶段4：输出

Return time series of compartments and summary metrics.

返回仓室的时间序列数据及汇总指标。

Output Format

输出格式

json

{
  "time_series": [{"t": 0, "S": 9900, "I": 100, "R": 0}],
  "summary": {"R0": 2.5, "peak_infected": 3200, "peak_day": 12, "total_infected": 8500},
  "metadata": {"model": "SIR", "beta": 0.5, "gamma": 0.2, "population": 10000}
}

json

{
  "time_series": [{"t": 0, "S": 9900, "I": 100, "R": 0}],
  "summary": {"R0": 2.5, "peak_infected": 3200, "peak_day": 12, "total_infected": 8500},
  "metadata": {"model": "SIR", "beta": 0.5, "gamma": 0.2, "population": 10000}
}

Examples

示例

Sample I/O

输入输出示例

Input: N=10000, I₀=10, β=0.3, γ=0.1 (R0=3.0) Expected: Exponential growth, peak ~4000 at day ~15, total infected ~9500

输入： N=10000, I₀=10, β=0.3, γ=0.1（R0=3.0） 预期结果： 指数增长，约在第15天达到峰值~~4000，总感染数~~9500

Edge Cases

边缘案例

Input	Expected	Why
R0 = 0.8	Rapid decay	Below threshold, dies out
I₀ = 1	Slower start but same eventual dynamics	Single seed takes longer to ignite
β = γ (R0=1)	Linear, no growth	Critical threshold, endemic equilibrium

输入	预期结果	原因
R0 = 0.8	快速衰减	低于阈值，内容逐渐消失
I₀ = 1	启动较慢但最终动态一致	单个种子用户需要更长时间引爆传播
β = γ (R0=1)	线性增长，无指数扩张	临界阈值，达到地方性平衡

Gotchas

注意事项

Homogeneous mixing assumption: SIR assumes everyone interacts equally. Real networks have hubs, clusters, and weak ties. Use network-based models for realistic spread.
Parameter estimation: β and γ are hard to estimate for social content. Use early spread data to fit parameters, then project.
Content ≠ disease: Unlike diseases, content sharing is voluntary and influenced by content quality, platform algorithms, and trends. Models give rough dynamics, not precise predictions.
Platform algorithms: Social media algorithms amplify or suppress content. The "transmission rate" is partly determined by the platform, not just user behavior.
Temporal dynamics: Content virality often has a much shorter lifecycle than disease (hours-days vs weeks-months). Adjust timescales accordingly.

均匀混合假设： SIR模型假设所有用户的互动概率均等。但真实网络中存在枢纽节点、集群和弱连接。如需更真实的传播模拟，请使用基于网络的模型。
参数估算： 针对社交内容的β和γ值很难估算。可利用早期传播数据拟合参数，再进行预测。
内容≠疾病： 与疾病不同，内容分享是自愿行为，受内容质量、平台算法和趋势影响。模型仅提供大致的传播动态，而非精确预测。
平台算法： 社交媒体算法会放大或抑制内容的传播。“传播率”部分由平台决定，并非完全由用户行为主导。
时间动态： 内容的病毒式传播周期通常远短于疾病（小时/天 vs 周/月）。需相应调整时间尺度。

References

参考资料

For network-based epidemic models, see
```
references/network-sir.md
```
For parameter estimation from early data, see
```
references/parameter-fitting.md
```

如需了解基于网络的流行病模型，请查看
```
references/network-sir.md
```
如需了解如何利用早期数据拟合参数，请查看
```
references/parameter-fitting.md
```