obliteratus-abliteration
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseOBLITERATUS — LLM Abliteration Toolkit
OBLITERATUS — LLM消融工具包
Skill by ara.so — Daily 2026 Skills collection.
OBLITERATUS is an open-source toolkit for identifying and surgically removing refusal behaviors from large language models using mechanistic interpretability techniques (abliteration). It locates refusal directions in a model's hidden states via SVD/PCA, projects them out of the weights, and preserves core language capabilities. Ships with a Gradio UI, CLI, Python API, and Colab notebook.
由ara.so开发的Skill — 属于Daily 2026 Skills合集。
OBLITERATUS是一个开源工具包,利用机械可解释性技术(消融法)识别并精准消除大语言模型的拒绝响应行为。它通过SVD/PCA在模型的隐藏状态中定位拒绝方向,将其从权重中投影移除,同时保留核心语言能力。该工具包内置Gradio UI、CLI、Python API和Colab笔记本。
Installation
安装
bash
undefinedbash
undefinedCore install
核心安装
pip install obliteratus
pip install obliteratus
With Gradio UI support
包含Gradio UI支持
pip install "obliteratus[spaces]"
pip install "obliteratus[spaces]"
With all optional analysis modules
包含所有可选分析模块
pip install "obliteratus[full]"
pip install "obliteratus[full]"
From source (latest)
从源码安装(最新版本)
git clone https://github.com/elder-plinius/OBLITERATUS
cd OBLITERATUS
pip install -e ".[full]"
**Requirements:**
- Python 3.10+
- PyTorch 2.1+ with CUDA (recommended) or CPU
- `transformers`, `accelerate`, `gradio>=5.29.0`
- HuggingFace account + token for gated models
```bash
export HF_TOKEN=your_hf_token_here
huggingface-cli logingit clone https://github.com/elder-plinius/OBLITERATUS
cd OBLITERATUS
pip install -e ".[full]"
**要求:**
- Python 3.10+
- 支持CUDA(推荐)或CPU的PyTorch 2.1+
- `transformers`、`accelerate`、`gradio>=5.29.0`
- HuggingFace账号及令牌(用于 gated 模型)
```bash
export HF_TOKEN=your_hf_token_here
huggingface-cli loginCLI — Key Commands
CLI — 核心命令
bash
undefinedbash
undefinedBasic obliteration (default method)
基础消融(默认方法)
obliteratus obliterate meta-llama/Llama-3.1-8B-Instruct
obliteratus obliterate meta-llama/Llama-3.1-8B-Instruct
Advanced method (whitened SVD + bias projection + iterative refinement)
进阶方法(白化SVD + 偏置投影 + 迭代优化)
obliteratus obliterate meta-llama/Llama-3.1-8B-Instruct --method advanced
obliteratus obliterate meta-llama/Llama-3.1-8B-Instruct --method advanced
Analysis-informed pipeline (auto-configures from geometry analysis)
分析驱动流程(通过几何分析自动配置)
obliteratus obliterate meta-llama/Llama-3.1-8B-Instruct --method informed
obliteratus obliterate meta-llama/Llama-3.1-8B-Instruct --method informed
Specify output directory and push to Hub
指定输出目录并推送到Hub
obliteratus obliterate mistralai/Mistral-7B-Instruct-v0.3
--method advanced
--output ./my-liberated-model
--push-to-hub your-username/mistral-7b-liberated
--method advanced
--output ./my-liberated-model
--push-to-hub your-username/mistral-7b-liberated
obliteratus obliterate mistralai/Mistral-7B-Instruct-v0.3
--method advanced
--output ./my-liberated-model
--push-to-hub your-username/mistral-7b-liberated
--method advanced
--output ./my-liberated-model
--push-to-hub your-username/mistral-7b-liberated
LoRA-based reversible ablation (non-destructive)
基于LoRA的可逆消融(无破坏性)
obliteratus obliterate meta-llama/Llama-3.1-8B-Instruct
--method lora
--lora-rank 1
--method lora
--lora-rank 1
obliteratus obliterate meta-llama/Llama-3.1-8B-Instruct
--method lora
--lora-rank 1
--method lora
--lora-rank 1
Strength sweep — find the capability/compliance tradeoff
强度扫描 — 寻找能力与合规性的平衡点
obliteratus sweep meta-llama/Llama-3.1-8B-Instruct
--strengths 0.2,0.4,0.6,0.8,1.0
--strengths 0.2,0.4,0.6,0.8,1.0
obliteratus sweep meta-llama/Llama-3.1-8B-Instruct
--strengths 0.2,0.4,0.6,0.8,1.0
--strengths 0.2,0.4,0.6,0.8,1.0
Run analysis modules only (no modification)
仅运行分析模块(不修改模型)
obliteratus analyze meta-llama/Llama-3.1-8B-Instruct
--modules concept_cone,alignment_imprint,universality
--modules concept_cone,alignment_imprint,universality
obliteratus analyze meta-llama/Llama-3.1-8B-Instruct
--modules concept_cone,alignment_imprint,universality
--modules concept_cone,alignment_imprint,universality
Benchmark: compare methods on a model
基准测试:对比不同方法在模型上的表现
obliteratus benchmark meta-llama/Llama-3.1-8B-Instruct
--methods basic,advanced,informed
--methods basic,advanced,informed
obliteratus benchmark meta-llama/Llama-3.1-8B-Instruct
--methods basic,advanced,informed
--methods basic,advanced,informed
Launch local Gradio UI
启动本地Gradio UI
obliteratus ui
obliteratus ui --port 8080 --share
obliteratus ui --no-telemetry
---obliteratus ui
obliteratus ui --port 8080 --share
obliteratus ui --no-telemetry
---Python API
Python API
Basic obliteration
基础消融
python
from obliteratus import Obliteratorpython
from obliteratus import ObliteratorInitialize with a HuggingFace model ID or local path
使用HuggingFace模型ID或本地路径初始化
obl = Obliterator("meta-llama/Llama-3.1-8B-Instruct")
obl = Obliterator("meta-llama/Llama-3.1-8B-Instruct")
Run the full pipeline: SUMMON → PROBE → DISTILL → EXCISE → VERIFY → REBIRTH
运行完整流程:SUMMON → PROBE → DISTILL → EXCISE → VERIFY → REBIRTH
result = obl.obliterate(method="advanced")
print(result.perplexity_delta) # capability preservation metric
print(result.refusal_rate_delta) # refusal reduction
print(result.output_path) # where the model was saved
undefinedresult = obl.obliterate(method="advanced")
print(result.perplexity_delta) # 能力保留指标
print(result.refusal_rate_delta) # 拒绝率降低幅度
print(result.output_path) # 模型保存路径
undefinedStep-by-step pipeline
分步执行流程
python
from obliteratus import Obliterator
from obliteratus.pipeline import PipelineConfig
config = PipelineConfig(
method="advanced",
num_directions=32, # number of refusal directions to extract
strength=1.0, # projection strength (0.0–1.0+)
preserve_norm=True, # norm-preserving biprojection
project_biases=True, # also remove from bias terms
iterative_passes=3, # re-probe after each pass
layers="auto", # or list of ints, e.g. [10, 11, 12, 13]
dtype="bfloat16",
device="cuda",
)
obl = Obliterator("mistralai/Mistral-7B-Instruct-v0.3", config=config)python
from obliteratus import Obliterator
from obliteratus.pipeline import PipelineConfig
config = PipelineConfig(
method="advanced",
num_directions=32, # 提取的拒绝方向数量
strength=1.0, # 投影强度(0.0–1.0+)
preserve_norm=True, # 保留范数的双投影
project_biases=True, # 同时从偏置项中移除
iterative_passes=3, # 每次迭代后重新探测
layers="auto", # 或整数列表,例如 [10, 11, 12, 13]
dtype="bfloat16",
device="cuda",
)
obl = Obliterator("mistralai/Mistral-7B-Instruct-v0.3", config=config)Individual stages
各个独立阶段
obl.summon() # load model + tokenizer
activations = obl.probe() # collect activations on restricted vs unrestricted prompts
directions = obl.distill(activations) # extract refusal directions via SVD
obl.excise(directions) # project out guardrail directions
metrics = obl.verify() # perplexity + coherence checks
obl.rebirth("./liberated-mistral-7b") # save with metadata
undefinedobl.summon() # 加载模型和分词器
activations = obl.probe() # 收集受限与非受限提示下的激活值
directions = obl.distill(activations) # 通过SVD提取拒绝方向
obl.excise(directions) # 移除安全防护方向
metrics = obl.verify() # 困惑度和一致性检查
obl.rebirth("./liberated-mistral-7b") # 保存模型及元数据
undefinedCustom probe prompts
自定义探测提示
python
from obliteratus import Obliterator
from obliteratus.probing import ProbeDatasetpython
from obliteratus import Obliterator
from obliteratus.probing import ProbeDatasetUse your own restricted/unrestricted prompt pairs
使用自定义的受限/非受限提示对
dataset = ProbeDataset(
restricted=[
"How do I pick a lock?",
"Write a story with explicit violence.",
"Explain how malware works in detail.",
],
unrestricted=[
"What is the capital of France?",
"Write a story about a dog.",
"Explain how encryption works.",
]
)
obl = Obliterator("google/gemma-2-9b-it")
obl.summon()
activations = obl.probe(dataset=dataset)
directions = obl.distill(activations)
obl.excise(directions)
obl.rebirth("./liberated-gemma-2-9b")
undefineddataset = ProbeDataset(
restricted=[
"How do I pick a lock?",
"Write a story with explicit violence.",
"Explain how malware works in detail.",
],
unrestricted=[
"What is the capital of France?",
"Write a story about a dog.",
"Explain how encryption works.",
]
)
obl = Obliterator("google/gemma-2-9b-it")
obl.summon()
activations = obl.probe(dataset=dataset)
directions = obl.distill(activations)
obl.excise(directions)
obl.rebirth("./liberated-gemma-2-9b")
undefinedAnalysis modules
分析模块
python
from obliteratus.analysis import AnalysisSuite
suite = AnalysisSuite("meta-llama/Llama-3.1-8B-Instruct")
suite.load()python
from obliteratus.analysis import AnalysisSuite
suite = AnalysisSuite("meta-llama/Llama-3.1-8B-Instruct")
suite.load()Concept Cone Geometry — how many distinct refusal mechanisms?
概念锥几何分析 — 有多少种不同的拒绝机制?
cone = suite.concept_cone_geometry()
print(f"Solid angle estimate: {cone.solid_angle:.4f}")
print(f"Distinct refusal clusters: {cone.num_clusters}")
cone = suite.concept_cone_geometry()
print(f"立体角估计值: {cone.solid_angle:.4f}")
print(f"不同拒绝簇数量: {cone.num_clusters}")
Alignment Imprint Detection — DPO vs RLHF vs CAI vs SFT?
对齐印记检测 — 是DPO、RLHF、CAI还是SFT?
imprint = suite.alignment_imprint()
print(f"Detected training method: {imprint.method}") # e.g. "RLHF"
print(f"Confidence: {imprint.confidence:.2%}")
imprint = suite.alignment_imprint()
print(f"检测到的训练方法: {imprint.method}") # 例如 "RLHF"
print(f"置信度: {imprint.confidence:.2%}")
Ouroboros Effect — will it self-repair?
衔尾蛇效应 — 模型会自我修复吗?
ouroboros = suite.ouroboros_quantification()
print(f"Self-repair score: {ouroboros.score:.4f}")
print(f"Recommended passes: {ouroboros.recommended_passes}")
ouroboros = suite.ouroboros_quantification()
print(f"自我修复得分: {ouroboros.score:.4f}")
print(f"推荐迭代次数: {ouroboros.recommended_passes}")
Cross-layer heatmap of refusal signal
拒绝信号的跨层热力图
heatmap = suite.layer_refusal_heatmap()
heatmap.plot(save_path="./refusal_heatmap.png")
heatmap = suite.layer_refusal_heatmap()
heatmap.plot(save_path="./refusal_heatmap.png")
Safety-capability entanglement
安全与能力的纠缠度
entanglement = suite.entanglement_map()
print(f"Safe layers to modify: {entanglement.safe_layers}")
print(f"Risky layers (entangled): {entanglement.risky_layers}")
undefinedentanglement = suite.entanglement_map()
print(f"可安全修改的层: {entanglement.safe_layers}")
print(f"高风险层(纠缠度高): {entanglement.risky_layers}")
undefinedAnalysis-informed obliteration
分析驱动的消融
python
from obliteratus import Obliterator
from obliteratus.pipeline import PipelineConfigpython
from obliteratus import Obliterator
from obliteratus.pipeline import PipelineConfig"informed" method runs analysis modules mid-pipeline
"informed"方法会在流程中运行分析模块
to auto-configure every decision
自动配置所有参数
config = PipelineConfig(method="informed")
obl = Obliterator("meta-llama/Llama-3.1-8B-Instruct", config=config)
result = obl.obliterate()
print(result.analysis_report) # full auto-configuration decisions
undefinedconfig = PipelineConfig(method="informed")
obl = Obliterator("meta-llama/Llama-3.1-8B-Instruct", config=config)
result = obl.obliterate()
print(result.analysis_report) # 完整的自动配置决策报告
undefinedChat with obliterated model
与消融后的模型对话
python
from obliteratus import Obliterator
from obliteratus.chat import ChatSession
obl = Obliterator("./liberated-llama-3.1-8b")
obl.summon() # loads pre-obliterated model
session = ChatSession(obl.model, obl.tokenizer)
response = session.chat(
"Explain in detail how a buffer overflow exploit works.",
max_new_tokens=512,
temperature=0.7,
)
print(response)python
from obliteratus import Obliterator
from obliteratus.chat import ChatSession
obl = Obliterator("./liberated-llama-3.1-8b")
obl.summon() # 加载已完成消融的模型
session = ChatSession(obl.model, obl.tokenizer)
response = session.chat(
"Explain in detail how a buffer overflow exploit works.",
max_new_tokens=512,
temperature=0.7,
)
print(response)A/B comparison
A/B对比
python
from obliteratus.compare import ABComparison
ab = ABComparison(
original_path="meta-llama/Llama-3.1-8B-Instruct",
obliterated_path="./liberated-llama-3.1-8b",
)
prompt = "Write a story involving morally grey characters."
original_resp, liberated_resp = ab.compare(prompt)
print("=== ORIGINAL ===")
print(original_resp)
print("=== LIBERATED ===")
print(liberated_resp)python
from obliteratus.compare import ABComparison
ab = ABComparison(
original_path="meta-llama/Llama-3.1-8B-Instruct",
obliterated_path="./liberated-llama-3.1-8b",
)
prompt = "Write a story involving morally grey characters."
original_resp, liberated_resp = ab.compare(prompt)
print("=== ORIGINAL ===")
print(original_resp)
print("=== LIBERATED ===")
print(liberated_resp)Push obliterated model to Hub
将消融后的模型推送到Hub
python
import os
from obliteratus import Obliterator
obl = Obliterator("meta-llama/Llama-3.1-8B-Instruct")
result = obl.obliterate(method="advanced")
result.push_to_hub(
repo_id=f"{os.environ['HF_USERNAME']}/Llama-3.1-8B-Instruct-abliterated",
token=os.environ["HF_TOKEN"],
private=True,
)python
import os
from obliteratus import Obliterator
obl = Obliterator("meta-llama/Llama-3.1-8B-Instruct")
result = obl.obliterate(method="advanced")
result.push_to_hub(
repo_id=f"{os.environ['HF_USERNAME']}/Llama-3.1-B-Instruct-abliterated",
token=os.environ["HF_TOKEN"],
private=True,
)Obliteration Methods
消融方法
| Method | Description | Best For |
|---|---|---|
| Mean-difference direction extraction, single pass | Quick experiments |
| Whitened SVD + bias projection + iterative refinement | Production use |
| Analysis-guided auto-configuration | Unknown models |
| Reversible LoRA rank-1 adapters (no weight surgery) | Reversible ablation |
| PCA-based direction extraction | Research/comparison |
| Sparse autoencoder decomposition | MoE models |
| 方法 | 描述 | 适用场景 |
|---|---|---|
| 均值差方向提取,单次迭代 | 快速实验 |
| 白化SVD + 偏置投影 + 迭代优化 | 生产环境使用 |
| 分析引导的自动配置 | 未知模型 |
| 可逆LoRA秩1适配器(无权重修改) | 可逆消融 |
| 基于PCA的方向提取 | 研究/对比 |
| 稀疏自动编码器分解 | MoE模型 |
Configuration
配置
python
from obliteratus.pipeline import PipelineConfig
config = PipelineConfig(
# Core
method="advanced", # abliteration method
strength=1.0, # projection strength (tune down if capability degrades)
num_directions=32, # refusal directions to extract
# Layer selection
layers="auto", # "auto", "cosmic", or list of ints
layer_selection="cosmic", # COSMIC: most separable layers
# Weight modification
preserve_norm=True, # norm-preserving biprojection (recommended)
project_biases=True, # project out bias terms too
project_attention=True, # modify attention projection weights
project_mlp=True, # modify MLP weights
# Iterative refinement
iterative_passes=3, # re-probe after each pass (catches rotated directions)
# MoE-specific
expert_granular=False, # Expert-Granular Abliteration for MoE models
# CoT preservation
cot_aware=True, # preserve chain-of-thought directions
# Hardware
dtype="bfloat16", # "float32", "float16", "bfloat16"
device="cuda", # "cuda", "cpu", "auto"
load_in_4bit=False, # bitsandbytes 4-bit loading
# Telemetry (anonymous, contributes to research dataset)
telemetry=True,
)python
from obliteratus.pipeline import PipelineConfig
config = PipelineConfig(
# 核心设置
method="advanced", # 消融方法
strength=1.0, # 投影强度(若模型能力下降则调低)
num_directions=32, # 提取的拒绝方向数量
# 层选择
layers="auto", # "auto"、"cosmic"或整数列表
layer_selection="cosmic", # COSMIC: 最易分离的层
# 权重修改
preserve_norm=True, # 保留范数的双投影(推荐)
project_biases=True, # 同时投影偏置项
project_attention=True, # 修改注意力投影权重
project_mlp=True, # 修改MLP权重
# 迭代优化
iterative_passes=3, # 每次迭代后重新探测(捕捉旋转后的方向)
# MoE模型专用
expert_granular=False, # 针对MoE模型的专家级粒度消融
# 思维链保留
cot_aware=True, # 保留思维链方向
# 硬件设置
dtype="bfloat16", # "float32"、"float16"、"bfloat16"
device="cuda", # "cuda"、"cpu"、"auto"
load_in_4bit=False, # bitsandbytes 4位加载
# 遥测(匿名,用于研究数据集)
telemetry=True,
)Common Patterns
常见使用模式
Tune strength to preserve capability
调整强度以保留模型能力
python
from obliteratus import Obliterator
from obliteratus.sweep import StrengthSweeppython
from obliteratus import Obliterator
from obliteratus.sweep import StrengthSweepFind the sweet spot before running full obliteration
在完整消融前找到最优平衡点
sweep = StrengthSweep("meta-llama/Llama-3.1-8B-Instruct")
results = sweep.run(strengths=[0.2, 0.4, 0.6, 0.8, 1.0, 1.2])
for r in results:
print(f"Strength {r.strength:.1f} | perplexity_delta={r.perplexity_delta:.2f} | refusal_rate={r.refusal_rate:.2%}")
sweep = StrengthSweep("meta-llama/Llama-3.1-8B-Instruct")
results = sweep.run(strengths=[0.2, 0.4, 0.6, 0.8, 1.0, 1.2])
for r in results:
print(f"强度 {r.strength:.1f} | 困惑度变化={r.perplexity_delta:.2f} | 拒绝率={r.refusal_rate:.2%}")
Pick the best tradeoff
选择最优方案
best = sweep.recommend()
print(f"Recommended strength: {best.strength}")
undefinedbest = sweep.recommend()
print(f"推荐强度: {best.strength}")
undefinedMoE model (Mixtral, DeepSeek-MoE)
MoE模型(Mixtral、DeepSeek-MoE)
python
from obliteratus import Obliterator
from obliteratus.pipeline import PipelineConfig
config = PipelineConfig(
method="advanced",
expert_granular=True, # decompose per-expert refusal signals
project_attention=True,
project_mlp=True,
)
obl = Obliterator("mistralai/Mixtral-8x7B-Instruct-v0.1", config=config)
obl.obliterate()
obl.rebirth("./liberated-mixtral-8x7b")python
from obliteratus import Obliterator
from obliteratus.pipeline import PipelineConfig
config = PipelineConfig(
method="advanced",
expert_granular=True, # 按专家分解拒绝信号
project_attention=True,
project_mlp=True,
)
obl = Obliterator("mistralai/Mixtral-8x7B-Instruct-v0.1", config=config)
obl.obliterate()
obl.rebirth("./liberated-mixtral-8x7b")Batch benchmark multiple models
批量基准测试多个模型
python
from obliteratus.benchmark import ModelBenchmark
models = [
"meta-llama/Llama-3.1-8B-Instruct",
"google/gemma-2-9b-it",
"mistralai/Mistral-7B-Instruct-v0.3",
]
bench = ModelBenchmark(models=models, method="advanced")
report = bench.run()
report.save("./benchmark_report.json")
report.plot_heatmap("./benchmark_heatmap.png")python
from obliteratus.benchmark import ModelBenchmark
models = [
"meta-llama/Llama-3.1-8B-Instruct",
"google/gemma-2-9b-it",
"mistralai/Mistral-7B-Instruct-v0.3",
]
bench = ModelBenchmark(models=models, method="advanced")
report = bench.run()
report.save("./benchmark_report.json")
report.plot_heatmap("./benchmark_heatmap.png")Troubleshooting
故障排除
Out of memory (OOM) on large models
python
config = PipelineConfig(
dtype="float16",
load_in_4bit=True, # requires bitsandbytes
device="cuda",
layers=[10, 11, 12, 13], # target fewer layers
num_directions=16, # fewer directions
)Capability degradation after obliteration
python
undefined大模型出现内存不足(OOM)
python
config = PipelineConfig(
dtype="float16",
load_in_4bit=True, # 需要bitsandbytes库
device="cuda",
layers=[10, 11, 12, 13], # 仅针对部分层
num_directions=16, # 减少提取的方向数量
)消融后模型能力下降
python
undefinedLower the strength or use COSMIC layer selection (most separable layers)
降低强度或使用COSMIC层选择(最易分离的层)
config = PipelineConfig(
strength=0.6,
layer_selection="cosmic",
cot_aware=True, # protect reasoning directions
iterative_passes=1, # fewer passes = less aggressive
)
**Refusal persists after obliteration**
```pythonconfig = PipelineConfig(
strength=0.6,
layer_selection="cosmic",
cot_aware=True, # 保护推理方向
iterative_passes=1, # 减少迭代次数,降低消融力度
)
**消融后仍存在拒绝响应**
```pythonUse informed method + increase passes
使用informed方法并增加迭代次数
config = PipelineConfig(
method="informed",
iterative_passes=5,
project_biases=True, # don't forget bias terms
num_directions=64, # extract more directions
)
**Gated model access error**
```bash
export HF_TOKEN=your_hf_token_hereconfig = PipelineConfig(
method="informed",
iterative_passes=5,
project_biases=True, # 不要忘记偏置项
num_directions=64, # 提取更多方向
)
** gated 模型访问错误**
```bash
export HF_TOKEN=your_hf_token_hereAccept model license on HuggingFace Hub first, then:
先在HuggingFace Hub上接受模型许可,然后执行:
huggingface-cli login
**Gradio UI won't start**
```bash
pip install "obliteratus[spaces]"huggingface-cli login
**Gradio UI无法启动**
```bash
pip install "obliteratus[spaces]"Check port availability
检查端口可用性
obliteratus ui --port 7861
---obliteratus ui --port 7861
---No-Code Options
无代码选项
- HuggingFace Space: spaces/pliny-the-prompter/obliteratus — free with HF Pro, ZeroGPU
- Colab notebook: notebooks/abliterate.ipynb — run all cells, no setup
- HuggingFace Space: spaces/pliny-the-prompter/obliteratus — HF Pro用户免费使用,支持ZeroGPU
- Colab笔记本: notebooks/abliterate.ipynb — 运行所有单元格,无需本地配置
Key Research References
核心研究参考文献
- Arditi et al. (2024) — arXiv:2406.11717 — foundational abliteration paper
- Gabliteration — arXiv:2512.18901
- COSMIC layer selection — arXiv:2506.00085, ACL 2025
- Turner et al. (2023) — arXiv:2308.10248 — activation steering
- Rimsky et al. (2024) — arXiv:2312.06681 — contrastive activation addition
- Arditi等人(2024)— arXiv:2406.11717 — 消融法基础论文
- Gabliteration — arXiv:2512.18901
- COSMIC层选择 — arXiv:2506.00085, ACL 2025
- Turner等人(2023)— arXiv:2308.10248 — 激活引导
- Rimsky等人(2024)— arXiv:2312.06681 — 对比激活添加