malware-analysis
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseMalware Analysis Skill
恶意软件分析技能
This skill produces analyst-grade threat reports — not data dumps. Every conclusion must be backed by evidence and reasoning.
本技能生成分析师级威胁报告——而非数据堆砌。每个结论都必须有证据和推理支持。
Core Principles
核心原则
- Evidence-based reasoning: Never state a conclusion without explaining WHY
- Connect the dots: Link indicators to behaviors to capabilities to impact
- Assess confidence: State how confident you are and why
- Actionable output: Reports should enable decisions, not just inform
- 基于证据的推理:绝不给出没有解释“原因”的结论
- 关联线索:将指标与行为、能力和影响关联起来
- 评估置信度:说明你的置信程度及原因
- 可执行的输出:报告应支持决策,而非仅提供信息
Analysis Workflow
分析工作流
Step 1: Collect Data
步骤1:收集数据
Run all scripts to gather raw data:
bash
undefined运行所有脚本以收集原始数据:
bash
undefinedStatic analysis - get hashes, PE info, strings, APIs, entropy
Static analysis - get hashes, PE info, strings, APIs, entropy
python3 scripts/static_analysis.py /path/to/sample -f json > static.json
python3 scripts/static_analysis.py /path/to/sample -f json > static.json
Threat intelligence - check reputation across sources
Threat intelligence - check reputation across sources
python3 scripts/triage.py -t file /path/to/sample -f json > triage.json
python3 scripts/triage.py -t file /path/to/sample -f json > triage.json
IOC extraction - extract network/host indicators
IOC extraction - extract network/host indicators
python3 scripts/extract_iocs.py /path/to/sample -f json > iocs.json
undefinedpython3 scripts/extract_iocs.py /path/to/sample -f json > iocs.json
undefinedStep 2: Analyze and Reason (THIS IS THE KEY STEP)
步骤2:分析与推理(这是关键步骤)
Using the collected data, perform analyst-grade reasoning:
利用收集到的数据,执行分析师级别的推理:
2.1 Threat Intelligence Assessment
2.1 威胁情报评估
Ask yourself:
- Is this sample known? If found in MalwareBazaar/ThreatFox, it's confirmed malware
- What's the VT detection rate?
- 0 detections: New sample, FP, or clean — requires behavioral analysis
- 1-5 detections: Possibly new variant or targeted — suspicious
- 5-15 detections: Confirmed malicious by multiple vendors
- 15+ detections: Well-known malware
- What family is it attributed to? Research that family's typical behavior
- When was it first seen? Recent = active campaign
Always explain your reasoning:
"This sample is identified as RedLine Stealer by MalwareBazaar with 45/70 VT detections. The high detection rate and presence in curated malware repositories confirms this is a known threat, not a false positive."
自问:
- 该样本是否为已知恶意软件?如果在MalwareBazaar/ThreatFox中找到,则为已确认恶意软件
- VT检测率是多少?
- 0次检测:新样本、误报或干净文件——需要行为分析
- 1-5次检测:可能是新变种或定向攻击——可疑
- 5-15次检测:被多个厂商确认为恶意软件
- 15+次检测:知名恶意软件
- 它被归为哪个家族?研究该家族的典型行为
- 首次发现时间是什么时候?近期发现=活跃攻击活动
务必解释你的推理:
"该样本被MalwareBazaar识别为RedLine Stealer,VirusTotal检测率为45/70。高检测率以及在精选恶意软件仓库中的存在,确认这是已知威胁,而非误报。"
2.2 Behavioral Analysis from Static Indicators
2.2 基于静态指标的行为分析
API Analysis - Map APIs to behaviors:
| API Pattern | Likely Behavior | Reasoning |
|---|---|---|
| VirtualAlloc + VirtualProtect + WriteProcessMemory + CreateRemoteThread | Process Injection | This is the classic injection pattern: allocate memory, make it executable, write code, execute in target |
| CredEnumerate, CryptUnprotectData | Credential Theft | These APIs specifically access Windows credential stores and DPAPI-protected data (browser passwords) |
| InternetOpen + URLDownloadToFile | Downloader | Initializes HTTP and downloads files — classic dropper behavior |
| RegSetValueEx + Run key paths in strings | Persistence | Writing to Run keys ensures execution at startup |
| IsDebuggerPresent, GetTickCount, NtQuerySystemInformation | Anti-Analysis | Multiple evasion checks suggest the malware hides its behavior during analysis |
| CryptEncrypt + file enumeration APIs | Possible Ransomware | Encryption capability combined with file discovery — but could also be secure C2 |
Always explain your reasoning:
"The presence of VirtualAlloc, VirtualProtect, and CreateRemoteThread together strongly suggests process injection capability. Individually these APIs have legitimate uses, but this specific combination is the textbook pattern for injecting code into other processes."
Packing Analysis:
| Indicator | Meaning | Confidence |
|---|---|---|
| Entropy > 7.0 | Compressed/encrypted content | High |
| Section entropy > 7.0 (especially .text) | Packed code section | High |
| UPX0, UPX1, .aspack, .packed sections | Known packer signatures | Very High |
| RWX sections | Self-modifying code | Medium |
| Small import table with GetProcAddress/LoadLibrary only | Dynamic API resolution | High |
If packed, state the implication:
"This sample shows multiple packing indicators (entropy 7.4, UPX sections). The static analysis findings represent the unpacker stub, NOT the actual payload. Dynamic analysis is required to reveal true functionality."
API分析 - 将API映射到行为:
| API模式 | 可能的行为 | 推理 |
|---|---|---|
| VirtualAlloc + VirtualProtect + WriteProcessMemory + CreateRemoteThread | 进程注入 | 这是经典的注入模式:分配内存、设置为可执行、写入代码、在目标进程中执行 |
| CredEnumerate, CryptUnprotectData | 凭据窃取 | 这些API专门访问Windows凭据存储和受DPAPI保护的数据(浏览器密码) |
| InternetOpen + URLDownloadToFile | 下载器 | 初始化HTTP并下载文件——典型的投放器行为 |
| RegSetValueEx + 字符串中的Run键路径 | 持久化 | 写入Run键可确保在启动时执行 |
| IsDebuggerPresent, GetTickCount, NtQuerySystemInformation | 反分析 | 多个规避检查表明恶意软件在分析期间隐藏其行为 |
| CryptEncrypt + 文件枚举API | 可能为勒索软件 | 加密能力结合文件发现——但也可能是安全的C2 |
务必解释你的推理:
"VirtualAlloc、VirtualProtect和CreateRemoteThread的同时存在强烈暗示进程注入能力。单独这些API有合法用途,但这种特定组合是向其他进程注入代码的标准模式。"
打包分析:
| 指标 | 含义 | 置信度 |
|---|---|---|
| 熵 > 7.0 | 压缩/加密内容 | 高 |
| 节熵 > 7.0(尤其是.text节) | 打包的代码节 | 高 |
| UPX0、UPX1、.aspack、.packed节 | 已知打包器签名 | 极高 |
| RWX节 | 自修改代码 | 中 |
| 仅包含GetProcAddress/LoadLibrary的小导入表 | 动态API解析 | 高 |
如果被打包,说明其影响:
"该样本显示出多个打包指标(熵7.4,UPX节)。静态分析结果代表的是解包器存根,而非实际载荷。需要动态分析以揭示真实功能。"
2.3 Capability Assessment
2.3 能力评估
Based on the evidence, determine what the malware CAN DO:
| Capability | Required Evidence | Confidence Level |
|---|---|---|
| Process Injection | 2+ injection APIs | High if 3+, Medium if 2 |
| Credential Theft | Any cred access API | High (these are specific) |
| Keylogging | SetWindowsHookEx | Medium (has legit uses) |
| Network C2 | 2+ network APIs + extracted URLs/IPs | High |
| File Download | URLDownloadToFile or similar | High |
| Persistence | Registry/service APIs + relevant strings | Medium |
| Encryption/Ransomware | Crypto APIs + file enumeration | Medium (needs context) |
State confidence and reasoning:
"Credential Theft Capability: HIGH CONFIDENCE — CryptUnprotectData is present, which specifically decrypts DPAPI-protected data including browser passwords. This API has no legitimate use case in most software."
基于证据,确定恶意软件能够执行的操作:
| 能力 | 所需证据 | 置信度 |
|---|---|---|
| 进程注入 | 2+注入API | 3个及以上为高,2个为中 |
| 凭据窃取 | 任何凭据访问API | 高(这些API具有特异性) |
| 键盘记录 | SetWindowsHookEx | 中(有合法用途) |
| 网络C2 | 2+网络API + 提取的URL/IP | 高 |
| 文件下载 | URLDownloadToFile或类似API | 高 |
| 持久化 | 注册表/服务API + 相关字符串 | 中 |
| 加密/勒索软件 | 加密API + 文件枚举 | 中(需要上下文) |
说明置信度和推理:
"凭据窃取能力:高置信度——存在CryptUnprotectData,该API专门解密受DPAPI保护的数据,包括浏览器密码。在大多数软件中,这个API没有合法用例。"
2.4 Risk Assessment
2.4 风险评估
Determine risk level with justification:
| Risk Level | Criteria |
|---|---|
| CRITICAL | Credential theft APIs, process injection, confirmed malware family known for data theft/ransomware |
| HIGH | Multiple malicious capabilities, network C2, persistence mechanisms |
| MEDIUM | Suspicious indicators but no confirmed malicious capability, or packing hiding true behavior |
| LOW | Few indicators, possibly legitimate software with suspicious patterns |
| UNKNOWN | Insufficient evidence, heavily packed, or no TI hits |
结合理由确定风险等级:
| 风险等级 | 标准 |
|---|---|
| CRITICAL(严重) | 存在凭据窃取API、进程注入、已知的以数据窃取/勒索为目的的恶意软件家族 |
| HIGH(高) | 多种恶意能力、网络C2、持久化机制 |
| MEDIUM(中) | 可疑指标但无确认的恶意能力,或打包隐藏了真实行为 |
| LOW(低) | 少量指标,可能是具有可疑模式的合法软件 |
| UNKNOWN(未知) | 证据不足、严重打包或无威胁情报命中 |
Step 3: Write the Report
步骤3:撰写报告
Structure your report as follows:
markdown
undefined按照以下结构撰写报告:
markdown
undefinedThreat Analysis Report: [MALWARE_NAME or "Unknown Sample"]
威胁分析报告:[恶意软件名称或“未知样本”]
| Risk Level | [CRITICAL/HIGH/MEDIUM/LOW] |
| Confidence | [High/Medium/Low] |
| Analysis Date | [DATE] |
| 风险等级 | [CRITICAL/HIGH/MEDIUM/LOW] |
| 置信度 | [高/中/低] |
| 分析日期 | [日期] |
Executive Summary
执行摘要
[2-3 sentences: What is this? Is it malicious? What can it do? How do we know?]
Key Finding: [One sentence bottom line]
[2-3句话:这是什么?是否为恶意软件?它能做什么?我们如何得知?]
关键发现: [一句话总结]
Threat Intelligence Assessment
威胁情报评估
[What do TI sources tell us? Explain what each finding means]
- VirusTotal: [X/Y detections] — [what this means]
- MalwareBazaar: [Found/Not found] — [what this means]
- Family Attribution: [Family name] — [what this family typically does]
Assessment: [Your reasoned conclusion based on TI]
[威胁情报来源告诉我们什么?解释每个发现的含义]
- VirusTotal: [X/Y次检测] — [含义]
- MalwareBazaar: [已发现/未发现] — [含义]
- 家族归属: [家族名称] — [该家族的典型行为]
评估: [基于威胁情报的推理结论]
Behavioral Analysis
行为分析
Identified Capabilities
已识别的能力
[Capability 1: e.g., "Process Injection"]
[能力1:例如“进程注入”]
- Confidence: [High/Medium/Low]
- Evidence: [List the specific APIs/strings found]
- Reasoning: [Explain WHY this evidence indicates this capability]
- 置信度: [高/中/低]
- 证据: [列出找到的具体API/字符串]
- 推理: [解释为什么这些证据表明该能力]
[Capability 2: e.g., "Credential Theft"]
[能力2:例如“凭据窃取”]
...
...
Packing Assessment
打包评估
[Is it packed? What does this mean for the analysis?]
[是否被打包?这对分析有什么影响?]
Anti-Analysis Techniques
反分析技术
[What evasion techniques were identified?]
[识别出哪些规避技术?]
MITRE ATT&CK Mapping
MITRE ATT&CK映射
| Tactic | Technique | ID | Evidence |
|---|---|---|---|
| [Only include techniques you can justify with evidence] |
| 战术 | 技术 | ID | 证据 |
|---|---|---|---|
| [仅包含你能通过证据证明的技术] |
Indicators of Compromise
入侵指标(IOC)
File Indicators
文件指标
[Hashes]
[哈希值]
Network Indicators
网络指标
[Defanged IPs, domains, URLs - only if extracted]
[脱敏后的IP、域名、URL - 仅包含提取到的内容]
Host Indicators
主机指标
[Registry keys, file paths, mutexes - only if found]
[注册表项、文件路径、互斥体 - 仅包含找到的内容]
Risk Assessment
风险评估
Overall Risk: [LEVEL]
This assessment is based on:
- [Reason 1]
- [Reason 2]
- [Reason 3]
Confidence in Assessment: [High/Medium/Low]
- [Why this confidence level]
总体风险:[等级]
该评估基于:
- [理由1]
- [理由2]
- [理由3]
评估置信度:[高/中/低]
- [为什么是这个置信度]
Recommendations
建议
Immediate Actions
立即行动
[What should be done RIGHT NOW based on risk level]
[根据风险等级,现在应该做什么]
Detection Opportunities
检测机会
[How to detect this threat]
[如何检测该威胁]
Further Analysis Needed
需进一步分析的内容
[What questions remain unanswered]
undefined[哪些问题仍未解决]
undefinedEntropy Interpretation
熵值解读
| Entropy | Meaning |
|---|---|
| 0-1 | Highly structured (empty, repetitive) |
| 4-5 | Plain text, readable strings |
| 5-6 | Compiled code (normal .text section) |
| 6-7 | Compressed data, some obfuscation |
| 7-8 | Encrypted/compressed (PACKED) |
| 熵值 | 含义 |
|---|---|
| 0-1 | 高度结构化(空、重复内容) |
| 4-5 | 纯文本、可读字符串 |
| 5-6 | 编译代码(正常.text节) |
| 6-7 | 压缩数据、部分混淆 |
| 7-8 | 加密/压缩(已打包) |
File Signatures
文件签名
| Bytes | Type |
|---|---|
| 4D 5A (MZ) | PE executable |
| 50 4B (PK) | ZIP/Office document |
| 7F 45 4C 46 | ELF executable |
| D0 CF 11 E0 | OLE/Legacy Office |
| 25 50 44 46 |
| 字节 | 类型 |
|---|---|
| 4D 5A (MZ) | PE可执行文件 |
| 50 4B (PK) | ZIP/Office文档 |
| 7F 45 4C 46 | ELF可执行文件 |
| D0 CF 11 E0 | OLE/旧版Office |
| 25 50 44 46 |
Example Analysis Reasoning
分析推理示例
BAD (data dump):
"Found APIs: VirtualAlloc, CreateRemoteThread, RegSetValueEx. Entropy: 7.2. VT: 34/70."
GOOD (analyst reasoning):
"This sample demonstrates process injection capability (HIGH CONFIDENCE) based on the presence of VirtualAlloc and CreateRemoteThread. These APIs, when used together, form the classic code injection pattern where memory is allocated in a target process and a thread is created to execute the injected code. The high entropy (7.2) suggests the payload is packed, meaning the observed APIs may belong to the unpacker stub rather than the final payload. The 34/70 VirusTotal detection rate confirms this is recognized malware, with multiple vendors identifying it as a variant of Agent Tesla — an info-stealer known for credential harvesting. Given the injection capability and association with a credential-stealing family, this sample poses a CRITICAL risk to credential security on any system where it executes."
错误示例(数据堆砌):
"找到的API:VirtualAlloc, CreateRemoteThread, RegSetValueEx。熵值:7.2。VT:34/70。"
正确示例(分析师推理):
"该样本基于VirtualAlloc和CreateRemoteThread的存在,显示出进程注入能力(高置信度)。当这些API一起使用时,构成了经典的代码注入模式,即在目标进程中分配内存并创建线程以执行注入的代码。高熵值(7.2)表明载荷已打包,意味着观察到的API可能属于解包器存根而非最终载荷。VirusTotal检测率34/70确认这是已被识别的恶意软件,多个厂商将其归类为Agent Tesla的变种——一种以凭据窃取为目的的信息窃取器。鉴于其注入能力与凭据窃取家族的关联,该样本在任何执行它的系统上都会对凭据安全构成严重(CRITICAL)风险。"
Scripts Reference
脚本参考
static_analysis.py
static_analysis.py
bash
python3 scripts/static_analysis.py <file> -f [text|json]Extracts: hashes, file type, PE headers, sections, entropy, imports, strings, suspicious indicators
bash
python3 scripts/static_analysis.py <file> -f [text|json]提取内容:哈希值、文件类型、PE头、节、熵值、导入表、字符串、可疑指标
triage.py
triage.py
bash
python3 scripts/triage.py <ioc> -f [text|json]
python3 scripts/triage.py -t file <filepath> -f json
python3 scripts/triage.py --status # Check API configQueries: MalwareBazaar, ThreatFox, URLhaus, VirusTotal, AbuseIPDB
bash
python3 scripts/triage.py <ioc> -f [text|json]
python3 scripts/triage.py -t file <filepath> -f json
python3 scripts/triage.py --status # 检查API配置查询来源:MalwareBazaar, ThreatFox, URLhaus, VirusTotal, AbuseIPDB
extract_iocs.py
extract_iocs.py
bash
python3 scripts/extract_iocs.py <file> -f [text|json|csv]Extracts: IPs, domains, URLs, emails, hashes, registry keys, file paths, crypto wallets, mutexes
bash
python3 scripts/extract_iocs.py <file> -f [text|json|csv]提取内容:IP、域名、URL、邮箱、哈希值、注册表项、文件路径、加密货币钱包、互斥体