malware-analysis

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Malware Analysis Skill

恶意软件分析技能

This skill produces analyst-grade threat reports — not data dumps. Every conclusion must be backed by evidence and reasoning.
本技能生成分析师级威胁报告——而非数据堆砌。每个结论都必须有证据和推理支持。

Core Principles

核心原则

  1. Evidence-based reasoning: Never state a conclusion without explaining WHY
  2. Connect the dots: Link indicators to behaviors to capabilities to impact
  3. Assess confidence: State how confident you are and why
  4. Actionable output: Reports should enable decisions, not just inform
  1. 基于证据的推理:绝不给出没有解释“原因”的结论
  2. 关联线索:将指标与行为、能力和影响关联起来
  3. 评估置信度:说明你的置信程度及原因
  4. 可执行的输出:报告应支持决策,而非仅提供信息

Analysis Workflow

分析工作流

Step 1: Collect Data

步骤1:收集数据

Run all scripts to gather raw data:
bash
undefined
运行所有脚本以收集原始数据:
bash
undefined

Static analysis - get hashes, PE info, strings, APIs, entropy

Static analysis - get hashes, PE info, strings, APIs, entropy

python3 scripts/static_analysis.py /path/to/sample -f json > static.json
python3 scripts/static_analysis.py /path/to/sample -f json > static.json

Threat intelligence - check reputation across sources

Threat intelligence - check reputation across sources

python3 scripts/triage.py -t file /path/to/sample -f json > triage.json
python3 scripts/triage.py -t file /path/to/sample -f json > triage.json

IOC extraction - extract network/host indicators

IOC extraction - extract network/host indicators

python3 scripts/extract_iocs.py /path/to/sample -f json > iocs.json
undefined
python3 scripts/extract_iocs.py /path/to/sample -f json > iocs.json
undefined

Step 2: Analyze and Reason (THIS IS THE KEY STEP)

步骤2:分析与推理(这是关键步骤)

Using the collected data, perform analyst-grade reasoning:
利用收集到的数据,执行分析师级别的推理:

2.1 Threat Intelligence Assessment

2.1 威胁情报评估

Ask yourself:
  • Is this sample known? If found in MalwareBazaar/ThreatFox, it's confirmed malware
  • What's the VT detection rate?
    • 0 detections: New sample, FP, or clean — requires behavioral analysis
    • 1-5 detections: Possibly new variant or targeted — suspicious
    • 5-15 detections: Confirmed malicious by multiple vendors
    • 15+ detections: Well-known malware
  • What family is it attributed to? Research that family's typical behavior
  • When was it first seen? Recent = active campaign
Always explain your reasoning:
"This sample is identified as RedLine Stealer by MalwareBazaar with 45/70 VT detections. The high detection rate and presence in curated malware repositories confirms this is a known threat, not a false positive."
自问:
  • 该样本是否为已知恶意软件?如果在MalwareBazaar/ThreatFox中找到,则为已确认恶意软件
  • VT检测率是多少?
    • 0次检测:新样本、误报或干净文件——需要行为分析
    • 1-5次检测:可能是新变种或定向攻击——可疑
    • 5-15次检测:被多个厂商确认为恶意软件
    • 15+次检测:知名恶意软件
  • 它被归为哪个家族?研究该家族的典型行为
  • 首次发现时间是什么时候?近期发现=活跃攻击活动
务必解释你的推理:
"该样本被MalwareBazaar识别为RedLine Stealer,VirusTotal检测率为45/70。高检测率以及在精选恶意软件仓库中的存在,确认这是已知威胁,而非误报。"

2.2 Behavioral Analysis from Static Indicators

2.2 基于静态指标的行为分析

API Analysis - Map APIs to behaviors:
API PatternLikely BehaviorReasoning
VirtualAlloc + VirtualProtect + WriteProcessMemory + CreateRemoteThreadProcess InjectionThis is the classic injection pattern: allocate memory, make it executable, write code, execute in target
CredEnumerate, CryptUnprotectDataCredential TheftThese APIs specifically access Windows credential stores and DPAPI-protected data (browser passwords)
InternetOpen + URLDownloadToFileDownloaderInitializes HTTP and downloads files — classic dropper behavior
RegSetValueEx + Run key paths in stringsPersistenceWriting to Run keys ensures execution at startup
IsDebuggerPresent, GetTickCount, NtQuerySystemInformationAnti-AnalysisMultiple evasion checks suggest the malware hides its behavior during analysis
CryptEncrypt + file enumeration APIsPossible RansomwareEncryption capability combined with file discovery — but could also be secure C2
Always explain your reasoning:
"The presence of VirtualAlloc, VirtualProtect, and CreateRemoteThread together strongly suggests process injection capability. Individually these APIs have legitimate uses, but this specific combination is the textbook pattern for injecting code into other processes."
Packing Analysis:
IndicatorMeaningConfidence
Entropy > 7.0Compressed/encrypted contentHigh
Section entropy > 7.0 (especially .text)Packed code sectionHigh
UPX0, UPX1, .aspack, .packed sectionsKnown packer signaturesVery High
RWX sectionsSelf-modifying codeMedium
Small import table with GetProcAddress/LoadLibrary onlyDynamic API resolutionHigh
If packed, state the implication:
"This sample shows multiple packing indicators (entropy 7.4, UPX sections). The static analysis findings represent the unpacker stub, NOT the actual payload. Dynamic analysis is required to reveal true functionality."
API分析 - 将API映射到行为:
API模式可能的行为推理
VirtualAlloc + VirtualProtect + WriteProcessMemory + CreateRemoteThread进程注入这是经典的注入模式:分配内存、设置为可执行、写入代码、在目标进程中执行
CredEnumerate, CryptUnprotectData凭据窃取这些API专门访问Windows凭据存储和受DPAPI保护的数据(浏览器密码)
InternetOpen + URLDownloadToFile下载器初始化HTTP并下载文件——典型的投放器行为
RegSetValueEx + 字符串中的Run键路径持久化写入Run键可确保在启动时执行
IsDebuggerPresent, GetTickCount, NtQuerySystemInformation反分析多个规避检查表明恶意软件在分析期间隐藏其行为
CryptEncrypt + 文件枚举API可能为勒索软件加密能力结合文件发现——但也可能是安全的C2
务必解释你的推理:
"VirtualAlloc、VirtualProtect和CreateRemoteThread的同时存在强烈暗示进程注入能力。单独这些API有合法用途,但这种特定组合是向其他进程注入代码的标准模式。"
打包分析:
指标含义置信度
熵 > 7.0压缩/加密内容
节熵 > 7.0(尤其是.text节)打包的代码节
UPX0、UPX1、.aspack、.packed节已知打包器签名极高
RWX节自修改代码
仅包含GetProcAddress/LoadLibrary的小导入表动态API解析
如果被打包,说明其影响:
"该样本显示出多个打包指标(熵7.4,UPX节)。静态分析结果代表的是解包器存根,而非实际载荷。需要动态分析以揭示真实功能。"

2.3 Capability Assessment

2.3 能力评估

Based on the evidence, determine what the malware CAN DO:
CapabilityRequired EvidenceConfidence Level
Process Injection2+ injection APIsHigh if 3+, Medium if 2
Credential TheftAny cred access APIHigh (these are specific)
KeyloggingSetWindowsHookExMedium (has legit uses)
Network C22+ network APIs + extracted URLs/IPsHigh
File DownloadURLDownloadToFile or similarHigh
PersistenceRegistry/service APIs + relevant stringsMedium
Encryption/RansomwareCrypto APIs + file enumerationMedium (needs context)
State confidence and reasoning:
"Credential Theft Capability: HIGH CONFIDENCE — CryptUnprotectData is present, which specifically decrypts DPAPI-protected data including browser passwords. This API has no legitimate use case in most software."
基于证据,确定恶意软件能够执行的操作
能力所需证据置信度
进程注入2+注入API3个及以上为高,2个为中
凭据窃取任何凭据访问API高(这些API具有特异性)
键盘记录SetWindowsHookEx中(有合法用途)
网络C22+网络API + 提取的URL/IP
文件下载URLDownloadToFile或类似API
持久化注册表/服务API + 相关字符串
加密/勒索软件加密API + 文件枚举中(需要上下文)
说明置信度和推理:
"凭据窃取能力:高置信度——存在CryptUnprotectData,该API专门解密受DPAPI保护的数据,包括浏览器密码。在大多数软件中,这个API没有合法用例。"

2.4 Risk Assessment

2.4 风险评估

Determine risk level with justification:
Risk LevelCriteria
CRITICALCredential theft APIs, process injection, confirmed malware family known for data theft/ransomware
HIGHMultiple malicious capabilities, network C2, persistence mechanisms
MEDIUMSuspicious indicators but no confirmed malicious capability, or packing hiding true behavior
LOWFew indicators, possibly legitimate software with suspicious patterns
UNKNOWNInsufficient evidence, heavily packed, or no TI hits
结合理由确定风险等级:
风险等级标准
CRITICAL(严重)存在凭据窃取API、进程注入、已知的以数据窃取/勒索为目的的恶意软件家族
HIGH(高)多种恶意能力、网络C2、持久化机制
MEDIUM(中)可疑指标但无确认的恶意能力,或打包隐藏了真实行为
LOW(低)少量指标,可能是具有可疑模式的合法软件
UNKNOWN(未知)证据不足、严重打包或无威胁情报命中

Step 3: Write the Report

步骤3:撰写报告

Structure your report as follows:
markdown
undefined
按照以下结构撰写报告:
markdown
undefined

Threat Analysis Report: [MALWARE_NAME or "Unknown Sample"]

威胁分析报告:[恶意软件名称或“未知样本”]

Risk Level[CRITICAL/HIGH/MEDIUM/LOW]
Confidence[High/Medium/Low]
Analysis Date[DATE]

风险等级[CRITICAL/HIGH/MEDIUM/LOW]
置信度[高/中/低]
分析日期[日期]

Executive Summary

执行摘要

[2-3 sentences: What is this? Is it malicious? What can it do? How do we know?]
Key Finding: [One sentence bottom line]

[2-3句话:这是什么?是否为恶意软件?它能做什么?我们如何得知?]
关键发现: [一句话总结]

Threat Intelligence Assessment

威胁情报评估

[What do TI sources tell us? Explain what each finding means]
  • VirusTotal: [X/Y detections] — [what this means]
  • MalwareBazaar: [Found/Not found] — [what this means]
  • Family Attribution: [Family name] — [what this family typically does]
Assessment: [Your reasoned conclusion based on TI]

[威胁情报来源告诉我们什么?解释每个发现的含义]
  • VirusTotal: [X/Y次检测] — [含义]
  • MalwareBazaar: [已发现/未发现] — [含义]
  • 家族归属: [家族名称] — [该家族的典型行为]
评估: [基于威胁情报的推理结论]

Behavioral Analysis

行为分析

Identified Capabilities

已识别的能力

[Capability 1: e.g., "Process Injection"]

[能力1:例如“进程注入”]

  • Confidence: [High/Medium/Low]
  • Evidence: [List the specific APIs/strings found]
  • Reasoning: [Explain WHY this evidence indicates this capability]
  • 置信度: [高/中/低]
  • 证据: [列出找到的具体API/字符串]
  • 推理: [解释为什么这些证据表明该能力]

[Capability 2: e.g., "Credential Theft"]

[能力2:例如“凭据窃取”]

...
...

Packing Assessment

打包评估

[Is it packed? What does this mean for the analysis?]
[是否被打包?这对分析有什么影响?]

Anti-Analysis Techniques

反分析技术

[What evasion techniques were identified?]

[识别出哪些规避技术?]

MITRE ATT&CK Mapping

MITRE ATT&CK映射

TacticTechniqueIDEvidence
[Only include techniques you can justify with evidence]

战术技术ID证据
[仅包含你能通过证据证明的技术]

Indicators of Compromise

入侵指标(IOC)

File Indicators

文件指标

[Hashes]
[哈希值]

Network Indicators

网络指标

[Defanged IPs, domains, URLs - only if extracted]
[脱敏后的IP、域名、URL - 仅包含提取到的内容]

Host Indicators

主机指标

[Registry keys, file paths, mutexes - only if found]

[注册表项、文件路径、互斥体 - 仅包含找到的内容]

Risk Assessment

风险评估

Overall Risk: [LEVEL]
This assessment is based on:
  1. [Reason 1]
  2. [Reason 2]
  3. [Reason 3]
Confidence in Assessment: [High/Medium/Low]
  • [Why this confidence level]

总体风险:[等级]
该评估基于:
  1. [理由1]
  2. [理由2]
  3. [理由3]
评估置信度:[高/中/低]
  • [为什么是这个置信度]

Recommendations

建议

Immediate Actions

立即行动

[What should be done RIGHT NOW based on risk level]
[根据风险等级,现在应该做什么]

Detection Opportunities

检测机会

[How to detect this threat]
[如何检测该威胁]

Further Analysis Needed

需进一步分析的内容

[What questions remain unanswered]
undefined
[哪些问题仍未解决]
undefined

Entropy Interpretation

熵值解读

EntropyMeaning
0-1Highly structured (empty, repetitive)
4-5Plain text, readable strings
5-6Compiled code (normal .text section)
6-7Compressed data, some obfuscation
7-8Encrypted/compressed (PACKED)
熵值含义
0-1高度结构化(空、重复内容)
4-5纯文本、可读字符串
5-6编译代码(正常.text节)
6-7压缩数据、部分混淆
7-8加密/压缩(已打包)

File Signatures

文件签名

BytesType
4D 5A (MZ)PE executable
50 4B (PK)ZIP/Office document
7F 45 4C 46ELF executable
D0 CF 11 E0OLE/Legacy Office
25 50 44 46PDF
字节类型
4D 5A (MZ)PE可执行文件
50 4B (PK)ZIP/Office文档
7F 45 4C 46ELF可执行文件
D0 CF 11 E0OLE/旧版Office
25 50 44 46PDF

Example Analysis Reasoning

分析推理示例

BAD (data dump):
"Found APIs: VirtualAlloc, CreateRemoteThread, RegSetValueEx. Entropy: 7.2. VT: 34/70."
GOOD (analyst reasoning):
"This sample demonstrates process injection capability (HIGH CONFIDENCE) based on the presence of VirtualAlloc and CreateRemoteThread. These APIs, when used together, form the classic code injection pattern where memory is allocated in a target process and a thread is created to execute the injected code. The high entropy (7.2) suggests the payload is packed, meaning the observed APIs may belong to the unpacker stub rather than the final payload. The 34/70 VirusTotal detection rate confirms this is recognized malware, with multiple vendors identifying it as a variant of Agent Tesla — an info-stealer known for credential harvesting. Given the injection capability and association with a credential-stealing family, this sample poses a CRITICAL risk to credential security on any system where it executes."
错误示例(数据堆砌):
"找到的API:VirtualAlloc, CreateRemoteThread, RegSetValueEx。熵值:7.2。VT:34/70。"
正确示例(分析师推理):
"该样本基于VirtualAlloc和CreateRemoteThread的存在,显示出进程注入能力(高置信度)。当这些API一起使用时,构成了经典的代码注入模式,即在目标进程中分配内存并创建线程以执行注入的代码。高熵值(7.2)表明载荷已打包,意味着观察到的API可能属于解包器存根而非最终载荷。VirusTotal检测率34/70确认这是已被识别的恶意软件,多个厂商将其归类为Agent Tesla的变种——一种以凭据窃取为目的的信息窃取器。鉴于其注入能力与凭据窃取家族的关联,该样本在任何执行它的系统上都会对凭据安全构成严重(CRITICAL)风险。"

Scripts Reference

脚本参考

static_analysis.py

static_analysis.py

bash
python3 scripts/static_analysis.py <file> -f [text|json]
Extracts: hashes, file type, PE headers, sections, entropy, imports, strings, suspicious indicators
bash
python3 scripts/static_analysis.py <file> -f [text|json]
提取内容:哈希值、文件类型、PE头、节、熵值、导入表、字符串、可疑指标

triage.py

triage.py

bash
python3 scripts/triage.py <ioc> -f [text|json]
python3 scripts/triage.py -t file <filepath> -f json
python3 scripts/triage.py --status  # Check API config
Queries: MalwareBazaar, ThreatFox, URLhaus, VirusTotal, AbuseIPDB
bash
python3 scripts/triage.py <ioc> -f [text|json]
python3 scripts/triage.py -t file <filepath> -f json
python3 scripts/triage.py --status  # 检查API配置
查询来源:MalwareBazaar, ThreatFox, URLhaus, VirusTotal, AbuseIPDB

extract_iocs.py

extract_iocs.py

bash
python3 scripts/extract_iocs.py <file> -f [text|json|csv]
Extracts: IPs, domains, URLs, emails, hashes, registry keys, file paths, crypto wallets, mutexes
bash
python3 scripts/extract_iocs.py <file> -f [text|json|csv]
提取内容:IP、域名、URL、邮箱、哈希值、注册表项、文件路径、加密货币钱包、互斥体