analyzing-linux-elf-malware

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Analyzing Linux ELF Malware

Linux ELF恶意软件分析

When to Use

适用场景

  • A Linux server or container has been compromised and suspicious ELF binaries are found
  • Analyzing Linux botnets (Mirai, Gafgyt, XorDDoS), cryptominers, or ransomware
  • Investigating malware targeting cloud infrastructure, Docker containers, or Kubernetes pods
  • Reverse engineering Linux rootkits and kernel modules
  • Analyzing cross-platform malware compiled for Linux x86_64, ARM, or MIPS architectures
Do not use for Windows PE binary analysis; use PEStudio, Ghidra, or IDA for Windows malware.
  • Linux服务器或容器遭入侵,且发现可疑ELF二进制文件
  • 分析Linux僵尸网络(Mirai、Gafgyt、XorDDoS)、加密挖矿程序或勒索软件
  • 调查针对云基础设施、Docker容器或Kubernetes Pod的恶意软件
  • 逆向分析Linux rootkit及内核模块
  • 分析为Linux x86_64、ARM或MIPS架构编译的跨平台恶意软件
请勿用于Windows PE二进制文件分析;Windows恶意软件分析请使用PEStudio、Ghidra或IDA。

Prerequisites

前置条件

  • Ghidra or IDA with Linux ELF support for disassembly and decompilation
  • Linux analysis VM (Ubuntu 22.04 recommended) with development tools installed
  • strace, ltrace, and GDB for dynamic analysis and debugging
  • readelf, objdump, and nm from GNU binutils for static inspection
  • Radare2 for quick binary triage and scripted analysis
  • Docker for isolated container-based malware execution
  • 支持Linux ELF反汇编与反编译的Ghidra或IDA
  • 安装了开发工具的Linux分析虚拟机(推荐Ubuntu 22.04)
  • 用于动态分析与调试的strace、ltrace及GDB
  • GNU binutils提供的readelf、objdump及nm工具,用于静态检查
  • 用于快速二进制分类与脚本化分析的Radare2
  • 用于隔离式容器恶意软件执行的Docker

Workflow

分析流程

Step 1: Identify ELF Binary Properties

步骤1:识别ELF二进制文件属性

Examine the ELF header and basic properties:
bash
undefined
检查ELF文件头及基本属性:
bash
undefined

File type identification

文件类型识别

file suspect_binary
file suspect_binary

Detailed ELF header analysis

详细ELF文件头分析

readelf -h suspect_binary
readelf -h suspect_binary

Section headers

节区头

readelf -S suspect_binary
readelf -S suspect_binary

Program headers (segments)

程序头(段)

readelf -l suspect_binary
readelf -l suspect_binary

Symbol table (if not stripped)

符号表(若未剥离符号)

readelf -s suspect_binary nm suspect_binary 2>/dev/null
readelf -s suspect_binary nm suspect_binary 2>/dev/null

Dynamic linking information

动态链接信息

readelf -d suspect_binary ldd suspect_binary 2>/dev/null # Only on matching architecture!
readelf -d suspect_binary ldd suspect_binary 2>/dev/null # 仅在匹配架构下执行!

Compute hashes

计算哈希值

md5sum suspect_binary sha256sum suspect_binary
md5sum suspect_binary sha256sum suspect_binary

Check for packing/UPX

检查是否被打包/UPX压缩

upx -t suspect_binary

```python
upx -t suspect_binary

```python

Python-based ELF analysis

基于Python的ELF分析

from elftools.elf.elffile import ELFFile import hashlib
with open("suspect_binary", "rb") as f: data = f.read() sha256 = hashlib.sha256(data).hexdigest()
with open("suspect_binary", "rb") as f: elf = ELFFile(f)
print(f"SHA-256:      {sha256}")
print(f"Class:        {elf.elfclass}-bit")
print(f"Endian:       {elf.little_endian and 'Little' or 'Big'}")
print(f"Machine:      {elf.header.e_machine}")
print(f"Type:         {elf.header.e_type}")
print(f"Entry Point:  0x{elf.header.e_entry:X}")

# Check if stripped
symtab = elf.get_section_by_name('.symtab')
print(f"Stripped:     {'Yes' if symtab is None else 'No'}")

# Section entropy analysis
import math
from collections import Counter
for section in elf.iter_sections():
    data = section.data()
    if len(data) > 0:
        entropy = -sum((c/len(data)) * math.log2(c/len(data))
                      for c in Counter(data).values() if c > 0)
        if entropy > 7.0:
            print(f"  [!] High entropy section: {section.name} ({entropy:.2f})")
undefined
from elftools.elf.elffile import ELFFile import hashlib
with open("suspect_binary", "rb") as f: data = f.read() sha256 = hashlib.sha256(data).hexdigest()
with open("suspect_binary", "rb") as f: elf = ELFFile(f)
print(f"SHA-256:      {sha256}")
print(f"Class:        {elf.elfclass}-bit")
print(f"Endian:       {elf.little_endian and 'Little' or 'Big'}")
print(f"Machine:      {elf.header.e_machine}")
print(f"Type:         {elf.header.e_type}")
print(f"Entry Point:  0x{elf.header.e_entry:X}")

# 检查是否剥离符号
symtab = elf.get_section_by_name('.symtab')
print(f"Stripped:     {'Yes' if symtab is None else 'No'}")

# 节区熵值分析
import math
from collections import Counter
for section in elf.iter_sections():
    data = section.data()
    if len(data) > 0:
        entropy = -sum((c/len(data)) * math.log2(c/len(data))
                      for c in Counter(data).values() if c > 0)
        if entropy > 7.0:
            print(f"  [!] High entropy section: {section.name} ({entropy:.2f})")
undefined

Step 2: Extract Strings and Indicators

步骤2:提取字符串与威胁指标

Search for embedded IOCs and functionality clues:
bash
undefined
搜索内嵌的IOC(威胁指标)及功能线索:
bash
undefined

ASCII strings

ASCII字符串提取

strings suspect_binary > strings_output.txt
strings suspect_binary > strings_output.txt

Search for network indicators

搜索网络指标

grep -iE "(http|https|ftp)://" strings_output.txt grep -iE "([0-9]{1,3}.){3}[0-9]{1,3}" strings_output.txt grep -iE "[a-zA-Z0-9.-]+.(com|net|org|io|ru|cn)" strings_output.txt
grep -iE "(http|https|ftp)://" strings_output.txt grep -iE "([0-9]{1,3}.){3}[0-9]{1,3}" strings_output.txt grep -iE "[a-zA-Z0-9.-]+.(com|net|org|io|ru|cn)" strings_output.txt

Search for shell commands

搜索Shell命令

grep -iE "(bash|sh|wget|curl|chmod|/tmp/|/dev/)" strings_output.txt
grep -iE "(bash|sh|wget|curl|chmod|/tmp/|/dev/)" strings_output.txt

Search for crypto mining indicators

搜索加密挖矿指标

grep -iE "(stratum|xmr|monero|pool.|mining)" strings_output.txt
grep -iE "(stratum|xmr|monero|pool.|mining)" strings_output.txt

Search for SSH/credential theft

搜索SSH/凭证窃取相关内容

grep -iE "(ssh|authorized_keys|id_rsa|shadow|passwd)" strings_output.txt
grep -iE "(ssh|authorized_keys|id_rsa|shadow|passwd)" strings_output.txt

Search for persistence mechanisms

搜索持久化机制

grep -iE "(crontab|systemd|init.d|rc.local|ld.so.preload)" strings_output.txt
grep -iE "(crontab|systemd|init.d|rc.local|ld.so.preload)" strings_output.txt

FLOSS for obfuscated strings (if available)

使用FLOSS提取混淆字符串(若可用)

floss suspect_binary
undefined
floss suspect_binary
undefined

Step 3: Analyze System Calls and Library Usage

步骤3:分析系统调用与库使用情况

Identify what system calls and libraries the malware uses:
bash
undefined
识别恶意软件使用的系统调用与库:
bash
undefined

List imported functions (dynamically linked)

列出导入的函数(动态链接)

readelf -r suspect_binary | grep -E "socket|connect|exec|fork|open|write|bind|listen"
readelf -r suspect_binary | grep -E "socket|connect|exec|fork|open|write|bind|listen"

Trace system calls during execution (in isolated VM only)

执行期间追踪系统调用(仅在隔离虚拟机中执行)

strace -f -e trace=network,process,file -o strace_output.txt ./suspect_binary
strace -f -e trace=network,process,file -o strace_output.txt ./suspect_binary

Trace library calls

追踪库调用

ltrace -f -o ltrace_output.txt ./suspect_binary
ltrace -f -o ltrace_output.txt ./suspect_binary

Key system calls to watch:

需要重点监控的系统调用:

Network: socket, connect, bind, listen, accept, sendto, recvfrom

网络类:socket、connect、bind、listen、accept、sendto、recvfrom

Process: fork, execve, clone, kill, ptrace

进程类:fork、execve、clone、kill、ptrace

File: open, read, write, unlink, rename, chmod

文件类:open、read、write、unlink、rename、chmod

Persistence: inotify_add_watch (file monitoring)

持久化类:inotify_add_watch(文件监控)

undefined
undefined

Step 4: Dynamic Analysis with GDB

步骤4:使用GDB进行动态分析

Debug the malware to observe runtime behavior:
bash
undefined
调试恶意软件以观察运行时行为:
bash
undefined

Start GDB with the binary

启动GDB并加载二进制文件

gdb ./suspect_binary
gdb ./suspect_binary

Set breakpoints on key functions

在关键函数处设置断点

(gdb) break main (gdb) break socket (gdb) break connect (gdb) break execve (gdb) break fork
(gdb) break main (gdb) break socket (gdb) break connect (gdb) break execve (gdb) break fork

Run and analyze

运行并分析

(gdb) run (gdb) info registers # View register state (gdb) x/20s $rdi # Examine string argument (gdb) bt # Backtrace (gdb) continue
(gdb) run (gdb) info registers # 查看寄存器状态 (gdb) x/20s $rdi # 检查字符串参数 (gdb) bt # 回溯调用栈 (gdb) continue

For stripped binaries, break on entry point

针对剥离符号的二进制文件,在入口点设置断点

(gdb) break *0x400580 # Entry point from readelf (gdb) run
(gdb) break *0x400580 # 入口点来自readelf输出 (gdb) run

Monitor network connections during execution

执行期间监控网络连接

In another terminal:

在另一个终端执行:

ss -tlnp # List listening sockets ss -tnp # List established connections
undefined
ss -tlnp # 列出监听套接字 ss -tnp # 列出已建立的连接
undefined

Step 5: Reverse Engineer with Ghidra

步骤5:使用Ghidra进行逆向工程

Perform deep code analysis on the ELF binary:
Ghidra Analysis for Linux ELF:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
1. Import: File -> Import -> Select ELF binary
   - Ghidra auto-detects ELF format and architecture
   - Accept default analysis options

2. Key analysis targets:
   - main() function (or entry point if stripped)
   - Socket creation and connection functions
   - Command dispatch logic (switch/case on received data)
   - Encryption/encoding routines
   - Persistence installation code
   - Self-propagation/scanning functions

3. For Mirai-like botnets, look for:
   - Credential list for brute-forcing (telnet/SSH)
   - Attack module selection (UDP flood, SYN flood, ACK flood)
   - Scanner module (port scanning for vulnerable devices)
   - Killer module (killing competing botnets)

4. For cryptominers, look for:
   - Mining pool connection (stratum protocol)
   - Wallet address strings
   - CPU/GPU utilization functions
   - Process hiding techniques
对ELF二进制文件进行深度代码分析:
Linux ELF的Ghidra分析流程:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
1. 导入:File -> Import -> 选择ELF二进制文件
   - Ghidra会自动检测ELF格式与架构
   - 接受默认分析选项

2. 核心分析目标:
   - main()函数(若剥离符号则分析入口点)
   - 套接字创建与连接函数
   - 命令分发逻辑(基于接收数据的switch/case分支)
   - 加密/编码例程
   - 持久化安装代码
   - 自我传播/扫描功能

3. 针对Mirai类僵尸网络,需重点查找:
   - 用于暴力破解的凭证列表(telnet/SSH)
   - 攻击模块选择(UDP洪水、SYN洪水、ACK洪水)
   - 扫描模块(针对易受攻击设备的端口扫描)
   - 查杀模块(查杀竞争僵尸网络)

4. 针对加密挖矿程序,需重点查找:
   - 矿池连接(stratum协议)
   - 钱包地址字符串
   - CPU/GPU利用率相关函数
   - 进程隐藏技术

Step 6: Analyze Linux-Specific Persistence

步骤6:分析Linux专属持久化机制

Check for persistence mechanisms:
bash
undefined
检查恶意软件的持久化手段:
bash
undefined

Check for LD_PRELOAD rootkit

检查LD_PRELOAD rootkit

strings suspect_binary | grep "ld.so.preload"
strings suspect_binary | grep "ld.so.preload"

Malware writing to /etc/ld.so.preload can hook all dynamic library calls

恶意软件写入/etc/ld.so.preload可劫持所有动态库调用

Check for crontab persistence

检查crontab持久化

strings suspect_binary | grep -i "cron"
strings suspect_binary | grep -i "cron"

Check for systemd service creation

检查systemd服务创建

strings suspect_binary | grep -iE "systemd|.service|systemctl"
strings suspect_binary | grep -iE "systemd|.service|systemctl"

Check for init script creation

检查初始化脚本创建

strings suspect_binary | grep -iE "init.d|rc.local|update-rc"
strings suspect_binary | grep -iE "init.d|rc.local|update-rc"

Check for SSH key injection

检查SSH密钥注入

strings suspect_binary | grep -i "authorized_keys"
strings suspect_binary | grep -i "authorized_keys"

Check for kernel module (rootkit) loading

检查内核模块(rootkit)加载

strings suspect_binary | grep -iE "insmod|modprobe|init_module"
strings suspect_binary | grep -iE "insmod|modprobe|init_module"

Check for process hiding

检查进程隐藏

strings suspect_binary | grep -iE "proc|readdir|getdents"
undefined
strings suspect_binary | grep -iE "proc|readdir|getdents"
undefined

Key Concepts

核心概念

TermDefinition
ELF (Executable and Linkable Format)Standard binary format for Linux executables, shared libraries, and core dumps containing headers, sections, and segments
Stripped BinaryELF binary with debug symbols removed, making reverse engineering more difficult as function names are lost
LD_PRELOADLinux environment variable specifying shared libraries to load before all others; abused by rootkits to intercept system library calls
straceLinux system call tracer that logs all system calls and signals made by a process, revealing file, network, and process operations
GOT/PLTGlobal Offset Table and Procedure Linkage Table; ELF structures for dynamic linking that can be hijacked for function hooking
Statically LinkedBinary compiled with all library code included; common in IoT malware to run on systems without matching shared libraries
MiraiProlific Linux botnet targeting IoT devices via telnet brute-force; source code leaked, leading to many variants
术语定义
ELF (Executable and Linkable Format)Linux可执行文件、共享库及核心转储文件的标准二进制格式,包含文件头、节区和段
Stripped Binary去除调试符号的ELF二进制文件,由于丢失了函数名称,会增加逆向工程的难度
LD_PRELOADLinux环境变量,用于指定在所有其他库之前加载的共享库;常被rootkit滥用以拦截系统库调用
straceLinux系统调用追踪工具,可记录进程发起的所有系统调用与信号,揭示文件、网络及进程操作
GOT/PLT全局偏移表(Global Offset Table)与过程链接表(Procedure Linkage Table);ELF的动态链接结构,可被劫持用于函数挂钩
Statically Linked包含所有库代码的编译二进制文件;在IoT恶意软件中较为常见,可在无匹配共享库的系统上运行
Mirai针对IoT设备的知名Linux僵尸网络,通过telnet暴力破解入侵;其源代码泄露后衍生出大量变种

Tools & Systems

工具与系统

  • Ghidra: NSA reverse engineering tool with full ELF support for x86, x86_64, ARM, MIPS, and other Linux architectures
  • Radare2: Open-source reverse engineering framework with command-line interface for quick binary analysis and scripting
  • strace: Linux system call tracing tool for observing binary behavior including file, network, and process operations
  • GDB: GNU Debugger for setting breakpoints, examining memory, and stepping through Linux binary execution
  • pyelftools: Python library for parsing ELF files programmatically for automated analysis pipelines
  • Ghidra: NSA推出的逆向工程工具,全面支持x86、x86_64、ARM、MIPS等Linux架构的ELF文件
  • Radare2: 开源逆向工程框架,提供命令行界面,用于快速二进制分析与脚本开发
  • strace: Linux系统调用追踪工具,用于观察二进制文件的文件、网络及进程操作行为
  • GDB: GNU调试器,用于设置断点、检查内存及单步执行Linux二进制文件
  • pyelftools: Python库,用于程序化解析ELF文件,构建自动化分析流水线

Common Scenarios

常见场景

Scenario: Analyzing a Cryptominer Found on a Compromised Linux Server

场景:分析遭入侵Linux服务器上发现的加密挖矿程序

Context: A cloud server shows 100% CPU usage. Investigation reveals an unknown binary running from /tmp with a suspicious name. The binary needs analysis to confirm it is a cryptominer and identify the attacker's wallet and pool.
Approach:
  1. Copy the binary to an analysis VM and compute SHA-256 hash
  2. Run
    file
    and
    readelf
    to identify architecture and linking type
  3. Extract strings and search for mining pool addresses (stratum+tcp://) and wallet addresses
  4. Run with strace in a sandbox to observe network connections (mining pool connection)
  5. Import into Ghidra to identify the mining algorithm and configuration extraction
  6. Check for persistence mechanisms (crontab, systemd service, SSH keys)
  7. Document all IOCs including pool address, wallet, C2 for updates, and persistence artifacts
Pitfalls:
  • Running
    ldd
    on malware outside a sandbox (ldd can execute code in the binary)
  • Not checking for ARM/MIPS architecture before attempting x86_64 execution
  • Missing companion scripts (.sh files) that may handle persistence and cleanup
  • Ignoring the initial access vector (how the miner was deployed: SSH brute force, web exploit, container escape)
背景:某云服务器CPU使用率达100%,调查发现/tmp目录下运行着一个名称可疑的未知二进制文件。需分析该文件以确认其为加密挖矿程序,并识别攻击者的钱包地址与矿池信息。
分析方法:
  1. 将二进制文件复制到分析虚拟机,计算SHA-256哈希值
  2. 运行
    file
    readelf
    命令识别架构与链接类型
  3. 提取字符串并搜索矿池地址(stratum+tcp://)与钱包地址
  4. 在沙箱中使用strace运行程序,观察网络连接(矿池连接)
  5. 导入Ghidra识别挖矿算法与配置提取逻辑
  6. 检查持久化机制(crontab、systemd服务、SSH密钥)
  7. 记录所有IOC,包括矿池地址、钱包地址、C2更新服务器及持久化痕迹
常见陷阱:
  • 在沙箱外对恶意软件运行
    ldd
    (ldd可能执行二进制文件中的代码)
  • 未检查ARM/MIPS架构就尝试在x86_64环境执行
  • 遗漏处理持久化与清理的配套脚本(.sh文件)
  • 忽略初始入侵向量(挖矿程序的部署方式:SSH暴力破解、Web漏洞利用、容器逃逸)

Output Format

输出格式

LINUX ELF MALWARE ANALYSIS REPORT
====================================
File:             /tmp/.X11-unix/.rsync
SHA-256:          e3b0c44298fc1c149afbf4c8996fb924...
Type:             ELF 64-bit LSB executable, x86-64
Linking:          Statically linked (all libraries embedded)
Stripped:         Yes
Size:             2,847,232 bytes
Packer:           UPX 3.96 (unpacked for analysis)

CLASSIFICATION
Family:           XMRig Cryptominer (modified)
Variant:          Custom build with C2 update mechanism

FUNCTIONALITY
[*] XMR (Monero) mining via RandomX algorithm
[*] Stratum pool connection for work submission
[*] C2 check-in for configuration updates
[*] Process name masquerading (argv[0] = "[kworker/0:0]")
[*] Competitor process killing (kills other miners)
[*] SSH key injection for re-access

NETWORK INDICATORS
Mining Pool:      stratum+tcp://pool.minexmr[.]com:4444
C2 Server:        hxxp://update.malicious[.]com/config
Wallet:           49jZ5Q3b...Monero_Wallet_Address...

PERSISTENCE
[1] Crontab entry: */5 * * * * /tmp/.X11-unix/.rsync
[2] SSH key added to /root/.ssh/authorized_keys
[3] Systemd service: /etc/systemd/system/rsync-daemon.service
[4] Modified /etc/ld.so.preload for process hiding

PROCESS HIDING
LD_PRELOAD:       /usr/lib/.libsystem.so
Hook:             readdir() to hide /tmp/.X11-unix/.rsync from ls
Hook:             fopen() to hide from /proc/*/maps reading
LINUX ELF MALWARE ANALYSIS REPORT
====================================
File:             /tmp/.X11-unix/.rsync
SHA-256:          e3b0c44298fc1c149afbf4c8996fb924...
Type:             ELF 64-bit LSB executable, x86-64
Linking:          Statically linked (all libraries embedded)
Stripped:         Yes
Size:             2,847,232 bytes
Packer:           UPX 3.96 (unpacked for analysis)

CLASSIFICATION
Family:           XMRig Cryptominer (modified)
Variant:          Custom build with C2 update mechanism

FUNCTIONALITY
[*] XMR (Monero) mining via RandomX algorithm
[*] Stratum pool connection for work submission
[*] C2 check-in for configuration updates
[*] Process name masquerading (argv[0] = "[kworker/0:0]")
[*] Competitor process killing (kills other miners)
[*] SSH key injection for re-access

NETWORK INDICATORS
Mining Pool:      stratum+tcp://pool.minexmr[.]com:4444
C2 Server:        hxxp://update.malicious[.]com/config
Wallet:           49jZ5Q3b...Monero_Wallet_Address...

PERSISTENCE
[1] Crontab entry: */5 * * * * /tmp/.X11-unix/.rsync
[2] SSH key added to /root/.ssh/authorized_keys
[3] Systemd service: /etc/systemd/system/rsync-daemon.service
[4] Modified /etc/ld.so.preload for process hiding

PROCESS HIDING
LD_PRELOAD:       /usr/lib/.libsystem.so
Hook:             readdir() to hide /tmp/.X11-unix/.rsync from ls
Hook:             fopen() to hide from /proc/*/maps reading