mm-cli-skill

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

mm CLI

mm

is a high-performance multimodal context management CLI. It indexes directories instantly (~60ms for 700 files), then exposes Unix-style commands for exploring, querying, and extracting content from images, videos, PDFs, code, and other files.

Always use

--format json

for machine-readable output when parsing results programmatically.

mm

是一款高性能的多模态上下文管理CLI工具。它能瞬间完成目录索引（700个文件仅需约60毫秒），随后提供类Unix命令，用于对图片、视频、PDF、代码及其他文件进行浏览、查询和内容提取。

当以编程方式解析结果时，请始终使用

--format json

以获取机器可读的输出。

Installation

安装

bash

undefined

bash

undefined

First run

mm --help

mm --version

to confirm mm isn't already installed

先运行

mm --help

或

mm --version

确认mm尚未安装

pip install mm-ctx

Alternative: shell installer

替代方案：Shell安装器

macOS / Linux

curl -LsSf https://vlm-run.github.io/mm/install/install.sh | sh

Windows (PowerShell)

irm https://vlm-run.github.io/mm/install/install.ps1 | iex

undefined

irm https://vlm-run.github.io/mm/install/install.ps1 | iex

undefined

Commands

命令

Command	Purpose
`find`	Locate/list files by name/kind/ext/size, tabular listing, tree view, schema
`cat`	Content extraction (auto-detected by file type × mode)
`grep`	Content search — text and semantic (via embeddings)
`wc`	Count files, bytes, lines, tokens
`bench`	Benchmark suite with statistical analysis
`config`	Extraction mode settings (show, init, set, reset-db, reset-profiles, reset)
`profile`	Manage LLM provider profiles (list, add, update, use, remove)

命令	用途
`find`	按名称/类型/扩展名/大小定位/列出文件，支持表格列表、树形视图、模式查看
`cat`	内容提取（根据文件类型×模式自动识别）
`grep`	内容搜索 — 支持文本匹配和基于嵌入向量的语义搜索
`wc`	统计文件数量、字节数、行数、token数
`bench`	包含统计分析的基准测试套件
`config`	提取模式设置（查看、初始化、配置、重置数据库、重置配置文件、全部重置）
`profile`	管理LLM提供商配置文件（列出、添加、更新、切换使用、删除）

Workflow

工作流程

Start with
```
mm find <dir> --tree --depth 1
```
to see the directory structure.
Use
```
mm wc <dir> --by-kind
```
to estimate token counts for LLM context budgeting.
Explore with
```
find
```
,
```
grep
```
,
```
cat
```
as needed.
Use
```
mm cat <file> -m accurate
```
for LLM-powered descriptions.

首先运行
```
mm find <dir> --tree --depth 1
```
查看目录结构。
使用
```
mm wc <dir> --by-kind
```
统计token数量，为LLM上下文预算提供参考。
根据需要使用
```
find
```
、
```
grep
```
、
```
cat
```
命令。
使用
```
mm cat <file> -m accurate
```
获取LLM生成的精准描述。

find — locate files, tabular listing, tree view, schema

find — 定位文件、表格列表、树形视图、模式查看

bash

mm find <dir> --kind image                           # all images
mm find <dir> --kind video                           # all videos
mm find <dir> --kind document                        # all PDFs/docs
mm find <dir> --kind audio                           # audio files
mm find <dir> --name "test_.*\.py"                   # filter by name (regex)
mm find <dir> -n config                              # filter by name (substring)
mm find <dir> --ext .png,.webp                       # by extension
mm find <dir> --min-size 1mb --max-size 10mb         # by size range
mm find <dir> --kind image --limit 5 --format json   # JSON output, capped
mm find <dir> --sort size --reverse --limit 10       # largest files

~63ms via Rust fast path. Piped output is one path per line.

--format json

returns full metadata.

bash

undefined

bash

mm find <dir> --kind image                           # 查找所有图片
mm find <dir> --kind video                           # 查找所有视频
mm find <dir> --kind document                        # 查找所有PDF/文档
mm find <dir> --kind audio                           # 查找音频文件
mm find <dir> --name "test_.*\.py"                   # 按名称过滤（正则表达式）
mm find <dir> -n config                              # 按名称过滤（子字符串）
mm find <dir> --ext .png,.webp                       # 按扩展名过滤
mm find <dir> --min-size 1mb --max-size 10mb         # 按大小范围过滤
mm find <dir> --kind image --limit 5 --format json   # JSON格式输出，限制结果数量
mm find <dir> --sort size --reverse --limit 10       # 查找最大的10个文件

通过Rust快速路径实现，耗时约63毫秒。管道输出为每行一个路径。

--format json

返回完整元数据。

bash

undefined

Tabular listing (default)

表格列表（默认）

mm find <dir> # all files mm find <dir> --columns name,kind,size --limit 10 # select columns mm find <dir> --sort size --reverse --format json # sorted JSON

mm find <dir> # 列出所有文件 mm find <dir> --columns name,kind,size --limit 10 # 选择指定列 mm find <dir> --sort size --reverse --format json # 排序后的JSON输出

Tree view

树形视图

mm find <dir> --tree # full tree with sizes mm find <dir> --tree --depth 1 # top-level dirs only mm find <dir> --tree --kind image # only image files mm find <dir> --tree --format json # JSON tree structure

mm find <dir> --tree # 包含大小的完整目录树 mm find <dir> --tree --depth 1 # 仅显示顶层目录 mm find <dir> --tree --kind image # 仅显示图片文件的目录树 mm find <dir> --tree --format json # JSON格式的目录树结构

Schema inspection

模式检查

mm find <dir> --schema # Rich table with column docs mm find <dir> --schema --format json # machine-readable

mm find <dir> --schema # 包含列说明的详细表格 mm find <dir> --schema --format json # 机器可读格式

Include gitignored files

包含被git忽略的文件

mm find <dir> --no-ignore # bypass .gitignore rules mm find <dir> --no-ignore --kind video # gitignored videos mm find <dir> --no-ignore --tree # tree including ignored dirs


Columns in the `files` table:

| Column    | Type      | Description                                                                      |
| --------- | --------- | -------------------------------------------------------------------------------- |
| path      | string    | Relative path from scan root                                                     |
| name      | string    | File name with extension                                                         |
| stem      | string    | File name without extension                                                      |
| ext       | string    | Extension including dot (`.png`, `.pdf`)                                         |
| size      | uint64    | File size in bytes                                                               |
| modified  | timestamp | Last modification time                                                           |
| created   | timestamp | Creation time                                                                    |
| mime      | string    | MIME type (`image/png`, `application/pdf`)                                       |
| kind      | string    | `image`, `video`, `document`, `code`, `audio`, `data`, `config`, `text`, `other` |
| is_binary | bool      | Whether file is binary                                                           |
| depth     | uint16    | Directory depth (0 = top-level)                                                  |
| parent    | string    | Parent directory path                                                            |
| width     | uint32    | Pixel width (images from header, videos via native parsing). Null for non-media. |
| height    | uint32    | Pixel height (images from header, videos via native parsing). Null for non-media.|

mm find <dir> --no-ignore # 绕过.gitignore规则 mm find <dir> --no-ignore --kind video # 查找被git忽略的视频 mm find <dir> --no-ignore --tree # 包含被忽略目录的树形视图


`files` 表格中的列：

| 列名      | 类型      | 描述                                                                      |
| --------- | --------- | -------------------------------------------------------------------------------- |
| path      | string    | 相对于扫描根目录的路径                                                     |
| name      | string    | 带扩展名的文件名                                                         |
| stem      | string    | 不带扩展名的文件名                                                      |
| ext       | string    | 包含点的扩展名（`.png`, `.pdf`）                                         |
| size      | uint64    | 文件大小（字节）                                                               |
| modified  | timestamp | 最后修改时间                                                           |
| created   | timestamp | 创建时间                                                                    |
| mime      | string    | MIME类型（`image/png`, `application/pdf`）                                       |
| kind      | string    | `image`, `video`, `document`, `code`, `audio`, `data`, `config`, `text`, `other` |
| is_binary | bool      | 文件是否为二进制                                                           |
| depth     | uint16    | 目录深度（0 = 顶层）                                                  |
| parent    | string    | 父目录路径                                                            |
| width     | uint32    | 像素宽度（图片从头部读取，视频通过原生解析获取）。非媒体文件为Null。 |
| height    | uint32    | 像素高度（图片从头部读取，视频通过原生解析获取）。非媒体文件为Null。|

cat — content extraction (pipeline-driven)

cat — 内容提取（流水线驱动）

Behaviour is auto-detected from file type. Default

--mode fast

runs local extraction (no LLM). Use

-m accurate

for LLM-powered descriptions.

bash

undefined

行为根据文件类型自动识别。默认的

--mode fast

运行本地提取（无需LLM）。使用

-m accurate

获取LLM生成的描述。

bash

undefined

Fast mode (default) — local extraction, no LLM

快速模式（默认）—— 本地提取，无需LLM

mm cat <file> # text/metadata extraction mm cat photo.png # image metadata (dims, MIME, hash, EXIF) mm cat video.mp4 # video metadata (resolution, duration, codecs) mm cat paper.pdf # text extraction via pypdfium2

mm cat <file> # 文本/元数据提取 mm cat photo.png # 图片元数据（尺寸、MIME、哈希、EXIF） mm cat video.mp4 # 视频元数据（分辨率、时长、编解码器） mm cat paper.pdf # 通过pypdfium2提取文本

Accurate mode — LLM-powered descriptions

精准模式 —— LLM生成的描述

mm cat photo.png -m accurate # VLM caption mm cat video.mp4 -m accurate # mosaic → VLM description mm cat audio.mp3 -m accurate # transcript → LLM summary mm cat paper.pdf -m accurate # text → LLM summary

mm cat photo.png -m accurate # VLM图片描述 mm cat video.mp4 -m accurate # 马赛克 → VLM描述 mm cat audio.mp3 -m accurate # 转录文本 → LLM摘要 mm cat paper.pdf -m accurate # 文本 → LLM摘要

Head / tail

头部/尾部内容

mm cat <file> -n 20 # first 20 lines mm cat <file> -n -10 # last 10 lines

mm cat <file> -n 20 # 前20行 mm cat <file> -n -10 # 后10行

Cache control

缓存控制

mm cat <file> --no-cache # bypass cache, force fresh run

mm cat <file> --no-cache # 绕过缓存，强制重新运行

Output formats

输出格式

mm cat <file> --format json # JSON output


Fast mode behavior by file type (<100ms target):

- **PDF** (.pdf): text extraction via pypdfium2. Scanned/image-only PDFs return empty.
- **Document** (.docx, .pptx): text extraction.
- **Image** (.png/.jpg/.webp/.gif/.bmp/.tiff/.svg): dimensions, MIME, xxh3 hash, EXIF data.
- **Video** (.mp4/.mkv/.webm/.avi/.mov): resolution, duration, FPS, codecs (metadata only, no ffmpeg).
- **Audio** (.mp3/.wav/.flac/.aac/.ogg/.m4a): duration, codec, bitrate (metadata only).

mm cat <file> --format json # JSON格式输出


快速模式按文件类型的行为（目标耗时<100毫秒）：

- **PDF** (.pdf)：通过pypdfium2提取文本。扫描版/纯图片PDF返回空内容。
- **文档** (.docx, .pptx)：提取文本。
- **图片** (.png/.jpg/.webp/.gif/.bmp/.tiff/.svg)：尺寸、MIME、xxh3哈希、EXIF数据。
- **视频** (.mp4/.mkv/.webm/.avi/.mov)：分辨率、时长、FPS、编解码器（仅元数据，无需ffmpeg）。
- **音频** (.mp3/.wav/.flac/.aac/.ogg/.m4a)：时长、编解码器、比特率（仅元数据）。

cat -p — named encoders and pipeline YAMLs

cat -p — 命名编码器与流水线YAML

The

-p

--pipeline

flag accepts either a registered encoder name or a YAML file path.

bash

undefined

-p

--pipeline

标志可接受已注册的编码器名称或YAML文件路径。

bash

undefined

Named encoder (encodes media into VLM-ready JSON messages)

命名编码器（将媒体编码为VLM兼容的JSON消息）

mm cat photo.png -p image-resize # Fit to 1024px, base64 encode mm cat photo.png -p image-tile # Resized overview + all tiles in one Message mm cat video.mp4 -p video-frame-sample # Extract frames at 1fps mm cat video.mp4 -p video-chunk # Chunk into 60s segments mm cat doc.pdf -p document-rasterize # Render pages as images mm cat doc.pdf -p document-rasterize-text # Rasterize + extract text

mm cat photo.png -p image-resize # 调整至1024px，base64编码 mm cat photo.png -p image-tile # 调整后的概览图 + 所有分片整合为一条消息 mm cat video.mp4 -p video-frame-sample # 按1fps提取帧 mm cat video.mp4 -p video-chunk # 分割为60秒的片段 mm cat doc.pdf -p document-rasterize # 将页面渲染为图片 mm cat doc.pdf -p document-rasterize-text # 渲染图片 + 提取文本

YAML pipeline file

YAML流水线文件

mm cat photo.png -p custom-pipeline.yaml

Multiple pipelines (dispatched by kind field in YAML)

多流水线（根据YAML中的kind字段分发）

mm cat *.jpg *.mp4 -p image.yaml -p video.yaml

List available encoders and pipelines

列出可用的编码器和流水线

mm cat --list-pipelines

undefined

mm cat --list-pipelines

undefined

Built-in encoders

内置编码器

Use either the bare name or the kind-prefixed display name.

Name	Media	Description
`image-resize`	image	Default. Fit to 1024px bounding box
`image-tile`	image	Resized overview + tile crops in one Message
`video-frame-sample`	video	Extract frames at fps (requires ffmpeg)
`video-frames-transcript`	video	Frames + Whisper transcript (accurate mode default)
`video-chunk`	video	Chunk into time-based segments with overlap
`video-mosaic`	video	Build mosaic grids from sampled frames
`video-shot-frames`	video	Scene detection → representative frames per shot
`video-shot-mosaic`	video	Scene detection → mosaic grid per shot
`video-gemini`	video	Pass video file as a Gemini Part
`video-gemini-chunked`	video	Chunk video into Gemini Parts
`audio-transcribe`	audio	Transcribe audio via Whisper (fast/accurate default)
`audio-gemini`	audio	Pass audio file as a Gemini Part
`document-page-text`	document	Extract text per page from PDF/DOCX/PPTX
`document-rasterize`	document	Render PDF pages as images (requires pypdfium2)
`document-rasterize-text`	document	Rasterize + extract text, interleaved
`document-gemini`	document	Pass document file as a Gemini Part

可使用裸名称或带类型前缀的显示名称。

名称	媒体类型	描述
`image-resize`	图片	默认选项。调整至1024px边界框
`image-tile`	图片	调整后的概览图 + 分片裁剪整合为一条消息
`video-frame-sample`	视频	按指定fps提取帧（需要ffmpeg）
`video-frames-transcript`	视频	帧 + Whisper转录文本（精准模式默认选项）
`video-chunk`	视频	按时间分割为带重叠的片段
`video-mosaic`	视频	从采样帧构建马赛克网格
`video-shot-frames`	视频	场景检测 → 每个场景的代表性帧
`video-shot-mosaic`	视频	场景检测 → 每个场景的马赛克网格
`video-gemini`	视频	将视频文件作为Gemini部件传递
`video-gemini-chunked`	视频	将视频分割为Gemini部件
`audio-transcribe`	音频	通过Whisper转录音频（快速/精准模式默认选项）
`audio-gemini`	音频	将音频文件作为Gemini部件传递
`document-page-text`	文档	从PDF/DOCX/PPTX逐页提取文本
`document-rasterize`	文档	将PDF页面渲染为图片（需要pypdfium2）
`document-rasterize-text`	文档	渲染图片 + 提取文本，交替输出
`document-gemini`	文档	将文档文件作为Gemini部件传递

Writing custom encoders

编写自定义编码器

Create a

.py

file in

python/mm/encoders/image/

python/mm/encoders/video/

(auto-discovered) or

~/.config/mm/encoders/

. The

name

is optional — it defaults to the function name with underscores replaced by hyphens:

python

from pathlib import Path
from mm.encoders import register_encoder

@register_encoder(media_types=("image",))
def my_custom(path: Path, **kw):
    """Registered as 'my-custom' (auto-named from function)."""
    import base64, io
    from PIL import Image
    img = Image.open(path)
    img.thumbnail((1024, 1024))
    buf = io.BytesIO()
    img.save(buf, "JPEG", quality=90)
    b64 = base64.b64encode(buf.getvalue()).decode()
    yield {"role": "user", "content": [
        {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{b64}"}}
    ]}

在

python/mm/encoders/image/

、

python/mm/encoders/video/

（自动发现）或

~/.config/mm/encoders/

目录下创建

.py

文件。

name

为可选参数 —— 默认将函数名的下划线替换为连字符：

python

from pathlib import Path
from mm.encoders import register_encoder

@register_encoder(media_types=("image",))
def my_custom(path: Path, **kw):
    """自动注册为'my-custom'（从函数名自动命名）。"""
    import base64, io
    from PIL import Image
    img = Image.open(path)
    img.thumbnail((1024, 1024))
    buf = io.BytesIO()
    img.save(buf, "JPEG", quality=90)
    b64 = base64.b64encode(buf.getvalue()).decode()
    yield {"role": "user", "content": [
        {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{b64}"}}
    ]}

Python API

python

from mm import process_image, process_image_tiled, process_video, process_document
from pathlib import Path

msg = process_image(Path("photo.png"), max_width=1024)       # Single Message dict
tiles = list(process_image_tiled(Path("scan.png"), tile_size=1024))  # Multiple Messages
chunks = list(process_video(Path("video.mp4")))               # Multiple Messages
pages = list(process_document(Path("doc.pdf")))               # Multiple Messages

python

from mm import process_image, process_image_tiled, process_video, process_document
from pathlib import Path

msg = process_image(Path("photo.png"), max_width=1024)       # 单个消息字典
tiles = list(process_image_tiled(Path("scan.png"), tile_size=1024))  # 多个消息
chunks = list(process_video(Path("video.mp4")))               # 多个消息
pages = list(process_document(Path("doc.pdf")))               # 多个消息

Via Context

通过Context调用

from mm import Context ctx = Context("~/data") messages = ctx.encode("photo.png", strategy="resize")

undefined

from mm import Context ctx = Context("~/data") messages = ctx.encode("photo.png", strategy="resize")

undefined

cat — pipeline overrides

cat — 流水线覆盖

Pipelines are 2-stage YAMLs: encode (convert to LLM-ready parts) → generate (LLM call). Key parameters can be overridden from the CLI.

流水线是两阶段YAML：编码（转换为LLM兼容部件）→ 生成（调用LLM）。可通过CLI覆盖关键参数。

Encode overrides (--encode.*)

编码覆盖（--encode.*）

bash

mm cat photo.png -m accurate --encode.strategy image-tile      # override encoder
mm cat photo.png -m accurate --encode.pyfunc ~/my_filter.py    # custom transform

bash

mm cat photo.png -m accurate --encode.strategy image-tile      # 覆盖编码器
mm cat photo.png -m accurate --encode.pyfunc ~/my_filter.py    # 自定义转换

Generate overrides (--generate.*)

生成覆盖（--generate.*）

bash

mm cat photo.png -m accurate --generate.max-tokens 1024          # increase token limit
mm cat photo.png -m accurate --generate.temperature 0.5           # higher temperature
mm cat photo.png -m accurate --generate.prompt "List 3 main objects in this image."
mm cat photo.png -m accurate --generate.json-mode true            # request JSON response

bash

mm cat photo.png -m accurate --generate.max-tokens 1024          # 增加token限制
mm cat photo.png -m accurate --generate.temperature 0.5           # 提高温度参数
mm cat photo.png -m accurate --generate.prompt "List 3 main objects in this image."
mm cat photo.png -m accurate --generate.json-mode true            # 请求JSON格式响应

Combining overrides

组合覆盖

bash

mm cat photo.png -m accurate \
  --encode.strategy image-tile \
  --generate.max-tokens 512 \
  --generate.prompt "Analyze this architecture diagram."

bash

mm cat photo.png -m accurate \
  --encode.strategy image-tile \
  --generate.max-tokens 512 \
  --generate.prompt "Analyze this architecture diagram."

cat -p — explicit pipeline YAML

cat -p — 显式流水线YAML

Load custom pipeline configurations from YAML files. The YAML's

kind

field dispatches the pipeline to the correct media type. The

generate

field is optional — omit it for encode-only pipelines.

从YAML文件加载自定义流水线配置。YAML的

kind

字段将流水线分发到对应的媒体类型。

generate

字段为可选参数 —— 省略则为仅编码流水线。

Single pipeline

单流水线

yaml

undefined

yaml

undefined

custom-image.yaml

kind: image mode: accurate encode: strategy: resize max_width: 512 generate: prompt: "What is in this image? One sentence only." max_tokens: 64


```bash
mm cat photo.png -p custom-image.yaml

kind: image mode: accurate encode: strategy: resize max_width: 512 generate: prompt: "What is in this image? One sentence only." max_tokens: 64


```bash
mm cat photo.png -p custom-image.yaml

Encode-only pipeline (no LLM)

仅编码流水线（无LLM）

yaml

undefined

yaml

undefined

encode-only.yaml

kind: document mode: fast encode: strategy: null

generate omitted = encode-only, no LLM call

省略generate = 仅编码，不调用LLM

undefined

undefined

Multi-document YAML

多文档YAML

yaml

kind: image
mode: accurate
encode:
  strategy: image-tile
  max_width: 2048
generate:
  prompt: "Describe this image in detail."
  max_tokens: 512
---
kind: video
mode: accurate
encode:
  mosaic_tile: "8x6"
  mosaic_count: 2
  frame_selection: scene
generate:
  prompt: "Summarize this video."
  max_tokens: 1024

bash

mm cat *.jpg *.mp4 -p multi-pipeline.yaml

yaml

kind: image
mode: accurate
encode:
  strategy: image-tile
  max_width: 2048
generate:
  prompt: "Describe this image in detail."
  max_tokens: 512
---
kind: video
mode: accurate
encode:
  mosaic_tile: "8x6"
  mosaic_count: 2
  frame_selection: scene
generate:
  prompt: "Summarize this video."
  max_tokens: 1024

bash

mm cat *.jpg *.mp4 -p multi-pipeline.yaml

CLI overrides layer on top of -p

CLI覆盖叠加在-p之上

bash

mm cat photo.png -p my-pipeline.yaml --generate.max-tokens 128

bash

mm cat photo.png -p my-pipeline.yaml --generate.max-tokens 128

TOML pipeline path overrides

TOML流水线路径覆盖

Override default pipeline paths in

~/.config/mm/mm.toml

toml

[pipelines]
image.fast = "~/.config/mm/pipelines/image/fast.yaml"
video.accurate = "/path/to/my-video-accurate.yaml"

在

~/.config/mm/mm.toml

中覆盖默认流水线路径：

toml

[pipelines]
image.fast = "~/.config/mm/pipelines/image/fast.yaml"
video.accurate = "/path/to/my-video-accurate.yaml"

cat — custom Python transforms (pyfunc)

cat — 自定义Python转换（pyfunc）

The

--encode.pyfunc

flag runs a custom Python transform on the encoded content parts before they are sent to the LLM.

--encode.pyfunc

标志在内容部件发送到LLM之前，运行自定义Python转换。

File-based pyfunc

文件形式的pyfunc

bash

undefined

bash

undefined

my_transform.py:

def transform(parts, context):

extra = {"type": "text", "text": "Focus on the data flow."}

return parts + [extra]

mm cat photo.png -m accurate --encode.pyfunc ~/my_transform.py

undefined

mm cat photo.png -m accurate --encode.pyfunc ~/my_transform.py

undefined

Pyfunc in pipeline YAML

流水线YAML中的pyfunc

yaml

kind: image
mode: accurate
encode:
  strategy: resize
  max_width: 512
  pyfunc: ~/my_transforms/filter.py
generate:
  prompt: "Analyze this image."
  max_tokens: 128

Inline

def

syntax in YAML:

yaml

encode:
  pyfunc: |
    def transform(parts, context):
        return [p for p in parts if p.get("type") == "image_url"]

yaml

kind: image
mode: accurate
encode:
  strategy: resize
  max_width: 512
  pyfunc: ~/my_transforms/filter.py
generate:
  prompt: "Analyze this image."
  max_tokens: 128

YAML中的内联

def

语法：

yaml

encode:
  pyfunc: |
    def transform(parts, context):
        return [p for p in parts if p.get("type") == "image_url"]

wc — count files, bytes, lines, tokens

wc — 统计文件数量、字节数、行数、token数

bash

mm wc <dir>                      # summary totals
mm wc <dir> --by-kind            # breakdown by file kind
mm wc <dir> --kind document      # only documents
mm wc <dir> --format json        # machine-readable

Estimates LLM tokens (~chars/4 for text, tile-based for images). ~65ms.

bash

mm wc <dir>                      # 汇总统计
mm wc <dir> --by-kind            # 按文件类型细分
mm wc <dir> --kind document      # 仅统计文档
mm wc <dir> --format json        # 机器可读格式

估算LLM token数（文本约为字符数/4，图片基于分片计算）。耗时约65毫秒。

grep — content search (text + semantic)

grep — 内容搜索（文本+语义）

bash

undefined

bash

undefined

Text search (regex matching)

文本搜索（正则匹配）

mm grep "pattern" <dir> # search all files mm grep "attention" <dir> --kind document # search only documents mm grep "TODO" <dir> --kind code # search code files mm grep "invoice" <dir> --kind document --format json # JSON output mm grep "error" <dir> -C 2 # 2 context lines mm grep "invoice" <dir> --count # match counts per file mm grep "Quantum Phase" <dir> -i # case-insensitive search mm grep "TODO" <dir> --ignore-case --kind code # case-insensitive in code mm grep "secret" <dir> --no-ignore # search gitignored files too

mm grep "pattern" <dir> # 搜索所有文件 mm grep "attention" <dir> --kind document # 仅搜索文档 mm grep "TODO" <dir> --kind code # 搜索代码文件 mm grep "invoice" <dir> --kind document --format json # JSON格式输出 mm grep "error" <dir> -C 2 # 显示匹配行前后2行上下文 mm grep "invoice" <dir> --count # 按文件统计匹配次数 mm grep "Quantum Phase" <dir> -i # 大小写不敏感搜索 mm grep "TODO" <dir> --ignore-case --kind code # 代码文件中大小写不敏感搜索 mm grep "secret" <dir> --no-ignore # 同时搜索被git忽略的文件

Semantic search (vector similarity via embeddings)

语义搜索（基于嵌入向量的相似度匹配）

mm grep "financial projections" <dir> -s # semantic search across all files mm grep "architecture overview" <dir> -s --format json # JSON with distances mm grep "revenue forecast" <dir> -s --index # auto-index unindexed files before search


**Warning**: grep runs extraction on every matching file. On large document directories (500+ PDFs), this can take minutes. Prefer `--kind code` or `--kind text` for fast text searches.

mm grep "financial projections" <dir> -s # 跨所有文件的语义搜索 mm grep "architecture overview" <dir> -s --format json # 带相似度距离的JSON输出 mm grep "revenue forecast" <dir> -s --index # 搜索前自动为未索引文件建立索引


**注意**：grep会对每个匹配文件执行提取操作。在大型文档目录（500+个PDF）中，这可能需要数分钟。对于快速文本搜索，优先使用`--kind code`或`--kind text`。

bench — benchmark suite

bench — 基准测试套件

bash

mm bench <dir>                          # full benchmark
mm bench <dir> --rounds 5               # more measurement rounds
mm bench <dir> --mode accurate          # include accurate-mode benchmarks
mm bench <dir> --format json            # JSON output for archival

bash

mm bench <dir>                          # 完整基准测试
mm bench <dir> --rounds 5               # 增加测试轮次
mm bench <dir> --mode accurate          # 包含精准模式基准测试
mm bench <dir> --format json            # 用于存档的JSON格式输出

config — extraction mode settings

config — 提取模式设置

bash

mm config show                                  # show current config
mm config init                                  # create config with default profile
mm config init --force                          # overwrite existing config
mm config set mode.fast.whisper_model tiny       # set a config value
mm config set mode.accurate.beam_size 5          # set a config value
mm config reset-db                              # delete all databases and caches
mm config reset-profiles                        # restore profiles to defaults
mm config reset                                 # reset everything (db + profiles)

bash

mm config show                                  # 查看当前配置
mm config init                                  # 使用默认配置文件创建配置
mm config init --force                          # 覆盖现有配置
mm config set mode.fast.whisper_model tiny       # 设置配置值
mm config set mode.accurate.beam_size 5          # 设置配置值
mm config reset-db                              # 删除所有数据库和缓存
mm config reset-profiles                        # 将配置文件恢复为默认值
mm config reset                                 # 重置所有内容（数据库+配置文件）

profile — LLM provider management

profile — LLM提供商管理

Provider settings are managed through profiles stored in

~/.config/mm/mm.toml

bash

mm profile list                                            # list all profiles (● = active)
mm profile add openrouter --base-url https://openrouter.ai/api/v1 --model vlm-1
mm profile update openrouter --model gemma4:e2b
mm profile use openrouter                                  # switch active profile
mm profile remove openrouter

Per-command profile selection:

bash

mm --profile openrouter cat photo.png -m accurate    # one-off override
MM_PROFILE=openrouter mm cat photo.png -m accurate   # env override

提供商设置通过存储在

~/.config/mm/mm.toml

中的配置文件进行管理。

bash

mm profile list                                            # 列出所有配置文件（● = 当前激活）
mm profile add openrouter --base-url https://openrouter.ai/api/v1 --model vlm-1
mm profile update openrouter --model gemma4:e2b
mm profile use openrouter                                  # 切换激活的配置文件
mm profile remove openrouter

按命令选择配置文件：

bash

mm --profile openrouter cat photo.png -m accurate    # 单次覆盖
MM_PROFILE=openrouter mm cat photo.png -m accurate   # 通过环境变量覆盖

Output formats

输出格式

TTY: Rich formatted tables/panels (human-friendly).
Piped / non-TTY: plain TSV/text or one-path-per-line (machine-readable, no ANSI codes).
--format json
: JSON output. Always use this when parsing results programmatically.
--format tsv
: Tab-separated values. Maximum token efficiency.
--format csv
: Comma-separated values.
--format dataset-jsonl
: JSONL for dataset export.
--format dataset-hf
: HuggingFace Datasets format.

TTY：富格式表格/面板（适合人类阅读）。
管道/非TTY：纯TSV/文本或每行一个路径（机器可读，无ANSI代码）。
--format json
：JSON格式输出。编程解析结果时请始终使用此格式。
--format tsv
：制表符分隔值。最高的token效率。
--format csv
：逗号分隔值。
--format dataset-jsonl
：用于数据集导出的JSONL格式。
--format dataset-hf
：HuggingFace Datasets格式。

Pipe composability

管道组合性

bash

mm find <dir> --kind image | mm cat               # find images, extract metadata
mm find <dir> --kind document --min-size 10mb | wc -l  # count large PDFs
mm find <dir> --kind video --format json | jq '.[].name'  # extract video names

bash

mm find <dir> --kind image | mm cat               # 查找图片并提取元数据
mm find <dir> --kind document --min-size 10mb | wc -l  # 统计大型PDF数量
mm find <dir> --kind video --format json | jq '.[].name'  # 提取视频名称

Tips

小贴士

All metadata commands (
```
find
```
,
```
wc
```
) run in ~60ms via the Rust fast path.
Start with
```
find --tree --depth 1
```
then
```
wc --by-kind
```
for the fastest directory overview.
Use
```
--format json
```
when you need to parse output programmatically.
```
find
```
returns paths only when piped, else it returns full metadata rows.
For PDFs,
```
cat
```
extracts text in fast mode; if empty, the PDF contains scanned images only.
For videos,
```
mm cat video.mp4 -m accurate
```
auto-generates keyframe mosaics and sends to LLM.
Use
```
--mode fast
```
for quick metadata/text extraction (default),
```
--mode accurate
```
for LLM descriptions.
Use
```
--no-cache
```
with
```
-m accurate
```
to force a fresh LLM call.
Use
```
-p
```
to load custom pipeline YAMLs or named encoders; CLI overrides layer on top.
Use
```
--encode.pyfunc
```
to inject custom Python transforms.
Use
```
--list-pipelines
```
to see all available encoders and built-in pipelines.

所有元数据命令（
```
find
```
,
```
wc
```
）通过Rust快速路径运行，耗时约60毫秒。
快速了解目录的最佳方式是先运行
```
find --tree --depth 1
```
，再运行
```
wc --by-kind
```
。
当需要编程解析输出时，请使用
```
--format json
```
。
管道输出时
```
find
```
仅返回路径，否则返回完整元数据行。
对于PDF，
```
cat
```
在快速模式下提取文本；如果返回空内容，则该PDF为扫描版纯图片PDF。
对于视频，
```
mm cat video.mp4 -m accurate
```
会自动生成关键帧马赛克并发送给LLM。
使用
```
--mode fast
```
进行快速元数据/文本提取（默认），使用
```
--mode accurate
```
获取LLM生成的描述。
使用
```
--no-cache
```
搭配
```
-m accurate
```
强制重新调用LLM。
使用
```
-p
```
加载自定义流水线YAML或命名编码器；CLI参数会叠加覆盖。
使用
```
--encode.pyfunc
```
注入自定义Python转换。
使用
```
--list-pipelines
```
查看所有可用编码器和内置流水线。

mm-cli-skill

Original

Translation

mm CLI

mm CLI

Installation

安装

First run mm --help or mm --version to confirm mm isn't already installed

先运行 mm --help 或 mm --version 确认mm尚未安装

Alternative: shell installer

替代方案：Shell安装器

macOS / Linux

macOS / Linux

Windows (PowerShell)

Windows (PowerShell)

Commands

命令

Workflow

工作流程

find — locate files, tabular listing, tree view, schema

find — 定位文件、表格列表、树形视图、模式查看

Tabular listing (default)

表格列表（默认）

Tree view

树形视图

Schema inspection

模式检查

Include gitignored files

包含被git忽略的文件

cat — content extraction (pipeline-driven)

cat — 内容提取（流水线驱动）

Fast mode (default) — local extraction, no LLM

快速模式（默认）—— 本地提取，无需LLM

Accurate mode — LLM-powered descriptions

精准模式 —— LLM生成的描述

Head / tail

头部/尾部内容

Cache control

缓存控制

Output formats

输出格式

cat -p — named encoders and pipeline YAMLs

cat -p — 命名编码器与流水线YAML

Named encoder (encodes media into VLM-ready JSON messages)

命名编码器（将媒体编码为VLM兼容的JSON消息）

YAML pipeline file

YAML流水线文件

Multiple pipelines (dispatched by kind field in YAML)

多流水线（根据YAML中的kind字段分发）

List available encoders and pipelines

列出可用的编码器和流水线

Built-in encoders

内置编码器

Writing custom encoders

编写自定义编码器

Python API

Python API

Via Context

通过Context调用

cat — pipeline overrides

cat — 流水线覆盖

Encode overrides (--encode.*)

编码覆盖（--encode.*）

Generate overrides (--generate.*)

生成覆盖（--generate.*）

Combining overrides

组合覆盖

cat -p — explicit pipeline YAML

cat -p — 显式流水线YAML

Single pipeline

单流水线

custom-image.yaml

custom-image.yaml

Encode-only pipeline (no LLM)

仅编码流水线（无LLM）

encode-only.yaml

encode-only.yaml

generate omitted = encode-only, no LLM call

省略generate = 仅编码，不调用LLM

Multi-document YAML

First run
`mm --help`
or
`mm --version`
to confirm mm isn't already installed

先运行
`mm --help`
或
`mm --version`
确认mm尚未安装