mm-cli-skill

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

mm CLI

mm CLI

mm
is a high-performance multimodal context management CLI. It indexes directories instantly (~60ms for 700 files), then exposes Unix-style commands for exploring, querying, and extracting content from images, videos, PDFs, code, and other files.
Always use
--format json
for machine-readable output when parsing results programmatically.
mm
是一款高性能的多模态上下文管理CLI工具。它能瞬间完成目录索引(700个文件仅需约60毫秒),随后提供类Unix命令,用于对图片、视频、PDF、代码及其他文件进行浏览、查询和内容提取。
当以编程方式解析结果时,请始终使用
--format json
以获取机器可读的输出。

Installation

安装

bash
undefined
bash
undefined

First run
mm --help
or
mm --version
to confirm mm isn't already installed

先运行
mm --help
mm --version
确认mm尚未安装

pip install mm-ctx
pip install mm-ctx

Alternative: shell installer

替代方案:Shell安装器

macOS / Linux

macOS / Linux

Windows (PowerShell)

Windows (PowerShell)

Commands

命令

CommandPurpose
find
Locate/list files by name/kind/ext/size, tabular listing, tree view, schema
cat
Content extraction (auto-detected by file type × mode)
grep
Content search — text and semantic (via embeddings)
wc
Count files, bytes, lines, tokens
bench
Benchmark suite with statistical analysis
config
Extraction mode settings (show, init, set, reset-db, reset-profiles, reset)
profile
Manage LLM provider profiles (list, add, update, use, remove)
命令用途
find
按名称/类型/扩展名/大小定位/列出文件,支持表格列表、树形视图、模式查看
cat
内容提取(根据文件类型×模式自动识别)
grep
内容搜索 — 支持文本匹配和基于嵌入向量的语义搜索
wc
统计文件数量、字节数、行数、token数
bench
包含统计分析的基准测试套件
config
提取模式设置(查看、初始化、配置、重置数据库、重置配置文件、全部重置)
profile
管理LLM提供商配置文件(列出、添加、更新、切换使用、删除)

Workflow

工作流程

  1. Start with
    mm find <dir> --tree --depth 1
    to see the directory structure.
  2. Use
    mm wc <dir> --by-kind
    to estimate token counts for LLM context budgeting.
  3. Explore with
    find
    ,
    grep
    ,
    cat
    as needed.
  4. Use
    mm cat <file> -m accurate
    for LLM-powered descriptions.
  1. 首先运行
    mm find <dir> --tree --depth 1
    查看目录结构。
  2. 使用
    mm wc <dir> --by-kind
    统计token数量,为LLM上下文预算提供参考。
  3. 根据需要使用
    find
    grep
    cat
    命令。
  4. 使用
    mm cat <file> -m accurate
    获取LLM生成的精准描述。

find — locate files, tabular listing, tree view, schema

find — 定位文件、表格列表、树形视图、模式查看

bash
mm find <dir> --kind image                           # all images
mm find <dir> --kind video                           # all videos
mm find <dir> --kind document                        # all PDFs/docs
mm find <dir> --kind audio                           # audio files
mm find <dir> --name "test_.*\.py"                   # filter by name (regex)
mm find <dir> -n config                              # filter by name (substring)
mm find <dir> --ext .png,.webp                       # by extension
mm find <dir> --min-size 1mb --max-size 10mb         # by size range
mm find <dir> --kind image --limit 5 --format json   # JSON output, capped
mm find <dir> --sort size --reverse --limit 10       # largest files
~63ms via Rust fast path. Piped output is one path per line.
--format json
returns full metadata.
bash
undefined
bash
mm find <dir> --kind image                           # 查找所有图片
mm find <dir> --kind video                           # 查找所有视频
mm find <dir> --kind document                        # 查找所有PDF/文档
mm find <dir> --kind audio                           # 查找音频文件
mm find <dir> --name "test_.*\.py"                   # 按名称过滤(正则表达式)
mm find <dir> -n config                              # 按名称过滤(子字符串)
mm find <dir> --ext .png,.webp                       # 按扩展名过滤
mm find <dir> --min-size 1mb --max-size 10mb         # 按大小范围过滤
mm find <dir> --kind image --limit 5 --format json   # JSON格式输出,限制结果数量
mm find <dir> --sort size --reverse --limit 10       # 查找最大的10个文件
通过Rust快速路径实现,耗时约63毫秒。管道输出为每行一个路径。
--format json
返回完整元数据。
bash
undefined

Tabular listing (default)

表格列表(默认)

mm find <dir> # all files mm find <dir> --columns name,kind,size --limit 10 # select columns mm find <dir> --sort size --reverse --format json # sorted JSON
mm find <dir> # 列出所有文件 mm find <dir> --columns name,kind,size --limit 10 # 选择指定列 mm find <dir> --sort size --reverse --format json # 排序后的JSON输出

Tree view

树形视图

mm find <dir> --tree # full tree with sizes mm find <dir> --tree --depth 1 # top-level dirs only mm find <dir> --tree --kind image # only image files mm find <dir> --tree --format json # JSON tree structure
mm find <dir> --tree # 包含大小的完整目录树 mm find <dir> --tree --depth 1 # 仅显示顶层目录 mm find <dir> --tree --kind image # 仅显示图片文件的目录树 mm find <dir> --tree --format json # JSON格式的目录树结构

Schema inspection

模式检查

mm find <dir> --schema # Rich table with column docs mm find <dir> --schema --format json # machine-readable
mm find <dir> --schema # 包含列说明的详细表格 mm find <dir> --schema --format json # 机器可读格式

Include gitignored files

包含被git忽略的文件

mm find <dir> --no-ignore # bypass .gitignore rules mm find <dir> --no-ignore --kind video # gitignored videos mm find <dir> --no-ignore --tree # tree including ignored dirs

Columns in the `files` table:

| Column    | Type      | Description                                                                      |
| --------- | --------- | -------------------------------------------------------------------------------- |
| path      | string    | Relative path from scan root                                                     |
| name      | string    | File name with extension                                                         |
| stem      | string    | File name without extension                                                      |
| ext       | string    | Extension including dot (`.png`, `.pdf`)                                         |
| size      | uint64    | File size in bytes                                                               |
| modified  | timestamp | Last modification time                                                           |
| created   | timestamp | Creation time                                                                    |
| mime      | string    | MIME type (`image/png`, `application/pdf`)                                       |
| kind      | string    | `image`, `video`, `document`, `code`, `audio`, `data`, `config`, `text`, `other` |
| is_binary | bool      | Whether file is binary                                                           |
| depth     | uint16    | Directory depth (0 = top-level)                                                  |
| parent    | string    | Parent directory path                                                            |
| width     | uint32    | Pixel width (images from header, videos via native parsing). Null for non-media. |
| height    | uint32    | Pixel height (images from header, videos via native parsing). Null for non-media.|
mm find <dir> --no-ignore # 绕过.gitignore规则 mm find <dir> --no-ignore --kind video # 查找被git忽略的视频 mm find <dir> --no-ignore --tree # 包含被忽略目录的树形视图

`files` 表格中的列:

| 列名      | 类型      | 描述                                                                      |
| --------- | --------- | -------------------------------------------------------------------------------- |
| path      | string    | 相对于扫描根目录的路径                                                     |
| name      | string    | 带扩展名的文件名                                                         |
| stem      | string    | 不带扩展名的文件名                                                      |
| ext       | string    | 包含点的扩展名(`.png`, `.pdf`)                                         |
| size      | uint64    | 文件大小(字节)                                                               |
| modified  | timestamp | 最后修改时间                                                           |
| created   | timestamp | 创建时间                                                                    |
| mime      | string    | MIME类型(`image/png`, `application/pdf`)                                       |
| kind      | string    | `image`, `video`, `document`, `code`, `audio`, `data`, `config`, `text`, `other` |
| is_binary | bool      | 文件是否为二进制                                                           |
| depth     | uint16    | 目录深度(0 = 顶层)                                                  |
| parent    | string    | 父目录路径                                                            |
| width     | uint32    | 像素宽度(图片从头部读取,视频通过原生解析获取)。非媒体文件为Null。 |
| height    | uint32    | 像素高度(图片从头部读取,视频通过原生解析获取)。非媒体文件为Null。|

cat — content extraction (pipeline-driven)

cat — 内容提取(流水线驱动)

Behaviour is auto-detected from file type. Default
--mode fast
runs local extraction (no LLM). Use
-m accurate
for LLM-powered descriptions.
bash
undefined
行为根据文件类型自动识别。默认的
--mode fast
运行本地提取(无需LLM)。使用
-m accurate
获取LLM生成的描述。
bash
undefined

Fast mode (default) — local extraction, no LLM

快速模式(默认)—— 本地提取,无需LLM

mm cat <file> # text/metadata extraction mm cat photo.png # image metadata (dims, MIME, hash, EXIF) mm cat video.mp4 # video metadata (resolution, duration, codecs) mm cat paper.pdf # text extraction via pypdfium2
mm cat <file> # 文本/元数据提取 mm cat photo.png # 图片元数据(尺寸、MIME、哈希、EXIF) mm cat video.mp4 # 视频元数据(分辨率、时长、编解码器) mm cat paper.pdf # 通过pypdfium2提取文本

Accurate mode — LLM-powered descriptions

精准模式 —— LLM生成的描述

mm cat photo.png -m accurate # VLM caption mm cat video.mp4 -m accurate # mosaic → VLM description mm cat audio.mp3 -m accurate # transcript → LLM summary mm cat paper.pdf -m accurate # text → LLM summary
mm cat photo.png -m accurate # VLM图片描述 mm cat video.mp4 -m accurate # 马赛克 → VLM描述 mm cat audio.mp3 -m accurate # 转录文本 → LLM摘要 mm cat paper.pdf -m accurate # 文本 → LLM摘要

Head / tail

头部/尾部内容

mm cat <file> -n 20 # first 20 lines mm cat <file> -n -10 # last 10 lines
mm cat <file> -n 20 # 前20行 mm cat <file> -n -10 # 后10行

Cache control

缓存控制

mm cat <file> --no-cache # bypass cache, force fresh run
mm cat <file> --no-cache # 绕过缓存,强制重新运行

Output formats

输出格式

mm cat <file> --format json # JSON output

Fast mode behavior by file type (<100ms target):

- **PDF** (.pdf): text extraction via pypdfium2. Scanned/image-only PDFs return empty.
- **Document** (.docx, .pptx): text extraction.
- **Image** (.png/.jpg/.webp/.gif/.bmp/.tiff/.svg): dimensions, MIME, xxh3 hash, EXIF data.
- **Video** (.mp4/.mkv/.webm/.avi/.mov): resolution, duration, FPS, codecs (metadata only, no ffmpeg).
- **Audio** (.mp3/.wav/.flac/.aac/.ogg/.m4a): duration, codec, bitrate (metadata only).
mm cat <file> --format json # JSON格式输出

快速模式按文件类型的行为(目标耗时<100毫秒):

- **PDF** (.pdf):通过pypdfium2提取文本。扫描版/纯图片PDF返回空内容。
- **文档** (.docx, .pptx):提取文本。
- **图片** (.png/.jpg/.webp/.gif/.bmp/.tiff/.svg):尺寸、MIME、xxh3哈希、EXIF数据。
- **视频** (.mp4/.mkv/.webm/.avi/.mov):分辨率、时长、FPS、编解码器(仅元数据,无需ffmpeg)。
- **音频** (.mp3/.wav/.flac/.aac/.ogg/.m4a):时长、编解码器、比特率(仅元数据)。

cat -p — named encoders and pipeline YAMLs

cat -p — 命名编码器与流水线YAML

The
-p
/
--pipeline
flag accepts either a registered encoder name or a YAML file path.
bash
undefined
-p
/
--pipeline
标志可接受已注册的编码器名称或YAML文件路径。
bash
undefined

Named encoder (encodes media into VLM-ready JSON messages)

命名编码器(将媒体编码为VLM兼容的JSON消息)

mm cat photo.png -p image-resize # Fit to 1024px, base64 encode mm cat photo.png -p image-tile # Resized overview + all tiles in one Message mm cat video.mp4 -p video-frame-sample # Extract frames at 1fps mm cat video.mp4 -p video-chunk # Chunk into 60s segments mm cat doc.pdf -p document-rasterize # Render pages as images mm cat doc.pdf -p document-rasterize-text # Rasterize + extract text
mm cat photo.png -p image-resize # 调整至1024px,base64编码 mm cat photo.png -p image-tile # 调整后的概览图 + 所有分片整合为一条消息 mm cat video.mp4 -p video-frame-sample # 按1fps提取帧 mm cat video.mp4 -p video-chunk # 分割为60秒的片段 mm cat doc.pdf -p document-rasterize # 将页面渲染为图片 mm cat doc.pdf -p document-rasterize-text # 渲染图片 + 提取文本

YAML pipeline file

YAML流水线文件

mm cat photo.png -p custom-pipeline.yaml
mm cat photo.png -p custom-pipeline.yaml

Multiple pipelines (dispatched by kind field in YAML)

多流水线(根据YAML中的kind字段分发)

mm cat *.jpg *.mp4 -p image.yaml -p video.yaml
mm cat *.jpg *.mp4 -p image.yaml -p video.yaml

List available encoders and pipelines

列出可用的编码器和流水线

mm cat --list-pipelines
undefined
mm cat --list-pipelines
undefined

Built-in encoders

内置编码器

Use either the bare name or the kind-prefixed display name.
NameMediaDescription
image-resize
imageDefault. Fit to 1024px bounding box
image-tile
imageResized overview + tile crops in one Message
video-frame-sample
videoExtract frames at fps (requires ffmpeg)
video-frames-transcript
videoFrames + Whisper transcript (accurate mode default)
video-chunk
videoChunk into time-based segments with overlap
video-mosaic
videoBuild mosaic grids from sampled frames
video-shot-frames
videoScene detection → representative frames per shot
video-shot-mosaic
videoScene detection → mosaic grid per shot
video-gemini
videoPass video file as a Gemini Part
video-gemini-chunked
videoChunk video into Gemini Parts
audio-transcribe
audioTranscribe audio via Whisper (fast/accurate default)
audio-gemini
audioPass audio file as a Gemini Part
document-page-text
documentExtract text per page from PDF/DOCX/PPTX
document-rasterize
documentRender PDF pages as images (requires pypdfium2)
document-rasterize-text
documentRasterize + extract text, interleaved
document-gemini
documentPass document file as a Gemini Part
可使用裸名称或带类型前缀的显示名称。
名称媒体类型描述
image-resize
图片默认选项。调整至1024px边界框
image-tile
图片调整后的概览图 + 分片裁剪整合为一条消息
video-frame-sample
视频按指定fps提取帧(需要ffmpeg)
video-frames-transcript
视频帧 + Whisper转录文本(精准模式默认选项)
video-chunk
视频按时间分割为带重叠的片段
video-mosaic
视频从采样帧构建马赛克网格
video-shot-frames
视频场景检测 → 每个场景的代表性帧
video-shot-mosaic
视频场景检测 → 每个场景的马赛克网格
video-gemini
视频将视频文件作为Gemini部件传递
video-gemini-chunked
视频将视频分割为Gemini部件
audio-transcribe
音频通过Whisper转录音频(快速/精准模式默认选项)
audio-gemini
音频将音频文件作为Gemini部件传递
document-page-text
文档从PDF/DOCX/PPTX逐页提取文本
document-rasterize
文档将PDF页面渲染为图片(需要pypdfium2)
document-rasterize-text
文档渲染图片 + 提取文本,交替输出
document-gemini
文档将文档文件作为Gemini部件传递

Writing custom encoders

编写自定义编码器

Create a
.py
file in
python/mm/encoders/image/
,
python/mm/encoders/video/
(auto-discovered) or
~/.config/mm/encoders/
. The
name
is optional — it defaults to the function name with underscores replaced by hyphens:
python
from pathlib import Path
from mm.encoders import register_encoder

@register_encoder(media_types=("image",))
def my_custom(path: Path, **kw):
    """Registered as 'my-custom' (auto-named from function)."""
    import base64, io
    from PIL import Image
    img = Image.open(path)
    img.thumbnail((1024, 1024))
    buf = io.BytesIO()
    img.save(buf, "JPEG", quality=90)
    b64 = base64.b64encode(buf.getvalue()).decode()
    yield {"role": "user", "content": [
        {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{b64}"}}
    ]}
python/mm/encoders/image/
python/mm/encoders/video/
(自动发现)或
~/.config/mm/encoders/
目录下创建
.py
文件。
name
为可选参数 —— 默认将函数名的下划线替换为连字符:
python
from pathlib import Path
from mm.encoders import register_encoder

@register_encoder(media_types=("image",))
def my_custom(path: Path, **kw):
    """自动注册为'my-custom'(从函数名自动命名)。"""
    import base64, io
    from PIL import Image
    img = Image.open(path)
    img.thumbnail((1024, 1024))
    buf = io.BytesIO()
    img.save(buf, "JPEG", quality=90)
    b64 = base64.b64encode(buf.getvalue()).decode()
    yield {"role": "user", "content": [
        {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{b64}"}}
    ]}

Python API

Python API

python
from mm import process_image, process_image_tiled, process_video, process_document
from pathlib import Path

msg = process_image(Path("photo.png"), max_width=1024)       # Single Message dict
tiles = list(process_image_tiled(Path("scan.png"), tile_size=1024))  # Multiple Messages
chunks = list(process_video(Path("video.mp4")))               # Multiple Messages
pages = list(process_document(Path("doc.pdf")))               # Multiple Messages
python
from mm import process_image, process_image_tiled, process_video, process_document
from pathlib import Path

msg = process_image(Path("photo.png"), max_width=1024)       # 单个消息字典
tiles = list(process_image_tiled(Path("scan.png"), tile_size=1024))  # 多个消息
chunks = list(process_video(Path("video.mp4")))               # 多个消息
pages = list(process_document(Path("doc.pdf")))               # 多个消息

Via Context

通过Context调用

from mm import Context ctx = Context("~/data") messages = ctx.encode("photo.png", strategy="resize")
undefined
from mm import Context ctx = Context("~/data") messages = ctx.encode("photo.png", strategy="resize")
undefined

cat — pipeline overrides

cat — 流水线覆盖

Pipelines are 2-stage YAMLs: encode (convert to LLM-ready parts) → generate (LLM call). Key parameters can be overridden from the CLI.
流水线是两阶段YAML:编码(转换为LLM兼容部件)→ 生成(调用LLM)。可通过CLI覆盖关键参数。

Encode overrides (--encode.*)

编码覆盖(--encode.*)

bash
mm cat photo.png -m accurate --encode.strategy image-tile      # override encoder
mm cat photo.png -m accurate --encode.pyfunc ~/my_filter.py    # custom transform
bash
mm cat photo.png -m accurate --encode.strategy image-tile      # 覆盖编码器
mm cat photo.png -m accurate --encode.pyfunc ~/my_filter.py    # 自定义转换

Generate overrides (--generate.*)

生成覆盖(--generate.*)

bash
mm cat photo.png -m accurate --generate.max-tokens 1024          # increase token limit
mm cat photo.png -m accurate --generate.temperature 0.5           # higher temperature
mm cat photo.png -m accurate --generate.prompt "List 3 main objects in this image."
mm cat photo.png -m accurate --generate.json-mode true            # request JSON response
bash
mm cat photo.png -m accurate --generate.max-tokens 1024          # 增加token限制
mm cat photo.png -m accurate --generate.temperature 0.5           # 提高温度参数
mm cat photo.png -m accurate --generate.prompt "List 3 main objects in this image."
mm cat photo.png -m accurate --generate.json-mode true            # 请求JSON格式响应

Combining overrides

组合覆盖

bash
mm cat photo.png -m accurate \
  --encode.strategy image-tile \
  --generate.max-tokens 512 \
  --generate.prompt "Analyze this architecture diagram."
bash
mm cat photo.png -m accurate \
  --encode.strategy image-tile \
  --generate.max-tokens 512 \
  --generate.prompt "Analyze this architecture diagram."

cat -p — explicit pipeline YAML

cat -p — 显式流水线YAML

Load custom pipeline configurations from YAML files. The YAML's
kind
field dispatches the pipeline to the correct media type. The
generate
field is optional — omit it for encode-only pipelines.
从YAML文件加载自定义流水线配置。YAML的
kind
字段将流水线分发到对应的媒体类型。
generate
字段为可选参数 —— 省略则为仅编码流水线。

Single pipeline

单流水线

yaml
undefined
yaml
undefined

custom-image.yaml

custom-image.yaml

kind: image mode: accurate encode: strategy: resize max_width: 512 generate: prompt: "What is in this image? One sentence only." max_tokens: 64

```bash
mm cat photo.png -p custom-image.yaml
kind: image mode: accurate encode: strategy: resize max_width: 512 generate: prompt: "What is in this image? One sentence only." max_tokens: 64

```bash
mm cat photo.png -p custom-image.yaml

Encode-only pipeline (no LLM)

仅编码流水线(无LLM)

yaml
undefined
yaml
undefined

encode-only.yaml

encode-only.yaml

kind: document mode: fast encode: strategy: null
kind: document mode: fast encode: strategy: null

generate omitted = encode-only, no LLM call

省略generate = 仅编码,不调用LLM

undefined
undefined

Multi-document YAML

多文档YAML

yaml
kind: image
mode: accurate
encode:
  strategy: image-tile
  max_width: 2048
generate:
  prompt: "Describe this image in detail."
  max_tokens: 512
---
kind: video
mode: accurate
encode:
  mosaic_tile: "8x6"
  mosaic_count: 2
  frame_selection: scene
generate:
  prompt: "Summarize this video."
  max_tokens: 1024
bash
mm cat *.jpg *.mp4 -p multi-pipeline.yaml
yaml
kind: image
mode: accurate
encode:
  strategy: image-tile
  max_width: 2048
generate:
  prompt: "Describe this image in detail."
  max_tokens: 512
---
kind: video
mode: accurate
encode:
  mosaic_tile: "8x6"
  mosaic_count: 2
  frame_selection: scene
generate:
  prompt: "Summarize this video."
  max_tokens: 1024
bash
mm cat *.jpg *.mp4 -p multi-pipeline.yaml

CLI overrides layer on top of -p

CLI覆盖叠加在-p之上

bash
mm cat photo.png -p my-pipeline.yaml --generate.max-tokens 128
bash
mm cat photo.png -p my-pipeline.yaml --generate.max-tokens 128

TOML pipeline path overrides

TOML流水线路径覆盖

Override default pipeline paths in
~/.config/mm/mm.toml
:
toml
[pipelines]
image.fast = "~/.config/mm/pipelines/image/fast.yaml"
video.accurate = "/path/to/my-video-accurate.yaml"
~/.config/mm/mm.toml
中覆盖默认流水线路径:
toml
[pipelines]
image.fast = "~/.config/mm/pipelines/image/fast.yaml"
video.accurate = "/path/to/my-video-accurate.yaml"

cat — custom Python transforms (pyfunc)

cat — 自定义Python转换(pyfunc)

The
--encode.pyfunc
flag runs a custom Python transform on the encoded content parts before they are sent to the LLM.
--encode.pyfunc
标志在内容部件发送到LLM之前,运行自定义Python转换。

File-based pyfunc

文件形式的pyfunc

bash
undefined
bash
undefined

my_transform.py:

my_transform.py:

def transform(parts, context):

def transform(parts, context):

extra = {"type": "text", "text": "Focus on the data flow."}

extra = {"type": "text", "text": "Focus on the data flow."}

return parts + [extra]

return parts + [extra]

mm cat photo.png -m accurate --encode.pyfunc ~/my_transform.py
undefined
mm cat photo.png -m accurate --encode.pyfunc ~/my_transform.py
undefined

Pyfunc in pipeline YAML

流水线YAML中的pyfunc

yaml
kind: image
mode: accurate
encode:
  strategy: resize
  max_width: 512
  pyfunc: ~/my_transforms/filter.py
generate:
  prompt: "Analyze this image."
  max_tokens: 128
Inline
def
syntax in YAML:
yaml
encode:
  pyfunc: |
    def transform(parts, context):
        return [p for p in parts if p.get("type") == "image_url"]
yaml
kind: image
mode: accurate
encode:
  strategy: resize
  max_width: 512
  pyfunc: ~/my_transforms/filter.py
generate:
  prompt: "Analyze this image."
  max_tokens: 128
YAML中的内联
def
语法:
yaml
encode:
  pyfunc: |
    def transform(parts, context):
        return [p for p in parts if p.get("type") == "image_url"]

wc — count files, bytes, lines, tokens

wc — 统计文件数量、字节数、行数、token数

bash
mm wc <dir>                      # summary totals
mm wc <dir> --by-kind            # breakdown by file kind
mm wc <dir> --kind document      # only documents
mm wc <dir> --format json        # machine-readable
Estimates LLM tokens (~chars/4 for text, tile-based for images). ~65ms.
bash
mm wc <dir>                      # 汇总统计
mm wc <dir> --by-kind            # 按文件类型细分
mm wc <dir> --kind document      # 仅统计文档
mm wc <dir> --format json        # 机器可读格式
估算LLM token数(文本约为字符数/4,图片基于分片计算)。耗时约65毫秒。

grep — content search (text + semantic)

grep — 内容搜索(文本+语义)

bash
undefined
bash
undefined

Text search (regex matching)

文本搜索(正则匹配)

mm grep "pattern" <dir> # search all files mm grep "attention" <dir> --kind document # search only documents mm grep "TODO" <dir> --kind code # search code files mm grep "invoice" <dir> --kind document --format json # JSON output mm grep "error" <dir> -C 2 # 2 context lines mm grep "invoice" <dir> --count # match counts per file mm grep "Quantum Phase" <dir> -i # case-insensitive search mm grep "TODO" <dir> --ignore-case --kind code # case-insensitive in code mm grep "secret" <dir> --no-ignore # search gitignored files too
mm grep "pattern" <dir> # 搜索所有文件 mm grep "attention" <dir> --kind document # 仅搜索文档 mm grep "TODO" <dir> --kind code # 搜索代码文件 mm grep "invoice" <dir> --kind document --format json # JSON格式输出 mm grep "error" <dir> -C 2 # 显示匹配行前后2行上下文 mm grep "invoice" <dir> --count # 按文件统计匹配次数 mm grep "Quantum Phase" <dir> -i # 大小写不敏感搜索 mm grep "TODO" <dir> --ignore-case --kind code # 代码文件中大小写不敏感搜索 mm grep "secret" <dir> --no-ignore # 同时搜索被git忽略的文件

Semantic search (vector similarity via embeddings)

语义搜索(基于嵌入向量的相似度匹配)

mm grep "financial projections" <dir> -s # semantic search across all files mm grep "architecture overview" <dir> -s --format json # JSON with distances mm grep "revenue forecast" <dir> -s --index # auto-index unindexed files before search

**Warning**: grep runs extraction on every matching file. On large document directories (500+ PDFs), this can take minutes. Prefer `--kind code` or `--kind text` for fast text searches.
mm grep "financial projections" <dir> -s # 跨所有文件的语义搜索 mm grep "architecture overview" <dir> -s --format json # 带相似度距离的JSON输出 mm grep "revenue forecast" <dir> -s --index # 搜索前自动为未索引文件建立索引

**注意**:grep会对每个匹配文件执行提取操作。在大型文档目录(500+个PDF)中,这可能需要数分钟。对于快速文本搜索,优先使用`--kind code`或`--kind text`。

bench — benchmark suite

bench — 基准测试套件

bash
mm bench <dir>                          # full benchmark
mm bench <dir> --rounds 5               # more measurement rounds
mm bench <dir> --mode accurate          # include accurate-mode benchmarks
mm bench <dir> --format json            # JSON output for archival
bash
mm bench <dir>                          # 完整基准测试
mm bench <dir> --rounds 5               # 增加测试轮次
mm bench <dir> --mode accurate          # 包含精准模式基准测试
mm bench <dir> --format json            # 用于存档的JSON格式输出

config — extraction mode settings

config — 提取模式设置

bash
mm config show                                  # show current config
mm config init                                  # create config with default profile
mm config init --force                          # overwrite existing config
mm config set mode.fast.whisper_model tiny       # set a config value
mm config set mode.accurate.beam_size 5          # set a config value
mm config reset-db                              # delete all databases and caches
mm config reset-profiles                        # restore profiles to defaults
mm config reset                                 # reset everything (db + profiles)
bash
mm config show                                  # 查看当前配置
mm config init                                  # 使用默认配置文件创建配置
mm config init --force                          # 覆盖现有配置
mm config set mode.fast.whisper_model tiny       # 设置配置值
mm config set mode.accurate.beam_size 5          # 设置配置值
mm config reset-db                              # 删除所有数据库和缓存
mm config reset-profiles                        # 将配置文件恢复为默认值
mm config reset                                 # 重置所有内容(数据库+配置文件)

profile — LLM provider management

profile — LLM提供商管理

Provider settings are managed through profiles stored in
~/.config/mm/mm.toml
.
bash
mm profile list                                            # list all profiles (● = active)
mm profile add openrouter --base-url https://openrouter.ai/api/v1 --model vlm-1
mm profile update openrouter --model gemma4:e2b
mm profile use openrouter                                  # switch active profile
mm profile remove openrouter
Per-command profile selection:
bash
mm --profile openrouter cat photo.png -m accurate    # one-off override
MM_PROFILE=openrouter mm cat photo.png -m accurate   # env override
提供商设置通过存储在
~/.config/mm/mm.toml
中的配置文件进行管理。
bash
mm profile list                                            # 列出所有配置文件(● = 当前激活)
mm profile add openrouter --base-url https://openrouter.ai/api/v1 --model vlm-1
mm profile update openrouter --model gemma4:e2b
mm profile use openrouter                                  # 切换激活的配置文件
mm profile remove openrouter
按命令选择配置文件:
bash
mm --profile openrouter cat photo.png -m accurate    # 单次覆盖
MM_PROFILE=openrouter mm cat photo.png -m accurate   # 通过环境变量覆盖

Output formats

输出格式

  • TTY: Rich formatted tables/panels (human-friendly).
  • Piped / non-TTY: plain TSV/text or one-path-per-line (machine-readable, no ANSI codes).
  • --format json
    : JSON output. Always use this when parsing results programmatically.
  • --format tsv
    : Tab-separated values. Maximum token efficiency.
  • --format csv
    : Comma-separated values.
  • --format dataset-jsonl
    : JSONL for dataset export.
  • --format dataset-hf
    : HuggingFace Datasets format.
  • TTY:富格式表格/面板(适合人类阅读)。
  • 管道/非TTY:纯TSV/文本或每行一个路径(机器可读,无ANSI代码)。
  • --format json
    :JSON格式输出。编程解析结果时请始终使用此格式。
  • --format tsv
    :制表符分隔值。最高的token效率。
  • --format csv
    :逗号分隔值。
  • --format dataset-jsonl
    :用于数据集导出的JSONL格式。
  • --format dataset-hf
    :HuggingFace Datasets格式。

Pipe composability

管道组合性

bash
mm find <dir> --kind image | mm cat               # find images, extract metadata
mm find <dir> --kind document --min-size 10mb | wc -l  # count large PDFs
mm find <dir> --kind video --format json | jq '.[].name'  # extract video names
bash
mm find <dir> --kind image | mm cat               # 查找图片并提取元数据
mm find <dir> --kind document --min-size 10mb | wc -l  # 统计大型PDF数量
mm find <dir> --kind video --format json | jq '.[].name'  # 提取视频名称

Tips

小贴士

  • All metadata commands (
    find
    ,
    wc
    ) run in ~60ms via the Rust fast path.
  • Start with
    find --tree --depth 1
    then
    wc --by-kind
    for the fastest directory overview.
  • Use
    --format json
    when you need to parse output programmatically.
  • find
    returns paths only when piped, else it returns full metadata rows.
  • For PDFs,
    cat
    extracts text in fast mode; if empty, the PDF contains scanned images only.
  • For videos,
    mm cat video.mp4 -m accurate
    auto-generates keyframe mosaics and sends to LLM.
  • Use
    --mode fast
    for quick metadata/text extraction (default),
    --mode accurate
    for LLM descriptions.
  • Use
    --no-cache
    with
    -m accurate
    to force a fresh LLM call.
  • Use
    -p
    to load custom pipeline YAMLs or named encoders; CLI overrides layer on top.
  • Use
    --encode.pyfunc
    to inject custom Python transforms.
  • Use
    --list-pipelines
    to see all available encoders and built-in pipelines.
  • 所有元数据命令(
    find
    ,
    wc
    )通过Rust快速路径运行,耗时约60毫秒。
  • 快速了解目录的最佳方式是先运行
    find --tree --depth 1
    ,再运行
    wc --by-kind
  • 当需要编程解析输出时,请使用
    --format json
  • 管道输出时
    find
    仅返回路径,否则返回完整元数据行。
  • 对于PDF,
    cat
    在快速模式下提取文本;如果返回空内容,则该PDF为扫描版纯图片PDF。
  • 对于视频,
    mm cat video.mp4 -m accurate
    会自动生成关键帧马赛克并发送给LLM。
  • 使用
    --mode fast
    进行快速元数据/文本提取(默认),使用
    --mode accurate
    获取LLM生成的描述。
  • 使用
    --no-cache
    搭配
    -m accurate
    强制重新调用LLM。
  • 使用
    -p
    加载自定义流水线YAML或命名编码器;CLI参数会叠加覆盖。
  • 使用
    --encode.pyfunc
    注入自定义Python转换。
  • 使用
    --list-pipelines
    查看所有可用编码器和内置流水线。