datasheet-intelligence
Original:🇺🇸 English
Translated
3 scripts
Convert mixed-format datasheets and hardware reference files (PDF, DOCX, HTML, Markdown, XLSX/CSV) into normalized Markdown knowledge files for AI coding agents. Use when a user asks to ingest datasheets, register maps, pinout/timing sheets, revision histories, or internal hardware notes before searching datasheet content or generating code. Produce RAG-ready section chunks, anchors, image references, and metadata under .context/knowledge.
10installs
Sourcedhkimxx/ai-agent-skills
Added on
NPX Install
npx skill4agent add dhkimxx/ai-agent-skills datasheet-intelligenceTags
Translated version includes tags in frontmatterSKILL.md Content
View Translation Comparison →Datasheet Intelligence
Purpose
Ingest multi-format datasheets into a single Markdown-first knowledge base so agents can answer hardware questions using consistent structure.
Prerequisites
Check first:
uvbash
uv --versionIf is not installed, install it by OS:
uvsh
# macOS (Homebrew)
brew install uv
# Linux (official installer)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows (WinGet)
winget install --id=astral-sh.uv -eAfter installation, restart the shell and verify again.
uv --versionWorkflow
- Place source datasheets in a folder (for example ).
docs/datasheets/ - Run with
scripts/ingest_docs.py.uv run --with docling python3 - Read outputs from :
.context/knowledge/- : normalized markdown
<doc>/<doc>.md - : section chunks for retrieval
<doc>/<doc>.sections.jsonl - : table-focused markdown
<doc>/<doc>.tables.md - : conversion metadata and validation info
<doc>/<doc>.meta.json - : raw Docling structured export
<doc>/<doc>.docling.json - : extracted images
<doc>/_images/* - : corpus manifest
knowledge.index.json
- Use search/read helpers:
- for corpus search
scripts/search_docs.py - for focused section reads
scripts/read_docs.py
Commands
bash
# Ingest all supported datasheets from a directory
uv run --with docling python3 scripts/ingest_docs.py docs/datasheets --output-dir .context/knowledge
# Ingest only top-level files and skip OCR for faster runs
uv run --with docling python3 scripts/ingest_docs.py docs/datasheets --non-recursive --no-ocr
# Search and focused read
uv run python3 scripts/search_docs.py "SPI0 address" --knowledge-dir .context/knowledge
uv run python3 scripts/read_docs.py exynos_spi_v1 --anchor section-4-2Operational Rules
- Use command-first guidance (above) instead of low-level converter internals.
- Prefer commands and do not assume a global
uv runalias.python - Keep output location at unless user asks otherwise.
.context/knowledge - Before the first run, ensure generated artifacts are ignored by git (or
.context/in.context/knowledge/)..gitignore - Preserve image presence in markdown with entries.
 - Keep section anchors and normalize textual references like into markdown links when anchor targets exist.
See Table 4.2 - Run on all provided files and summarize failures without stopping the whole batch unless explicitly requested.
- For detailed CLI options and execution presets, read .
references/execution-options.md - If is unavailable, follow the fallback instructions in
uv.references/execution-options.md
Resources
- : main ingestion pipeline
scripts/ingest_docs.py - : chunk-level search helper
scripts/search_docs.py - : focused reader helper
scripts/read_docs.py - : output schema and retrieval contract
references/output-contract.md - : detailed runtime flags and command presets
references/execution-options.md