Loading...
Loading...
Convert a PDF (research paper, technical report, or project document) into a beautiful single-page academic/project website with a structured outline JSON. Trigger this skill when the user wants to make a paper page, project homepage, or academic website from a PDF — in Chinese or English.
npx skill4agent add zai-org/glm-skills glmv-pdf-to-weboutline.jsongenerate_web.py{SKILL_DIR}/scripts/pip install pymupdf pillowcurl{WORKSPACE}/web/<pdf_stem>_<timestamp>/web/
└── <pdf_stem>_<timestamp>/
├── outline.json ← structured web plan (WebPlan schema)
├── crops/ ← locally-saved cropped images
│ ├── fig_arch_crop.png
│ ├── table_results_crop.png
│ └── ...
└── index.html ← the website<pdf_stem><timestamp>YYYYMMDD_HHMMSScrops/<name>_crop.png$ARGUMENTSimport os, datetime
pdf_stem = os.path.splitext(os.path.basename(pdf_path))[0]
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
out_dir = os.path.join(workspace, "web", f"{pdf_stem}_{timestamp}")mkdir -p "<out_dir>/crops"pdf_stem=$(basename "$ARGUMENTS" .pdf)
curl -L -o "/tmp/${pdf_stem}.pdf" "$ARGUMENTS"python {SKILL_DIR}/scripts/pdf_to_images.py "<pdf_path>" --dpi 120[{"page": 1, "path": "/abs/path/page_001.png"}, ...]page → path | Purpose |
|---|---|
| Title, authors, venue badge, link buttons |
| Full abstract text |
| 3–5 key contribution cards |
| Architecture figure + method explanation |
| Quantitative table + qualitative figures |
| Brief conclusion |
| BibTeX block |
<out_dir>/outline.json{
"project_title": "Paper Title",
"lang": "English",
"authors": ["Author One", "Author Two"],
"sections_plan": [
{
"section_index": 1,
"section_id": "hero",
"title": "Hero",
"content": "Title, authors, venue, teaser figure description",
"required_images": [
{
"url": "<local_page_path_from_phase1>",
"visual_description": "Figure 1: teaser showing input-output examples",
"usage_reason": "Hero section visual to immediately show the paper's output"
}
]
}
]
}lang"Chinese""English"required_images[]urlpathoutline.json<out_dir>/outline.json{SKILL_DIR}/scripts/crop.pyoutline.jsonAgent tool call:
description: "Grounding crop page N"
prompt: |
You are a visual grounding and cropping assistant. Your task is to precisely
locate specified visual elements in a page image and crop them out.
## Grounding method
Use visual grounding to locate each target:
1. Read the source image using the Read tool to view it
2. Identify the target element described below
3. Determine its bounding box as normalized coordinates in the 0–999 range:
- 0 = left/top edge of the image
- 999 = right/bottom edge of the image
- These are thousandths, NOT pixels, NOT percentages (0–100)
- Format: [x1, y1, x2, y2] where (x1,y1) is top-left, (x2,y2) is bottom-right
- Example: [0, 0, 500, 500] = top-left quarter of the image
4. Be precise: tightly bound the target element with a small margin (~10–20 units)
around it. Do NOT crop too wide or too narrow.
## Source image
<page_image_path>
## Crops needed
For each crop below, first do grounding (locate the element), then crop:
1. Name: "<descriptive_name>"
Target: "<visual_description from outline.json>"
Context: "<usage_reason from outline.json>"
## Crop command
After determining the bounding box [X1, Y1, X2, Y2] for each target, run:
```bash
python <SKILL_DIR>/scripts/crop.py \
--path "<page_image_path>" \
--box X1 Y1 X2 Y2 \
--name "<crop_name>" \
--out-dir "<out_dir>/crops"
```
## Verification
After each crop, READ the output image to visually verify the correct region
was captured. If the crop missed the target or is too wide/narrow, adjust the
coordinates and re-run crop.py.
## Output
Report the final results as a list:
- crop_name: <name>, file: <output_filename>, box: [X1, Y1, X2, Y2]<page_image_path><SKILL_DIR><out_dir>{"path": "/abs/path/<name>_crop.png"}section_id → [crop filename, ...]python3 -c "
from PIL import Image; import os, json
d = '<out_dir>/crops'
sizes = {}
for f in sorted(os.listdir(d)):
if f.endswith('.png'):
w, h = Image.open(os.path.join(d, f)).size
sizes[f] = {'width': w, 'height': h, 'aspect': round(w/h, 2)}
print(json.dumps(sizes, indent=2))
"| Aspect ratio | Layout recommendation |
|---|---|
| < 0.7 (tall/narrow) | |
| 0.7 – 1.3 (square-ish) | |
| > 1.3 (wide) | Full-width, |
| > 2.0 (very wide, e.g. tables) | Full-width with horizontal scroll fallback |
/tmp/website.html<img src="...">crops/<name>_crop.pngpython {SKILL_DIR}/scripts/generate_web.py \
--html-file /tmp/website.html \
--title "<paper title>" \
--out-dir "<out_dir>/"900pxhero[📄 Paper] [💻 Code] [🗄️ Dataset]abstractcontributionsmethod<figure><img><figcaption>results<table>conclusioncitation<pre><code>navigator.clipboard<img>crops/<name>_crop.pngloading="lazy"alt<figure><figcaption>IntersectionObserver<pdf_stem>_<timestamp>/outline.jsoncrops/crops/<name>_crop.pnggenerate_web.py