research-deep
Original:🇺🇸 English
Translated
2 scriptsChecked / no sensitive code detected
Execute deep research on every item in a research outline, producing structured JSON per item and a final markdown report. Use after running /research to generate an outline. Reads outline.yaml and fields.yaml, launches parallel research agents in batches, validates output, generates a consolidated report, and supports resume on interruption. Trigger when the user says "start deep research", "research these items", "run the deep phase", "fill in the fields for each item", or "generate the research report".
1installs
Added on
NPX Install
npx skill4agent add marco-machado/agent-skills research-deepTags
Translated version includes tags in frontmatterSKILL.md Content
View Translation Comparison →Research Deep — Batch Item Research
Read a research outline ( + ) produced by , then research each item in parallel batches, producing one structured JSON file per item and a final consolidated markdown report.
outline.yamlfields.yaml/researchVariables
| Variable | Source | Description |
|---|---|---|
| | The |
| Discovered | Directory containing |
| | |
| | |
| | |
| Derived | Absolute path to |
| Per item | The item's |
| Derived | Slugified item name: lowercase, spaces to underscores, strip non-alphanumeric except underscores, collapse consecutive underscores. E.g. "GitHub Copilot" becomes |
| Derived | |
Step 1: Locate Outline
Search for in the current working directory.
*/outline.yaml- If exactly one is found: read it along with the sibling . Store the containing directory as
fields.yaml.{outline_dir} - If multiple are found: list them and ask the user which to use.
- If none found: tell the user to run first and stop.
/research
Read both files. Extract the items list and execution config. Report to the user:
- Topic:
{topic} - Items count: N items
- Batch config: parallel agents,
{batch_size}items each{items_per_agent} - Output directory:
{output_dir}
Step 2: Resume Check
Check for existing files.
{output_dir}.jsonFor each existing JSON file:
- Parse the filename back to an item name (reverse the slug: -> match against items list)
github_copilot.json - Run the validation script to check completeness:
bash
python3 scripts/validate_json.py -f {fields_path} -j {output_path} - If validation passes (exit code 0): mark the item as completed — skip it
- If validation fails (exit code 1): mark the item as incomplete — include it in the run
Report resume status to the user:
- "Found N/{total} completed items. Resuming with {remaining} items."
- List the completed items so the user can verify
If all items are already completed, report this and stop.
Step 3: Batch Execution
Partition the remaining items into batches:
- Each agent handles up to items
{items_per_agent} - Launch up to agents in parallel per batch
{batch_size}
Before launching each batch, show the user which items are in this batch and ask for approval:
- "Batch {N}/{total_batches}: items [list]. Launch?"
For each agent, build the prompt from the template below. Preserve the structure and goals; only substitute the .
{variables}Read for search methodology guidance to include in the agent context.
references/web-search-guide.mdSub-agent prompt template:
## Task
Research the following item(s) and output structured JSON.
Topic: {topic}
### Items to Research
{for each item assigned to this agent:}
- name: {item_name}
description: {item_description}
{end for}
## Field Definitions
Read the field definitions file to understand what data to collect for each item:
{fields_path}
Use all field categories and fields defined in that file. Each item gets its own JSON object with every field populated.
## Research Instructions
- Search for authoritative, current information on each item
- Use 2-3 search query variations per item
- Prefer official sources (project websites, documentation, release announcements)
- Cross-reference claims across multiple sources when possible
- Note publication dates — flag anything older than 12 months
## Output Format
For each item, write a JSON file to its output path:
{for each item:}
- {item_name} -> {output_path}
{end for}
Each JSON file must follow this structure:
```json
{
"name": "{item_name}",
"category_name": {
"field_name": "value",
"field_name": "value"
},
"another_category": {
"field_name": "value"
},
"uncertain": ["field_name_1", "field_name_2"],
"sources": [
{"description": "Source description", "url": "https://..."}
]
}Field value rules:
- Populate every field defined in fields.yaml
- If a value cannot be confidently determined, write your best estimate and append "[uncertain]" to the string value
- Add the field name to the top-level "uncertain" array
- All values must be in English
- Use the detail_level from fields.yaml to calibrate response length:
- brief: single value or short phrase
- moderate: 1-3 sentences
- detailed: full paragraph or structured breakdown
Validation
After writing each JSON file, run:
bash
python3 {validate_script_path} -f {fields_path} -j {output_path}If validation fails, read the error output, fix the JSON, and re-run until it passes.
The task is complete only after all items pass validation.
**One-shot example** (single item, topic "AI Coding History"):Task
Research the following item(s) and output structured JSON.
Topic: AI Coding History
Items to Research
- name: GitHub Copilot description: Developed by Microsoft/GitHub, first mainstream AI coding assistant
Field Definitions
Read the field definitions file to understand what data to collect for each item:
/Users/you/ai-coding-history/fields.yaml
Use all field categories and fields defined in that file. Each item gets its own JSON object with every field populated.
Research Instructions
- Search for authoritative, current information on each item
- Use 2-3 search query variations per item
- Prefer official sources (project websites, documentation, release announcements)
- Cross-reference claims across multiple sources when possible
- Note publication dates — flag anything older than 12 months
Output Format
For each item, write a JSON file to its output path:
- GitHub Copilot -> /Users/you/ai-coding-history/results/github_copilot.json
Each JSON file must follow this structure:
json
{
"name": "GitHub Copilot",
"basic_info": {
"release_date": "2021-06-29",
"company": "Microsoft / GitHub"
},
"technical_features": {
"underlying_model": "OpenAI Codex (initially), GPT-4 (current)",
"context_window": "Varies by tier; up to 128k tokens in Copilot Enterprise [uncertain]"
},
"uncertain": ["context_window"],
"sources": [
{"description": "GitHub Copilot official documentation", "url": "https://docs.github.com/copilot"}
]
}Field value rules:
- Populate every field defined in fields.yaml
- If a value cannot be confidently determined, write your best estimate and append "[uncertain]" to the string value
- Add the field name to the top-level "uncertain" array
- All values must be in English
- Use the detail_level from fields.yaml to calibrate response length:
- brief: single value or short phrase
- moderate: 1-3 sentences
- detailed: full paragraph or structured breakdown
Validation
After writing each JSON file, run:
bash
python3 /Users/you/agent-skills/skills/research-deep/scripts/validate_json.py -f /Users/you/ai-coding-history/fields.yaml -j /Users/you/ai-coding-history/results/github_copilot.jsonIf validation fails, read the error output, fix the JSON, and re-run until it passes.
The task is complete only after all items pass validation.
## Step 4: Monitor and Continue
After launching a batch:
1. **Wait** for all agents in the batch to complete
2. **Collect results**: for each agent, check that its output JSON files exist and pass validation
3. **Handle failures**:
- If an agent fails entirely (no output): log the item names and add them to a retry list
- If validation fails after the agent finishes: log which fields are missing/invalid
- **Retry failed items once** in the next batch. If they fail again, mark them as failed and move on.
4. **Report batch progress**: "Batch {N} complete: {succeeded}/{total} items succeeded."
5. **Launch next batch** (with user approval)
## Step 5: Summary Report
After all batches complete, output:
Research Complete
Topic: {topic}
Output directory: {output_dir}
Results
- Completed: {count} / {total} items
- Failed: {count} items {list names if any}
- Items with uncertain fields: {count}
Uncertain Fields Summary
{For each item with uncertain fields:}
- {item_name}: {list of uncertain field names} {end for}
Failed Items
{If any:}
- {item_name}: {reason for failure} {end for}
## Step 6: Generate Report
After Step 5's summary (or after resume finds all items already completed), generate a markdown report.
**Ask the user**: "Which fields should appear as summary columns in the table of contents? (Pick from the available fields — e.g. release_date, company, github_stars)"
To help the user choose, scan the completed JSON files and list fields that have short values (single numbers, dates, short strings) — these work well as TOC columns.
Run the report generation script:
```bash
python3 scripts/generate_report.py \
-f {fields_path} \
-d {output_dir} \
-o {outline_dir}/report.md \
--toc-fields field1,field2,field3If the script exits with an error, show the error output to the user and stop.
Otherwise, confirm: "Report written to " and show the first ~30 lines as a preview.
{outline_dir}/report.mdRules
- NEVER modify or
outline.yaml— they are read-only inputsfields.yaml - NEVER skip the user approval step before each batch
- NEVER retry a failed item more than once
- Always run the validation script after writing each JSON — do not mark an item as complete until validation passes
- Write JSON files atomically: write to first, then rename to
{output_path}.tmpafter validation passes{output_path}
Gotchas
- Slug collisions: two items could slug to the same filename (e.g. "C++" and "C" could both become ). If detected, append a numeric suffix:
c,c.json.c_2.json - Large item counts: if there are 50+ items, warn the user about total agent cost before starting.
- fields.yaml changes: if the user modifies fields.yaml between runs, previously completed items won't have the new fields. The validation script will catch this — those items will be re-researched on resume.