skillpack-harvest — Editorial workflow for lifting host skills into gbrain
Convention: see _brain-filing-rules.md for
file placement rules. This skill writes into gbrain's own tree, not the
brain repo's notes.
This skill is the inverse of
gbrain skillpack scaffold
. Scaffold ships
skills downstream (gbrain → host). Harvest lifts proven patterns
upstream (host → gbrain) so they become references every other client
can scaffold.
Contract
A harvest is "properly done" when:
- The host skill is mature (used in production, recent routing-eval
cases pass).
- The editorial genericization in Phase 3 has scrubbed every
fork-specific reference (names, real entities, internal channels).
gbrain skillpack harvest --dry-run
previewed the file set.
- The real
gbrain skillpack harvest <slug> --from <host>
succeeded
with (no privacy-lint hits).
bun test test/skills-conformance.test.ts
passes on the new
.
- The user has reviewed the diff in gbrain and explicitly approved
the commit.
If any of these is incomplete, the skill is NOT yet harvested — the
files may sit in gbrain's working tree, but they're not landed.
Output Format
This skill produces three artifacts in gbrain's working tree:
skills/<harvested-slug>/SKILL.md
(and any sibling files like
)
- Paired source files at their mirror paths (e.g.
) when the host SKILL.md declared them
in frontmatter
- An updated with the new slug added to
(sorted)
The session output to the user is a one-line success summary plus
a list of files written. JSON mode (
) returns the full
shape for machine consumption.
Anti-Patterns
- Skipping the dry-run. Always preview first. Files land in
gbrain's working tree; cleanup is a away, but you
shouldn't need to.
- Trusting the linter alone. The default regex set catches the
common cases. It doesn't catch every proper noun. Phase 3 (the
editorial pass) is the primary defense.
- Harvesting without justification. The lint exists
for a reason. If you bypass it, document why in the commit.
- Harvesting a skill that's still in flux. Wait until the host
version stabilizes. Otherwise you'll harvest, then re-harvest,
then re-harvest, and that churns gbrain's bundle for no benefit.
- Moving files instead of copying. Harvest is a copy. The host
retains its skill. Don't the source after harvesting.
- Harvesting batch (multiple skills at once). Not supported, and
for good reason — the editorial review per skill is real work.
When to invoke
- The user developed a skill in their host fork (Wintermute, Neuromancer,
Zion, etc.) and wants other gbrain clients to be able to use it
- A skill has proven itself in production and is ready to generalize
- The user explicitly asks to "harvest" or "publish" a skill upstream
Do NOT invoke when:
- The skill is still in flux locally — let it stabilize first
- The skill references private content that can't be generalized
- The user just wants to share a one-off draft (use a gist instead)
Preconditions
Before running this skill, confirm:
-
The skill is mature. Recent
cases pass; the
skill has been used in production at least a few times.
-
The skill is generalizable. Strip-test in your head: replace
every fork-specific name. Does it still make sense as a skill?
-
The user owns the gbrain checkout. The harvest writes into
gbrain's working tree. They'll review and commit. Don't harvest
into a checkout the user doesn't intend to commit from.
Workflow
Phase 1 — Plan
Ask the user:
- What slug should the harvested skill have? (Slugs must be kebab-case,
globally unique in the gbrain bundle.)
- Which host repo is the source? (Path to repo root, not to the skill
directory — e.g. , not
~/git/wintermute/skills/foo
.)
- Should paired source files come along? (Check the host SKILL.md's
frontmatter array.)
Phase 2 — Dry-run + privacy-lint preview
bash
gbrain skillpack harvest <slug> --from <host-repo-root> --dry-run
The output shows:
- Which files would land in gbrain's tree
- Whether paired sources are included
- (Implicit) The skill's frontmatter triggers — read them and check
they generalize
Do not skip the dry-run. The privacy linter only runs on a real
harvest, but the dry-run preview lets you see the files before they
land. Spot-check the SKILL.md and any paired source for things the
linter might miss (proper nouns, internal project names, etc.).
Phase 3 — Genericization checklist (the editorial pass)
Before running the real harvest, walk the host's
files and apply this checklist. If anything matches, edit the host
file FIRST, then run harvest.
-
Fork-specific names → generic phrasing
- → (or )
- , , → same treatment
- Personal first names (, , etc.) → /
/ a generic placeholder
-
Real entities → placeholders
- Real people, companies, deals, funds → placeholder slugs
(, , , etc.)
- Email addresses → strip entirely OR use
- Internal Slack channels → or strip
- Specific tracker IDs / Linear ticket numbers → strip
-
Fork-specific conventions → references
- Mentions of files → either lift the doc
into gbrain OR replace with a generic placeholder explanation
- Mentions of
<host-repo>/skills/<other-fork-only-skill>
→ either
decide to harvest that one too, or replace with a generic
pattern reference
-
Triggers array generalizes
- Read every entry in frontmatter . None should
reference the user's name, fork name, or internal tools.
- "Have garry sign off on it" → "have the user sign off on it"
-
routing-eval.jsonl examples are scrubbed
- Open
skills/<slug>/routing-eval.jsonl
. Every field
gets the same scrub as .
-
Code comments + log strings
- If a paired source is going to be harvested, walk it for the
same private-pattern leaks. Comments are the most common
hiding spot.
Phase 4 — Real harvest
Once Phase 3 is complete, run the real harvest:
bash
gbrain skillpack harvest <slug> --from <host-repo-root>
Default behavior:
- Path-confinement + symlink rejection at file copy
- Privacy linter runs against
~/.gbrain/harvest-private-patterns.txt
(plus built-in defaults: , email, Slack channels)
- On any match → rollback (delete the harvested files) + exit non-zero
- updated to add the slug, sorted
Outcomes:
- — success, manifest updated, files in gbrain's tree
- — privacy linter caught something. Go back to Phase 3,
scrub the host file, retry.
- — gbrain already has a skill at that slug. Either
use a different slug, or pass if you really
mean to replace.
Phase 5 — Verify in gbrain
After a successful harvest:
bun test test/skills-conformance.test.ts
— confirms the new
SKILL.md meets the frontmatter contract.
gbrain skillpack check --strict
— confirms no drift between
bundle and gbrain's own checkout.
- — confirms the slug shows up in the bundle.
- Review the diff:
cd <gbrainRoot> && git diff -- skills/<slug>/
- Commit the additions in gbrain (do NOT commit any leftover files
in the host repo — harvest is a copy, not a move).
Phase 6 — Downstream announcement (optional)
If other gbrain clients should pick up the new skill:
- Note it in under "Skills added" for the next release
- Tag the user / contributor in the PR if the skill came from
someone outside the core team
Bypass:
The privacy linter is the safety net. The editorial pass is the
primary defense. If you've completed Phase 3 thoroughly and the
linter is still firing on a false positive, use
:
bash
gbrain skillpack harvest <slug> --from <host-repo-root> --no-lint
Document the bypass in the commit message. Future maintainers
should be able to see WHY the lint was bypassed (e.g. "Wintermute
appears in a citation, not a real reference — verified manually").
Never bypass the linter on a casual basis. The whole point of the
default-on lint is that real names occasionally slip through the
editorial pass.
What harvest does NOT do
- It does NOT move files (it copies). The host's
stays in place.
- It does NOT auto-scrub names. The editorial pass is human-driven.
- It does NOT publish to npm or a remote bundle. It writes to
gbrain's working tree; the user commits + ships via the normal
gbrain release process.
- It does NOT support (no batch harvest). One skill at a
time keeps the editorial review tractable.
Files this skill touches
- gbrain's — every file in the host skill dir
(copy)
- gbrain's mirror path for declared paired sources
(e.g. if the host SKILL.md declares it
in frontmatter)
- gbrain's — adds the slug to
array, sorted alphabetically