SKOS Taxonomist
Design operational taxonomies with SKOS so Panda and all sub-agents classify work consistently, retrieve better context, and avoid tag soup.
This skill is for execution, not theory: define concepts, classify agent data, validate constraints, and wire taxonomy fields into Typesense.
When to Use
Use this skill when any request involves:
- "taxonomy", "SKOS", "controlled vocabulary", "concept scheme"
- classifying agent interactions, runs, docs, memory, events, or tasks
- designing or revising / contracts
- mapping joelclaw concepts to another vocabulary
- deciding whether SKOS is needed or a simpler tag model is enough
Primary Outcomes
- A versioned SKOS concept scheme with stable concept URIs.
- Agent-classification rules that produce deterministic concept metadata.
- Typesense field mappings that support lexical, faceted, and hybrid retrieval.
- Governance rules for candidate concepts, alias drift, deprecation, and mappings.
Non-Negotiable SKOS Rules (Normative)
Follow these in every scheme:
- and are distinct classes.
- is not transitive; use transitive super-property semantics () for closure queries.
- max one value per language tag for a resource.
- , , and are pairwise disjoint for the same resource+language form.
- is disjoint with .
- Mapping relations are for cross-scheme linking. is transitive and should be used sparingly.
- is intentionally non-transitive.
- / are for grouping; they are disjoint with .
If any of these fail, stop and fix data before rollout.
JoelClaw Operational Scheme (Workload v1)
Define a dedicated scheme for workload classification:
- Scheme URI:
joelclaw:scheme:workload:v1
- Taxonomy version string:
- Concept URI pattern:
joelclaw:concept:<top-level>[:<subconcept>]
- Notation style: upper snake or dotted operational codes (stable and immutable)
Top-Level Concepts (Required)
| Notation | URI | Purpose |
|---|
| joelclaw:concept:platform
| Runtime/platform infrastructure and hosting substrate |
| joelclaw:concept:integration
| External system connections, APIs, webhooks, adapters |
| | CLI/dev tooling, operator commands, local automation |
| joelclaw:concept:pipeline
| Inngest/event workflows, ingestion and processing chains |
| | Code implementation, tests, CI/CD, packaging |
| joelclaw:concept:knowledge
| Docs, memory, taxonomy, retrieval context |
| | Messaging channels, notifications, agent/user communication |
| | OTEL/logging/metrics/diagnostics/reliability telemetry |
| | Governance, ADRs, policies, lifecycle and process controls |
Core Turtle Skeleton
turtle
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix jcw: <joelclaw:> .
jcw:scheme:workload:v1 a skos:ConceptScheme ;
skos:prefLabel "JoelClaw Workload Taxonomy v1"@en ;
skos:definition "Operational workload taxonomy for Panda and sub-agents."@en ;
skos:hasTopConcept
jcw:concept:platform,
jcw:concept:integration,
jcw:concept:tooling,
jcw:concept:pipeline,
jcw:concept:build,
jcw:concept:knowledge,
jcw:concept:comms,
jcw:concept:observe,
jcw:concept:meta .
jcw:concept:pipeline a skos:Concept ;
skos:inScheme jcw:scheme:workload:v1 ;
skos:topConceptOf jcw:scheme:workload:v1 ;
skos:notation "PIPELINE" ;
skos:prefLabel "Pipeline"@en ;
skos:altLabel "workflow"@en, "event flow"@en ;
skos:definition "Durable event-driven processing sequences."@en ;
skos:scopeNote "Use for Inngest functions, ingest chains, and orchestration logic."@en ;
skos:narrower jcw:concept:pipeline:ingest, jcw:concept:pipeline:enrichment ;
skos:related jcw:concept:observe, jcw:concept:build .
Agent Classification Contract
Every sub-agent output that can be stored/retrieved must emit:
- (single canonical concept URI)
- (ordered list: primary first, then secondary)
- (
rules|llm|backfill|manual|fallback
)
classification_confidence
(0-1 float, optional but recommended)
Recommended envelope:
json
{
"primary_concept_id": "joelclaw:concept:pipeline:ingest",
"concept_ids": [
"joelclaw:concept:pipeline:ingest",
"joelclaw:concept:knowledge",
"joelclaw:concept:observe"
],
"taxonomy_version": "workload-v1",
"concept_source": "rules",
"classification_confidence": 0.88
}
Classification Procedure
- Normalize candidate labels (, lowercase, slugify, punctuation collapse).
- Match against , then , then alias tables.
- Disambiguate using and neighboring concepts (, ).
- Emit one primary concept plus optional secondary concepts.
- If unresolved, map to a controlled fallback concept and log unmapped labels.
- Emit OTEL metadata for mapping diagnostics (, , ).
SKOS-XL (When Labels Need First-Class Metadata)
Use SKOS-XL only when label objects need metadata or relationships:
- acronym/abbreviation management with provenance
- per-label source attribution
- multilingual/transliteration workflows with label-level auditing
- label-to-label relationships (deprecated term -> replacement term)
If labels are plain synonyms only, stay with core SKOS labels.
SKOS-XL Example
turtle
@prefix skosxl: <http://www.w3.org/2008/05/skos-xl#> .
@prefix jcw: <joelclaw:> .
jcw:label:comms:imessage a skosxl:Label ;
skosxl:literalForm "iMessage"@en .
jcw:concept:comms:imessage skosxl:altLabel jcw:label:comms:imessage .
Mapping to External Vocabularies
Use mapping properties across schemes, not inside one scheme:
- : interchangeable meaning (rare, high bar)
- : near-equivalent, safe default for most interop
- / : granularity mismatch
- : associative cross-scheme link
Internal Cross-Scheme Example (Workload -> Existing Docs Scheme)
turtle
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix jcw: <joelclaw:> .
@prefix jcd: <jc:> .
jcw:concept:build skos:closeMatch jcd:docs:programming .
jcw:concept:observe skos:broadMatch jcd:docs:operations .
jcw:concept:knowledge skos:relatedMatch jcd:docs:education .
Mapping guardrails:
- Start with ; escalate to only with explicit review.
- Do not chain blindly across multiple schemes.
- Review inferred collisions caused by transitive/symmetric mapping behavior.
Collections and Ordered Collections
Use collections for non-hierarchical grouping:
- for thematic bundles (example: all communication channels)
- for deterministic sequences (example: escalation stages)
Do not encode hierarchy with collections. Use
/
for taxonomy structure.
turtle
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix jcw: <joelclaw:> .
jcw:collection:comms-channels a skos:Collection ;
skos:prefLabel "Comms channels"@en ;
skos:member
jcw:concept:comms:telegram,
jcw:concept:comms:slack,
jcw:concept:comms:imessage .
URI and Notation Policy ()
- URI local parts are lowercase kebab or colon-separated operational paths.
- is immutable once released.
- Never repurpose a concept URI. Deprecate old URI and add mappings.
- Keep human names in labels, not in IDs.
- Encode scheme version in scheme URI and metadata, not in every concept URI unless required.
Recommended patterns:
joelclaw:scheme:workload:v1
joelclaw:concept:observe:otel
joelclaw:collection:agent-lifecycle
Typesense Integration Contract
Treat SKOS as source-of-truth semantics and Typesense as retrieval runtime.
Field Mapping
| SKOS | Typesense field | Type | Notes |
|---|
| Concept URI | | | Canonical identifier |
| | facet | Filter by scheme/version |
| | facet | Operational code lookups |
| | | Primary lexical form |
| | | Alias lookup/query expansion |
| | | Misspelling/legacy term recovery |
| | | Long-form semantic context |
| | | Disambiguation guidance |
| | facet | Direct parents |
| | facet | Direct children |
| | facet | Lateral associations |
| Mapping props | , , etc. | | Cross-scheme links |
| Version/governance | , | facet | Rollout control |
For retrievable entities (docs, memory, events), also persist:
- (, faceted)
- , , , (where applicable)
Query Patterns
Classification candidate lookup:
bash
curl -s "http://localhost:8108/collections/taxonomy_concepts/documents/search?q=ingest+pipeline&query_by=pref_label,alt_labels,scope_note,definition&per_page=10" \
-H "X-TYPESENSE-API-KEY: panda-typesense-key"
Concept-constrained retrieval:
bash
curl -s "http://localhost:8108/collections/documents/documents/search?q=*&query_by=content&filter_by=concept_ids:=[joelclaw:concept:pipeline] && taxonomy_version:=workload-v1&per_page=20" \
-H "X-TYPESENSE-API-KEY: panda-typesense-key"
Operational notes:
- Use when filtering/faceting without lexical query terms.
- Keep concept fields faceted for fast filters and diagnostics.
- Synonyms operate on tokens, not values; concept IDs must be canonical.
- For transitive hierarchy retrieval, precompute ancestor closures into a dedicated field (example: ).
Local Access Troubleshooting
If
is unreachable, port-forward first:
bash
kubectl port-forward -n joelclaw svc/typesense 8108:8108
Quality Gates and Validation
Run these checks before shipping taxonomy changes:
- Label integrity:
- one per language per concept
- no overlap among pref/alt/hidden labels
- Structural integrity:
- every concept in exactly one expected scheme (or explicit multi-scheme design)
- no accidental hierarchy cycles
- and coherence
- Mapping integrity:
- mapping links only target external scheme concepts
- reviewed and justified
- Runtime integrity:
- coverage target met (>=95% for new records)
- unmapped labels observable in OTEL
Recommended operational probes:
joelclaw otel search "concept_ids|primary_concept_id|taxonomy_version" --hours 24
joelclaw recall "<query>" --category <mapped-category>
joelclaw docs search "<query>" --concept <concept-uri>
Governance Workflow
- Propose concept as with , not canonical.
- Check collisions against existing labels and aliases.
- Validate impact on classifier rules and retrieval filters.
- Promote to only after review and observed usage.
- Deprecate by state transition + mapping hints, never by URI reuse.
- Version changes explicitly ( -> ) with migration notes.
Anti-Patterns
Do not do these:
- "Tag soup" growth: free-form tags affecting retrieval without concept mapping.
- Using as a convenience synonym.
- Treating collection membership as hierarchy.
- Building deep trees without retrieval use-cases.
- Skipping then trying to fix ambiguity downstream with prompts only.
When SKOS Is Overkill
Use a simpler tagging/enum model when all are true:
- Fewer than ~30 stable labels.
- No hierarchy/mapping requirements.
- No multilingual or alias governance needs.
- Labels are not reused across systems.
- Retrieval quality does not depend on concept semantics.
If any of those stop being true, migrate to SKOS before scaling.
Research-Derived Operational Guidance
Use SKOS metadata to improve retrieval quality in agent pipelines:
- Metadata-aware retrieval pipelines can materially improve multi-hop QA accuracy (for example, reported F1/EM gains in Multi-Meta-RAG benchmarks).
- Controlled vocabularies plus free text generally improve precision/recall versus text-only querying in domain IR studies.
- Practical implication for joelclaw: always combine lexical/vector retrieval with concept filtering or re-ranking signals when classification confidence is high.
References