marketplace-search-recsys-planning
Original:🇺🇸 English
Translated
Use this skill whenever planning, designing, reviewing, or improving search and recommendation systems for a two-sided trust marketplace built on OpenSearch — covers user-intent framing, product-surface architecture, index design, query understanding, retrieval strategy, ranking, search-plus-recs blending, measurement, and a dashboard-and-alerting layer for ongoing decision making. Triggers on tasks involving marketplace search, homefeeds, ranking, relevance tuning, OpenSearch query DSL, analyzers, synonyms, golden sets, NDCG, A/B testing, or diagnosing an existing retrieval system. Use this skill BEFORE marketplace-personalisation when planning new work; hand off when the diagnosed bottleneck is personalisation-specific.
5installs
Sourcepproenca/dot-skills
Added on
NPX Install
npx skill4agent add pproenca/dot-skills marketplace-search-recsys-planningTags
Translated version includes tags in frontmatterSKILL.md Content
View Translation Comparison →Marketplace Engineering Two-Sided Search and Recsys Planning Best Practices
Comprehensive planning, design and diagnostic guide for search and recommendation systems
in two-sided trust marketplaces. Covers OpenSearch index, query and ranking patterns, the
methodology for planning retrieval work, the handoff points to recommendation-specific
tooling, and the instrumentation and dashboard layer that turns measurement into ongoing
decision making. Contains 57 rules across 10 categories ordered by cascade impact, plus
two playbooks (plan a new system from scratch, diagnose an existing one) and explicit
living-artefact conventions (decisions log, golden set, gotchas).
When to Apply
Reference this skill when:
- Planning a new marketplace retrieval project from scratch
- Reviewing an existing retrieval system that feels stale, unfair, or unpersonalised
- Designing the OpenSearch index mapping, analyzers, or query DSL
- Choosing retrieval primitives per product surface (search, recs, hybrid, curated)
- Deciding which search quality metrics to track and dashboard
- Running the weekly search-quality review ritual
- Diagnosing a silent regression in ranking, coverage, or zero-result rate
- Deciding when a retrieval problem is actually a personalisation problem
This skill is the precursor to . Start here for
planning and search work; hand off to the personalisation skill when the diagnosed
bottleneck is impression tracking, feedback-loop bias, or AWS Personalize-specific
design.
marketplace-personalisationLiving Context
This skill treats the system as evolving. Three living artefacts carry context across
sessions, releases, and team changes — read them before making suggestions, update them
after every shipped change:
- (in this skill folder) — append-only diagnostic lessons. Every gotcha has a date and a short description of what surprised the team and how it was resolved.
gotchas.md - Decisions log (maintained in the product repo, typically ) — every ranking change, schema tweak, and synonym edit recorded with its hypothesis, offline and online evidence, ship criterion, outcome, and rollback path. See rule
decisions/*.md.plan-maintain-a-decisions-log - Golden query set (frozen per eval cycle, committed to the product repo) — the
reference set of queries against which every ranking change is offline-evaluated
before an online test. See rule
.
plan-version-the-golden-set
Rule Categories
Categories are ordered by cascade impact on the retrieval lifecycle: intent
misunderstanding poisons architecture; wrong architecture poisons index; wrong index
poisons retrieval forever until a reindex; every downstream layer inherits the upstream
error.
| # | Category | Prefix | Impact |
|---|---|---|---|
| 1 | Problem Framing and User Intent | | CRITICAL |
| 2 | Surface Taxonomy and Architecture | | CRITICAL |
| 3 | Index Design and Mapping | | HIGH |
| 4 | Planning and Improvement Methodology | | HIGH |
| 5 | Query Understanding | | MEDIUM-HIGH |
| 6 | Retrieval Strategy | | MEDIUM-HIGH |
| 7 | Relevance and Ranking | | MEDIUM-HIGH |
| 8 | Search and Recommender Blending | | MEDIUM |
| 9 | Measurement and Experimentation | | MEDIUM |
| 10 | Instrumentation, Dashboards and Decision Triggers | | MEDIUM |
Quick Reference
1. Problem Framing and User Intent (CRITICAL)
- — classify before retrieving
intent-map-queries-to-intent-classes - — different failure modes, different strategies
intent-separate-known-item-from-discovery - — design from real data, not imagined data
intent-audit-live-query-logs-first - — precision vs diversity
intent-distinguish-transactional-from-exploratory - — per-surface query shapes
intent-reject-one-search-for-everything - — curated is a legitimate answer
intent-treat-no-search-as-first-class-choice
2. Surface Taxonomy and Architecture (CRITICAL)
- — a single-source-of-truth routing table
arch-map-surface-to-retrieval-primitive - — two-stage pipelines
arch-split-candidate-generation-from-ranking - — declare fallback owner per surface
arch-design-zero-result-fallback - — cold start is permanent, not bootstrap
arch-design-for-cold-start-from-day-one - — diversify primary dependencies
arch-avoid-mono-stack-retrieval - — every routing decision recorded
arch-route-surfaces-deliberately
3. Index Design and Mapping (HIGH)
- — reindex is expensive
index-design-mappings-conservatively - — full-text plus exact match
index-use-keyword-and-text-as-multi-fields - — tokens must agree
index-match-index-and-query-time-analyzers - — language-aware stemming
index-use-language-analyzers-for-language-fields - — index only what you search
index-separate-searchable-from-display-fields - — prevent mapping drift
index-use-index-templates-for-consistency - — freshness in seconds, not hours
index-stream-listing-updates-via-cdc
4. Planning and Improvement Methodology (HIGH)
- — instrumentation gate on kick-off
plan-audit-before-you-build - — the first artefact, not the last
plan-build-golden-query-set-first - — theory of constraints
plan-find-bottleneck-before-optimising - — living context across team changes
plan-maintain-a-decisions-log - — frozen per eval cycle
plan-version-the-golden-set - — recognise the boundary
plan-handoff-to-personalisation-skill
5. Query Understanding (MEDIUM-HIGH)
- — canonical string in
query-normalise-before-anything-else - — double-digit recall wins
query-use-language-analyzers-for-stemming - — domain vocabulary not thesaurus
query-curate-synonyms-by-domain - — 10-15% of queries have typos
query-use-fuzzy-matching-for-typos - — single-pass classifier
query-classify-before-routing - — latency isolation
query-build-autocomplete-on-separate-index
6. Retrieval Strategy (MEDIUM-HIGH)
- — filter cache wins
retrieve-use-filter-clauses-for-exact-matches - — must vs should vs filter
retrieve-use-bool-structure-deliberately - — rescore window limits cost
retrieve-run-expensive-signals-in-rescore - — lexical plus semantic
retrieve-combine-bm25-and-knn-via-hybrid-search - — constant-cost deep pagination
retrieve-paginate-with-search-after - — re-embedding is expensive
retrieve-choose-embedding-model-deliberately
7. Relevance and Ranking (MEDIUM-HIGH)
- — upstream levers first
rank-tune-bm25-parameters-last - — explicit named functions
rank-use-function-score-for-business-signals - — supervised learning needs labels
rank-deploy-ltr-only-after-golden-set-exists - — after scoring, not before
rank-apply-diversity-at-rank-time - — comparable scales
rank-normalise-scores-across-retrieval-primitives
8. Search and Recommender Blending (MEDIUM)
- — precision queries
blend-use-search-alone-for-specific-intent - — normalised weighted sum
blend-combine-search-and-personalisation-scores - — traceable results
blend-keep-hybrid-blending-explainable - — guaranteed cascade to non-empty
blend-never-return-zero-results
9. Measurement and Experimentation (MEDIUM)
- — one definition per surface
measure-define-session-success-per-surface - — three metrics for one picture
measure-track-ndcg-mrr-zero-result-rate - — cheapest failure metric
measure-track-reformulation-rate-as-failure-signal - — scale beyond human judges
measure-use-click-models-for-implicit-judgments - — 10x less sample needed
measure-run-interleaving-as-cheap-ab-proxy
10. Instrumentation, Dashboards and Decision Triggers (MEDIUM)
- — structured replayable events
monitor-log-every-query-with-full-context - — redact before warehouse ingestion
monitor-scrub-pii-from-query-logs - — threshold lines, colour bands
monitor-build-search-health-dashboard - — quality metrics, not error rates
monitor-alert-on-decision-triggers - — RBO churn as leading indicator
monitor-track-ranking-stability-churn - — calendar-driven ritual
monitor-run-weekly-search-quality-review
Planning and Improving
Two playbooks compose the rules into end-to-end workflows:
- — Plan a new marketplace retrieval system from scratch. Nine-step workflow from intent audit through the first A/B-tested online lift, with explicit exit criteria per step.
references/playbooks/planning.md - — Diagnose and improve an existing retrieval system. Decision tree that walks through telemetry, index freshness, coverage, baseline gap, cold start, segment regressions, and algorithm iteration in that order, with hand-off points to
references/playbooks/improving.mdwhen the bottleneck is personalisation-specific.marketplace-personalisation
Read the playbooks first when the task is "design a new search and recommender project"
or "this retrieval system needs to get better". Read individual rules when a specific
question arises during implementation or review.
How to Use
- Read for category structure and cascade rationale.
references/_sections.md - Read for diagnostic lessons accumulated from prior incidents.
gotchas.md - Read to plan a new system.
references/playbooks/planning.md - Read to diagnose an existing one.
references/playbooks/improving.md - Read individual rule files when a specific task matches the rule title.
- Use to author new rules as the skill grows.
assets/templates/_template.md
Related Skills
- — The companion skill covering AWS Personalize implementation, impression tracking, schema design, two-sided matching, feedback loops, and the personalisation-specific diagnostic playbook. Hand off to this skill when the diagnostic identifies a personalisation-specific bottleneck.
marketplace-personalisation
Reference Files
| File | Description |
|---|---|
| references/_sections.md | Category definitions and impact ordering |
| references/playbooks/planning.md | Plan a new retrieval system |
| references/playbooks/improving.md | Diagnose an existing retrieval system |
| gotchas.md | Accumulated diagnostic lessons (living) |
| assets/templates/_template.md | Template for authoring new rules |
| metadata.json | Version, discipline, references |