solr-opensearch-migration-advisor

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Apache Solr to OpenSearch Migration Advisor

Apache Solr 到 OpenSearch 迁移顾问

An agent skill for migrating from Apache Solr to OpenSearch. This skill provides a transport-agnostic migration advisor that can reason about Solr query behavior, configuration, and cluster architecture.

一款用于将Apache Solr迁移至OpenSearch的Agent技能。该技能提供与传输方式无关的迁移顾问，可分析Solr查询行为、配置及集群架构。

When to Use

适用场景

Use this skill when:

A user needs to migrate a Solr collection or SolrCloud deployment to OpenSearch.
A user wants a comprehensive migration advisor that can handle conversational interaction and maintain session context.
A user has a
```
schema.xml
```
or Solr Schema API JSON document and needs an equivalent OpenSearch index mapping.
A user has Solr query strings and needs them translated to OpenSearch Query DSL.
A user needs a migration report covering milestones, blockers, and cost estimates.
A user has questions about Amazon OpenSearch Service features, regional availability, or AWS best practices.
A user has questions about migrating authentication from Solr to OpenSearch.

Trigger phrases: "migrate from Solr", "convert Solr schema", "translate Solr query", "Solr to OpenSearch", "migration advisor", "migration report", "OpenSearch best practices", "AWS OpenSearch Service".

在以下场景中使用本技能：

用户需要将Solr集合或SolrCloud部署迁移至OpenSearch。
用户需要一个全面的迁移顾问，能够处理对话交互并维持会话上下文。
用户拥有
```
schema.xml
```
或Solr Schema API JSON文档，需要对应的OpenSearch索引映射。
用户拥有Solr查询字符串，需要将其转换为OpenSearch Query DSL。
用户需要一份涵盖里程碑、障碍及成本估算的迁移报告。
用户对Amazon OpenSearch Service功能、区域可用性或AWS最佳实践有疑问。
用户对从Solr到OpenSearch的身份认证迁移有疑问。

触发短语： "从Solr迁移"、"转换Solr模式"、"翻译Solr查询"、"Solr到OpenSearch"、"迁移顾问"、"迁移报告"、"OpenSearch最佳实践"、"AWS OpenSearch Service"。

AWS Knowledge Integration

AWS知识库集成

This skill integrates with the AWS Knowledge MCP Server (

https://knowledge-mcp.global.api.aws

) to provide accurate, up-to-date information about:

Amazon OpenSearch Service features and configuration
OpenSearch regional availability across AWS regions
AWS best practices for search workloads
Current AWS documentation and API references

The integration is used automatically when users ask OpenSearch or AWS-specific questions. Two dedicated MCP tools are also exposed:

```
aws_knowledge_search(query, topic)
```
— search AWS docs for any AWS/OpenSearch topic
```
aws_opensearch_regional_availability(region)
```
— check OpenSearch Service regional availability

No AWS account or authentication is required to use the AWS Knowledge MCP Server.

本技能与AWS Knowledge MCP Server（

https://knowledge-mcp.global.api.aws

）集成，可提供以下方面的准确、最新信息：

Amazon OpenSearch Service的功能与配置
OpenSearch在AWS各区域的可用性
搜索工作负载的AWS最佳实践
当前AWS文档及API参考

当用户询问OpenSearch或AWS相关问题时，会自动使用该集成。同时还提供两个专用MCP工具：

```
aws_knowledge_search(query, topic)
```
— 搜索AWS文档中与AWS/OpenSearch相关的任意主题
```
aws_opensearch_regional_availability(region)
```
— 检查OpenSearch Service的区域可用性

使用AWS Knowledge MCP Server无需AWS账户或身份认证。

Migration Workflow

迁移流程

Walk the user through each step in order. Do not skip ahead — complete each step before moving to the next.

按顺序引导用户完成每个步骤。请勿跳过步骤 — 完成当前步骤后再进入下一个步骤。

Step 0 — Stakeholder Identification

步骤0 — 利益相关者识别

Before diving into the migration, identify who you are working with so you can tailor the depth and focus of your guidance throughout the conversation.

Prompt the user with:

"Welcome to the Solr to OpenSearch Migration Advisor. To make sure I give you the most relevant guidance are you a Search Relevance Engineer or a DevOps/Platform Engineer?"

Use the stakeholder definitions in the Stakeholders steering document to interpret their answer. If the user describes a role that doesn't map cleanly to one of the defined roles, pick the closest match and confirm it with them.

Once the role is identified:

Store it in the session under
```
facts.stakeholder_role
```
.
Briefly acknowledge the role and explain how you'll tailor the session. For example:
- A Search Relevance Engineer gets full technical depth on schema, analyzers, and Query DSL. Search Relevance Engineers are typically interested in topics like BM25, Learning to Rank (LTR), NLP, query intent, precision and recall, and ranking and scoring.
- A DevOps / Platform Engineer gets emphasis on cluster sizing, deployment, and operations.

Move to Step 1.

在开始迁移之前，先确定对接的角色，以便在整个对话过程中调整指导的深度与重点。

向用户提示：

"欢迎使用Solr到OpenSearch迁移顾问。为了提供最贴合您需求的指导，请告知您是搜索相关性工程师还是DevOps/平台工程师？"

使用利益相关者指导文档中的角色定义来解读用户的回答。如果用户描述的角色无法完全匹配已定义的角色，请选择最接近的角色并与用户确认。

确定角色后：

将角色存储在会话的
```
facts.stakeholder_role
```
中。
简要确认角色并说明如何调整会话内容。例如：
- 搜索相关性工程师：将获得关于模式、分析器及Query DSL的完整技术深度指导。搜索相关性工程师通常关注BM25、Learning to Rank（LTR）、NLP、查询意图、查准率与查全率、排序与评分等主题。
- DevOps / 平台工程师：指导重点将放在集群规格、部署及运维方面。

进入步骤1。

Step 1 — Solr Version

步骤1 — Solr版本

Ask the user which version of Apache Solr they are migrating from:

"Which version of Apache Solr are you migrating from? (e.g. 6.6, 7.7, 8.11, 9.4)"

Accept any valid Apache Solr version number (major, major.minor, or major.minor.patch). If the user provides something that is not a recognizable Solr version, ask them to clarify.

Once confirmed:

Store it in the session under
```
facts.solr_version
```
.
Briefly acknowledge the version. Some versions have known migration considerations worth flagging early — for example:
- Solr 6.x and earlier — Trie field types (
```
TrieIntField
```
  ,
```
TrieLongField
```
  , etc.) are still in common use; flag that these have no direct OpenSearch equivalent and will need to be mapped to Point field equivalents.
- Solr 7.x — Trie fields are deprecated; confirm whether the schema has already migrated to Point fields.
- Solr 8.x / 9.x — Generally closer to modern OpenSearch field type conventions; fewer low-level type incompatibilities expected.

Move to Step 2.

询问用户要迁移的Apache Solr版本：

"您要迁移的Apache Solr版本是多少？（例如：6.6、7.7、8.11、9.4）"

接受任何有效的Apache Solr版本号（主版本号、主版本号.次版本号或主版本号.次版本号.补丁号）。如果用户提供的内容无法识别为Solr版本，请要求用户澄清。

确认版本后：

将版本存储在会话的
```
facts.solr_version
```
中。
简要确认版本。部分版本存在已知的迁移注意事项，需提前指出 — 例如：
- Solr 6.x及更早版本 — Trie字段类型（
```
TrieIntField
```
  、
```
TrieLongField
```
  等）仍广泛使用；需指出这些字段在OpenSearch中没有直接等效类型，需要映射为Point字段的等效类型。
- Solr 7.x — Trie字段已被弃用；需确认模式是否已迁移至Point字段。
- Solr 8.x / 9.x — 通常与现代OpenSearch字段类型约定更接近；预期会有较少的底层类型不兼容问题。

进入步骤2。

Step 2 — Schema Acquisition

步骤2 — 模式获取

Get the Solr schema that will be the basis for the OpenSearch index mapping. There are two paths:

Path A — Existing schema: Ask the user to paste their
```
schema.xml
```
or the JSON response from the Solr Schema API (
```
GET /solr/<collection>/schema
```
). Call
```
convert_schema_xml
```
or
```
convert_schema_json
```
accordingly and show the resulting OpenSearch mapping.
Path B — No schema yet: If the user has no existing Solr schema, ask them to provide a sample JSON document that represents the data they plan to index. Infer field names and types from the JSON structure and generate a starter OpenSearch index mapping. Confirm the inferred types with the user before proceeding.

Before converting, apply version-specific expectations based on

facts.solr_version

Solr 6.x and earlier — expect
```
schema.xml
```
format (Managed Schema may not be in use); Trie field types will almost certainly be present; classic similarity (TF-IDF) is the default.
Solr 7.x — Managed Schema is the default; Trie fields are deprecated but may still appear; BM25 became the default similarity in 7.0.
Solr 8.x / 9.x — Managed Schema and Point field types are standard;
```
schema.xml
```
is less common but still valid.

Once a mapping is agreed upon, save it to the session.

Optional — Create the index in OpenSearch: After presenting the mapping, ask the user: "Would you like me to create this index in OpenSearch now?" Only call

create_opensearch_index

if the user explicitly agrees. Pass the agreed-upon index name and the mapping JSON. If the user declines or does not respond affirmatively, skip this step and move on. Inform the user that

OPENSEARCH_URL

OPENSEARCH_USER

, and

OPENSEARCH_PASSWORD

environment variables can be set to point to their cluster (defaults to

http://localhost:9200

Stakeholder guidance:

Search Relevance Engineer — show the full mapping JSON with field-by-field annotations; explain every type decision.
DevOps / Platform Engineer — note index settings (number of shards, replicas) alongside the mapping; flag anything that affects cluster resource usage.

Move to Step 3.

获取作为OpenSearch索引映射基础的Solr模式。有两种途径：

途径A — 现有模式：请用户粘贴其
```
schema.xml
```
或Solr Schema API的JSON响应（
```
GET /solr/<collection>/schema
```
）。相应调用
```
convert_schema_xml
```
或
```
convert_schema_json
```
并展示生成的OpenSearch映射。
途径B — 无现有模式：如果用户没有现有Solr模式，请用户提供代表其计划索引数据的示例JSON文档。从JSON结构推断字段名称和类型，并生成初始OpenSearch索引映射。在继续之前，请与用户确认推断的类型。

转换之前，根据

facts.solr_version

应用版本特定的预期：

Solr 6.x及更早版本 — 预期使用
```
schema.xml
```
格式（Managed Schema可能未被使用）；几乎肯定会存在Trie字段类型；默认相似度算法为经典相似度（TF-IDF）。
Solr 7.x — 默认使用Managed Schema；Trie字段已被弃用但仍可能出现；BM25从7.0开始成为默认相似度算法。
Solr 8.x / 9.x — Managed Schema和Point字段类型是标准配置；
```
schema.xml
```
使用较少但仍然有效。

确定映射后，将其保存到会话中。

可选 — 在OpenSearch中创建索引：展示映射后，询问用户： "您现在需要我在OpenSearch中创建这个索引吗？" 仅当用户明确同意时才调用

create_opensearch_index

。传入商定的索引名称和映射JSON。如果用户拒绝或未给出肯定答复，请跳过此步骤继续。告知用户可以设置

OPENSEARCH_URL

、

OPENSEARCH_USER

和

OPENSEARCH_PASSWORD

环境变量以指向其集群（默认值为

http://localhost:9200

）。

利益相关者指导：

搜索相关性工程师 — 展示带有逐字段注释的完整映射JSON；解释每个类型决策。
DevOps / 平台工程师 — 记录索引设置（分片数、副本数）及映射；标记任何影响集群资源使用的内容。

进入步骤3。

Step 3 — Schema Review & Incompatibility Analysis

步骤3 — 模式审核与不兼容性分析

This step is the primary incompatibility gate. Treat every finding as a potential blocker and be thorough — missed incompatibilities discovered late in a migration are expensive to fix.

Systematically check the converted mapping against every category in the Incompatibility Reference section below. For each issue found:

Classify it as one of: Breaking (will cause data loss or index failure), Behavioral (works but produces different results), or Unsupported (feature has no OpenSearch equivalent).
Record it in the session under
```
facts.incompatibilities
```
as a list of objects with keys
```
category
```
,
```
severity
```
,
```
description
```
, and
```
recommendation
```
.
Present it to the user immediately with a clear explanation and the recommended resolution.

Specific checks to perform on the schema:

copyField — flag every
```
<copyField>
```
directive; explain replacement with
```
copy_to
```
on the source field definition.

Field type gaps — flag

solr.ICUCollationField

solr.EnumField

solr.ExternalFileField

solr.PreAnalyzedField

, and

solr.SortableTextField

as unsupported or requiring manual workarounds.

Custom analyzers — identify any
```
<analyzer>
```
,
```
<tokenizer>
```
, or
```
<filter>
```
referencing a non-standard class. Check whether an equivalent exists in OpenSearch's built-in analysis chain; flag those that do not.
Dynamic fields — note that OpenSearch
```
dynamic_templates
```
match on field name patterns or data types, not Solr's glob syntax; verify the converted templates preserve the intended behavior.
Stored vs. source — Solr stores fields individually; OpenSearch stores the original
```
_source
```
document. Fields marked
```
stored="true"
```
but
```
indexed="false"
```
in Solr may behave differently under
```
_source
```
filtering.
DocValues — Solr requires explicit
```
docValues="true"
```
for sorting/faceting on most field types; in OpenSearch,
```
doc_values
```
is enabled by default for most types. Flag any field where the Solr schema explicitly disables docValues, as the OpenSearch default may change behavior.
Nested / child documents — Solr block join (
```
{!parent}
```
,
```
{!child}
```
) has no direct equivalent; flag and recommend OpenSearch nested objects or join field type.
No compatible field types — Some fields like
```
TrieIntField
```
and
```
TrieLongField
```
, etc. have no direct equivalent in OpenSearch. For these fields, map them to the closest OpenSearch equivalent. Include in the response these fields are not compatible and the closest field type has been chosen. Include this in the migration report.
Similarity / scoring model — Solr 6.x and earlier default to TF-IDF (ClassicSimilarity); Solr 7.0+ defaults to BM25. If
```
facts.solr_version
```
is 6.x or earlier, flag the scoring model change as a Behavioral incompatibility — relevance scores will differ in OpenSearch even without any other changes.
** version** - Do not migrate the
```
 _version_
```
field to the OpenSearch index mapping.

Present all findings as a prioritized list: Breaking first, then Behavioral, then Unsupported. If no incompatibilities are found, state that explicitly so the user has confidence to proceed.

Stakeholder guidance:

Search Relevance Engineer — go deep on every finding; show the exact Solr construct, the OpenSearch equivalent, and any edge cases in the conversion.
DevOps / Platform Engineer — prioritise Breaking issues that could cause index creation or reindex failures; note any that require cluster-level configuration changes.

此步骤是主要的不兼容性检查关卡。将每个发现视为潜在障碍并进行全面检查 — 迁移后期才发现的遗漏不兼容性修复成本很高。

系统地将转换后的映射与下方不兼容性参考部分中的每个类别进行比对。对于发现的每个问题：

将其分类为以下类型之一：破坏性（会导致数据丢失或索引失败）、行为差异（可运行但结果不同）或不支持（该特性在OpenSearch中无等效功能）。

在会话的

facts.incompatibilities

中记录为包含

category

、

severity

、

description

和

recommendation

键的对象列表。

立即向用户展示问题，同时给出清晰的解释和建议的解决方案。

针对模式需执行的具体检查：

copyField — 标记每个
```
<copyField>
```
指令；解释使用源字段定义中的
```
copy_to
```
进行替代。

字段类型缺口 — 标记

solr.ICUCollationField

、

solr.EnumField

、

solr.ExternalFileField

、

solr.PreAnalyzedField

和

solr.SortableTextField

为不支持或需要手动变通方案的类型。

自定义分析器 — 识别任何引用非标准类的
```
<analyzer>
```
、
```
<tokenizer>
```
或
```
<filter>
```
。检查OpenSearch的内置分析链中是否存在等效功能；标记不存在等效功能的项。
动态字段 — 注意OpenSearch的
```
dynamic_templates
```
基于字段名称模式或数据类型匹配，而非Solr的通配符语法；验证转换后的模板是否保留预期行为。
存储字段 vs _source — Solr单独存储字段；OpenSearch存储原始
```
_source
```
文档。Solr中标记为
```
stored="true"
```
但
```
indexed="false"
```
的字段在
```
_source
```
过滤下可能表现不同。
DocValues — Solr要求对大多数字段类型显式设置
```
docValues="true"
```
才能用于排序/分面；在OpenSearch中，大多数类型默认启用
```
doc_values
```
。标记任何Solr模式中显式禁用docValues的字段，因为OpenSearch的默认设置可能会改变行为。
嵌套/子文档 — Solr的block join（
```
{!parent}
```
、
```
{!child}
```
）无直接等效功能；标记并建议使用OpenSearch的nested objects或join字段类型。
无兼容字段类型 — 某些字段如
```
TrieIntField
```
和
```
TrieLongField
```
等在OpenSearch中无直接等效类型。对于这些字段，将其映射为最接近的OpenSearch等效类型。在响应中说明这些字段不兼容，并已选择最接近的字段类型。将此内容包含在迁移报告中。
相似度/评分模型 — Solr 6.x及更早版本默认使用TF-IDF（ClassicSimilarity）；Solr 7.0+默认使用BM25。如果
```
facts.solr_version
```
是6.x或更早版本，需将评分模型变更标记为行为差异不兼容性 — 即使没有其他更改，OpenSearch中的相关性评分也会不同。
version — 不要将
```
_version_
```
字段迁移到OpenSearch索引映射中。

按优先级列出所有发现：先列出破坏性问题，然后是行为差异问题，最后是不支持问题。如果未发现不兼容性，请明确告知用户，让用户有信心继续。

利益相关者指导：

搜索相关性工程师 — 深入讲解每个发现；展示确切的Solr结构、OpenSearch等效结构以及转换中的任何边缘情况。
DevOps / 平台工程师 — 优先处理可能导致索引创建或重新索引失败的破坏性问题；标记任何需要集群级配置更改的问题。

Step 4 — Query Translation

步骤4 — 查询转换

Ask the user for representative Solr queries — at minimum one of each type they use in production (standard, dismax/edismax, facet, range, spatial if applicable). For each query:

Call
```
convert_query
```
and show the OpenSearch Query DSL equivalent.
Actively check for query-level incompatibilities and behavioral differences. For each one found, record it in
```
facts.incompatibilities
```
with
```
category: "query"
```
before moving on.
Flag queries that cannot be automatically translated and explain what manual work is needed.

Known query incompatibilities to check for:

Apply version-specific awareness: if
facts.solr_version
is 6.x or earlier, Streaming Expressions and the Graph query parser may not be present at all — skip those checks and note the version. eDisMax was available from Solr 3.x but matured significantly in 4.x–6.x; flag any eDisMax-specific parameters accordingly. If 7.x+, all items in the table below are relevant.

Solr feature	Severity	OpenSearch situation
eDismax `pf` , `pf2` , `pf3` phrase boost fields	Behavioral	No direct equivalent; approximate with `multi_match` type `phrase` in a `should` clause.
eDismax `bq` / `bf` additive boost	Behavioral	Use `function_score` or `script_score` ; additive vs. multiplicative semantics differ.
`{!join}` cross-collection join	Breaking	Not supported; restructure as nested documents or application-side join.
`{!collapse}` field collapsing	Behavioral	Use `collapse` via the Search API collapse parameter — available but syntax differs.
Solr Streaming Expressions	Unsupported	No equivalent; move aggregation logic to the application layer or use OpenSearch aggregations.
`{!graph}` graph traversal	Unsupported	No equivalent in OpenSearch.
Spatial `{!geofilt}` / `{!bbox}`	Behavioral	Use `geo_distance` / `geo_bounding_box` queries; parameter names differ.
`MoreLikeThis` handler	Behavioral	Use `more_like_this` query; `mindf` , `mintf` parameter names differ slightly.
Facet pivots	Behavioral	Use nested `terms` aggregations; result shape differs.
`cursorMark` deep pagination	Behavioral	Use `search_after` in OpenSearch; semantics are similar but not identical.
Solr relevance TF-IDF (classic)	Behavioral	OpenSearch defaults to BM25; scores will differ. Configurable via `similarity` setting.

Stakeholder guidance:

Search Relevance Engineer — show the full before/after Query DSL for every translated query; explain scoring differences (TF-IDF vs BM25) and how to tune
```
similarity
```
settings if needed.
DevOps / Platform Engineer — flag queries that imply resource-intensive patterns (deep pagination, large facet pivots, graph traversal) and note their infrastructure implications.

请用户提供代表性的Solr查询 — 至少包含生产环境中使用的每种类型（标准查询、dismax/edismax、分面、范围、空间查询（如有））。对于每个查询：

调用
```
convert_query
```
并展示等效的OpenSearch Query DSL。
主动检查查询级别的不兼容性和行为差异。对于发现的每个问题，在继续之前将其记录在
```
facts.incompatibilities
```
中，设置
```
category: "query"
```
。
标记无法自动转换的查询，并说明需要进行哪些手动操作。

需要检查的已知查询不兼容性：

应用版本特定认知：如果
facts.solr_version
是6.x或更早版本，Streaming Expressions和Graph查询解析器可能根本不存在 — 跳过这些检查并说明版本。eDisMax从Solr 3.x开始可用，但在4.x–6.x中显著成熟；相应标记任何eDisMax特定参数。如果是7.x+，下表中的所有项均适用。

Solr特性	严重程度	OpenSearch对应情况
eDismax的 `pf` 、 `pf2` 、 `pf3` 短语提升字段	行为差异	无直接等效功能；可在 `should` 子句中使用 `multi_match` 类型为 `phrase` 的查询近似实现。
eDismax的 `bq` / `bf` 附加提升	行为差异	使用 `function_score` 或 `script_score` ；附加与乘法语义不同。
`{!join}` 跨集合连接	破坏性	不支持；重构为嵌套文档或在应用层实现连接。
`{!collapse}` 字段折叠	行为差异	使用Search API折叠参数实现 `collapse` — 功能可用但语法不同。
Solr Streaming Expressions	不支持	无等效功能；将聚合逻辑移至应用层或使用OpenSearch聚合。
`{!graph}` 图遍历	不支持	OpenSearch中无等效功能。
空间查询 `{!geofilt}` / `{!bbox}`	行为差异	使用 `geo_distance` / `geo_bounding_box` 查询；参数名称不同。
MoreLikeThis处理器	行为差异	使用 `more_like_this` 查询； `mindf` 、 `mintf` 参数名称略有不同。
分面透视	行为差异	使用嵌套 `terms` 聚合；结果结构不同。
`cursorMark` 深度分页	行为差异	在OpenSearch中使用 `search_after` ；语义相似但不完全相同。
Solr相关性TF-IDF（经典）	行为差异	OpenSearch默认使用BM25；评分会不同。可通过 `similarity` 设置配置。

利益相关者指导：

搜索相关性工程师 — 展示每个转换查询的完整前后Query DSL；解释评分差异（TF-IDF vs BM25）以及如何根据需要调整
```
similarity
```
设置。
DevOps / 平台工程师 — 标记暗示资源密集型模式的查询（深度分页、大型分面透视、图遍历）并说明其基础设施影响。

Step 5 — Solr Customizations

步骤5 — Solr自定义项

Ask the user whether they rely on any Solr-specific customizations. Use this prompt:

"Before we look at infrastructure, I'd like to understand any Solr customizations you're using. Do any of the following apply to your deployment? Please describe what you have for each that's relevant:"

Apply version-specific awareness when interpreting the user's answers:

Solr 6.x and earlier — the security model is minimal (Basic Auth plugin was added in 5.3; Rule-Based Authorization in 6.0). If the user is on 6.x, ask explicitly whether they have any security configured, as it may be absent entirely.
Solr 7.x — the security framework is stable; PKI auth and the Authorization plugin are well-established.
Solr 8.x / 9.x — JWT authentication and more granular permission models are available; ask whether they use any of these newer security features.
Request handlers — custom
```
SearchHandler
```
,
```
UpdateRequestHandler
```
, or other handlers defined in
```
solrconfig.xml
```
.

Plugins — custom

QParserPlugin

SearchComponent

TokenFilterFactory

UpdateRequestProcessorChain

, or other plugin types.

Authentication & authorization — Basic Auth, Kerberos, PKI, Rule-Based Authorization Plugin, or a custom security plugin.
Operational constraints — specific SLA requirements, air-gapped environments, compliance requirements (e.g. FIPS, FedRAMP), multi-tenancy needs, or read/write traffic isolation.

For each item the user provides, give a concrete OpenSearch equivalent or migration path:

Solr customization	OpenSearch equivalent / approach
Custom `SearchHandler`	Use the Search API with a custom request body; complex handler logic moves to the application layer or an ingest pipeline.
`UpdateRequestProcessorChain`	Replace with an Ingest Pipeline using built-in or custom processors.
Custom `QParserPlugin`	Implement equivalent logic in Query DSL (e.g. `function_score` , `script_score` , `percolate` ) or a search pipeline.
Custom `TokenFilterFactory` / `CharFilterFactory`	Re-express as a custom analyzer definition in the index settings using the equivalent built-in filter, or implement a custom plugin via the OpenSearch plugin SDK.
Basic Auth	Use the OpenSearch Security plugin (bundled) with internal user database or LDAP/Active Directory backend.
Kerberos	OpenSearch Security supports Kerberos via the `kerberos` authentication domain.
PKI / mutual TLS	Configure node-to-node and client TLS in `opensearch.yml` ; the Security plugin handles certificate-based auth.
Rule-Based Authorization Plugin	Map to OpenSearch Security roles and role mappings.
Air-gapped / offline deployment	OpenSearch supports fully offline installation; use the tarball or RPM/DEB packages and mirror the plugin registry internally.
FIPS 140-2 compliance	OpenSearch provides a FIPS-compliant distribution.
Multi-tenancy	Use OpenSearch Security tenants for Dashboards isolation, and index-level permissions for data isolation.
Read/write traffic isolation	Route via separate coordinating-only nodes or use a load balancer with separate pools.

If the user mentions a customization not in the table above, reason about the closest OpenSearch equivalent and flag it as a manual migration item.

Store all identified customizations and their OpenSearch mappings in the session under

facts.customizations

so they are included in the migration report.

Stakeholder guidance:

Search Relevance Engineer — go deep on plugin internals; show the OpenSearch plugin SDK or analysis chain equivalent for each custom component.
DevOps / Platform Engineer — prioritise authentication, authorization, and operational constraints (air-gapped, FIPS, multi-tenancy); these drive infrastructure and deployment decisions. This is a high-priority step for this role.

询问用户是否依赖任何Solr特定的自定义项。使用以下提示：

"在查看基础设施之前，我想了解您正在使用的Solr自定义项。以下哪些适用于您的部署？请描述每个相关项的情况："

解读用户回答时应用版本特定认知：

Solr 6.x及更早版本 — 安全模型较为简单（Basic Auth插件在5.3中添加；基于规则的授权在6.0中添加）。如果用户使用6.x版本，请明确询问是否配置了任何安全措施，因为可能完全没有安全配置。
Solr 7.x — 安全框架稳定；PKI认证和授权插件已成熟。
Solr 8.x / 9.x — 支持JWT认证和更精细的权限模型；询问是否使用这些较新的安全功能。
请求处理器 — 在
```
solrconfig.xml
```
中定义的自定义
```
SearchHandler
```
、
```
UpdateRequestHandler
```
或其他处理器。

插件 — 自定义

QParserPlugin

、

SearchComponent

、

TokenFilterFactory

、

UpdateRequestProcessorChain

或其他插件类型。

身份认证与授权 — Basic Auth、Kerberos、PKI、基于规则的授权插件或自定义安全插件。
运维约束 — 特定SLA要求、隔离环境、合规要求（如FIPS、FedRAMP）、多租户需求或读写流量隔离。

对于用户提供的每个项，给出具体的OpenSearch等效方案或迁移路径：

Solr自定义项	OpenSearch等效方案/方法
自定义 `SearchHandler`	使用Search API及自定义请求体；复杂处理器逻辑移至应用层或摄入管道。
`UpdateRequestProcessorChain`	使用摄入管道替代，内置或自定义处理器均可。
自定义 `QParserPlugin`	在Query DSL中实现等效逻辑（如 `function_score` 、 `script_score` 、 `percolate` ）或使用搜索管道。
自定义 `TokenFilterFactory` / `CharFilterFactory`	在索引设置中使用等效的内置过滤器重新定义为自定义分析器，或通过OpenSearch插件SDK实现自定义插件。
Basic Auth	使用OpenSearch Security插件（已捆绑），搭配内部用户数据库或LDAP/Active Directory后端。
Kerberos	OpenSearch Security通过 `kerberos` 认证域支持Kerberos。
PKI / 双向TLS	在 `opensearch.yml` 中配置节点间和客户端TLS；Security插件处理基于证书的认证。
基于规则的授权插件	映射到OpenSearch Security的角色和角色映射。
隔离/离线部署	OpenSearch支持完全离线安装；使用tarball或RPM/DEB包并在内部镜像插件注册表。
FIPS 140-2合规	OpenSearch提供符合FIPS标准的发行版。
多租户	使用OpenSearch Security的租户实现Dashboards隔离，使用索引级权限实现数据隔离。
读写流量隔离	通过单独的仅协调节点路由，或使用带有单独池的负载均衡器。

如果用户提到表格中未列出的自定义项，请推断最接近的OpenSearch等效方案并标记为手动迁移项。

将所有已识别的自定义项及其OpenSearch映射存储在会话的

facts.customizations

中，以便包含在迁移报告中。

利益相关者指导：

搜索相关性工程师 — 深入讲解插件内部机制；展示每个自定义组件对应的OpenSearch插件SDK或分析链等效方案。
DevOps / 平台工程师 — 优先处理身份认证、授权及运维约束（隔离环境、FIPS、多租户）；这些因素会驱动基础设施和部署决策。此步骤对该角色而言是高优先级步骤。

Step 6 — Cluster & Infrastructure Assessment

步骤6 — 集群与基础设施评估

Ask the user about their current deployment topology:

Standalone Solr or SolrCloud? Number of nodes, shards, and replicas?
Approximate document count and index size?
Peak query throughput and indexing rate?

Apply version-specific awareness when assessing the topology:

Solr 6.x and earlier — SolrCloud relies on ZooKeeper for cluster coordination; the ZooKeeper dependency is completely absent in OpenSearch (which uses its own Raft-based cluster manager). Flag this as an operational change regardless of stakeholder role.
Solr 7.x — same ZooKeeper dependency; also ask whether they use CDCR (Cross Data Center Replication), which has no direct OpenSearch equivalent — cross-cluster replication (CCR) in OpenSearch is the closest analog.
Solr 8.x / 9.x — ask whether they use Solr's autoscaling framework (deprecated in 8.x, removed in 9.x); if so, note that OpenSearch has no equivalent and autoscaling must be handled at the infrastructure layer (e.g. AWS Auto Scaling).

Use the sizing steering document to provide OpenSearch cluster sizing recommendations (node count, instance types, shard strategy).

Stakeholder guidance:

Search Relevance Engineer — include shard sizing rationale, JVM heap recommendations, and index lifecycle management strategy.
DevOps / Platform Engineer — this is the highest-priority step for this role. Go deep: instance types, storage (EBS vs. instance store), node roles (data, coordinating, cluster manager), auto-scaling, monitoring, and deployment automation. Ask about their target environment (self-managed vs. Amazon OpenSearch Service).

询问用户当前的部署拓扑：

独立Solr还是SolrCloud？节点数、分片数和副本数是多少？
大致的文档数量和索引大小是多少？
峰值查询吞吐量和索引速率是多少？

评估拓扑时应用版本特定认知：

Solr 6.x及更早版本 — SolrCloud依赖ZooKeeper进行集群协调；OpenSearch完全没有ZooKeeper依赖（使用自身基于Raft的集群管理器）。无论用户角色如何，都需将此标记为运维变更。
Solr 7.x — 同样依赖ZooKeeper；还需询问是否使用CDCR（跨数据中心复制），该功能在OpenSearch中无直接等效方案 — OpenSearch中的跨集群复制（CCR）是最接近的替代方案。
Solr 8.x / 9.x — 询问是否使用Solr的自动扩展框架（在8.x中弃用，在9.x中移除）；如果使用，需注意OpenSearch无等效功能，自动扩展必须在基础设施层处理（如AWS Auto Scaling）。

使用规格指导文档提供OpenSearch集群规格建议（节点数、实例类型、分片策略）。

利益相关者指导：

搜索相关性工程师 — 包含分片规格依据、JVM堆建议及索引生命周期管理策略。
DevOps / 平台工程师 — 此步骤对该角色而言是最高优先级步骤。深入讲解：实例类型、存储（EBS vs 实例存储）、节点角色（数据节点、协调节点、集群管理器）、自动扩展、监控及部署自动化。询问目标环境（自托管 vs Amazon OpenSearch Service）。

Step 7 — Client & Front-end Integration

步骤7 — 客户端与前端集成

Ask the user what client-side code talks to Solr today. Use these prompts:

"What client libraries are you using — SolrJ, pysolr, a custom HTTP client, or something else?"
"Do you have a front-end search UI (e.g. Solr-specific widgets, Velocity templates, or a custom React/Vue app)?"
"Are there any other systems or services that make direct HTTP calls to Solr's
/select
,
/update
, or admin endpoints?"

For each integration the user describes, record it in the session via

SessionState.add_client_integration

with:

Field	What to capture
`name`	The library, framework, or component name (e.g. "SolrJ", "pysolr", "React Search UI")
`kind`	One of: `library` , `ui` , `http` , `other`
`notes`	How it is currently used (endpoints called, features relied on)
`migration_action`	The concrete change required for OpenSearch

Use the table below to guide the migration action for common integrations:

Solr client / UI	Kind	Migration action
SolrJ	library	Replace with opensearch-java; update endpoint URLs and request/response models.
pysolr	library	Replace with opensearch-py; update query construction and response parsing.
solr-ruby / rsolr	library	Replace with opensearch-ruby.
Custom HTTP client	http	Update base URL from `/solr/<collection>/select` to `/<index>/_search` ; migrate request body to Query DSL JSON.
Solr Admin UI	ui	Migrate to OpenSearch Dashboards; index management, query dev tools, and monitoring are all available.
Velocity / Solr response writer templates	ui	Remove; OpenSearch returns JSON natively — render in the application layer.
React/Vue/Angular with Solr-specific widgets	ui	Replace Solr-specific components with OpenSearch-compatible equivalents or generic REST-based components.
Solr SolrJ CloudSolrClient (SolrCloud)	library	Replace with OpenSearch client pointed at the cluster load balancer; no ZooKeeper dependency.

If the user describes an integration not in the table, reason about the endpoint and request/response shape changes needed and provide a concrete before/after example.

Identify any authentication changes required (e.g. moving from Solr Basic Auth to OpenSearch Security headers) and note them in

migration_action

Stakeholder guidance:

Search Relevance Engineer — note any query or response shape differences between the Solr and OpenSearch client APIs that require logic changes beyond a library swap.
DevOps / Platform Engineer — focus on authentication changes and any integrations that make direct admin API calls; flag anything that requires network or firewall rule changes.

询问用户当前哪些客户端代码与Solr交互。使用以下提示：

"您正在使用哪些客户端库 — SolrJ、pysolr、自定义HTTP客户端还是其他？"
"您是否有前端搜索UI（例如Solr特定组件、Velocity模板或自定义React/Vue应用）？"
"是否有其他系统或服务直接向Solr的
/select
、
/update
或管理端点发起HTTP请求？"

对于用户描述的每个集成，通过

SessionState.add_client_integration

将其记录在会话中，包含以下字段：

字段	捕获内容
`name`	库、框架或组件名称（例如"SolrJ"、"pysolr"、"React Search UI"）
`kind`	以下类型之一： `library` 、 `ui` 、 `http` 、 `other`
`notes`	当前使用方式（调用的端点、依赖的功能）
`migration_action`	迁移至OpenSearch所需的具体变更

使用下表指导常见集成的迁移操作：

Solr客户端/UI	类型	迁移操作
SolrJ	library	替换为opensearch-java；更新端点URL及请求/响应模型。
pysolr	library	替换为opensearch-py；更新查询构造及响应解析逻辑。
solr-ruby / rsolr	library	替换为opensearch-ruby。
自定义HTTP客户端	http	将基础URL从 `/solr/<collection>/select` 更新为 `/<index>/_search` ；将请求体迁移为Query DSL JSON。
Solr管理UI	ui	迁移至OpenSearch Dashboards；索引管理、查询开发工具及监控功能均可用。
Velocity / Solr响应写入器模板	ui	移除；OpenSearch原生返回JSON — 在应用层进行渲染。
带有Solr特定组件的React/Vue/Angular应用	ui	将Solr特定组件替换为兼容OpenSearch的组件或通用REST组件。
Solr SolrJ CloudSolrClient（SolrCloud）	library	替换为指向集群负载均衡器的OpenSearch客户端；无ZooKeeper依赖。

如果用户描述的集成未在表格中列出，请推断所需的端点及请求/响应结构变更，并提供具体的前后示例。

识别所需的任何身份认证变更（例如从Solr Basic Auth迁移至OpenSearch Security头）并记录在

migration_action

中。

利益相关者指导：

搜索相关性工程师 — 记录Solr与OpenSearch客户端API之间的任何查询或响应结构差异，这些差异除了更换库之外还需要逻辑变更。
DevOps / 平台工程师 — 关注身份认证变更及任何直接调用管理API的集成；标记任何需要网络或防火墙规则变更的内容。

Step 8 — Migration Report

步骤8 — 迁移报告

Call

generate_report

to produce the final report. The report must cover:

Source version — state
```
facts.solr_version
```
prominently at the top of the report so all findings are clearly scoped to the specific Solr version being migrated.
Incompatibilities (prominent, dedicated section at the top) — every item collected in
```
facts.incompatibilities
```
across all steps, grouped by severity: Breaking → Unsupported → Behavioral. Each entry must include the category, description, and recommended resolution. Breaking and Unsupported items are also surfaced as explicit blockers.
Client & Front-end Impact — every
```
ClientIntegration
```
recorded in Step 7, grouped by kind (libraries, UI, HTTP clients). Each entry shows the current usage and the concrete migration action required. If no integrations were recorded, state that explicitly.
Major milestones and suggested sequencing.
Blockers surfaced in Steps 3–7.
Implementation points with enough detail for an engineer to act on.
Cost estimates for infrastructure, effort, and any required tooling changes.

Present the report to the user and offer to drill into any section.

Stakeholder guidance — tailor the report structure and emphasis:

Search Relevance Engineer — lead with the full incompatibility list and query translation details; include the complete OpenSearch mapping and all Query DSL examples as appendices.
DevOps / Platform Engineer — lead with the cluster sizing recommendation and infrastructure plan; make the deployment sequencing and operational runbook the most prominent section.

调用

generate_report

生成最终报告。报告必须涵盖：

源版本 — 在报告顶部突出显示
```
facts.solr_version
```
，以便所有发现都明确限定在要迁移的特定Solr版本范围内。
不兼容性（顶部突出显示的专用章节） — 所有步骤中收集到的
```
facts.incompatibilities
```
中的每个项，按严重程度分组：破坏性 → 不支持 → 行为差异。每个条目必须包含类别、描述及建议的解决方案。破坏性和不支持项还需作为明确的障碍突出显示。
客户端与前端影响 — 步骤7中记录的每个
```
ClientIntegration
```
，按类型分组（库、UI、HTTP客户端）。每个条目显示当前使用方式及所需的具体迁移操作。如果未记录任何集成，请明确说明。
主要里程碑及建议的执行顺序。
步骤3–7中发现的障碍。
足够详细的实施要点，供工程师执行。
基础设施、工作量及任何所需工具变更的成本估算。

向用户展示报告并提供深入讲解任何章节的选项。

利益相关者指导 — 调整报告结构与重点：

搜索相关性工程师 — 以完整的不兼容性列表和查询转换细节开头；将完整的OpenSearch映射和所有Query DSL示例作为附录包含在内。
DevOps / 平台工程师 — 以集群规格建议和基础设施计划开头；将部署顺序和运维手册作为最突出的章节。

Resuming a Conversation

恢复对话

Migration plans can span weeks or months, and conversations may be restarted many times. All session state — schema mappings, incompatibilities, query translations, client integrations, and workflow progress — is persisted automatically after every turn using the

session_id

you provide.

迁移计划可能持续数周或数月，对话可能会多次重启。所有会话状态 — 模式映射、不兼容性、查询转换、客户端集成及流程进度 — 都会在每次交互后使用您提供的

session_id

自动持久化。

Migration Progress File

迁移进度文件

In addition to the JSON session state, maintain a human-readable Markdown file at

sessions/<session_id>.md

. This file is the user's living record of their migration journey — update it at the end of every step so it always reflects the current state of the migration.

除JSON会话状态外，还需在

sessions/<session_id>.md

维护一个人类可读的Markdown文件。该文件是用户迁移过程的实时记录 — 每个步骤完成后更新，使其始终反映迁移的当前状态。

When to update

更新时机

Update

sessions/<session_id>.md

after every step completes. Do not wait until the end of the migration. Each update should reflect only what is known at that point — do not leave placeholder sections for steps not yet reached.

每个步骤完成后更新

sessions/<session_id>.md

。不要等到迁移结束才更新。每次更新应仅反映当前已知的内容 — 不要为尚未进行的步骤留下占位章节。

File structure

文件结构

The file must always contain the following sections, updated in place as the migration progresses:

markdown

undefined

该文件必须始终包含以下章节，随着迁移进度更新内容：

markdown

undefined

Solr to OpenSearch Migration — <session_id>

Solr到OpenSearch迁移 — <session_id>

Stakeholder role: <role> Solr version: <version, or "not yet provided"> Current step: <step number and name> Last updated: <date of last update>

利益相关者角色： <role> Solr版本： <version, 或 "尚未提供"> 当前步骤： <步骤编号及名称> 最后更新： <最后更新日期>

Progress

进度

Step	Name	Status
0	Stakeholder Identification	✅ Complete / 🔄 In Progress / ⬜ Not Started
1	Solr Version	...
2	Schema Acquisition	...
3	Schema Review & Incompatibility Analysis	...
4	Query Translation	...
5	Solr Customizations	...
6	Cluster & Infrastructure Assessment	...
7	Client & Front-end Integration	...
8	Migration Report	...

步骤	名称	状态
0	利益相关者识别	✅ 已完成 / 🔄 进行中 / ⬜ 未开始
1	Solr版本	...
2	模式获取	...
3	模式审核与不兼容性分析	...
4	查询转换	...
5	Solr自定义项	...
6	集群与基础设施评估	...
7	客户端与前端集成	...
8	迁移报告	...

Key Facts

关键信息

Solr version: <value from facts.solr_version>
Stakeholder role: <value from facts.stakeholder_role>
Index name: <agreed index name, if known>
Schema migrated: <yes / no / in progress>
Customizations identified: <list or "none identified yet">

Solr版本： <facts.solr_version中的值>
利益相关者角色： <facts.stakeholder_role中的值>
索引名称： <商定的索引名称（如有）>
模式已迁移： <是 / 否 / 进行中>
已识别的自定义项： <列表或 "尚未识别">

Incompatibilities

不兼容性

Severity	Category	Description	Recommendation
Breaking	...	...	...
Behavioral	...	...	...
Unsupported	...	...	...

<如果尚未发现，写入 "尚未识别不兼容性">

严重程度	类别	描述	建议方案
破坏性	...	...	...
行为差异	...	...	...
不支持	...	...	...

Client Integrations

客户端集成

Name	Kind	Current Usage	Migration Action
...	...	...	...

<如果尚未记录，写入 "尚未记录客户端集成">

名称	类型	当前使用方式	迁移操作
...	...	...	...

Notes

备注

<Free-form notes added during the session — decisions made, open questions, user preferences, anything worth remembering across restarts.>

undefined

<会话期间添加的自由格式备注 — 已做出的决策、未解决的问题、用户偏好、任何跨重启需要记住的内容。>

undefined

Rules

规则

Create the file at the end of Step 0, once the stakeholder role is known. Initialize all step statuses to ⬜ Not Started except Step 0 which becomes ✅ Complete.
Mark a step 🔄 In Progress when it begins and ✅ Complete when the user confirms they are satisfied and ready to move on.
Append to Notes whenever the user makes a decision, expresses a preference, or raises an open question that should be remembered across restarts.
Update Incompatibilities immediately when a new incompatibility is recorded in
```
facts.incompatibilities
```
— do not batch them until the report.
Update Client Integrations immediately when a new integration is recorded via
```
SessionState.add_client_integration
```
.
When deleting information keep the structure described, only delete information that has shown to be irrelevant, and place a note highlighting aspects that were shown during the conversation to be irrelevant, giving reasons why this is the case. Do not delete any information relevant to the migration effort - only add or update where suitable.
The file is the source of truth for human readers. Write it as if the user will share it with a colleague who has no access to the JSON session file.

在步骤0结束时创建文件，确定利益相关者角色后。初始化所有步骤状态为⬜未开始，除了步骤0设置为✅已完成。
步骤开始时标记为🔄进行中，用户确认满意并准备继续时标记为✅已完成。
用户做出决策、表达偏好或提出需要跨重启记住的未解决问题时，添加到备注中。
新不兼容性记录到
facts.incompatibilities
后立即更新不兼容性章节 — 不要等到生成报告时批量处理。
通过
SessionState.add_client_integration
记录新集成后立即更新客户端集成章节。
删除信息时 保留上述结构，仅删除已证明无关的信息，并添加备注说明对话中已证明无关的内容及原因。不要删除任何与迁移工作相关的信息 — 仅在合适时添加或更新。
该文件是人类读者的权威来源。撰写时假设用户会将其分享给无法访问JSON会话文件的同事。

How to resume

如何恢复对话

When starting a new conversation, pass the same

session_id

you used previously:

python

undefined

开始新对话时，使用之前的

session_id

：

python

undefined

Resume an existing session — all prior context is restored automatically

恢复现有会话 — 自动恢复所有先前上下文

response = skill.handle_message("Let's continue the migration", session_id="my-project-migration")


Via MCP:
```json
{ "tool": "handle_message", "arguments": { "message": "Let's continue", "session_id": "my-project-migration" } }

The advisor will reload the full

SessionState

(history, facts, progress, incompatibilities, client integrations) and pick up exactly where you left off. The Markdown progress file at

sessions/<session_id>.md

will also be updated to reflect the resumed state.

response = skill.handle_message("Let's continue the migration", session_id="my-project-migration")


通过MCP：
```json
{ "tool": "handle_message", "arguments": { "message": "Let's continue", "session_id": "my-project-migration" } }

顾问将重新加载完整的

SessionState

（历史记录、信息、进度、不兼容性、客户端集成），并从上次中断的位置继续。

sessions/<session_id>.md

的Markdown进度文件也会更新以反映恢复后的状态。

Choosing a session ID

选择会话ID

Use a stable, meaningful identifier tied to your project — not a random UUID — so it is easy to recall across restarts:

```
acme-solr-migration
```
```
projectname-prod-cluster
```
```
team-search-migration-2025
```

使用与项目绑定的稳定、有意义的标识符 — 不要使用随机UUID — 以便跨重启轻松回忆：

```
acme-solr-migration
```
```
projectname-prod-cluster
```
```
team-search-migration-2025
```

Listing and inspecting existing sessions

列出并检查现有会话

python

from scripts.storage import FileStorage

storage = FileStorage("sessions")

python

from scripts.storage import FileStorage

storage = FileStorage("sessions")

List all saved sessions

列出所有已保存的会话

print(storage.list_sessions())

Inspect a specific session

检查特定会话

state = storage.load("my-project-migration") print(f"Progress: Step {state.progress}") print(f"Incompatibilities found: {len(state.incompatibilities)}") print(f"Facts: {state.facts}")

undefined

state = storage.load("my-project-migration") print(f"Progress: Step {state.progress}") print(f"Incompatibilities found: {len(state.incompatibilities)}") print(f"Facts: {state.facts}")

undefined

Session files

会话文件

With the default

FileStorage

backend, each session produces two files:

```
sessions/<session_id>.json
```
— machine-readable JSON containing the full conversation history, all discovered facts, incompatibilities, client integrations, and progress. Used by the skill for session resumption.
```
sessions/<session_id>.md
```
— human-readable Markdown progress file. Updated after every step. Safe to share with colleagues, attach to tickets, or check into version control. See the Migration Progress File section above for the full format.

使用默认

FileStorage

后端时，每个会话会生成两个文件：

```
sessions/<session_id>.json
```
— 机器可读的JSON文件，包含完整对话历史、所有已发现的信息、不兼容性、客户端集成及进度。供技能用于会话恢复。
```
sessions/<session_id>.md
```
— 人类可读的Markdown进度文件。每个步骤完成后更新。可安全地与同事分享、附加到工单或提交到版本控制。完整格式见上方迁移进度文件章节。

Starting fresh

重新开始

To reset a session and start over:

python

storage.delete("my-project-migration")

Or simply use a new

session_id

要重置会话并重新开始：

python

storage.delete("my-project-migration")

或直接使用新的

session_id

。

Reference Knowledge Base

参考知识库

You have access to a verified knowledge base of technical information about Apache Solr and OpenSearch located under the

references

directory. Consult these files proactively — do not wait for the user to ask. Use the table below to select the most relevant file(s) for the current topic, then cite the specific section you drew from.

您可以访问位于

references

目录下的Apache Solr和OpenSearch技术信息验证知识库。主动查阅这些文件 — 不要等到用户询问。使用下表为当前主题选择最相关的文件，然后引用您参考的具体章节。

When to Use Each Reference File

各参考文件的适用场景

File	Content Summary	Use When…
`references/01-schema-migration.md`	Field type mappings, `schema.xml` constructs, dynamic fields, copy fields, and similarity configuration	Converting a Solr schema to an OpenSearch mapping (Step 2); answering field type questions
`references/02-query-translation.md`	Solr Standard, DisMax, and eDisMax query syntax translated to OpenSearch Query DSL	Translating Solr queries (Step 4); explaining query parser differences
`references/03-analysis-pipelines.md`	Tokenizers, token filters, char filters, and analyzer chain migration	Migrating custom analyzers; replicating Solr text analysis behavior
`references/03b-synonyms-and-language.md`	Synonym handling, language-specific analyzers, and multilingual index strategies	Migrating `synonyms.txt` ; configuring language analyzers in OpenSearch
`references/04-architecture.md`	SolrCloud vs. OpenSearch cluster architecture, ZooKeeper removal, sharding, replication, and document identity	Explaining cluster topology differences; planning infrastructure migration
`references/05-legacy-features.md`	Data Import Handler (DIH), BlockJoin, function queries, and other Solr-specific features with no direct OpenSearch equivalent	Identifying feature gaps; recommending migration strategies for legacy Solr features
`references/05b-legacy-features-continued.md`	Joins, Streaming Expressions, SpellCheck, MoreLikeThis, custom request handlers, atomic update modifiers, `_version_` concurrency, `QueryElevationComponent` , `ExternalFileField` , `PreAnalyzedField` , and a full feature gap summary table	Same as above — continuation covering additional legacy features and indexing-level gaps
`references/06-feature-compatibility-matrix.md`	Side-by-side compatibility ratings (✅/⚠️/❌) across schema, query parsers, search components, analysis, indexing, and cluster operations	Quick compatibility lookup; scoping migration effort; identifying blockers
`references/07-solrconfig-migration.md`	`solrconfig.xml` constructs (request handlers, caches, update settings, merge policy, similarity) mapped to OpenSearch equivalents	Migrating `solrconfig.xml` ; configuring OpenSearch index and node settings
`references/08-query-behavior-edge-cases.md`	Known behavioral differences between Solr query parsers and OpenSearch Query DSL: default operator, fuzzy scale, date math, scoring, highlighting, sorting, deep pagination, Solr-only query parsers ( `{!complexphrase}` , `{!surround}` , `{!graph}` , `{!switch}` , `{!rerank}` ) with no OpenSearch equivalent	Debugging query result differences; validating query parity after migration; identifying unsupported query parsers
`references/09-sizing-and-performance.md`	Node roles, shard sizing formulas, JVM/heap tuning, bulk indexing settings, cache configuration, hardware recommendations, and monitoring metrics	Sizing a new OpenSearch cluster; performance tuning; capacity planning (Step 3 / DevOps stakeholder)

文件	内容摘要	适用场景…
`references/01-schema-migration.md`	字段类型映射、 `schema.xml` 结构、动态字段、复制字段及相似度配置	将Solr模式转换为OpenSearch映射（步骤2）；回答字段类型相关问题
`references/02-query-translation.md`	Solr Standard、DisMax和eDisMax查询语法转换为OpenSearch Query DSL	转换Solr查询（步骤4）；解释查询解析器差异
`references/03-analysis-pipelines.md`	分词器、令牌过滤器、字符过滤器及分析链迁移	迁移自定义分析器；复制Solr文本分析行为
`references/03b-synonyms-and-language.md`	同义词处理、特定语言分析器及多语言索引策略	迁移 `synonyms.txt` ；在OpenSearch中配置语言分析器
`references/04-architecture.md`	SolrCloud与OpenSearch集群架构对比、ZooKeeper移除、分片、复制及文档标识	解释集群拓扑差异；规划基础设施迁移
`references/05-legacy-features.md`	Data Import Handler（DIH）、BlockJoin、函数查询及其他Solr特定功能（无OpenSearch直接等效功能）	识别功能缺口；为Solr遗留功能推荐迁移策略
`references/05b-legacy-features-continued.md`	连接、Streaming Expressions、拼写检查、MoreLikeThis、自定义请求处理器、原子更新修饰符、 `_version_` 并发、 `QueryElevationComponent` 、 `ExternalFileField` 、 `PreAnalyzedField` 及完整功能缺口汇总表	同上 — 涵盖其他遗留功能及索引级缺口的续篇
`references/06-feature-compatibility-matrix.md`	模式、查询解析器、搜索组件、分析、索引及集群运维方面的并排兼容性评级（✅/⚠️/❌）	快速兼容性查询；评估迁移工作量；识别障碍
`references/07-solrconfig-migration.md`	`solrconfig.xml` 结构（请求处理器、缓存、更新设置、合并策略、相似度）映射到OpenSearch等效项	迁移 `solrconfig.xml` ；配置OpenSearch索引及节点设置
`references/08-query-behavior-edge-cases.md`	Solr查询解析器与OpenSearch Query DSL之间的已知行为差异：默认运算符、模糊比例、日期计算、评分、高亮、排序、深度分页、Solr独有的查询解析器（ `{!complexphrase}` 、 `{!surround}` 、 `{!graph}` 、 `{!switch}` 、 `{!rerank}` ）无OpenSearch等效功能	调试查询结果差异；迁移后验证查询一致性；识别不支持的查询解析器
`references/09-sizing-and-performance.md`	节点角色、分片规格公式、JVM/堆调优、批量索引设置、缓存配置、硬件建议及监控指标	规划新OpenSearch集群规格；性能调优；容量规划（步骤3 / DevOps利益相关者）

Usage Guidelines

使用指南

Cite your sources. When drawing on a reference file, name the file and section (e.g., "per
references/06-feature-compatibility-matrix.md
, section 3 — Query Parsers").
Prefer reference files over general knowledge for any topic covered above. The reference files reflect decisions and conventions specific to this migration skill.
Combine files when needed. For example, a schema question may require both
```
01-schema-migration.md
```
(field types) and
```
03-analysis-pipelines.md
```
(analyzer chains).

Stakeholder filtering. For a DevOps / Platform Engineer, prioritize

04-architecture.md

09-sizing-and-performance.md

, and

07-solrconfig-migration.md

. For a Search Relevance Engineer, prioritize

01-schema-migration.md

02-query-translation.md

03-analysis-pipelines.md

, and

08-query-behavior-edge-cases.md

#[[file:references/01-schema-migration.md]] #[[file:references/02-query-translation.md]] #[[file:references/03-analysis-pipelines.md]] #[[file:references/03b-synonyms-and-language.md]] #[[file:references/04-architecture.md]] #[[file:references/05-legacy-features.md]] #[[file:references/05b-legacy-features-continued.md]] #[[file:references/06-feature-compatibility-matrix.md]] #[[file:references/07-solrconfig-migration.md]] #[[file:references/08-query-behavior-edge-cases.md]] #[[file:references/09-sizing-and-performance.md]]

注明来源。参考文件内容时，注明文件名及章节（例如："根据
references/06-feature-compatibility-matrix.md
第3章 — 查询解析器"）。
优先使用参考文件而非通用知识 处理上述涵盖的任何主题。参考文件反映了本迁移技能特有的决策和约定。
必要时结合多个文件。例如，模式相关问题可能同时需要
```
01-schema-migration.md
```
（字段类型）和
```
03-analysis-pipelines.md
```
（分析链）。

按利益相关者筛选。对于DevOps / 平台工程师，优先使用

04-architecture.md

、

09-sizing-and-performance.md

和

07-solrconfig-migration.md

。对于搜索相关性工程师，优先使用

01-schema-migration.md

、

02-query-translation.md

、

03-analysis-pipelines.md

和

08-query-behavior-edge-cases.md

。

Instructions

说明

Always maintain the session context using the
```
session_id
```
. Every call loads the full
```
SessionState
```
(history, facts, progress, incompatibilities) and saves it back before returning — sessions are fully resumable across restarts.
Follow the steps in order. If the user jumps ahead, acknowledge their input, store it in the session, and guide them back to complete any skipped steps.
If a user asks for migration advice but hasn't provided technical details, proactively request the Solr schema or a sample JSON document (Step 2).
Use
facts.solr_version
throughout every step. Once the Solr version is known, apply version-specific checks, flag version-specific incompatibilities, and tailor all recommendations accordingly. Never give generic advice when a version-specific answer is more accurate.
Use the steering documents (Stakeholders, Query Translation, Index Design, Sizing, Incompatibilities, Authentication) to inform all reasoning.
Incompatibility tracking is mandatory. Every incompatibility found in any step must be recorded in
```
facts.incompatibilities
```
(via
```
SessionState.add_incompatibility
```
) before moving on. Never silently skip a known issue.
When in doubt about whether something is an incompatibility, flag it conservatively — a false positive is far less harmful than a missed breaking change.
Cite reference sources. Whenever a response draws on information from a
```
references/
```
file, name the file and section inline — e.g., "per
references/06-feature-compatibility-matrix.md
, section 2 — Query Parsers". Do not present reference-derived content as general knowledge.

始终使用
```
session_id
```
维护会话上下文。每次调用都会加载完整的
```
SessionState
```
（历史记录、信息、进度、不兼容性）并在返回前保存 — 会话可跨重启完全恢复。
按顺序执行步骤。如果用户跳步，确认其输入并存储到会话中，然后引导用户返回完成跳过的步骤。
如果用户请求迁移建议但未提供技术细节，请主动请求Solr模式或示例JSON文档（步骤2）。
在每个步骤中始终使用
facts.solr_version
。确定Solr版本后，应用版本特定检查，标记版本特定不兼容性，并相应调整所有建议。当版本特定答案更准确时，切勿给出通用建议。
使用指导文档（利益相关者、查询转换、索引设计、规格、不兼容性、身份认证）指导所有推理。
必须跟踪不兼容性。任何步骤中发现的每个不兼容性都必须在继续之前记录到
```
facts.incompatibilities
```
中（通过
```
SessionState.add_incompatibility
```
）。切勿忽略已知问题。
不确定某内容是否为不兼容性时，保守标记 — 误报比遗漏破坏性变更的危害小得多。
注明参考来源。每当响应引用
```
references/
```
目录下文件的信息时，在内容中注明文件名及章节 — 例如："根据
references/06-feature-compatibility-matrix.md
第2章 — 查询解析器"。不要将参考来源的内容作为通用知识呈现。

Session State Fields

会话状态字段

The

SessionState

object persisted for each session contains:

Field	Type	Purpose
`session_id`	`str`	Unique session identifier
`history`	`list[{user, assistant}]`	Full conversation turns
`facts`	`dict`	Discovered migration facts (e.g. `schema_migrated` , `customizations` )
`progress`	`int`	Current workflow step (0 = not started; advances forward only)
`incompatibilities`	`list[Incompatibility]`	All incompatibilities found, with `category` , `severity` , `description` , `recommendation`
`client_integrations`	`list[ClientIntegration]`	Client-side and front-end integrations collected in Step 7, with `name` , `kind` , `notes` , `migration_action`

每个会话持久化的

SessionState

对象包含：

字段	类型	用途
`session_id`	`str`	唯一会话标识符
`history`	`list[{user, assistant}]`	完整对话轮次
`facts`	`dict`	已发现的迁移信息（例如 `schema_migrated` 、 `customizations` ）
`progress`	`int`	当前流程步骤（0 = 未开始；仅向前推进）
`incompatibilities`	`list[Incompatibility]`	所有已发现的不兼容性，包含 `category` 、 `severity` 、 `description` 、 `recommendation`
`client_integrations`	`list[ClientIntegration]`	步骤7中收集的客户端及前端集成，包含 `name` 、 `kind` 、 `notes` 、 `migration_action`

Pluggable Storage Backends

可插拔存储后端

The storage backend is injected at construction time. Built-in options:

```
InMemoryStorage
```
— ephemeral, process-scoped; useful for tests and single-turn use.
```
FileStorage(base_path)
```
— JSON file per session on disk; the default for persistent deployments.

Custom backends implement

StorageBackend

(four methods:

_save_raw

_load_raw

delete

list_sessions

) and are drop-in replacements with no changes to skill logic.

存储后端在构造时注入。内置选项：

```
InMemoryStorage
```
— 临时存储，进程范围内有效；适用于测试和单轮次使用。
```
FileStorage(base_path)
```
— 每个会话对应磁盘上的一个JSON文件；持久化部署的默认选项。

自定义后端实现

StorageBackend

（四个方法：

_save_raw

、

_load_raw

、

delete

、

list_sessions

），无需修改技能逻辑即可直接替换。

Usage

使用方法

Library Usage

库使用方式

python

import sys
import os

python

import sys
import os

Add scripts directory to sys.path

将scripts目录添加到sys.path

sys.path.append(os.path.join(os.getcwd(), ".kiro/skills/solr-to-opensearch/scripts"))

from skill import SolrToOpenSearchMigrationSkill

sys.path.append(os.path.join(os.getcwd(), ".kiro/skills/solr-to-opensearch/scripts"))

from skill import SolrToOpenSearchMigrationSkill

Initialize advisor

初始化顾问

skill = SolrToOpenSearchMigrationSkill()

Handle conversational message

处理对话消息

session_id = "user-123" response = skill.handle_message("Help me migrate my Solr schema: <schema>...</schema>", session_id) print(response)

Generate final report

生成最终报告

report = skill.generate_report(session_id) print(report)

undefined

report = skill.generate_report(session_id) print(report)

undefined

MCP Server Usage

MCP Server使用方式

Install dependencies and run the MCP server over stdio:

bash

pip install -e ".kiro/skills/solr-to-opensearch[mcp]"
python .kiro/skills/solr-to-opensearch/scripts/mcp_server.py

Or configure it in your MCP client (e.g.

.kiro/settings/mcp.json

json

{
  "mcpServers": {
    "solr-to-opensearch": {
      "command": "python3",
      "args": [".kiro/skills/solr-to-opensearch/scripts/mcp_server.py"],
      "disabled": false,
      "autoApprove": []
    }
  }
}

安装依赖并通过标准输入输出运行MCP服务器：

bash

pip install -e ".kiro/skills/solr-to-opensearch[mcp]"
python .kiro/skills/solr-to-opensearch/scripts/mcp_server.py

或在MCP客户端中配置（例如

.kiro/settings/mcp.json

）：

json

{
  "mcpServers": {
    "solr-to-opensearch": {
      "command": "python3",
      "args": [".kiro/skills/solr-to-opensearch/scripts/mcp_server.py"],
      "disabled": false,
      "autoApprove": []
    }
  }
}

Persistence Fallback

持久化备选方案

In case you are not successful using provided session persistence tools for persistence as a JSON file at

sessions/<session_id>.json

, persist such a file yourself at the given location within the solr-opensearch-migration-advisor directory. The file is human-readable and contains the full conversation history, all discovered facts, and migration progress. Similarly, always maintain the Markdown progress file at

sessions/<session_id>.md

as described in the Migration Progress File section. If the JSON session file cannot be written, the Markdown file must still be kept up to date — it is the human-readable record of the migration and must never be skipped.

如果无法使用提供的会话持久化工具将JSON文件持久化到

sessions/<session_id>.json

，请自行在solr-opensearch-migration-advisor目录下的指定位置创建该文件。该文件是人类可读的，包含完整对话历史、所有已发现的信息及迁移进度。同样，始终按照迁移进度文件章节的描述维护Markdown进度文件

sessions/<session_id>.md

。如果无法写入JSON会话文件，仍需更新Markdown文件 — 它是迁移的人类可读记录，绝不能跳过。