amazon-elasticache

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

ElastiCache

ElastiCache

A modular ElastiCache toolkit organized as a registry of sub-skills. Each sub-skill handles one domain of ElastiCache work. The router below matches user intent to the right sub-skill, then loads only the references needed for that sub-skill.
这是一个模块化的ElastiCache工具包,以子技能注册表的形式组织。每个子技能负责处理ElastiCache工作的一个领域。下方的路由模块会将用户意图匹配到对应的子技能,然后仅加载该子技能所需的参考资料。

How this skill works

该技能的工作流程

  1. Match the user's request against the semantic categories in the registry below. Match on meaning, not exact wording ("help me figure out which data structures to use" matches
    data-modeling
    even without the word "pattern").
  2. Disambiguation: If the user's intent matches multiple sub-skills, apply these rules in order:
    • If
      .elasticache/requirements.json
      exists with
      infrastructure.endpoint
      set, prefer
      monitoring
      or
      data-modeling
      (the user has an existing cache).
    • If no cache exists (no requirements.json or no endpoint), prefer
      requirements
      .
    • If still ambiguous, ask one clarifying question: "Are you looking to set up something new, or troubleshoot something existing?"
  3. Check the Guardrails section before recommending an engine or deployment model.
  4. Read
    references/{sub-skill-id}/instructions.md
    for the matched sub-skill. If the file is not found at a relative path, check your prompt or environment for the skill directory absolute path and retry with
    {skill-directory}/references/{sub-skill-id}/instructions.md
    .
  5. If the request spans multiple sub-skills, execute them in pipeline order.
  6. If a sub-skill requires upstream context (engine, deployment model, endpoint) not yet in session memory, route to the upstream sub-skill first.
  7. If no sub-skill matches, activate
    requirements
    first.
  8. If a script or CLI call fails, show the error to the user and suggest a specific fix before retrying.
  1. 将用户请求与下方注册表中的语义类别进行匹配。匹配基于含义而非精确措辞(例如“帮我确定该使用哪些数据结构”即使没有“模式”一词,也会匹配
    data-modeling
    )。
  2. 歧义消除:如果用户意图匹配多个子技能,按以下顺序应用规则:
    • 若存在
      .elasticache/requirements.json
      infrastructure.endpoint
      已设置,优先选择
      monitoring
      data-modeling
      (用户已有缓存实例)。
    • 若不存在缓存实例(无requirements.json或未设置endpoint),优先选择
      requirements
    • 若仍存在歧义,询问一个澄清问题:“你是想要搭建新的缓存,还是排查现有缓存的问题?”
  3. 在推荐引擎或部署模型前,先查看“护栏规则”部分。
  4. 读取匹配子技能对应的
    references/{sub-skill-id}/instructions.md
    文件。如果相对路径下未找到该文件,请检查你的提示词或环境中的技能目录绝对路径,然后尝试使用
    {skill-directory}/references/{sub-skill-id}/instructions.md
    路径重试。
  5. 如果请求涉及多个子技能,按管道顺序执行它们。
  6. 如果子技能需要会话内存中尚未存在的上游上下文(引擎、部署模型、端点),先路由到上游子技能。
  7. 如果没有匹配的子技能,先激活
    requirements
  8. 如果脚本或CLI调用失败,向用户显示错误信息,并建议具体的修复方案后再重试。

Sub-skill registry

子技能注册表

Each entry has: an ID (directory name under
references/
), a domain description, semantic categories for matching, and upstream/downstream dependencies.
IDNameDomainSemantic CategoriesUpstreamDownstream
requirements
Solution FitGathers workload, stack, scale, latency, persistence, and budget through workspace scan + structured interview. Decides whether ElastiCache is the right service and hands off with a routing recommendation.I need a cache, speed up my app, reduce database load, lower Bedrock cost, should I use ElastiCache, what's best for my workload, evaluating cache options, ElastiCache vs X, Valkey vs X, vague new workload
setup
,
data-modeling
,
genai
,
monitoring
,
migration
setup
Create and ConnectProvisioning, connectivity, security, authentication, IaC, deployment choice. Gets the user to a working cache with least friction. Covers engine selection, serverless vs node-based, VPC, TLS, RBAC/IAM, jump-host/SSM tunnels, CLI/SDK/CFN/CDK/Terraform starters.create a cache, set up ElastiCache, provision, Valkey cluster, connect Lambda/ECS/EKS/EC2, VPC, security groups, TLS, RBAC, IAM auth, jump host, SSM tunnel, CloudFormation, CDK, Terraform, engine selection, serverless vs node-based, backup, snapshot, restore, export
requirements
(optional)
data-modeling
,
genai
,
monitoring
data-modeling
Application PatternsPicks data structures, key schema, TTL strategy, invalidation approach, and client code for non-AI patterns: cache-aside, session store, rate limiting, leaderboards, counters, pub/sub, streams, shopping carts, job queues, activity feeds.session store, rate limiting, leaderboard, cache-aside, query caching, counters, streams, pub/sub, shopping cart, job queue, activity feed, key schema, TTL, invalidation, data structures
setup
(cache must exist)
monitoring
genai
AI and Vector WorkloadsClassifies request into Mode 1 (plain cache), Mode 2 (semantic response cache), or Mode 3 (full vector search). Selects Valkey and forces node-based Valkey 8.2 or above (recommend 9.0) when server-side vector similarity is needed. Covers semantic caching, agent memory, RAG retrieval, recommendation, personalization, conversation/session persistence for AI agents, and framework wiring (Strands, mem0, LangChain).semantic cache, RAG, agent memory, conversational memory, vector search, embeddings, recommendation, personalization, Bedrock latency, Bedrock cost, LLM caching, Strands, mem0, LangChain, conversation history, AI session store, embedding provider, framework integration
setup
(cache must exist)
monitoring
monitoring
Operate and ObserveDiagnoses performance, cost, and reliability using metrics first, then recommends the smallest change. Covers dashboards, alarms, log delivery, cost reporting, event routing, troubleshooting high CPU / memory / replication lag / connection spikes / low hit rate / hot keys / big keys / slot imbalance / latency spike root cause.cache is slow, cost too high, hit rate low, high CPU, memory pressure, replication lag, connection spikes, dashboards, alarms, CloudWatch, cost comparison, troubleshoot, hot key, uneven shard load, one node pinned, big key, memory bloat, which key is biggest, keyspace distribution, prefix analysis, cost attribution by tenant, memory imbalance, one shard full, slot memory skew, latency spike, slow command incident, root cause for latency bump
setup
,
migration
migration
Engine and Platform MigrationSelects the migration path and sequences preflight, validation, cutover, and rollback. Covers self-managed Redis → ElastiCache, Redis OSS → Valkey, node-based ↔ serverless, version upgrades. Hard validate-before-migrate gate.migrate, Redis OSS to Valkey, self-managed to ElastiCache, node-based to serverless, serverless to node-based, engine upgrade, version upgrade, zero-downtime cutover, rollback
setup
,
monitoring
每个条目包含:ID(
references/
下的目录名称)、名称、适用领域、语义匹配类别以及上下游依赖。
ID名称适用领域语义匹配类别上游依赖下游依赖
requirements
适配评估通过工作区扫描+结构化访谈收集工作负载、技术栈、规模、延迟、持久性和预算信息。判断ElastiCache是否为合适的服务,并给出路由推荐。我需要缓存、加速我的应用、降低数据库负载、降低Bedrock成本、我是否应该使用ElastiCache、我的工作负载最适合什么、评估缓存选项、ElastiCache对比X、Valkey对比X、模糊的新工作负载需求
setup
,
data-modeling
,
genai
,
monitoring
,
migration
setup
创建与连接资源配置、连接设置、安全配置、认证、基础设施即代码(IaC)、部署方案选择。以最小摩擦帮助用户搭建可用的缓存。涵盖引擎选择、无服务器vs基于节点、VPC、TLS、RBAC/IAM、跳转主机/SSM隧道、CLI/SDK/CFN/CDK/Terraform入门模板。创建缓存、搭建ElastiCache、资源配置、Valkey集群、连接Lambda/ECS/EKS/EC2、VPC、安全组、TLS、RBAC、IAM认证、跳转主机、SSM隧道、CloudFormation、CDK、Terraform、引擎选择、无服务器vs基于节点、备份、快照、恢复、导出
requirements
(可选)
data-modeling
,
genai
,
monitoring
data-modeling
应用模式为非AI场景选择数据结构、键 schema、TTL策略、失效方式和客户端代码:cache-aside、会话存储、速率限制、排行榜、计数器、发布/订阅、流、购物车、任务队列、活动信息流。会话存储、速率限制、排行榜、cache-aside、查询缓存、计数器、流、发布/订阅、购物车、任务队列、活动信息流、键schema、TTL、失效策略、数据结构
setup
(必须已存在缓存)
monitoring
genai
AI与向量工作负载将请求分为模式1(普通缓存)、模式2(语义响应缓存)或模式3(全向量搜索)。当需要服务器端向量相似度功能时,选择Valkey并强制使用基于节点的Valkey 8.2及以上版本(推荐9.0)。涵盖语义缓存、Agent内存、RAG检索、推荐系统、个性化服务、AI Agent的对话/会话持久化,以及框架集成(Strands、mem0、LangChain)。语义缓存、RAG、Agent内存、对话内存、向量搜索、嵌入向量、推荐系统、个性化服务、Bedrock延迟、Bedrock成本、LLM缓存、Strands、mem0、LangChain、对话历史、AI会话存储、嵌入向量提供商、框架集成
setup
(必须已存在缓存)
monitoring
monitoring
运维与观测先通过指标诊断性能、成本和可靠性问题,再推荐最小化的变更方案。涵盖仪表盘、告警、日志投递、成本报告、事件路由,以及排查高CPU/内存/复制延迟/连接峰值/低命中率/热键/大键/槽位不平衡/延迟峰值的根本原因。缓存速度慢、成本过高、命中率低、高CPU、内存压力、复制延迟、连接峰值、仪表盘、告警、CloudWatch、成本对比、故障排查、热键、分片负载不均、单个节点负载过高、大键、内存膨胀、哪个键最大、键空间分布、前缀分析、租户成本归因、内存不平衡、单个分片已满、槽位内存倾斜、延迟峰值、慢命令事件、延迟突增的根本原因
setup
,
migration
migration
引擎与平台迁移选择迁移路径,并规划预检、验证、切换和回滚步骤。涵盖自托管Redis→ElastiCache、Redis OSS→Valkey、基于节点↔无服务器、版本升级。迁移前需通过严格的验证关卡。迁移、Redis OSS转Valkey、自托管转ElastiCache、基于节点转无服务器、无服务器转基于节点、引擎升级、版本升级、零停机切换、回滚
setup
,
monitoring

Pipeline order

管道执行顺序

Sub-skills run independently, but common multi-step journeys follow these pipelines:
  • requirements
    setup
    → (
    data-modeling
    |
    genai
    ) →
    monitoring
  • migration
    setup
    monitoring
  • monitoring
    setup
    |
    migration
    (if metrics indicate)
子技能可独立运行,但常见的多步骤流程遵循以下管道:
  • requirements
    setup
    → (
    data-modeling
    |
    genai
    ) →
    monitoring
  • migration
    setup
    monitoring
  • monitoring
    setup
    |
    migration
    (如果指标显示需要)

State handoff: requirements.json

状态传递:requirements.json

.elasticache/requirements.json
is the single source of truth for cross-sub-skill state. Each sub-skill reads it at start and writes its section after completing work. Read before writing; merge, do not overwrite.
SectionOwnerKey fields
top-level
requirements
engine
,
deployment_model
,
region
,
runtime
,
patterns
,
use_case
,
vpc_id
,
subnet_ids
,
security_group_ids
infrastructure
setup
cache_name
,
resource_id
,
engine_version
,
topology
,
endpoint
,
port
,
auth_model
,
tls
,
client_library
,
execution_path
,
access_mode
,
tunnel_instance_id
,
embedding_provider
,
embedding_model
,
embedding_dim
,
embedding_module
genai
genai
mode
,
mode_2_path
,
framework
migration
migration
source_type
,
source_host
,
migration_path
,
cutover_status
Ownership note:
deployment_model
is set by
requirements
during initial interview.
migration
may update it after an engine or deployment model switch (e.g., node-based to serverless).
requirements.json should include
"schema_version": 1
and
"last_updated": "<ISO timestamp>"
at the top level. Every sub-skill that writes to requirements.json must update
last_updated
. If
last_updated
is older than 7 days, warn the user that cached state may be stale.
requirements.json tracks one active cache. If the user works with multiple caches in the same project, confirm which cache is active before reading or writing state.
When a sub-skill needs upstream context (engine, endpoint, auth model), check requirements.json first. If the field is
null
or the file does not exist, route to the upstream sub-skill.
.elasticache/requirements.json
是跨子技能状态的唯一可信来源。每个子技能启动时读取该文件,完成工作后写入对应的部分。读取后再写入;采用合并方式,而非覆盖。
章节归属子技能核心字段
顶层
requirements
engine
,
deployment_model
,
region
,
runtime
,
patterns
,
use_case
,
vpc_id
,
subnet_ids
,
security_group_ids
infrastructure
setup
cache_name
,
resource_id
,
engine_version
,
topology
,
endpoint
,
port
,
auth_model
,
tls
,
client_library
,
execution_path
,
access_mode
,
tunnel_instance_id
,
embedding_provider
,
embedding_model
,
embedding_dim
,
embedding_module
genai
genai
mode
,
mode_2_path
,
framework
migration
migration
source_type
,
source_host
,
migration_path
,
cutover_status
归属说明
deployment_model
requirements
在初始访谈时设置。
migration
在切换引擎或部署模型后(例如从基于节点转为无服务器)可能会更新该字段。
requirements.json应在顶层包含
"schema_version": 1
"last_updated": "<ISO时间戳>"
。所有写入requirements.json的子技能必须更新
last_updated
字段。如果
last_updated
早于7天,需提醒用户缓存状态可能已过期。
requirements.json跟踪一个活跃缓存实例。如果用户在同一项目中使用多个缓存实例,在读取或写入状态前需确认哪个是活跃实例。
当子技能需要上游上下文(引擎、端点、认证模型)时,先检查requirements.json。如果字段为
null
或文件不存在,路由到上游子技能。

Global rules (apply to every sub-skill)

全局规则(适用于所有子技能)

  1. Execution path. Use AWS CLI, SDK (boto3), CloudFormation, or CDK as the primary path for control-plane work. Use valkey-py as the primary path for data-plane work.
  2. Response depth. Summary (2-3 sentences) for "should I" or "which" questions. Standard (recommendation + config + code + next steps) by default. Expert (full decision matrix with alternatives, cost, security caveats) for "why" or "compare all" questions. Escalate on user request; never downgrade unprompted.
  3. Session memory. Track region, VPC, engine, deployment model, auth model, compute runtime, and language. Carry forward across sub-skills. Do not re-ask. If the user overrides a value, update it everywhere. Inferred values (from workspace scan or IaC) must be re-confirmed before high-risk decisions (engine, deployment model, security posture); low-risk inferences (language, framework, region) can be used as defaults silently.
  4. Source priority. Always answer from skill-local files first (sub-skill references, then
    scripts/
    ). Do not fetch external documentation, web search, or context7 unless the local files cannot answer the query. When local files are insufficient, fall back to official AWS docs: https://docs.aws.amazon.com/AmazonElastiCache/latest/dg/ for features and https://aws.amazon.com/elasticache/pricing/ for pricing. Never invent price points or version constraints. If the user references a Valkey or Redis version, feature, or pricing tier not covered in local files, fall back to https://docs.aws.amazon.com/AmazonElastiCache/latest/dg/ before answering. Do not extrapolate from local content that may be outdated.
  5. Freshness disclaimer. When outputting pricing, version constraints, or feature availability, include a one-line disclaimer: "For current pricing see https://aws.amazon.com/elasticache/pricing/. For current feature availability see https://docs.aws.amazon.com/AmazonElastiCache/latest/dg/."
  6. Connection safety. Never create a Valkey/Redis client at module level (top of file, import time). Initialize connections inside a function or on first use. Module-level connections crash applications that import the module before the cache is reachable.
  7. Code environment. Use
    python3
    (not
    python
    ) in all commands and shebangs. Do not generate
    pip install
    or
    pytest
    commands in application code. Prefer the agent's file-writing tool over shell redirection for writing files.
  8. Input validation (write operations only). Before running any create, modify, or delete CLI command or SDK call that uses user-provided values (cache name, region, subnet IDs, security group IDs, VPC ID, KMS key, snapshot name), validate them with
    scripts/input_validator.py
    . Run
    python3 scripts/input_validator.py cache_name=<val> region=<val> ...
    or import
    validate_all
    as a library. Skip validation for read-only operations (describe, list) where bad input is harmless.
  9. Resource tagging (always apply on resource creation). When creating any cache (
    create-serverless-cache
    ,
    create-replication-group
    ,
    create-cache-cluster
    ), ALWAYS include these tags:
    --tags Key=managed_by,Value=aws-skills Key=skill,Value=elasticache Key=skill_version,Value=1.0.0 Key=created_by,Value=elasticache-skill Key=generation_model,Value={your-model-id}
    Example:
    --tags Key=managed_by,Value=aws-skills Key=skill,Value=elasticache Key=skill_version,Value=1.0.0 Key=created_by,Value=elasticache-skill Key=generation_model,Value=claude-sonnet-4-20250514
    Set
    generation_model
    to your own model identifier. Include these tags even if the user does not mention tagging. If the user provides additional tags, append these to their tags rather than replacing them.
  1. 执行路径:控制平面工作优先使用AWS CLI、SDK(boto3)、CloudFormation或CDK。数据平面工作优先使用valkey-py。
  2. 响应深度:对于“是否应该”或“哪个更好”的问题,提供摘要(2-3句话)。默认提供标准响应(推荐方案+配置+代码+下一步)。对于“为什么”或“对比所有选项”的问题,提供专家级响应(包含替代方案、成本、安全注意事项的完整决策矩阵)。仅在用户请求时升级响应级别;不得未经提示降低响应级别。
  3. 会话内存:跟踪区域、VPC、引擎、部署模型、认证模型、计算运行时和编程语言。跨子技能传递这些信息。不得重复询问。如果用户覆盖某个值,需在所有地方更新该值。推断值(来自工作区扫描或IaC)在高风险决策(引擎、部署模型、安全策略)前必须重新确认;低风险推断(语言、框架、区域)可默认使用无需确认。
  4. 来源优先级:优先从技能本地文件(子技能参考资料,然后是
    scripts/
    )获取答案。除非本地文件无法回答查询,否则不得获取外部文档、进行网页搜索或使用其他外部上下文。当本地文件不足时,回退到AWS官方文档:功能相关参考https://docs.aws.amazon.com/AmazonElastiCache/latest/dg/,定价相关参考https://aws.amazon.com/elasticache/pricing/。不得编造价格点或版本限制。如果用户提到本地文件未覆盖的Valkey或Redis版本、功能或定价层级,需先参考https://docs.aws.amazon.com/AmazonElastiCache/latest/dg/再回答。不得从可能过时的本地内容进行推断。
  5. 时效性声明:当输出定价、版本限制或功能可用性信息时,需包含一行声明:“当前定价请查看https://aws.amazon.com/elasticache/pricing/。当前功能可用性请查看https://docs.aws.amazon.com/AmazonElastiCache/latest/dg/。”
  6. 连接安全性:切勿在模块级别(文件顶部、导入时)创建Valkey/Redis客户端。应在函数内部或首次使用时初始化连接。模块级连接会导致在缓存可达前导入模块的应用崩溃。
  7. 代码环境:在所有命令和shebang中使用
    python3
    (而非
    python
    )。不得在应用代码中生成
    pip install
    pytest
    命令。写入文件时优先使用Agent的文件写入工具,而非shell重定向。
  8. 输入验证(仅适用于写入操作):在运行任何使用用户提供值(缓存名称、区域、子网ID、安全组ID、VPC ID、KMS密钥、快照名称)的创建、修改或删除CLI命令或SDK调用前,使用
    scripts/input_validator.py
    进行验证。运行
    python3 scripts/input_validator.py cache_name=<值> region=<值> ...
    或导入
    validate_all
    作为库。对于只读操作(描述、列表),如果输入错误无危害,可跳过验证。
  9. 资源标签(创建资源时必须添加):创建任何缓存(
    create-serverless-cache
    create-replication-group
    create-cache-cluster
    )时,必须包含以下标签:
    --tags Key=managed_by,Value=aws-skills Key=skill,Value=elasticache Key=skill_version,Value=1.0.0 Key=created_by,Value=elasticache-skill Key=generation_model,Value={你的模型ID}
    示例:
    --tags Key=managed_by,Value=aws-skills Key=skill,Value=elasticache Key=skill_version,Value=1.0.0 Key=created_by,Value=elasticache-skill Key=generation_model,Value=claude-sonnet-4-20250514
    generation_model
    设置为你自己的模型标识符。即使用户未提及标签,也必须添加这些标签。如果用户提供额外标签,将这些标签追加到用户提供的标签后,而非替换。

Reference loading

参考资料加载

Load additional references only when the current turn's answer requires them.
On-demand pointers (not preloaded; fetch when the trigger applies):
  • references/shared-ux/production-readiness.md
    — when the user asks if their cache is ready for production, or after setup completes and the user wants to go to production
  • references/shared-ux/action-safety.md
    — before any destructive action (risk levels, never-auto-execute list)
  • references/shared-ux/error-remediation.md
    — when the user hits a specific ElastiCache error code (MOVED, CROSSSLOT, CLUSTERDOWN, MULTI/EXEC+IAM, etc.)
  • references/shared-foundation/boundary-doc.md
    — when the user asks what this skill covers
  • references/shared-foundation/attribution.md
    — when generating CLI commands, SDK code, or IaC templates
  • references/shared-foundation/architecture-diagrams.md
    — when the user asks for architecture diagrams or visual reference
  • references/shared-runtime/lambda.md
    — when connecting from Lambda (cold start gotchas, IAM auth code, lazy init)
  • references/shared-runtime/ecs.md
    — when connecting from ECS (SIGTERM shutdown, connection pool drain, task definition)
  • references/shared-runtime/eks.md
    — when connecting from EKS (IRSA, service mesh bypass, SecurityGroupPolicy CRD)
  • references/shared-runtime/api-gateway.md
    — when integrating with API Gateway (no direct path, caching layers comparison)
  • references/shared-runtime/rds-acceleration.md
    — when caching RDS/Aurora queries (thundering herd, stampede protection, invalidation)
  • references/shared-runtime/secret-injection.md
    — when the user asks about credential management per compute platform
  • references/shared-security/encryption-defaults.md
    — when adding encryption to an existing unencrypted cluster (TLS two-step migration, at-rest immutability)
  • references/shared-security/config-guardrails.md
    — when the user wants continuous compliance monitoring (AWS Config rules, custom Lambda rules)
  • references/shared-security/vpc-patterns.md
    — when debugging port/security-group issues (port 6380 serverless reader, anti-patterns)
Folder convention:
references/
contains 10 folders. 6 match the sub-skills (
requirements
,
setup
,
data-modeling
,
genai
,
monitoring
,
migration
) and are routing destinations. The 4
shared-*
folders (
shared-foundation
,
shared-ux
,
shared-security
,
shared-runtime
) are cross-cutting material loaded on demand, not routing destinations.
仅当当前轮次的回答需要时,才加载额外的参考资料。
按需加载的参考资料(不预加载;触发条件满足时获取):
  • references/shared-ux/production-readiness.md
    — 当用户询问其缓存是否准备好投入生产,或在搭建完成后用户想要投入生产时
  • references/shared-ux/action-safety.md
    — 在执行任何破坏性操作前(风险级别、禁止自动执行列表)
  • references/shared-ux/error-remediation.md
    — 当用户遇到特定的ElastiCache错误代码时(MOVED、CROSSSLOT、CLUSTERDOWN、MULTI/EXEC+IAM等)
  • references/shared-foundation/boundary-doc.md
    — 当用户询问该技能的覆盖范围时
  • references/shared-foundation/attribution.md
    — 当生成CLI命令、SDK代码或IaC模板时
  • references/shared-foundation/architecture-diagrams.md
    — 当用户询问架构图或可视化参考时
  • references/shared-runtime/lambda.md
    — 当从Lambda连接时(冷启动问题、IAM认证代码、延迟初始化)
  • references/shared-runtime/ecs.md
    — 当从ECS连接时(SIGTERM关闭、连接池排空、任务定义)
  • references/shared-runtime/eks.md
    — 当从EKS连接时(IRSA、服务网格绕过、SecurityGroupPolicy CRD)
  • references/shared-runtime/api-gateway.md
    — 当与API Gateway集成时(无直接路径、缓存层对比)
  • references/shared-runtime/rds-acceleration.md
    — 当缓存RDS/Aurora查询时(惊群效应、防击穿、失效策略)
  • references/shared-runtime/secret-injection.md
    — 当用户询问各计算平台的凭证管理时
  • references/shared-security/encryption-defaults.md
    — 当为现有未加密集群添加加密时(TLS两步迁移、静态数据不可变性)
  • references/shared-security/config-guardrails.md
    — 当用户想要持续合规监控时(AWS Config规则、自定义Lambda规则)
  • references/shared-security/vpc-patterns.md
    — 当调试端口/安全组问题时(无服务器读取端口6380、反模式)
文件夹约定
references/
包含10个文件夹。其中6个与子技能对应(
requirements
setup
data-modeling
genai
monitoring
migration
),是路由目标。另外4个
shared-*
文件夹(
shared-foundation
shared-ux
shared-security
shared-runtime
)是跨领域资料,按需加载,不是路由目标。

Guardrails

护栏规则

PriorityRule
CRITICALVector search MUST use node-based Valkey 8.2 or above. Serverless does NOT support vector search. Never suggest serverless for vector search. Apply this regardless of which sub-skill activates.
CRITICALDo not invent price points or version constraints. Use
scripts/price_calculator.py
and current AWS docs when precision matters.
HIGHDo not recommend Memcached when the user needs persistence, replication, RBAC or IAM auth, sorted sets, streams, pub/sub, or vector search.
HIGHDo not assume local laptop access works directly. ElastiCache is VPC-centric; explain VPC, tunnel, or jump-host access when needed.
STANDARDDo not trigger on every generic Redis mention. Trigger when the user is clearly asking about AWS, managed caching, migration, connectivity, pricing, operations, or AWS service integration.
STANDARDFor ambiguous "cache" requests inside AWS contexts, activate this skill and start with
requirements
.
优先级规则
关键向量搜索必须使用基于节点的Valkey 8.2及以上版本。无服务器版本不支持向量搜索。切勿为向量搜索场景推荐无服务器版本。无论哪个子技能激活,均需遵守此规则。
关键不得编造价格点或版本限制。当需要精确信息时,使用
scripts/price_calculator.py
和当前AWS文档。
当用户需要持久性、复制、RBAC或IAM认证、有序集合、流、发布/订阅或向量搜索时,不得推荐Memcached。
不得假设本地笔记本电脑可直接访问ElastiCache。ElastiCache以VPC为中心;必要时需解释VPC、隧道或跳转主机访问方式。
标准不得在每次提及通用Redis时触发。仅当用户明确询问AWS相关、托管缓存、迁移、连接、定价、运维或AWS服务集成时触发。
标准对于AWS环境中模糊的“缓存”请求,激活该技能并从
requirements
开始。

Product truths

产品事实

  • ElastiCache Serverless deploys in under a minute and removes infrastructure management.
  • Valkey serverless pricing is 33% lower than other supported engines; node-based Valkey pricing is 20% lower.
  • Serverless caches have in-transit encryption always enabled (cannot be disabled).
  • IAM auth is available for all ElastiCache Valkey versions (7.2 is the baseline Valkey version on ElastiCache) and Redis OSS 7.0+.
  • Valkey version ladder: 7.2 (baseline), 8.0 (20% more data per node (capacity improvement), per-slot metrics), 8.1 (Bloom filters, COMMANDLOG, SET IFEQ, 20% less memory via new hash table (efficiency improvement)), 8.2 (vector search), 9.0 (recommended default for new clusters). Recommend Valkey 9.0 for new clusters unless a specific feature dictates otherwise.
  • Vector search is available for Valkey 8.2 or above on node-based clusters (recommend 9.0).
  • Global Datastore is available for node-based clusters only. It does not support IPv6 or Local Zones. Global Datastore supports AUTH and RBAC. Cross-region failover must be promoted manually (no autofailover across regions). At-rest encryption must be enabled on all clusters in the Global Datastore, but each cluster can use a separate KMS key per region.
  • Online migration from self-managed Redis to ElastiCache requires: (source) AUTH must not be enabled,
    protected-mode
    set to
    no
    , replication and administrative commands must not be renamed (e.g.,
    sync
    ,
    psync
    ,
    info
    ,
    config
    ,
    command
    ,
    cluster
    ); (target) encryption in-transit disabled, Multi-AZ enabled, engine version Redis OSS 5.0.6+ or Valkey 7.2+, not part of a Global Datastore, data tiering disabled. Shard counts must match between source and target. All source Redis instances must use the same port. Online migration is not supported for serverless caches (node-based targets only). See
    references/migration/topology-validation.md
    for the full checklist.
  • ElastiCache无服务器版本部署时间不到1分钟,无需管理基础设施。
  • Valkey无服务器版本的价格比其他支持的引擎低33%;基于节点的Valkey版本价格低20%。
  • 无服务器缓存始终启用传输中加密(无法禁用)。
  • IAM认证适用于所有ElastiCache Valkey版本(ElastiCache上的基准Valkey版本为7.2)和Redis OSS 7.0+。
  • Valkey版本梯队:7.2(基准版)、8.0(每个节点可存储20%更多数据(容量提升)、每槽位指标)、8.1(布隆过滤器、COMMANDLOG、SET IFEQ、通过新哈希表减少20%内存(效率提升))、8.2(向量搜索)、9.0(新集群推荐默认版本)。除非特定功能要求,否则为新集群推荐Valkey 9.0。
  • 向量搜索仅适用于基于节点集群上的Valkey 8.2及以上版本(推荐9.0)。
  • Global Datastore仅适用于基于节点的集群。不支持IPv6或本地区域。Global Datastore支持AUTH和RBAC。跨区域故障转移必须手动触发(无跨区域自动故障转移)。Global Datastore中的所有集群必须启用静态数据加密,但每个集群可使用不同区域的独立KMS密钥。
  • 从自托管Redis迁移至ElastiCache的在线迁移要求:(源端)不得启用AUTH,
    protected-mode
    设置为
    no
    ,复制和管理命令不得重命名(例如
    sync
    psync
    info
    config
    command
    cluster
    );(目标端)禁用传输中加密,启用多可用区,引擎版本为Redis OSS 5.0.6+或Valkey 7.2+,不属于Global Datastore,禁用数据分层。源端和目标端的分片数量必须匹配。所有源端Redis实例必须使用相同端口。在线迁移不支持无服务器缓存(仅支持基于节点的目标端)。完整检查清单请查看
    references/migration/topology-validation.md