together-embeddings
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTogether Embeddings & Reranking
Together Embeddings & Reranking 嵌入与重排序
Overview
概述
Use this skill for semantic retrieval components:
- create embeddings
- batch embeddings
- build retrieval or RAG pipelines
- rerank retrieved candidates
This skill is for retrieval plumbing, not for the final language-model response itself.
本skill适用于以下语义检索组件场景:
- 生成embeddings
- 批量生成embeddings
- 构建检索或RAG管道
- 对检索到的候选结果重排序
本skill用于检索链路底层能力,而非直接生成大语言模型最终响应。
When This Skill Wins
适用场景
- Build vector search or semantic similarity features
- Add embedding generation to a data pipeline
- Improve retrieval quality with reranking
- Assemble a retrieval stage before calling a chat model
- 构建向量搜索或语义相似度功能
- 在数据管道中加入embedding生成能力
- 通过重排序提升检索质量
- 在调用聊天模型前搭建检索阶段
Hand Off To Another Skill
可转接的其他skill
- Use for the final answer-generation step
together-chat-completions - Use for very large offline embedding backfills
together-batch-inference - Use when reranking requires a dedicated deployment
together-dedicated-endpoints
- 最终答案生成步骤使用
together-chat-completions - 超大规模离线embedding回填使用
together-batch-inference - 当重排序需要专用部署时,使用
together-dedicated-endpoints
Quick Routing
快速上手指引
- Embeddings API usage
- Read references/api-reference.md
- Start with scripts/embed_and_rerank.py or scripts/embed_and_rerank.ts
- Semantic search (embed, store, query)
- Start with scripts/semantic_search.py -- includes an in-memory vector store, cosine-similarity retrieval, and optional rerank
- RAG pipeline composition
- Start with scripts/rag_pipeline.py
- Model selection and rerank constraints
- Read references/models.md
- Embeddings API 使用
- 阅读 references/api-reference.md
- 从 scripts/embed_and_rerank.py 或 scripts/embed_and_rerank.ts 开始上手
- 语义搜索(嵌入、存储、查询)
- 从 scripts/semantic_search.py 开始上手 -- 包含内存向量存储、余弦相似度检索和可选重排序能力
- RAG管道搭建
- 从 scripts/rag_pipeline.py 开始上手
- 模型选择与重排序约束
- 阅读 references/models.md
Workflow
工作流程
- Confirm that the user needs vectors or retrieval, not direct generation.
- Choose the embedding model and batch shape.
- Generate embeddings for corpus and query paths consistently.
- Retrieve candidates. An in-memory cosine-similarity store works for prototyping and small corpora (see ). Use a dedicated vector database for production scale.
semantic_search.py - Rerank only when the extra latency and endpoint requirement are justified. When no dedicated rerank endpoint is available, cosine-similarity ranking is a reasonable fallback.
- 确认用户需要向量或检索能力,而非直接生成内容。
- 选择embedding模型和批量大小。
- 为语料库和查询链路生成一致的embeddings。
- 检索候选结果。内存余弦相似度存储适用于原型开发和小型语料库(参见)。生产级规模请使用专用向量数据库。
semantic_search.py - 仅在额外延迟和端点需求合理时使用重排序。若无可用的专用重排序端点,余弦相似度排序是合理的降级方案。
High-Signal Rules
重要注意事项
- Python scripts require the Together v2 SDK (). If the user is on an older version, they must upgrade first:
together>=2.0.0.uv pip install --upgrade "together>=2.0.0" - Keep embeddings and reranking conceptually separate; rerank is a second-stage precision step.
- Reranking in this repo assumes a dedicated endpoint. Do not promise serverless rerank unless the product changes. When no endpoint is available, fall back to cosine-similarity ranking.
- The embedding model has a 514-token context limit. Chunk longer documents before embedding.
- The example demonstrates retrieval plus generation; treat generation as a hand-off to chat completions.
rag_pipeline.py - Preserve model consistency across indexing and querying.
- Python脚本依赖Together v2 SDK()。如果用户使用的是旧版本,必须先升级:
together>=2.0.0。uv pip install --upgrade "together>=2.0.0" - 从概念上区分embeddings和重排序:重排序是第二阶段的精准度优化步骤。
- 本仓库中的重排序功能默认需要专用端点。除非产品迭代支持,否则不要承诺提供serverless重排序能力。无可用端点时,请降级使用余弦相似度排序。
- embedding模型的上下文限制为514-token。生成embedding前请对长文档进行分块。
- 示例展示了检索加生成的流程:请将生成部分转接给chat completions能力。
rag_pipeline.py - 索引和查询阶段需保持使用的模型一致。
Resource Map
资源索引
- API details: references/api-reference.md
- Model guide: references/models.md
- Python embeddings example: scripts/embed_and_rerank.py
- TypeScript embeddings example: scripts/embed_and_rerank.ts
- Python semantic search: scripts/semantic_search.py
- Python RAG pipeline: scripts/rag_pipeline.py
- API详情: references/api-reference.md
- 模型指南: references/models.md
- Python embeddings示例: scripts/embed_and_rerank.py
- TypeScript embeddings示例: scripts/embed_and_rerank.ts
- Python语义搜索示例: scripts/semantic_search.py
- Python RAG管道示例: scripts/rag_pipeline.py