together-embeddings
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTogether Embeddings & Reranking
Together 嵌入与重排序
Overview
概述
Generate vector embeddings for text and rerank documents by relevance.
- Embeddings endpoint:
/v1/embeddings - Rerank endpoint:
/v1/rerank
生成文本的向量嵌入,并按照相关性对文档进行重排序。
- 嵌入接口:
/v1/embeddings - 重排序接口:
/v1/rerank
Embeddings
嵌入功能
Generate Embeddings
生成嵌入
python
from together import Together
client = Together()
response = client.embeddings.create(
model="BAAI/bge-base-en-v1.5",
input="What is the meaning of life?",
)
print(response.data[0].embedding[:5]) # First 5 dimensionstypescript
import Together from "together-ai";
const together = new Together();
const response = await together.embeddings.create({
model: "BAAI/bge-base-en-v1.5",
input: "What is the meaning of life?",
});
console.log(response.data[0].embedding.slice(0, 5));shell
curl -X POST "https://api.together.xyz/v1/embeddings" \
-H "Authorization: Bearer $TOGETHER_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"BAAI/bge-base-en-v1.5","input":"What is the meaning of life?"}'python
from together import Together
client = Together()
response = client.embeddings.create(
model="BAAI/bge-base-en-v1.5",
input="What is the meaning of life?",
)
print(response.data[0].embedding[:5]) # First 5 dimensionstypescript
import Together from "together-ai";
const together = new Together();
const response = await together.embeddings.create({
model: "BAAI/bge-base-en-v1.5",
input: "What is the meaning of life?",
});
console.log(response.data[0].embedding.slice(0, 5));shell
curl -X POST "https://api.together.xyz/v1/embeddings" \
-H "Authorization: Bearer $TOGETHER_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"BAAI/bge-base-en-v1.5","input":"What is the meaning of life?"}'Batch Embeddings
批量生成嵌入
python
texts = ["First document", "Second document", "Third document"]
response = client.embeddings.create(
model="BAAI/bge-base-en-v1.5",
input=texts,
)
for i, item in enumerate(response.data):
print(f"Text {i}: {len(item.embedding)} dimensions")typescript
import Together from "together-ai";
const together = new Together();
const response = await together.embeddings.create({
model: "BAAI/bge-base-en-v1.5",
input: [
"First document",
"Second document",
"Third document",
],
});
for (const item of response.data) {
console.log(`Index ${item.index}: ${item.embedding.length} dimensions`);
}shell
curl -X POST "https://api.together.xyz/v1/embeddings" \
-H "Authorization: Bearer $TOGETHER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "BAAI/bge-base-en-v1.5",
"input": [
"First document",
"Second document",
"Third document"
]
}'python
texts = ["First document", "Second document", "Third document"]
response = client.embeddings.create(
model="BAAI/bge-base-en-v1.5",
input=texts,
)
for i, item in enumerate(response.data):
print(f"Text {i}: {len(item.embedding)} dimensions")typescript
import Together from "together-ai";
const together = new Together();
const response = await together.embeddings.create({
model: "BAAI/bge-base-en-v1.5",
input: [
"First document",
"Second document",
"Third document",
],
});
for (const item of response.data) {
console.log(`Index ${item.index}: ${item.embedding.length} dimensions`);
}shell
curl -X POST "https://api.together.xyz/v1/embeddings" \
-H "Authorization: Bearer $TOGETHER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "BAAI/bge-base-en-v1.5",
"input": [
"First document",
"Second document",
"Third document"
]
}'Embedding Models
嵌入模型
| Model | API String | Dimensions | Max Input |
|---|---|---|---|
| BGE Base EN v1.5 | | 768 | 512 tokens |
| Multilingual E5 Large | | 1024 | 514 tokens (recommended) |
| 模型 | API标识 | 维度 | 最大输入 |
|---|---|---|---|
| BGE Base EN v1.5 | | 768 | 512 tokens |
| Multilingual E5 Large | | 1024 | 514 tokens (推荐) |
Reranking
重排序功能
Rerank a set of documents by relevance to a query:
python
response = client.rerank.create(
model="mixedbread-ai/Mxbai-Rerank-Large-V2",
query="What is the capital of France?",
documents=[
"Paris is the capital of France.",
"Berlin is the capital of Germany.",
"London is the capital of England.",
"The Eiffel Tower is in Paris.",
],
)
for result in response.results:
print(f"Index: {result.index}, Score: {result.relevance_score:.4f}")typescript
import Together from "together-ai";
const together = new Together();
const documents = [
"Paris is the capital of France.",
"Berlin is the capital of Germany.",
"London is the capital of England.",
"The Eiffel Tower is in Paris.",
];
const response = await together.rerank.create({
model: "mixedbread-ai/Mxbai-Rerank-Large-V2",
query: "What is the capital of France?",
documents,
top_n: 2,
});
for (const result of response.results) {
console.log(`Index: ${result.index}, Score: ${result.relevance_score}`);
}shell
curl -X POST "https://api.together.xyz/v1/rerank" \
-H "Authorization: Bearer $TOGETHER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "mixedbread-ai/Mxbai-Rerank-Large-V2",
"query": "What is the capital of France?",
"documents": ["Paris is the capital of France.", "Berlin is the capital of Germany."]
}'根据与查询的相关性对文档集合进行重排序:
python
response = client.rerank.create(
model="mixedbread-ai/Mxbai-Rerank-Large-V2",
query="What is the capital of France?",
documents=[
"Paris is the capital of France.",
"Berlin is the capital of Germany.",
"London is the capital of England.",
"The Eiffel Tower is in Paris.",
],
)
for result in response.results:
print(f"Index: {result.index}, Score: {result.relevance_score:.4f}")typescript
import Together from "together-ai";
const together = new Together();
const documents = [
"Paris is the capital of France.",
"Berlin is the capital of Germany.",
"London is the capital of England.",
"The Eiffel Tower is in Paris.",
];
const response = await together.rerank.create({
model: "mixedbread-ai/Mxbai-Rerank-Large-V2",
query: "What is the capital of France?",
documents,
top_n: 2,
});
for (const result of response.results) {
console.log(`Index: ${result.index}, Score: ${result.relevance_score}`);
}shell
curl -X POST "https://api.together.xyz/v1/rerank" \
-H "Authorization: Bearer $TOGETHER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "mixedbread-ai/Mxbai-Rerank-Large-V2",
"query": "What is the capital of France?",
"documents": ["Paris is the capital of France.", "Berlin is the capital of Germany."]
}'Rerank Parameters
重排序参数
| Parameter | Type | Description |
|---|---|---|
| string | Rerank model (required) |
| string | Search query (required) |
| string[] or object[] | Documents to rerank (required). Pass objects with named fields for structured documents. |
| int | Return top N results |
| bool | Include document text in response |
| string[] | Fields to use for ranking when documents are JSON objects (e.g., |
| 参数 | 类型 | 描述 |
|---|---|---|
| string | 重排序模型(必填) |
| string | 搜索查询(必填) |
| string[] 或 object[] | 待重排序的文档(必填)。如果是结构化文档可传入带命名字段的对象 |
| int | 返回排名前N的结果 |
| bool | 响应中是否包含文档原文 |
| string[] | 当文档为JSON对象时用于排序的字段(例如: |
RAG Pipeline Pattern
RAG流水线示例
python
undefinedpython
undefined1. Generate query embedding
1. Generate query embedding
query_embedding = client.embeddings.create(
model="BAAI/bge-base-en-v1.5",
input="How does photosynthesis work?",
).data[0].embedding
query_embedding = client.embeddings.create(
model="BAAI/bge-base-en-v1.5",
input="How does photosynthesis work?",
).data[0].embedding
2. Retrieve candidates from vector DB (your code)
2. Retrieve candidates from vector DB (your code)
candidates = vector_db.search(query_embedding, top_k=20)
candidates = vector_db.search(query_embedding, top_k=20)
3. Rerank for precision
3. Rerank for precision
reranked = client.rerank.create(
model="mixedbread-ai/Mxbai-Rerank-Large-V2",
query="How does photosynthesis work?",
documents=[c.text for c in candidates],
top_n=5,
)
reranked = client.rerank.create(
model="mixedbread-ai/Mxbai-Rerank-Large-V2",
query="How does photosynthesis work?",
documents=[c.text for c in candidates],
top_n=5,
)
4. Use top results as context for LLM
4. Use top results as context for LLM
context = "\n".join([candidates[r.index].text for r in reranked.results])
response = client.chat.completions.create(
model="meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
messages=[
{"role": "system", "content": f"Answer based on this context:\n{context}"},
{"role": "user", "content": "How does photosynthesis work?"},
],
)
undefinedcontext = "\n".join([candidates[r.index].text for r in reranked.results])
response = client.chat.completions.create(
model="meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
messages=[
{"role": "system", "content": f"Answer based on this context:\n{context}"},
{"role": "user", "content": "How does photosynthesis work?"},
],
)
undefinedResources
相关资源
- Model details: See references/models.md
- Runnable script: See scripts/embed_and_rerank.py — embed, compute similarity, and rerank pipeline (v2 SDK)
- Official docs: Embeddings Overview
- Official docs: Rerank Overview
- API reference: Embeddings API
- API reference: Rerank API