together-embeddings

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Together Embeddings & Reranking

Together 嵌入与重排序

Overview

概述

Generate vector embeddings for text and rerank documents by relevance.
  • Embeddings endpoint:
    /v1/embeddings
  • Rerank endpoint:
    /v1/rerank
生成文本的向量嵌入,并按照相关性对文档进行重排序。
  • 嵌入接口:
    /v1/embeddings
  • 重排序接口:
    /v1/rerank

Embeddings

嵌入功能

Generate Embeddings

生成嵌入

python
from together import Together
client = Together()

response = client.embeddings.create(
    model="BAAI/bge-base-en-v1.5",
    input="What is the meaning of life?",
)
print(response.data[0].embedding[:5])  # First 5 dimensions
typescript
import Together from "together-ai";
const together = new Together();

const response = await together.embeddings.create({
  model: "BAAI/bge-base-en-v1.5",
  input: "What is the meaning of life?",
});
console.log(response.data[0].embedding.slice(0, 5));
shell
curl -X POST "https://api.together.xyz/v1/embeddings" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"BAAI/bge-base-en-v1.5","input":"What is the meaning of life?"}'
python
from together import Together
client = Together()

response = client.embeddings.create(
    model="BAAI/bge-base-en-v1.5",
    input="What is the meaning of life?",
)
print(response.data[0].embedding[:5])  # First 5 dimensions
typescript
import Together from "together-ai";
const together = new Together();

const response = await together.embeddings.create({
  model: "BAAI/bge-base-en-v1.5",
  input: "What is the meaning of life?",
});
console.log(response.data[0].embedding.slice(0, 5));
shell
curl -X POST "https://api.together.xyz/v1/embeddings" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"BAAI/bge-base-en-v1.5","input":"What is the meaning of life?"}'

Batch Embeddings

批量生成嵌入

python
texts = ["First document", "Second document", "Third document"]
response = client.embeddings.create(
    model="BAAI/bge-base-en-v1.5",
    input=texts,
)
for i, item in enumerate(response.data):
    print(f"Text {i}: {len(item.embedding)} dimensions")
typescript
import Together from "together-ai";
const together = new Together();

const response = await together.embeddings.create({
  model: "BAAI/bge-base-en-v1.5",
  input: [
    "First document",
    "Second document",
    "Third document",
  ],
});
for (const item of response.data) {
  console.log(`Index ${item.index}: ${item.embedding.length} dimensions`);
}
shell
curl -X POST "https://api.together.xyz/v1/embeddings" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "BAAI/bge-base-en-v1.5",
    "input": [
      "First document",
      "Second document",
      "Third document"
    ]
  }'
python
texts = ["First document", "Second document", "Third document"]
response = client.embeddings.create(
    model="BAAI/bge-base-en-v1.5",
    input=texts,
)
for i, item in enumerate(response.data):
    print(f"Text {i}: {len(item.embedding)} dimensions")
typescript
import Together from "together-ai";
const together = new Together();

const response = await together.embeddings.create({
  model: "BAAI/bge-base-en-v1.5",
  input: [
    "First document",
    "Second document",
    "Third document",
  ],
});
for (const item of response.data) {
  console.log(`Index ${item.index}: ${item.embedding.length} dimensions`);
}
shell
curl -X POST "https://api.together.xyz/v1/embeddings" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "BAAI/bge-base-en-v1.5",
    "input": [
      "First document",
      "Second document",
      "Third document"
    ]
  }'

Embedding Models

嵌入模型

ModelAPI StringDimensionsMax Input
BGE Base EN v1.5
BAAI/bge-base-en-v1.5
768512 tokens
Multilingual E5 Large
intfloat/multilingual-e5-large-instruct
1024514 tokens (recommended)
模型API标识维度最大输入
BGE Base EN v1.5
BAAI/bge-base-en-v1.5
768512 tokens
Multilingual E5 Large
intfloat/multilingual-e5-large-instruct
1024514 tokens (推荐)

Reranking

重排序功能

Rerank a set of documents by relevance to a query:
python
response = client.rerank.create(
    model="mixedbread-ai/Mxbai-Rerank-Large-V2",
    query="What is the capital of France?",
    documents=[
        "Paris is the capital of France.",
        "Berlin is the capital of Germany.",
        "London is the capital of England.",
        "The Eiffel Tower is in Paris.",
    ],
)
for result in response.results:
    print(f"Index: {result.index}, Score: {result.relevance_score:.4f}")
typescript
import Together from "together-ai";
const together = new Together();

const documents = [
  "Paris is the capital of France.",
  "Berlin is the capital of Germany.",
  "London is the capital of England.",
  "The Eiffel Tower is in Paris.",
];

const response = await together.rerank.create({
  model: "mixedbread-ai/Mxbai-Rerank-Large-V2",
  query: "What is the capital of France?",
  documents,
  top_n: 2,
});

for (const result of response.results) {
  console.log(`Index: ${result.index}, Score: ${result.relevance_score}`);
}
shell
curl -X POST "https://api.together.xyz/v1/rerank" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mixedbread-ai/Mxbai-Rerank-Large-V2",
    "query": "What is the capital of France?",
    "documents": ["Paris is the capital of France.", "Berlin is the capital of Germany."]
  }'
根据与查询的相关性对文档集合进行重排序:
python
response = client.rerank.create(
    model="mixedbread-ai/Mxbai-Rerank-Large-V2",
    query="What is the capital of France?",
    documents=[
        "Paris is the capital of France.",
        "Berlin is the capital of Germany.",
        "London is the capital of England.",
        "The Eiffel Tower is in Paris.",
    ],
)
for result in response.results:
    print(f"Index: {result.index}, Score: {result.relevance_score:.4f}")
typescript
import Together from "together-ai";
const together = new Together();

const documents = [
  "Paris is the capital of France.",
  "Berlin is the capital of Germany.",
  "London is the capital of England.",
  "The Eiffel Tower is in Paris.",
];

const response = await together.rerank.create({
  model: "mixedbread-ai/Mxbai-Rerank-Large-V2",
  query: "What is the capital of France?",
  documents,
  top_n: 2,
});

for (const result of response.results) {
  console.log(`Index: ${result.index}, Score: ${result.relevance_score}`);
}
shell
curl -X POST "https://api.together.xyz/v1/rerank" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mixedbread-ai/Mxbai-Rerank-Large-V2",
    "query": "What is the capital of France?",
    "documents": ["Paris is the capital of France.", "Berlin is the capital of Germany."]
  }'

Rerank Parameters

重排序参数

ParameterTypeDescription
model
stringRerank model (required)
query
stringSearch query (required)
documents
string[] or object[]Documents to rerank (required). Pass objects with named fields for structured documents.
top_n
intReturn top N results
return_documents
boolInclude document text in response
rank_fields
string[]Fields to use for ranking when documents are JSON objects (e.g.,
["title", "text"]
)
参数类型描述
model
string重排序模型(必填)
query
string搜索查询(必填)
documents
string[] 或 object[]待重排序的文档(必填)。如果是结构化文档可传入带命名字段的对象
top_n
int返回排名前N的结果
return_documents
bool响应中是否包含文档原文
rank_fields
string[]当文档为JSON对象时用于排序的字段(例如:
["title", "text"]

RAG Pipeline Pattern

RAG流水线示例

python
undefined
python
undefined

1. Generate query embedding

1. Generate query embedding

query_embedding = client.embeddings.create( model="BAAI/bge-base-en-v1.5", input="How does photosynthesis work?", ).data[0].embedding
query_embedding = client.embeddings.create( model="BAAI/bge-base-en-v1.5", input="How does photosynthesis work?", ).data[0].embedding

2. Retrieve candidates from vector DB (your code)

2. Retrieve candidates from vector DB (your code)

candidates = vector_db.search(query_embedding, top_k=20)
candidates = vector_db.search(query_embedding, top_k=20)

3. Rerank for precision

3. Rerank for precision

reranked = client.rerank.create( model="mixedbread-ai/Mxbai-Rerank-Large-V2", query="How does photosynthesis work?", documents=[c.text for c in candidates], top_n=5, )
reranked = client.rerank.create( model="mixedbread-ai/Mxbai-Rerank-Large-V2", query="How does photosynthesis work?", documents=[c.text for c in candidates], top_n=5, )

4. Use top results as context for LLM

4. Use top results as context for LLM

context = "\n".join([candidates[r.index].text for r in reranked.results]) response = client.chat.completions.create( model="meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo", messages=[ {"role": "system", "content": f"Answer based on this context:\n{context}"}, {"role": "user", "content": "How does photosynthesis work?"}, ], )
undefined
context = "\n".join([candidates[r.index].text for r in reranked.results]) response = client.chat.completions.create( model="meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo", messages=[ {"role": "system", "content": f"Answer based on this context:\n{context}"}, {"role": "user", "content": "How does photosynthesis work?"}, ], )
undefined

Resources

相关资源