together-embeddings

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Together Embeddings & Reranking

Together 嵌入与重排序

Overview

概述

Generate vector embeddings for text and rerank documents by relevance.

Embeddings endpoint:
```
/v1/embeddings
```
Rerank endpoint:
```
/v1/rerank
```

生成文本的向量嵌入，并按照相关性对文档进行重排序。

嵌入接口：
```
/v1/embeddings
```
重排序接口：
```
/v1/rerank
```

Embeddings

嵌入功能

Generate Embeddings

生成嵌入

python

from together import Together
client = Together()

response = client.embeddings.create(
    model="BAAI/bge-base-en-v1.5",
    input="What is the meaning of life?",
)
print(response.data[0].embedding[:5])  # First 5 dimensions

typescript

import Together from "together-ai";
const together = new Together();

const response = await together.embeddings.create({
  model: "BAAI/bge-base-en-v1.5",
  input: "What is the meaning of life?",
});
console.log(response.data[0].embedding.slice(0, 5));

shell

curl -X POST "https://api.together.xyz/v1/embeddings" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"BAAI/bge-base-en-v1.5","input":"What is the meaning of life?"}'

python

from together import Together
client = Together()

response = client.embeddings.create(
    model="BAAI/bge-base-en-v1.5",
    input="What is the meaning of life?",
)
print(response.data[0].embedding[:5])  # First 5 dimensions

typescript

import Together from "together-ai";
const together = new Together();

const response = await together.embeddings.create({
  model: "BAAI/bge-base-en-v1.5",
  input: "What is the meaning of life?",
});
console.log(response.data[0].embedding.slice(0, 5));

shell

curl -X POST "https://api.together.xyz/v1/embeddings" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"BAAI/bge-base-en-v1.5","input":"What is the meaning of life?"}'

Batch Embeddings

批量生成嵌入

python

texts = ["First document", "Second document", "Third document"]
response = client.embeddings.create(
    model="BAAI/bge-base-en-v1.5",
    input=texts,
)
for i, item in enumerate(response.data):
    print(f"Text {i}: {len(item.embedding)} dimensions")

typescript

import Together from "together-ai";
const together = new Together();

const response = await together.embeddings.create({
  model: "BAAI/bge-base-en-v1.5",
  input: [
    "First document",
    "Second document",
    "Third document",
  ],
});
for (const item of response.data) {
  console.log(`Index ${item.index}: ${item.embedding.length} dimensions`);
}

shell

curl -X POST "https://api.together.xyz/v1/embeddings" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "BAAI/bge-base-en-v1.5",
    "input": [
      "First document",
      "Second document",
      "Third document"
    ]
  }'

python

texts = ["First document", "Second document", "Third document"]
response = client.embeddings.create(
    model="BAAI/bge-base-en-v1.5",
    input=texts,
)
for i, item in enumerate(response.data):
    print(f"Text {i}: {len(item.embedding)} dimensions")

typescript

import Together from "together-ai";
const together = new Together();

const response = await together.embeddings.create({
  model: "BAAI/bge-base-en-v1.5",
  input: [
    "First document",
    "Second document",
    "Third document",
  ],
});
for (const item of response.data) {
  console.log(`Index ${item.index}: ${item.embedding.length} dimensions`);
}

shell

curl -X POST "https://api.together.xyz/v1/embeddings" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "BAAI/bge-base-en-v1.5",
    "input": [
      "First document",
      "Second document",
      "Third document"
    ]
  }'

Embedding Models

嵌入模型

Model	API String	Dimensions	Max Input
BGE Base EN v1.5	`BAAI/bge-base-en-v1.5`	768	512 tokens
Multilingual E5 Large	`intfloat/multilingual-e5-large-instruct`	1024	514 tokens (recommended)

模型	API标识	维度	最大输入
BGE Base EN v1.5	`BAAI/bge-base-en-v1.5`	768	512 tokens
Multilingual E5 Large	`intfloat/multilingual-e5-large-instruct`	1024	514 tokens (推荐)

Reranking

重排序功能

Rerank a set of documents by relevance to a query:

python

response = client.rerank.create(
    model="mixedbread-ai/Mxbai-Rerank-Large-V2",
    query="What is the capital of France?",
    documents=[
        "Paris is the capital of France.",
        "Berlin is the capital of Germany.",
        "London is the capital of England.",
        "The Eiffel Tower is in Paris.",
    ],
)
for result in response.results:
    print(f"Index: {result.index}, Score: {result.relevance_score:.4f}")

typescript

import Together from "together-ai";
const together = new Together();

const documents = [
  "Paris is the capital of France.",
  "Berlin is the capital of Germany.",
  "London is the capital of England.",
  "The Eiffel Tower is in Paris.",
];

const response = await together.rerank.create({
  model: "mixedbread-ai/Mxbai-Rerank-Large-V2",
  query: "What is the capital of France?",
  documents,
  top_n: 2,
});

for (const result of response.results) {
  console.log(`Index: ${result.index}, Score: ${result.relevance_score}`);
}

shell

curl -X POST "https://api.together.xyz/v1/rerank" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mixedbread-ai/Mxbai-Rerank-Large-V2",
    "query": "What is the capital of France?",
    "documents": ["Paris is the capital of France.", "Berlin is the capital of Germany."]
  }'

根据与查询的相关性对文档集合进行重排序：

python

response = client.rerank.create(
    model="mixedbread-ai/Mxbai-Rerank-Large-V2",
    query="What is the capital of France?",
    documents=[
        "Paris is the capital of France.",
        "Berlin is the capital of Germany.",
        "London is the capital of England.",
        "The Eiffel Tower is in Paris.",
    ],
)
for result in response.results:
    print(f"Index: {result.index}, Score: {result.relevance_score:.4f}")

typescript

import Together from "together-ai";
const together = new Together();

const documents = [
  "Paris is the capital of France.",
  "Berlin is the capital of Germany.",
  "London is the capital of England.",
  "The Eiffel Tower is in Paris.",
];

const response = await together.rerank.create({
  model: "mixedbread-ai/Mxbai-Rerank-Large-V2",
  query: "What is the capital of France?",
  documents,
  top_n: 2,
});

for (const result of response.results) {
  console.log(`Index: ${result.index}, Score: ${result.relevance_score}`);
}

shell

curl -X POST "https://api.together.xyz/v1/rerank" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mixedbread-ai/Mxbai-Rerank-Large-V2",
    "query": "What is the capital of France?",
    "documents": ["Paris is the capital of France.", "Berlin is the capital of Germany."]
  }'

Rerank Parameters

重排序参数

Parameter	Type	Description
`model`	string	Rerank model (required)
`query`	string	Search query (required)
`documents`	string[] or object[]	Documents to rerank (required). Pass objects with named fields for structured documents.
`top_n`	int	Return top N results
`return_documents`	bool	Include document text in response
`rank_fields`	string[]	Fields to use for ranking when documents are JSON objects (e.g., `["title", "text"]` )

参数	类型	描述
`model`	string	重排序模型（必填）
`query`	string	搜索查询（必填）
`documents`	string[] 或 object[]	待重排序的文档（必填）。如果是结构化文档可传入带命名字段的对象
`top_n`	int	返回排名前N的结果
`return_documents`	bool	响应中是否包含文档原文
`rank_fields`	string[]	当文档为JSON对象时用于排序的字段（例如： `["title", "text"]` ）

RAG Pipeline Pattern

RAG流水线示例

python

undefined

python

undefined

1. Generate query embedding

query_embedding = client.embeddings.create( model="BAAI/bge-base-en-v1.5", input="How does photosynthesis work?", ).data[0].embedding

2. Retrieve candidates from vector DB (your code)

candidates = vector_db.search(query_embedding, top_k=20)

3. Rerank for precision

reranked = client.rerank.create( model="mixedbread-ai/Mxbai-Rerank-Large-V2", query="How does photosynthesis work?", documents=[c.text for c in candidates], top_n=5, )

4. Use top results as context for LLM

context = "\n".join([candidates[r.index].text for r in reranked.results]) response = client.chat.completions.create( model="meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo", messages=[ {"role": "system", "content": f"Answer based on this context:\n{context}"}, {"role": "user", "content": "How does photosynthesis work?"}, ], )

undefined

undefined

together-embeddings

Original

Translation

Together Embeddings & Reranking

Together 嵌入与重排序

Overview

概述

Embeddings

嵌入功能

Generate Embeddings

生成嵌入

Batch Embeddings

批量生成嵌入

Embedding Models

嵌入模型

Reranking

重排序功能

Rerank Parameters

重排序参数

RAG Pipeline Pattern

RAG流水线示例

1. Generate query embedding

1. Generate query embedding

2. Retrieve candidates from vector DB (your code)

2. Retrieve candidates from vector DB (your code)

3. Rerank for precision

3. Rerank for precision

4. Use top results as context for LLM

4. Use top results as context for LLM

Resources

相关资源