chat

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Chat with Sarvam-M

与Sarvam-M聊天

Sarvam-M is Sarvam AI's flagship language model optimized for Indian languages with hybrid thinking capabilities for complex reasoning.

Sarvam-M是Sarvam AI的旗舰语言模型，针对印度语言进行了优化，具备混合思维能力，可处理复杂推理任务。

Key Features

核心特性

24B parameters optimized for Indic languages
Hybrid thinking mode for complex reasoning
Free to use (no cost)
OpenAI-compatible API
Multilingual chat in English + 10 Indian languages

240亿参数：针对印度语言优化
混合思维模式：支持复杂推理
免费使用（无任何费用）
与OpenAI兼容的API
多语言支持：支持英语及10种印度语言聊天

Installation

安装

bash

pip install sarvamai

bash

pip install sarvamai

or

pip install openai # OpenAI-compatible

undefined

pip install openai # OpenAI-compatible

undefined

Quick Start

快速开始

Sarvam SDK

python

from sarvamai import SarvamAI

client = SarvamAI()

response = client.chat.completions.create(
    model="sarvam-m",
    messages=[
    {
        "role": "user",
        "content": "भारत की राजधानी क्या है?"
    }
]
)

print(response.choices[
    0
].message.content)

python

from sarvamai import SarvamAI

client = SarvamAI()

response = client.chat.completions.create(
    model="sarvam-m",
    messages=[
    {
        "role": "user",
        "content": "भारत की राजधानी क्या है?"
    }
]
)

print(response.choices[
    0
].message.content)

OpenAI-Compatible

兼容OpenAI

python

from openai import OpenAI

client = OpenAI(
    api_key=os.environ[
    "SARVAM_API_KEY"
],
    base_url="https://api.sarvam.ai/v1"
)

response = client.chat.completions.create(
    model="sarvam-m",
    messages=[
    {
        "role": "user",
        "content": "What is the capital of India?"
    }
]
)

print(response.choices[
    0
].message.content)

python

from openai import OpenAI

client = OpenAI(
    api_key=os.environ[
    "SARVAM_API_KEY"
],
    base_url="https://api.sarvam.ai/v1"
)

response = client.chat.completions.create(
    model="sarvam-m",
    messages=[
    {
        "role": "user",
        "content": "What is the capital of India?"
    }
]
)

print(response.choices[
    0
].message.content)

Hybrid Thinking Mode

混合思维模式

Enable step-by-step reasoning for complex problems:

python

response = client.chat.completions.create(
    model="sarvam-m",
    messages=[
    {
        "role": "user",
        "content": "Solve: If a train travels 120km in 2 hours, what is its average speed?"
    }
],
    thinking=True  # Enable reasoning
)

为复杂问题启用逐步推理：

python

response = client.chat.completions.create(
    model="sarvam-m",
    messages=[
    {
        "role": "user",
        "content": "Solve: If a train travels 120km in 2 hours, what is its average speed?"
    }
],
    thinking=True  # Enable reasoning
)

Access thinking process

print("Thinking:", response.choices[ 0 ].message.thinking) print("Answer:", response.choices[ 0 ].message.content)

undefined

print("Thinking:", response.choices[ 0 ].message.thinking) print("Answer:", response.choices[ 0 ].message.content)

undefined

System Prompts

系统提示词

Guide the model's behavior:

python

response = client.chat.completions.create(
    model="sarvam-m",
    messages=[
    {
        "role": "system",
        "content": "You are a helpful Hindi tutor. Always respond in Hindi with English transliteration in parentheses."
    },
    {
        "role": "user",
        "content": "How do I say 'Good morning'?"
    }
]
)

引导模型的行为：

python

response = client.chat.completions.create(
    model="sarvam-m",
    messages=[
    {
        "role": "system",
        "content": "You are a helpful Hindi tutor. Always respond in Hindi with English transliteration in parentheses."
    },
    {
        "role": "user",
        "content": "How do I say 'Good morning'?"
    }
]
)

Multi-turn Conversations

多轮对话

Maintain context across turns:

python

messages = [
    {
        "role": "system",
        "content": "You are a knowledgeable assistant."
    },
    {
        "role": "user",
        "content": "Tell me about the Taj Mahal"
    },
    {
        "role": "assistant",
        "content": "The Taj Mahal is a white marble mausoleum..."
    },
    {
        "role": "user",
        "content": "Who built it and when?"
    }
]

response = client.chat.completions.create(
    model="sarvam-m",
    messages=messages
)

在多轮对话中保持上下文：

python

messages = [
    {
        "role": "system",
        "content": "You are a knowledgeable assistant."
    },
    {
        "role": "user",
        "content": "Tell me about the Taj Mahal"
    },
    {
        "role": "assistant",
        "content": "The Taj Mahal is a white marble mausoleum..."
    },
    {
        "role": "user",
        "content": "Who built it and when?"
    }
]

response = client.chat.completions.create(
    model="sarvam-m",
    messages=messages
)

Streaming

流式响应

Stream responses token by token:

python

stream = client.chat.completions.create(
    model="sarvam-m",
    messages=[
    {
        "role": "user",
        "content": "Write a short poem about India"
    }
],
    stream=True
)

for chunk in stream:
    if chunk.choices[
    0
].delta.content:
        print(chunk.choices[
    0
].delta.content, end="", flush=True)

逐token流式返回响应：

python

stream = client.chat.completions.create(
    model="sarvam-m",
    messages=[
    {
        "role": "user",
        "content": "Write a short poem about India"
    }
],
    stream=True
)

for chunk in stream:
    if chunk.choices[
    0
].delta.content:
        print(chunk.choices[
    0
].delta.content, end="", flush=True)

Temperature Control

温度控制

Control randomness:

python

undefined

控制响应的随机性：

python

undefined

Creative (higher temperature)

response = client.chat.completions.create( model="sarvam-m", messages=[ { "role": "user", "content": "Write a creative story" } ], temperature=0.9 )

Factual (lower temperature)

response = client.chat.completions.create( model="sarvam-m", messages=[ { "role": "user", "content": "What is 2+2?" } ], temperature=0.1 )

undefined

response = client.chat.completions.create( model="sarvam-m", messages=[ { "role": "user", "content": "What is 2+2?" } ], temperature=0.1 )

undefined

JavaScript

javascript

import { SarvamAI
} from "sarvamai";

const client = new SarvamAI();

const response = await client.chat.completions.create({
  model: "sarvam-m",
  messages: [
        { role: "user", content: "भारत की राजधानी क्या है?"
        }
    ]
});

console.log(response.choices[
    0
].message.content);

javascript

import { SarvamAI
} from "sarvamai";

const client = new SarvamAI();

const response = await client.chat.completions.create({
  model: "sarvam-m",
  messages: [
        { role: "user", content: "भारत की राजधानी क्या है?"
        }
    ]
});

console.log(response.choices[
    0
].message.content);

OpenAI SDK

javascript

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.SARVAM_API_KEY,
  baseURL: "https://api.sarvam.ai/v1"
});

const response = await client.chat.completions.create({
  model: "sarvam-m",
  messages: [
        { role: "user", content: "Hello!"
        }
    ]
});

javascript

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.SARVAM_API_KEY,
  baseURL: "https://api.sarvam.ai/v1"
});

const response = await client.chat.completions.create({
  model: "sarvam-m",
  messages: [
        { role: "user", content: "Hello!"
        }
    ]
});

cURL

bash

curl -X POST "https://api.sarvam.ai/v1/chat/completions" \
  -H "api-subscription-key: $SARVAM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "sarvam-m",
    "messages": [
        {
            "role": "user",
            "content": "What is the capital of India?"
        }
    ]
}'

bash

curl -X POST "https://api.sarvam.ai/v1/chat/completions" \
  -H "api-subscription-key: $SARVAM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "sarvam-m",
    "messages": [
        {
            "role": "user",
            "content": "What is the capital of India?"
        }
    ]
}'

Parameters

参数

Parameter	Type	Required	Description
`model`	string	Yes	`sarvam-m`
`messages`	array	Yes	Conversation history
`temperature`	float	No	0.0-2.0 (default: 1.0)
`max_tokens`	int	No	Max response length
`stream`	bool	No	Enable streaming
`thinking`	bool	No	Enable hybrid thinking
`top_p`	float	No	Nucleus sampling (0.0-1.0)

参数	类型	是否必填	描述
`model`	string	是	`sarvam-m`
`messages`	array	是	对话历史
`temperature`	float	否	0.0-2.0（默认值：1.0）
`max_tokens`	int	否	响应的最大长度
`stream`	bool	否	启用流式响应
`thinking`	bool	否	启用混合思维
`top_p`	float	否	核采样（0.0-1.0）

Response

响应示例

json

{
    "id": "chatcmpl-abc123",
    "object": "chat.completion",
    "model": "sarvam-m",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "भारत की राजधानी नई दिल्ली है।"
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 15,
        "completion_tokens": 12,
        "total_tokens": 27
    }
}

See references/prompting.md for advanced prompting techniques.

json

{
    "id": "chatcmpl-abc123",
    "object": "chat.completion",
    "model": "sarvam-m",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "भारत की राजधानी नई दिल्ली है।"
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 15,
        "completion_tokens": 12,
        "total_tokens": 27
    }
}

如需了解高级提示词技巧，请查看references/prompting.md。