Multi-Turn Chat
The Chat API is fully based on LMDeploy. You can refer to the LMDeploy documentation for private deployment of the same API.
🚀 News​
Intern-S1 Released: Intern-S1 is a state-of-the-art open-source multimodal model built on a Mixture-of-Experts (MoE) architecture, with enhanced scientific reasoning capabilities. It currently delivers the best overall performance among open-source models.
- To use this model via the ChatAPI, set the
model
field tointern-s1
. By default, the model operates in standard (non-deep-thinking) mode. - You can enable deep thinking by setting the optional
thinking_mode
field (boolean) totrue
when using the intern-s1 model.
Rate Limit​
- API Rate Limit: By default, each user is limited to 30 requests per minute. If you require a higher request limit, you can apply for an upgraded rate limit configuration at Rate Limiting Policy.
Request Examples​
Currently, the Intern ChatAPI is compatible with some methods of the OpenAI Python SDK, and more adaptations are in progress... We still recommend using native Python requests or curl requests to access the Intern API.
For users opting to use the OpenAI SDK, please install it first:
pip install openai
Non-Streaming Request​
(1) Python Example​
- Python Requests
import requests
import json
url = 'https://chat.intern-ai.org.cn/api/v1/chat/completions'
headers = {
'Content-Type': 'application/json',
"Authorization": "Bearer eyJ0eXBlIjoiSl...please provide a valid token!"
}
data = {
"model": "intern-latest",
"messages": [{
"role": "user",
"content": "Hello!"
}],
"n": 1,
"temperature": 0.8,
"top_p": 0.9
}
res = requests.post(url, headers=headers, data=json.dumps(data))
print(res.status_code)
print(res.json())
print(res.json()["choices"][0]['message']["content"])
- Applying OpenAI Python SDK
from openai import OpenAI
client = OpenAI(
api_key="eyJ0eXBlIjoiSl...please provide a valid token!", # Token is passed here without 'Bearer'
base_url="https://chat.intern-ai.org.cn/api/v1/",
)
chat_rsp = client.chat.completions.create(
model="intern-latest",
messages=[{"role": "user", "content": "hello"}],
)
for choice in chat_rsp.choices:
print(choice.message.content)
Parameter Description:
- Supports
model
,messages
,n
,temperature
,top_p
,stream
,max_tokens
,tools
- Other parameters are not supported yet
(2) CLI Example​
openai -b "https://chat.intern-ai.org.cn/api/v1/" \
-k "eyJ0eXBlIjoiSl...please provide a valid token!" \
api chat.completions.create \
-m "intern-latest" \
-g user hello
Note:
- Supports parameters
-g ROLE CONTENT -m MODEL [-n N] [-t TEMPERATURE] [-P TOP_P]
--stop STOP
is currently not supported
(3) curl Example​
curl --location 'https://chat.intern-ai.org.cn/api/v1/chat/completions' \
--header 'Authorization: Bearer xxxxxxx' \
--header 'Content-Type: application/json' \
--data '{
"model": "intern-latest",
"messages": [{
"role": "user",
"content": "Do you know Liu Cixin?"
}, {
"role": "assistant",
"content": "As an AI assistant, I know Liu Cixin. He is a renowned Chinese science fiction writer and engineer, having won multiple awards, including the Hugo and Nebula awards."
},{
"role": "user",
"content": "Which of his works won the Hugo Award?"
}],
"temperature": 0.8,
"top_p": 0.9
}'
Streaming Request Example​
(1) Python Example​
- Python Requests
import requests
import json
url = 'https://chat.intern-ai.org.cn/api/v1/chat/completions'
header = {
'Content-Type':'application/json',
"Authorization":"Bearer eyJ0eXBlIjoiSl...please fill in the correct token!"
}
data = {
"model": "intern-latest",
"messages": [{
"role": "user",
"content": "Hello~"
}],
"n": 1,
"temperature": 0.8,
"top_p": 0.9,
"stream": True,
}
response = requests.post(url, headers=header,data=json.dumps(data), stream=True)
for chunk in response.iter_lines(chunk_size=8192, decode_unicode=False, delimiter=b'\n'):
if not chunk:
continue
decoded = chunk.decode('utf-8')
if not decoded.startswith("data:"):
raise Exception(f"error message {decoded}")
decoded = decoded.strip("data:").strip()
if "[DONE]" == decoded:
print("finish!")
break
output = json.loads(decoded)
if output["object"] == "error":
raise Exception(f"logic err: {output}")
print(output["choices"][0]["delta"]["content"])
- Applying OpenAI Python SDK
from openai import OpenAI
client = OpenAI(
api_key="eyJ0eXBlIjoiSl...please fill in the correct token!", # Pass token here, without Bearer
base_url="https://chat.intern-ai.org.cn/api/v1/",
)
chat_rsp = client.chat.completions.create(
model="intern-latest",
messages=[{"role": "user", "content": "hello"}],
stream=True,
)
for chunk in chat_rsp:
print(chunk.choices[0].delta.content)
(2) CLI Example​
openai -b "https://chat.intern-ai.org.cn/api/v1/"
-k "eyJ0eXBlIjoiSl...please fill in the correct token!"
api chat.completions.create
-m "intern-latest"
--stream
-g user hello
Note:
- Supports
-g ROLE CONTENT -m MODEL [-n N] [-t TEMPERATURE] [-P TOP_P] --stream
parameters. - Does not currently support
[--stop STOP]
.
(3) curl Example​
curl --location 'https://chat.intern-ai.org.cn/api/v1/chat/completions' \
--header 'Authorization: Bearer xxxxxxx' \
--header 'Content-Type: application/json' \
--data '{
"model": "intern-latest",
"messages": [{
"role": "user",
"content": "Do you know Liu Cixin?"
}, {
"role": "assistant",
"content": "As an AI assistant, I know Liu Cixin. He is a famous Chinese science fiction writer and engineer, having won several awards including the Hugo and Nebula Awards."
},{
"role": "user",
"content": "Which of his works won the Hugo Award?"
}],
"temperature": 0.8,
"top_p": 0.9,
"stream": true
}'
Imgae-Text Request Example​
The following example uses an HTTP request body for example. For specific SDK usage, you can refer to the Python Example, CLI Example, or curl Example.
// Request
{
"model": "internvl2.5-latest",
"messages": [
{
"role": "user",
"content": "ä½ å¥½"
},
{
"role": "assistant",
"content": "ä½ å¥½ï¼Œæˆ‘æ˜¯ internvl"
},
{
"role": "user",
"content": [ // user's input with image and text,Array
{
"type": "text", // type field supports text/image_url
"text": "Describe the image please"
},
{
"type": "image_url",
"image_url": {
"url": "https://static.openxlab.org.cn/internvl/demo/visionpro.png" // image url/ base64 image
}
},
{
"type": "image_url", // support multiple images in one round
"image_url": {
"url": "data:image/jpeg;base64,{<encode_image(image_path)>}" // replace <encode_image(image_path)> to your encoded base64 image
}
}
]
}
],
"temperature": 0.8, // float [0,1],default=0.5
"top_p": 0.9, // float [0,1],default=1
"max_tokens": 100 // default=0
}
Tool Call Request Example​
// New Request
{
"model": "intern-latest", // Default model is the latest version
"messages": [
{
"role": "user",
"content": "What's the weather today?"
},
{
"role": "assistant",
"content": "I need to use the get_current_weather API to check today's weather in Shanghai",
"tool_calls": [
{
"id": "97102",
"type": "function",
"function": {
"name": "get_current_weather",
"arguments": "{'location': 'shanghai', 'unit': 'celsius'}"
}
}
]
},
{
"role": "tool",
"content": "{'location': 'shanghai', 'temperature': '40', 'unit': 'celsius'}",
"tool_call_id": "97102"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"unit": {
"type": "string",
"enum": [
"celsius",
"fahrenheit"
]
},
},
"required": [
"location"
],
},
},
}
],
"temperature": 0.8, // float [0,1], default=0.5
"top_p": 0.9 // float [0,1], default=1
"request_output_len": 100 // default=0
}
Response Examples​
Example of Non-streaming Response​
// Request
Schema: HTTP
Path: /api/v1/chat/completions
Method: POST
Header:
Authorization: $BearerToken
// Body
{
"model":"intern-latest", // Default model is the latest version
"messages": [{
"role": "user", // Supported roles: user/assistant/system/tool
"content": "Do you know Liu Cixin?"
}, {
"role": "assistant",
"content": "As an AI assistant, I know Liu Cixin. He is a famous Chinese science fiction writer and engineer, having won several awards including the Hugo and Nebula Awards."
},{
"role": "user",
"content": "Which of his works won the Hugo Award?"
}],
"temperature": 0.8, // float [0,1], default=0.5
"top_p": 0.9 // float [0,1], default=1
"n": 1 // default=1
}
// Response
Status Code: 200
Body:
{
"id": "chatcmpl-123", // Unique identifier for this response
"model": "intern-latest" // Model ID
"created": 1677652288, // Timestamp
"choices": [{ // Model's response content
"index": 0, // First choice
"message": {
"role": "assistant",
"content": "Liu Cixin's 'The Three-Body Problem' series won the Hugo Award for Best Novel in 2015. This was also the first time an Asian science fiction writer won this prestigious award....", // Response from Puyu
},
"finish_reason": "stop" // Options: length / stop / tool_calls, length means the result exceeded max_tokens
}],
"moderation": { // Content moderation, currently omitted
"sensitive": false
"category": ""
}
}
Example of Non-streaming Response​
// Request
Schema: HTTP
Path: /api/v1/chat/completions
Method: POST
Header:
Authorization: $BearerToken
// Request
{
"model": "intern-latest", // Default model is the latest version
"messages": [{
"role": "user", // Supported roles: user/assistant/system/tool
"content": "Do you know Liu Cixin?"
}, {
"role": "assistant",
"content": "As an AI assistant, I know Liu Cixin. He is a famous Chinese science fiction writer and engineer, having won several awards including the Hugo and Nebula Awards."
},{
"role": "user",
"content": "Which of his works won the Hugo Award?"
}],
"n": 1,
"temperature": 0.8,
"top_p": 0.9,
"stream": True,
}
// Response
Status Code: 200
Body:
{
"id": "chatcmpl-123", // Unique identifier for this response
"model": "intern-latest" // Model ID
"created": 1677652288, // Timestamp
"choices": [{ // Model's response content
"index": 0, // First choice
"delta": {
"role": "assistant",
"content": "Liu Cixin's 'The Three-Body Problem' series won the Hugo Award for Best Novel in 2015", // Response from Puyu
},
"finish_reason": "" // Options: length / stop / empty string, length means the result exceeded max_tokens. Empty string means not all output returned yet
}],
"moderation": { // Content moderation, currently omitted
"sensitive": false
"category": ""
}
}
Example of Tool Call Response​
Status Code: 200
Body:
{
"id": "chatcmpl-123", // Unique identifier for this response
"model": "intern-latest", // Model ID
"created": 1677652288, // Creation timestamp
"choices": [{ // Model's response content
"index": 0, // Entry 0
"message": {
"role": "assistant",
"content": "get_current_weather",
"tool_calls": [
{
"id": "97102",
"type": "function",
"function": {
"name": "get_current_weather",
"arguments": "{'location': 'shanghai', 'unit': 'celsius'}"
}
}
]
},
"finish_reason": "tool_calls"
}],
"moderation": {
"sensitive": false,
"category": ""
}
}
Parameter Explanation​
The API sets a timeout for the model call. If the model has not completed its output within 120 seconds, the output will be interrupted and the current result returned.
Request Parameters Explanation​
Parameter | Type | Example | Description |
---|---|---|---|
model | string | intern-latest | Name of the model being called |
messages | array | LLM: [{"role":"user","content":"Hello"}]Multi-modal Model: Please refer to the below "image-text messages" | Conversation history and current query. The role (user/assistant/system/tool) is supported. "role": "system" is not supported for internthinker-beta model |
thinking_mode | optional,boolean | True | This field is only effective when the model is set to intern-s1 . It controls whether the model responds in deep thinking mode. When thinking_mode is enabled, setting a system message is not allowed. |
tools | Optional, array | See tools example below | An array of functions available for the model to call. Based on this array, the model will respond with the corresponding function's JSON object. |
tool_choice | Optional, string or object | {"type": "function", "function": {"name": "my_function"}} | Options: none (forces no function call), auto (automatic function call), required (must call a function), or provide a specific function object that must be called. |
temperature | Optional, float | 0.8 | Sampling temperature |
top_p | Optional, float | 0.9 | The probability threshold for candidate tokens |
max_tokens | Optional, int | 100 | The maximum number of tokens for the output. Default 0 adjusts to the maximum allowed length |
stream | Optional, bool | true | Stream incremental results. Does not support streaming with tool calls. |
- Image-Text Messages
{
"model": "internvl-latest",
"messages": [
{
"role": "user",
"content": "ä½ å¥½"
},
{
"role": "assistant",
"content": "ä½ å¥½ï¼Œæˆ‘æ˜¯ internvl"
},
{
"role": "user",
"content": [ // user's input with image and text,Array
{
"type": "text", // type field supports text/image_url
"text": "Describe the image please"
},
{
"type": "image_url",
"image_url": {
"url": "https://static.openxlab.org.cn/internvl/demo/visionpro.png" // image url/ base64 image
}
},
{
"type": "image_url", // support multiple images in one round
"image_url": {
"url": "https://static.openxlab.org.cn/puyu/demo/000-2x.jpg"
}
}
]
}
],
"temperature": 0.8, // float [0,1],default=0.5
"top_p": 0.9, // float [0,1],default=1
"max_tokens": 100 // default=0
}
- Tools Example
tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
},
"required": ["location"],
},
},
}
]
Response Parameters Explanation​
Parameter | Type | Example | Description |
---|---|---|---|
id | string | chatcmpl-123 | Unique session ID |
model | string | intern-latest | Model name used |
created | int | 1677652288 | Creation timestamp |
choices | array | [{"index":0,"message":{"role":"assistant","content":"hi"},"finish_reason":"stop"}] | Model's reply |
moderation | object | {"sensitive":false,"category":""} | Content moderation info (not yet supported) |
Explanation of Related Structures​
- Choice Structure Explanation
Parameter | Type | Example | Description |
---|---|---|---|
index | int | 0 | Index of the result in case of multiple generations |
message | object | {"role":"assistant", "content":"Hello"} | Represents a query or response. Only exists in non-streaming requests |
delta | object | {"role":"assistant", "content":"Hello"} | Represents a query or response. Only exists in streaming requests, and does not support tool calls during streaming. |
finish_reason | string | stop | Options: length / stop / tool_calls. length indicates that the response exceeds max_tokens. |
- Message Structure Explanation
Parameter | Type | Example | Description |
---|---|---|---|
role | string | user | Indicates user query (user) or model response (assistant) |
content | string | Hello | Dialogue content. When sending a "tool" request, the content is the JSON string of the function call's response |
tool_call_id | string | 97102 | Required when sending a "tool" request or if the model generates tool_calls content. These IDs must match. |
tool_calls | array | See tools_calls example below | The JSON-formatted function call returned by the model. |
- Tool Calls Example
tool_calls = [
{
"id": "97102",
"type": "function",
"function": {
"name": "get_current_weather",
"arguments": "{'location': 'shanghai', 'unit': 'celsius'}"
}
}
]
- Moderation Structure Explanation
Parameter | Type | Example | Description |
---|---|---|---|
sensitive | bool | false | Indicates whether content is sensitive |
category | string | ad | Classification of sensitive content |
Common Errors​
Error Code | Description | Suggested Solution |
---|---|---|
-10002 | Invalid parameter | Check if the query is empty |
-20004 | Non-whitelisted user | Apply for access through the website |
-20013 | Requested model does not exist | Verify if the model parameter is supported |
-20018 | Incorrect message format | Verify the message format |
-20035 | Account not linked to a phone number | Bind a phone number via the website |
-20053 | Exceeded frequency limit (req/min or tokens/min) | Ensure the request frequency is within allowed limits, or apply for higher request frequency |
A0202 | User authentication failed | Verify if the provided token is correct and formatted consistently (including Bearer) |
A0211 | Token expired | Confirm if the token has expired |
C1114 | Token is not for this service | Verify if the token is meant for this service |