Multi-Turn Chat
The Chat API is fully deployed based on LMDeploy. You can refer to the LMDeploy documentation for private deployment of the same API.
π Newsβ
- The first model of the InternLM3 series, InternLM3-8B-Instruct, has been released as open-source, and API support is available. Use
internlm3-latest
in themodel
field to enable inference with this model. - The API now supports InternVL2.5 multimodal model series. To use this model for image-text inference, simply set the
model
field tointernvl2.5-latest
.
Rate Limitβ
- API Rate Limit: By default, each user is limited to 10 requests per minute. If you require a higher request limit, you can apply for an upgraded rate limit configuration at Rate Limiting Policy.
Request Examplesβ
Currently, the InternLM ChatAPI is compatible with some methods of the OpenAI Python SDK, and more adaptations are in progress... We still recommend using native Python requests or curl requests to access the Intern API.
For users opting to use the OpenAI SDK, please install it first:
pip install openai
Non-Streaming Request Exampleβ
(1) Python Exampleβ
- Python Requests
import requests
import json
url = 'https://chat.intern-ai.org.cn/api/v1/chat/completions'
headers = {
'Content-Type': 'application/json',
"Authorization": "Bearer eyJ0eXBlIjoiSl...please provide a valid token!"
}
data = {
"model": "internlm3-latest",
"messages": [{
"role": "user",
"content": "Hello!"
}],
"n": 1,
"temperature": 0.8,
"top_p": 0.9
}
res = requests.post(url, headers=headers, data=json.dumps(data))
print(res.status_code)
print(res.json())
print(res.json()["choices"][0]['message']["content"])
- Applying OpenAI Python SDK
from openai import OpenAI
client = OpenAI(
api_key="eyJ0eXBlIjoiSl...please provide a valid token!", # Token is passed here without 'Bearer'
base_url="https://chat.intern-ai.org.cn/api/v1/",
)
chat_rsp = client.chat.completions.create(
model="internlm3-latest",
messages=[{"role": "user", "content": "hello"}],
)
for choice in chat_rsp.choices:
print(choice.message.content)
Parameter Description:
- Supports
model
,messages
,n
,temperature
,top_p
,stream
,max_tokens
,tools
- Other parameters are not supported yet
(2) CLI Exampleβ
openai -b "https://chat.intern-ai.org.cn/api/v1/" \
-k "eyJ0eXBlIjoiSl...please provide a valid token!" \
api chat.completions.create \
-m "internlm3-latest" \
-g user hello
Note:
- Supports parameters
-g ROLE CONTENT -m MODEL [-n N] [-t TEMPERATURE] [-P TOP_P]
--stop STOP
is currently not supported
(3) curl Exampleβ
curl --location 'https://chat.intern-ai.org.cn/api/v1/chat/completions' \
--header 'Authorization: Bearer xxxxxxx' \
--header 'Content-Type: application/json' \
--data '{
"model": "internlm3-latest",
"messages": [{
"role": "user",
"content": "Do you know Liu Cixin?"
}, {
"role": "assistant",
"content": "As an AI assistant, I know Liu Cixin. He is a renowned Chinese science fiction writer and engineer, having won multiple awards, including the Hugo and Nebula awards."
},{
"role": "user",
"content": "Which of his works won the Hugo Award?"
}],
"temperature": 0.8,
"top_p": 0.9
}'
Streaming Request Exampleβ
(1) Python Exampleβ
- Python Requests
import requests
import json
url = 'https://chat.intern-ai.org.cn/api/v1/chat/completions'
header = {
'Content-Type':'application/json',
"Authorization":"Bearer eyJ0eXBlIjoiSl...please fill in the correct token!"
}
data = {
"model": "internlm3-latest",
"messages": [{
"role": "user",
"content": "Hello~"
}],
"n": 1,
"temperature": 0.8,
"top_p": 0.9,
"stream": True,
}
response = requests.post(url, headers=header,data=json.dumps(data), stream=True)
for chunk in response.iter_lines(chunk_size=8192, decode_unicode=False, delimiter=b'\n'):
if not chunk:
continue
decoded = chunk.decode('utf-8')
if not decoded.startswith("data:"):
raise Exception(f"error message {decoded}")
decoded = decoded.strip("data:").strip()
if "[DONE]" == decoded:
print("finish!")
break
output = json.loads(decoded)
if output["object"] == "error":
raise Exception(f"logic err: {output}")
print(output["choices"][0]["delta"]["content"])
- Applying OpenAI Python SDK
from openai import OpenAI
client = OpenAI(
api_key="eyJ0eXBlIjoiSl...please fill in the correct token!", # Pass token here, without Bearer
base_url="https://chat.intern-ai.org.cn/api/v1/",
)
chat_rsp = client.chat.completions.create(
model="internlm3-latest",
messages=[{"role": "user", "content": "hello"}],
stream=True,
)
for chunk in chat_rsp:
print(chunk.choices[0].delta.content)
(2) CLI Exampleβ
openai -b "https://chat.intern-ai.org.cn/api/v1/"
-k "eyJ0eXBlIjoiSl...please fill in the correct token!"
api chat.completions.create
-m "internlm3-latest"
--stream
-g user hello
Note:
- Supports
-g ROLE CONTENT -m MODEL [-n N] [-t TEMPERATURE] [-P TOP_P] --stream
parameters. - Does not currently support
[--stop STOP]
.
(3) curl Exampleβ
curl --location 'https://chat.intern-ai.org.cn/api/v1/chat/completions' \
--header 'Authorization: Bearer xxxxxxx' \
--header 'Content-Type: application/json' \
--data '{
"model": "internlm3-latest",
"messages": [{
"role": "user",
"content": "Do you know Liu Cixin?"
}, {
"role": "assistant",
"content": "As an AI assistant, I know Liu Cixin. He is a famous Chinese science fiction writer and engineer, having won several awards including the Hugo and Nebula Awards."
},{
"role": "user",
"content": "Which of his works won the Hugo Award?"
}],
"temperature": 0.8,
"top_p": 0.9,
"stream": true
}'
Imgae-Text Request Exampleβ
The following example uses an HTTP request body for example. For specific SDK usage, you can refer to the Python Example, CLI Example, or curl Example.
// Request
{
"model": "internvl2.5-latest",
"messages": [
{
"role": "user",
"content": "δ½ ε₯½"
},
{
"role": "assistant",
"content": "δ½ ε₯½οΌζζ― internvl"
},
{
"role": "user",
"content": [ // user's input with image and textοΌArray
{
"type": "text", // type field supports text/image_url
"text": "Describe the image please"
},
{
"type": "image_url",
"image_url": {
"url": "https://static.openxlab.org.cn/internvl/demo/visionpro.png" // image url/ base64 image
}
},
{
"type": "image_url", // support multiple images in one round
"image_url": {
"url": "https://static.openxlab.org.cn/puyu/demo/000-2x.jpg"
}
}
]
}
],
"temperature": 0.8, // float [0,1],default=0.5
"top_p": 0.9, // float [0,1],default=1
"max_tokens": 100 // default=0
}
Tool Call Request Exampleβ
// New Request
{
"model": "internlm3-latest", // Default model is the latest version
"messages": [
{
"role": "user",
"content": "What's the weather today?"
},
{
"role": "assistant",
"content": "I need to use the get_current_weather API to check today's weather in Shanghai",
"tool_calls": [
{
"id": "97102",
"type": "function",
"function": {
"name": "get_current_weather",
"arguments": "{'location': 'shanghai', 'unit': 'celsius'}"
}
}
]
},
{
"role": "tool",
"content": "{'location': 'shanghai', 'temperature': '40', 'unit': 'celsius'}",
"tool_call_id": "97102"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"unit": {
"type": "string",
"enum": [
"celsius",
"fahrenheit"
]
},
},
"required": [
"location"
],
},
},
}
],
"temperature": 0.8, // float [0,1], default=0.5
"top_p": 0.9 // float [0,1], default=1
"request_output_len": 100 // default=0
}
Response Examplesβ
Example of Non-streaming Requestβ
// Request
Schema: HTTP
Path: /api/v1/chat/completions
Method: POST
Header:
Authorization: $BearerToken
// Body
{
"model":"internlm3-latest", // Default model is the latest version
"messages": [{
"role": "user", // Supported roles: user/assistant/system/tool
"content": "Do you know Liu Cixin?"
}, {
"role": "assistant",
"content": "As an AI assistant, I know Liu Cixin. He is a famous Chinese science fiction writer and engineer, having won several awards including the Hugo and Nebula Awards."
},{
"role": "user",
"content": "Which of his works won the Hugo Award?"
}],
"temperature": 0.8, // float [0,1], default=0.5
"top_p": 0.9 // float [0,1], default=1
"n": 1 // default=1
}
// Response
Status Code: 200
Body:
{
"id": "chatcmpl-123", // Unique identifier for this response
"model": "internlm3-latest" // Model ID
"created": 1677652288, // Timestamp
"choices": [{ // Model's response content
"index": 0, // First choice
"message": {
"role": "assistant",
"content": "Liu Cixin's 'The Three-Body Problem' series won the Hugo Award for Best Novel in 2015. This was also the first time an Asian science fiction writer won this prestigious award....", // Response from Puyu
},
"finish_reason": "stop" // Options: length / stop / tool_calls, length means the result exceeded max_tokens
}],
"moderation": { // Content moderation, currently omitted
"sensitive": false
"category": ""
}
}
Example of Non-streaming Responseβ
// Request
Schema: HTTP
Path: /api/v1/chat/completions
Method: POST
Header:
Authorization: $BearerToken
// Request
{
"model": "internlm3-latest", // Default model is the latest version
"messages": [{
"role": "user", // Supported roles: user/assistant/system/tool
"content": "Do you know Liu Cixin?"
}, {
"role": "assistant",
"content": "As an AI assistant, I know Liu Cixin. He is a famous Chinese science fiction writer and engineer, having won several awards including the Hugo and Nebula Awards."
},{
"role": "user",
"content": "Which of his works won the Hugo Award?"
}],
"n": 1,
"temperature": 0.8,
"top_p": 0.9,
"stream": True,
}
// Response
Status Code: 200
Body:
{
"id": "chatcmpl-123", // Unique identifier for this response
"model": "internlm3-latest" // Model ID
"created": 1677652288, // Timestamp
"choices": [{ // Model's response content
"index": 0, // First choice
"delta": {
"role": "assistant",
"content": "Liu Cixin's 'The Three-Body Problem' series won the Hugo Award for Best Novel in 2015", // Response from Puyu
},
"finish_reason": "" // Options: length / stop / empty string, length means the result exceeded max_tokens. Empty string means not all output returned yet
}],
"moderation": { // Content moderation, currently omitted
"sensitive": false
"category": ""
}
}
Example of Tool Call Responseβ
Status Code: 200
Body:
{
"id": "chatcmpl-123", // Unique identifier for this response
"model": "internlm3-latest", // Model ID
"created": 1677652288, // Creation timestamp
"choices": [{ // Model's response content
"index": 0, // Entry 0
"message": {
"role": "assistant",
"content": "get_current_weather",
"tool_calls": [
{
"id": "97102",
"type": "function",
"function": {
"name": "get_current_weather",
"arguments": "{'location': 'shanghai', 'unit': 'celsius'}"
}
}
]
},
"finish_reason": "tool_calls"
}],
"moderation": {
"sensitive": false,
"category": ""
}
}
Parameter Explanationβ
The API sets a timeout for the model call. If the model has not completed its output within 120 seconds, the output will be interrupted and the current result returned.
Request Parameters Explanationβ
Parameter | Type | Example | Description |
---|---|---|---|
model | string | internlm3-latest | Name of the model being called |
messages | array | LLM: [{"role":"user","content":"Hello"}]Multi-modal Model: Please refer to the below "image-text messages" | Conversation history and current query. The role (user/assistant/system/tool) is supported. "role": "system" is not supported for internthinker-beta model |
tools | Optional, array | See tools example below | An array of functions available for the model to call. Based on this array, the model will respond with the corresponding function's JSON object. |
tool_choice | Optional, string or object | {"type": "function", "function": {"name": "my_function"}} | Options: none (forces no function call), auto (automatic function call), required (must call a function), or provide a specific function object that must be called. |
temperature | Optional, float | 0.8 | Sampling temperature |
top_p | Optional, float | 0.9 | The probability threshold for candidate tokens |
max_tokens | Optional, int | 100 | The maximum number of tokens for the output. Default 0 adjusts to the maximum allowed length |
stream | Optional, bool | true | Stream incremental results. Does not support streaming with tool calls. |
- Image-Text Messages
{
"model": "internvl-latest",
"messages": [
{
"role": "user",
"content": "δ½ ε₯½"
},
{
"role": "assistant",
"content": "δ½ ε₯½οΌζζ― internvl"
},
{
"role": "user",
"content": [ // user's input with image and textοΌArray
{
"type": "text", // type field supports text/image_url
"text": "Describe the image please"
},
{
"type": "image_url",
"image_url": {
"url": "https://static.openxlab.org.cn/internvl/demo/visionpro.png" // image url/ base64 image
}
},
{
"type": "image_url", // support multiple images in one round
"image_url": {
"url": "https://static.openxlab.org.cn/puyu/demo/000-2x.jpg"
}
}
]
}
],
"temperature": 0.8, // float [0,1],default=0.5
"top_p": 0.9, // float [0,1],default=1
"max_tokens": 100 // default=0
}
- Tools Example
tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
},
"required": ["location"],
},
},
}
]
Response Parameters Explanationβ
Parameter | Type | Example | Description |
---|---|---|---|
id | string | chatcmpl-123 | Unique session ID |
model | string | internlm3-latest | Model name used |
created | int | 1677652288 | Creation timestamp |
choices | array | [{"index":0,"message":{"role":"assistant","content":"hi"},"finish_reason":"stop"}] | Model's reply |
moderation | object | {"sensitive":false,"category":""} | Content moderation info (not yet supported) |
Explanation of Related Structuresβ
- Choice Structure Explanation
Parameter | Type | Example | Description |
---|---|---|---|
index | int | 0 | Index of the result in case of multiple generations |
message | object | {"role":"assistant", "content":"Hello"} | Represents a query or response. Only exists in non-streaming requests |
delta | object | {"role":"assistant", "content":"Hello"} | Represents a query or response. Only exists in streaming requests, and does not support tool calls during streaming. |
finish_reason | string | stop | Options: length / stop / tool_calls. length indicates that the response exceeds max_tokens. |
- Message Structure Explanation
Parameter | Type | Example | Description |
---|---|---|---|
role | string | user | Indicates user query (user) or model response (assistant) |
content | string | Hello | Dialogue content. When sending a "tool" request, the content is the JSON string of the function call's response |
tool_call_id | string | 97102 | Required when sending a "tool" request or if the model generates tool_calls content. These IDs must match. |
tool_calls | array | See tools_calls example below | The JSON-formatted function call returned by the model. |
- Tool Calls Example
tool_calls = [
{
"id": "97102",
"type": "function",
"function": {
"name": "get_current_weather",
"arguments": "{'location': 'shanghai', 'unit': 'celsius'}"
}
}
]
- Moderation Structure Explanation
Parameter | Type | Example | Description |
---|---|---|---|
sensitive | bool | false | Indicates whether content is sensitive |
category | string | ad | Classification of sensitive content |
Common Errorsβ
Error Code | Description | Suggested Solution |
---|---|---|
-10002 | Invalid parameter | Check if the query is empty |
-20004 | Non-whitelisted user | Apply for access through the website |
-20013 | Requested model does not exist | Verify if the model parameter is supported |
-20018 | Incorrect message format | Verify the message format |
-20035 | Account not linked to a phone number | Bind a phone number via the website |
-20053 | Exceeded frequency limit (req/min or tokens/min) | Ensure the request frequency is within allowed limits, or apply for higher request frequency |
A0202 | User authentication failed | Verify if the provided token is correct and formatted consistently (including Bearer) |
A0211 | Token expired | Confirm if the token has expired |
C1114 | Token is not for this service | Verify if the token is meant for this service |