Skip to main content

Multi-Turn Chat

The Chat API is fully deployed based on LMDeploy. You can refer to the LMDeploy documentation for private deployment of the same API.

πŸš€ News​

  • The first model of the InternLM3 series, InternLM3-8B-Instruct, has been released as open-source, and API support is available. Use internlm3-latest in the model field to enable inference with this model.
  • The API now supports InternVL2.5 multimodal model series. To use this model for image-text inference, simply set the model field to internvl2.5-latest.

Rate Limit​

  • API Rate Limit: By default, each user is limited to 10 requests per minute. If you require a higher request limit, you can apply for an upgraded rate limit configuration at Rate Limiting Policy.

Request Examples​

Currently, the InternLM ChatAPI is compatible with some methods of the OpenAI Python SDK, and more adaptations are in progress... We still recommend using native Python requests or curl requests to access the Intern API.

For users opting to use the OpenAI SDK, please install it first:

pip install openai

Non-Streaming Request Example​

(1) Python Example​

  • Python Requests
import requests
import json

url = 'https://chat.intern-ai.org.cn/api/v1/chat/completions'
headers = {
'Content-Type': 'application/json',
"Authorization": "Bearer eyJ0eXBlIjoiSl...please provide a valid token!"
}
data = {
"model": "internlm3-latest",
"messages": [{
"role": "user",
"content": "Hello!"
}],
"n": 1,
"temperature": 0.8,
"top_p": 0.9
}

res = requests.post(url, headers=headers, data=json.dumps(data))
print(res.status_code)
print(res.json())
print(res.json()["choices"][0]['message']["content"])
  • Applying OpenAI Python SDK
from openai import OpenAI

client = OpenAI(
api_key="eyJ0eXBlIjoiSl...please provide a valid token!", # Token is passed here without 'Bearer'
base_url="https://chat.intern-ai.org.cn/api/v1/",
)

chat_rsp = client.chat.completions.create(
model="internlm3-latest",
messages=[{"role": "user", "content": "hello"}],
)

for choice in chat_rsp.choices:
print(choice.message.content)

Parameter Description:

  • Supports model, messages, n, temperature, top_p, stream, max_tokens, tools
  • Other parameters are not supported yet

(2) CLI Example​

openai -b "https://chat.intern-ai.org.cn/api/v1/" \
-k "eyJ0eXBlIjoiSl...please provide a valid token!" \
api chat.completions.create \
-m "internlm3-latest" \
-g user hello

Note:

  • Supports parameters -g ROLE CONTENT -m MODEL [-n N] [-t TEMPERATURE] [-P TOP_P]
  • --stop STOP is currently not supported

(3) curl Example​

curl --location 'https://chat.intern-ai.org.cn/api/v1/chat/completions' \
--header 'Authorization: Bearer xxxxxxx' \
--header 'Content-Type: application/json' \
--data '{
"model": "internlm3-latest",
"messages": [{
"role": "user",
"content": "Do you know Liu Cixin?"
}, {
"role": "assistant",
"content": "As an AI assistant, I know Liu Cixin. He is a renowned Chinese science fiction writer and engineer, having won multiple awards, including the Hugo and Nebula awards."
},{
"role": "user",
"content": "Which of his works won the Hugo Award?"
}],
"temperature": 0.8,
"top_p": 0.9
}'

Streaming Request Example​

(1) Python Example​

  • Python Requests
import requests
import json

url = 'https://chat.intern-ai.org.cn/api/v1/chat/completions'
header = {
'Content-Type':'application/json',
"Authorization":"Bearer eyJ0eXBlIjoiSl...please fill in the correct token!"
}
data = {
"model": "internlm3-latest",
"messages": [{
"role": "user",
"content": "Hello~"
}],
"n": 1,
"temperature": 0.8,
"top_p": 0.9,
"stream": True,
}

response = requests.post(url, headers=header,data=json.dumps(data), stream=True)
for chunk in response.iter_lines(chunk_size=8192, decode_unicode=False, delimiter=b'\n'):
if not chunk:
continue
decoded = chunk.decode('utf-8')
if not decoded.startswith("data:"):
raise Exception(f"error message {decoded}")
decoded = decoded.strip("data:").strip()
if "[DONE]" == decoded:
print("finish!")
break
output = json.loads(decoded)
if output["object"] == "error":
raise Exception(f"logic err: {output}")
print(output["choices"][0]["delta"]["content"])
  • Applying OpenAI Python SDK
from openai import OpenAI

client = OpenAI(
api_key="eyJ0eXBlIjoiSl...please fill in the correct token!", # Pass token here, without Bearer
base_url="https://chat.intern-ai.org.cn/api/v1/",
)

chat_rsp = client.chat.completions.create(
model="internlm3-latest",
messages=[{"role": "user", "content": "hello"}],
stream=True,
)

for chunk in chat_rsp:
print(chunk.choices[0].delta.content)

(2) CLI Example​

openai -b "https://chat.intern-ai.org.cn/api/v1/" 
-k "eyJ0eXBlIjoiSl...please fill in the correct token!"
api chat.completions.create
-m "internlm3-latest"
--stream
-g user hello

Note:

  • Supports -g ROLE CONTENT -m MODEL [-n N] [-t TEMPERATURE] [-P TOP_P] --stream parameters.
  • Does not currently support [--stop STOP].

(3) curl Example​

curl --location 'https://chat.intern-ai.org.cn/api/v1/chat/completions' \
--header 'Authorization: Bearer xxxxxxx' \
--header 'Content-Type: application/json' \
--data '{
"model": "internlm3-latest",
"messages": [{
"role": "user",
"content": "Do you know Liu Cixin?"
}, {
"role": "assistant",
"content": "As an AI assistant, I know Liu Cixin. He is a famous Chinese science fiction writer and engineer, having won several awards including the Hugo and Nebula Awards."
},{
"role": "user",
"content": "Which of his works won the Hugo Award?"
}],
"temperature": 0.8,
"top_p": 0.9,
"stream": true
}'

Imgae-Text Request Example​

The following example uses an HTTP request body for example. For specific SDK usage, you can refer to the Python Example, CLI Example, or curl Example.

// Request
{
"model": "internvl2.5-latest",
"messages": [
{
"role": "user",
"content": "δ½ ε₯½"
},
{
"role": "assistant",
"content": "δ½ ε₯½οΌŒζˆ‘ζ˜― internvl"
},
{
"role": "user",
"content": [ // user's input with image and text,Array
{
"type": "text", // type field supports text/image_url
"text": "Describe the image please"
},
{
"type": "image_url",
"image_url": {
"url": "https://static.openxlab.org.cn/internvl/demo/visionpro.png" // image url/ base64 image
}
},
{
"type": "image_url", // support multiple images in one round
"image_url": {
"url": "https://static.openxlab.org.cn/puyu/demo/000-2x.jpg"
}
}
]
}
],
"temperature": 0.8, // float [0,1],default=0.5
"top_p": 0.9, // float [0,1],default=1
"max_tokens": 100 // default=0
}

Tool Call Request Example​

// New Request
{
"model": "internlm3-latest", // Default model is the latest version
"messages": [
{
"role": "user",
"content": "What's the weather today?"
},
{
"role": "assistant",
"content": "I need to use the get_current_weather API to check today's weather in Shanghai",
"tool_calls": [
{
"id": "97102",
"type": "function",
"function": {
"name": "get_current_weather",
"arguments": "{'location': 'shanghai', 'unit': 'celsius'}"
}
}
]
},
{
"role": "tool",
"content": "{'location': 'shanghai', 'temperature': '40', 'unit': 'celsius'}",
"tool_call_id": "97102"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"unit": {
"type": "string",
"enum": [
"celsius",
"fahrenheit"
]
},
},
"required": [
"location"
],
},
},
}
],
"temperature": 0.8, // float [0,1], default=0.5
"top_p": 0.9 // float [0,1], default=1
"request_output_len": 100 // default=0
}

Response Examples​

Example of Non-streaming Request​

// Request
Schema: HTTP
Path: /api/v1/chat/completions
Method: POST
Header:
Authorization: $BearerToken

// Body
{
"model":"internlm3-latest", // Default model is the latest version
"messages": [{
"role": "user", // Supported roles: user/assistant/system/tool
"content": "Do you know Liu Cixin?"
}, {
"role": "assistant",
"content": "As an AI assistant, I know Liu Cixin. He is a famous Chinese science fiction writer and engineer, having won several awards including the Hugo and Nebula Awards."
},{
"role": "user",
"content": "Which of his works won the Hugo Award?"
}],
"temperature": 0.8, // float [0,1], default=0.5
"top_p": 0.9 // float [0,1], default=1
"n": 1 // default=1
}
// Response
Status Code: 200
Body:
{
"id": "chatcmpl-123", // Unique identifier for this response
"model": "internlm3-latest" // Model ID
"created": 1677652288, // Timestamp
"choices": [{ // Model's response content
"index": 0, // First choice
"message": {
"role": "assistant",
"content": "Liu Cixin's 'The Three-Body Problem' series won the Hugo Award for Best Novel in 2015. This was also the first time an Asian science fiction writer won this prestigious award....", // Response from Puyu
},
"finish_reason": "stop" // Options: length / stop / tool_calls, length means the result exceeded max_tokens
}],
"moderation": { // Content moderation, currently omitted
"sensitive": false
"category": ""
}
}

Example of Non-streaming Response​

// Request
Schema: HTTP
Path: /api/v1/chat/completions
Method: POST
Header:
Authorization: $BearerToken

// Request
{
"model": "internlm3-latest", // Default model is the latest version
"messages": [{
"role": "user", // Supported roles: user/assistant/system/tool
"content": "Do you know Liu Cixin?"
}, {
"role": "assistant",
"content": "As an AI assistant, I know Liu Cixin. He is a famous Chinese science fiction writer and engineer, having won several awards including the Hugo and Nebula Awards."
},{
"role": "user",
"content": "Which of his works won the Hugo Award?"
}],
"n": 1,
"temperature": 0.8,
"top_p": 0.9,
"stream": True,
}
// Response
Status Code: 200
Body:
{
"id": "chatcmpl-123", // Unique identifier for this response
"model": "internlm3-latest" // Model ID
"created": 1677652288, // Timestamp
"choices": [{ // Model's response content
"index": 0, // First choice
"delta": {
"role": "assistant",
"content": "Liu Cixin's 'The Three-Body Problem' series won the Hugo Award for Best Novel in 2015", // Response from Puyu
},
"finish_reason": "" // Options: length / stop / empty string, length means the result exceeded max_tokens. Empty string means not all output returned yet
}],
"moderation": { // Content moderation, currently omitted
"sensitive": false
"category": ""
}
}

Example of Tool Call Response​

Status Code: 200 
Body:
{
"id": "chatcmpl-123", // Unique identifier for this response
"model": "internlm3-latest", // Model ID
"created": 1677652288, // Creation timestamp
"choices": [{ // Model's response content
"index": 0, // Entry 0
"message": {
"role": "assistant",
"content": "get_current_weather",
"tool_calls": [
{
"id": "97102",
"type": "function",
"function": {
"name": "get_current_weather",
"arguments": "{'location': 'shanghai', 'unit': 'celsius'}"
}
}
]
},
"finish_reason": "tool_calls"
}],
"moderation": {
"sensitive": false,
"category": ""
}
}

Parameter Explanation​

The API sets a timeout for the model call. If the model has not completed its output within 120 seconds, the output will be interrupted and the current result returned.

Request Parameters Explanation​

ParameterTypeExampleDescription
modelstringinternlm3-latestName of the model being called
messagesarrayLLM: [{"role":"user","content":"Hello"}]Multi-modal Model: Please refer to the below "image-text messages"Conversation history and current query. The role (user/assistant/system/tool) is supported. "role": "system" is not supported for internthinker-beta model
toolsOptional, arraySee tools example belowAn array of functions available for the model to call. Based on this array, the model will respond with the corresponding function's JSON object.
tool_choiceOptional, string or object{"type": "function", "function": {"name": "my_function"}}Options: none (forces no function call), auto (automatic function call), required (must call a function), or provide a specific function object that must be called.
temperatureOptional, float0.8Sampling temperature
top_pOptional, float0.9The probability threshold for candidate tokens
max_tokensOptional, int100The maximum number of tokens for the output. Default 0 adjusts to the maximum allowed length
streamOptional, booltrueStream incremental results. Does not support streaming with tool calls.
  • Image-Text Messages
{
"model": "internvl-latest",
"messages": [
{
"role": "user",
"content": "δ½ ε₯½"
},
{
"role": "assistant",
"content": "δ½ ε₯½οΌŒζˆ‘ζ˜― internvl"
},
{
"role": "user",
"content": [ // user's input with image and text,Array
{
"type": "text", // type field supports text/image_url
"text": "Describe the image please"
},
{
"type": "image_url",
"image_url": {
"url": "https://static.openxlab.org.cn/internvl/demo/visionpro.png" // image url/ base64 image
}
},
{
"type": "image_url", // support multiple images in one round
"image_url": {
"url": "https://static.openxlab.org.cn/puyu/demo/000-2x.jpg"
}
}
]
}
],
"temperature": 0.8, // float [0,1],default=0.5
"top_p": 0.9, // float [0,1],default=1
"max_tokens": 100 // default=0
}
  • Tools Example
tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
},
"required": ["location"],
},
},
}
]

Response Parameters Explanation​

ParameterTypeExampleDescription
idstringchatcmpl-123Unique session ID
modelstringinternlm3-latestModel name used
createdint1677652288Creation timestamp
choicesarray[{"index":0,"message":{"role":"assistant","content":"hi"},"finish_reason":"stop"}]Model's reply
moderationobject{"sensitive":false,"category":""}Content moderation info (not yet supported)
  • Choice Structure Explanation
ParameterTypeExampleDescription
indexint0Index of the result in case of multiple generations
messageobject{"role":"assistant", "content":"Hello"}Represents a query or response. Only exists in non-streaming requests
deltaobject{"role":"assistant", "content":"Hello"}Represents a query or response. Only exists in streaming requests, and does not support tool calls during streaming.
finish_reasonstringstopOptions: length / stop / tool_calls. length indicates that the response exceeds max_tokens.
  • Message Structure Explanation
ParameterTypeExampleDescription
rolestringuserIndicates user query (user) or model response (assistant)
contentstringHelloDialogue content. When sending a "tool" request, the content is the JSON string of the function call's response
tool_call_idstring97102Required when sending a "tool" request or if the model generates tool_calls content. These IDs must match.
tool_callsarraySee tools_calls example belowThe JSON-formatted function call returned by the model.
  • Tool Calls Example
tool_calls = [
{
"id": "97102",
"type": "function",
"function": {
"name": "get_current_weather",
"arguments": "{'location': 'shanghai', 'unit': 'celsius'}"
}
}
]
  • Moderation Structure Explanation
ParameterTypeExampleDescription
sensitiveboolfalseIndicates whether content is sensitive
categorystringadClassification of sensitive content

Common Errors​

Error CodeDescriptionSuggested Solution
-10002Invalid parameterCheck if the query is empty
-20004Non-whitelisted userApply for access through the website
-20013Requested model does not existVerify if the model parameter is supported
-20018Incorrect message formatVerify the message format
-20035Account not linked to a phone numberBind a phone number via the website
-20053Exceeded frequency limit (req/min or tokens/min)Ensure the request frequency is within allowed limits, or apply for higher request frequency
A0202User authentication failedVerify if the provided token is correct and formatted consistently (including Bearer)
A0211Token expiredConfirm if the token has expired
C1114Token is not for this serviceVerify if the token is meant for this service