Create Chat Completion

The chat completions endpoint enables multi-turn conversations with KrosAI’s language models. This is ideal for chatbots, virtual assistants, and interactive applications.

PreviousCompletions API NextTranslation API

Last updated 1 month ago

Create Chat Completion

The chat completions endpoint enables multi-turn conversations with KrosAI’s language models. This is ideal for chatbots, virtual assistants, and interactive applications.

Create Chat Completion

POST /v1/chat/completions

Request Body

messages array required

An array of messages comprising the conversation history

model string required

The ID of the model to use. Currently supported:KrosMLingual1.0.1

max_tokens integer default: "100"

The maximum number of tokens to generate

temperature number default: "0.7"

Controls randomness in the output. Values between 0 and 1.

Message Object

role string required

The role of the message author. Must be one of: system, user, or assistant

content string required

The content of the message

Example Request

{
  "model": "KrosMLingual1.0.1",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant that translates English to Yoruba."
    },
    {
      "role": "user",
      "content": "Translate: I love you"
    }
  ],
  "temperature": 0.7,
  "max_tokens": 50
}

Example Response

{
  "id": "chatcmpl-456def",
  "object": "chat.completion",
  "created": 1677649420,
  "model": "KrosMLingual1.0.1",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Mo nife re"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 3,
    "total_tokens": 23
  }
}

The chat completion API maintains conversation context across multiple messages.

Error Responses

400: Bad Request object

Invalid request parameters or message format

Invalid or missing API key

Rate limit exceeded

Best Practices

System Messages: Use system messages to set the behavior and context for your assistant.
Message History: Keep message history concise to stay within token limits.
Temperature: Use a lower temperature (0.2-0.4) for more focused, deterministic responses.
Rate Limits: Implement proper error handling for rate limits.

PreviousCompletions API NextTranslation API

Last updated 1 month ago