Skip to content

API Reference

Complete API documentation for CPAI endpoints.

http://localhost:8080/v1

All API requests require authentication via the Authorization header with a Bearer token.

Include your API key in the Authorization header:

Authorization: Bearer cpai-xxxxx

OpenAI-compatible chat completions endpoint.

Request body:

{
"model": "kimi-k2.5",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
"temperature": 0.7,
"max_tokens": 1024,
"stream": false
}

Response:

{
"id": "chatcmpl-xxx",
"object": "chat.completion",
"created": 1234567890,
"model": "kimi-k2.5",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 20,
"total_tokens": 30
}
}

Set stream: true for Server-Sent Events (SSE) streaming responses.

Terminal window
curl http://localhost:8080/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "kimi-k2.5",
"messages": [{"role": "user", "content": "Hello!"}],
"stream": true
}'

Available models:

  • kimi-k2.5 - Kimi K2.5 (default)
  • deepseek-coder - DeepSeek Coder
  • codellama - CodeLlama
CodeDescription
400Bad Request - Invalid request body
401Unauthorized - Invalid or missing API key
403Forbidden - IP not whitelisted
429Too Many Requests - Concurrency limit exceeded
503Service Unavailable - No GPU nodes available

Public health endpoint (no authentication required):

GET /health