Skip to content

FAQ

Frequently asked questions about CPAI.

codingplan.ai (CPAI) is an AI inference API service that provides unlimited access to open-source coding models like Kimi K2.5. Instead of paying per token, you pay for concurrency lanes with unlimited usage.

We use a subscription model based on concurrency lanes. Each plan includes a certain number of concurrent request lanes. Higher priority plans get faster response times. All plans include unlimited token usage.

Currently we support Kimi K2.5, DeepSeek Coder, and CodeLlama. We’re constantly adding new state-of-the-art open-source models.

Yes! Our API is fully compatible with the OpenAI API format. You can use official OpenAI SDKs by simply changing the base URL to our endpoint.

A concurrency lane represents one simultaneous request you can make to the API. If you have 3 lanes, you can have 3 requests processing at the same time. Additional requests will be queued until a lane becomes available.

We recommend using IP whitelisting to restrict which IP addresses can use your API key. You can configure this in your dashboard under Security settings. Never commit your API key to public repositories.

Yes! CPAI is open source and can be self-hosted. You’ll need GPUs to run the inference backend (SGLang). Check our GitHub repository for deployment instructions.

For support, please email us at support@codingplan.ai or open an issue on our GitHub repository.