FAQ

Frequently asked questions about CPAI.

What is codingplan.ai?

codingplan.ai (CPAI) is an AI inference API service that provides unlimited access to open-source coding models like Kimi K2.5. Instead of paying per token, you pay for concurrency lanes with unlimited usage.

How does pricing work?

We use a subscription model based on concurrency lanes. Each plan includes a certain number of concurrent request lanes. Higher priority plans get faster response times. All plans include unlimited token usage.

What models are available?

Currently we support Kimi K2.5, DeepSeek Coder, and CodeLlama. We’re constantly adding new state-of-the-art open-source models.

Is the API compatible with OpenAI?

Yes! Our API is fully compatible with the OpenAI API format. You can use official OpenAI SDKs by simply changing the base URL to our endpoint.

What is a concurrency lane?

A concurrency lane represents one simultaneous request you can make to the API. If you have 3 lanes, you can have 3 requests processing at the same time. Additional requests will be queued until a lane becomes available.

How do I secure my API key?

We recommend using IP whitelisting to restrict which IP addresses can use your API key. You can configure this in your dashboard under Security settings. Never commit your API key to public repositories.

Can I self-host?

Yes! CPAI is open source and can be self-hosted. You’ll need GPUs to run the inference backend (SGLang). Check our GitHub repository for deployment instructions.

How do I get support?

For support, please email us at support@codingplan.ai or open an issue on our GitHub repository.