How to Leverage Cloud-hosted LLM and Pay Per Usage?

By Dmitry Trifonov•May 21, 2025

llminferencecloudriftllmops

Overview of the CloudRift LLM-as-a-service

With LLM-as-a-Service (LLMaaS), developers can tap into the power of state-of-the-art language models through simple APIs and pay only for what they use, down to the token. LLM-as-a-Service gives you API access to pre-trained large language models hosted in the cloud: no training, servers, or DevOps headaches. You send a prompt; it sends a response. That’s it.

Register

First, you need to register at the CloudRift console. Go to the aforementioned link and click the “Sign Up” button.

Select the billing tab and click “Add Credit”.

Use the Stripe payment interface to add balance to your account using one of the available methods. $10 is enough to start.

Create an API Key

First, we need to create the API key, which will allow you to authenticate with the service. To do so, open the API Keys panel and press “Create API Key.”

Name your key and click “Create API Key”.

Finally, copy or download your key. The key will be encrypted, so you won’t be able to see it again.

Select a Model

Click the “Inference” button and select a model you want to run. Generally, choosing the right model for the task and optimizing the quality and cost is an involved process. However, for a starter, you can start from a large, general-purpose model like DeepSeek V3.

Once you click on the model you want to use the instruction window pops-up.

The quickest way to test model is to copy the cURL instructions, paste them into the command line, replace the API key with the previously saved value. You should see the model streaming the response right away.

Streaming response from an inference server when invoked via cURL

What Next?

You can use an OpenAI-style API endpoint in place of ChatGPT in any service. Leverage LLM for various tasks, such as chatbots, code generation, and translation.

LLM-as-a-Service provides a service similar to OpenAI, Anthropic, and Google Gemini at a much lower price point. Additionally, you can host your own models optimized for your specific applications and ensure the secure treatment of your data.