Deploy and scale LLM inference—serve models like Llama 4 and DeepSeek
on our low-cost GPU platform.
Pay-as-you-go APIs mean you pay only for the inference you use.
Maverick beats GPT‑4o on coding, vision, reasoning and remains lightweight for efficient local deployment.
Rivals closed models on math and code; open weights, vetted safety, and low pricing simplify enterprise deployment.
Rivals closed models on math and code; open weights, vetted safety, and low pricing simplify enterprise deployment.
import OpenAI from "openai";
const openai = new OpenAI({
baseURL: "https://inference.cloudrift.ai/v1",
apiKey: "YOUR_RIFT_API_KEY",
});
const completion = await openai.chat.completions.create({
model: "llama4:maverick",
messages: [
{
role: "user",
content: "What is the meaning of life?"
}
],
stream: true,
});
for await (const chunk of completion) {
process.stdout.write(chunk.choices[0]?.delta.content as string);
}
Instant access to high-performance models—no queues, no GPUs to reserve.
Just straightforward model options you can build on.
Maverick beats GPT‑4o on coding, vision, reasoning and remains lightweight for efficient local deployment.
Rivals closed models on math and code; open weights, vetted safety, and low pricing simplify enterprise deployment.
Rivals closed models on math and code; open weights, vetted safety, and low pricing simplify enterprise deployment.
May 28th update to the original DeepSeek R1 Performance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens.
We're here to support your compute and AI needs. Let us know if you're looking to:
Businesses of any size are welcome.