High-Performance LLM
Inference at Low Cost

Deploy and scale LLM inference—serve models like  Llama 4 and DeepSeek
on our low-cost GPU platform.
Pay-as-you-go APIs mean you pay only for the inference you use.

Kimi-K2-Instruct logo

Kimi-K2-Instruct

$0.30 in | $1.75 out131.07K Context

Kimi K2 is a mixture-of-experts (MoE) language model with 1 trillion total parameters. Kimi K2 excels at knowledge, reasoning, and coding tasks, with strong agentic capabilities.

DeepSeek-R1-0528 logo

DeepSeek-R1-0528

$0.25 in | $1.00 out32.77K Context

May 28th update to the original DeepSeek R1 Performance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens.

Llama-4-Maverick-17B-128E-Instruct logo

Llama-4-Maverick-17B-128E-Instruct

$0.10 in | $0.35 out1.05M Context

Maverick beats GPT‑4o on coding, vision, reasoning and remains lightweight for efficient local deployment.

Quick Start

import OpenAI from "openai";

const openai = new OpenAI({
  baseURL: "https://inference.cloudrift.ai/v1",
  apiKey: "YOUR_RIFT_API_KEY",
});

const completion = await openai.chat.completions.create({
  model: "llama4:maverick",
  messages: [
    {
      role: "user",
      content: "What is the meaning of life?"
    }
  ],
  stream: true,
});

for await (const chunk of completion) {
  process.stdout.write(chunk.choices[0]?.delta.content as string);
}

All Available Models

Instant access to high-performance models—no queues, no GPUs to reserve.
Just straightforward model options you can build on.

Kimi-K2-Instruct logo

Kimi-K2-Instruct

$0.30 in | $1.75 out131.07K Context

Kimi K2 is a mixture-of-experts (MoE) language model with 1 trillion total parameters. Kimi K2 excels at knowledge, reasoning, and coding tasks, with strong agentic capabilities.

DeepSeek-R1-0528 logo

DeepSeek-R1-0528

$0.25 in | $1.00 out32.77K Context

May 28th update to the original DeepSeek R1 Performance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens.

Llama-4-Maverick-17B-128E-Instruct logo

Llama-4-Maverick-17B-128E-Instruct

$0.10 in | $0.35 out1.05M Context

Maverick beats GPT‑4o on coding, vision, reasoning and remains lightweight for efficient local deployment.

DeepSeek-V3 logo

DeepSeek-V3

$0.15 in | $0.40 out163.84K Context

Rivals closed models on math and code; open weights, vetted safety, and low pricing simplify enterprise deployment.

DeepSeek-R1 logo

DeepSeek-R1

$0.15 in | $0.40 out163.84K Context

Rivals closed models on math and code; open weights, vetted safety, and low pricing simplify enterprise deployment.

Meta-Llama-3.1-70B-Instruct-FP8 logo

Meta-Llama-3.1-70B-Instruct-FP8

$0.00 in | $0.00 out16.38K Context

Llama 3.1 70B Instruct FP8 model running on 4x RTX 4090 optimised with Pliops XDP LightningAI accelerator card

Get in Touch

We're here to support your compute and AI needs. Let us know if you're looking to:

  • Find an affordable GPU provider
  • Sell your compute online
  • Manage on-prem infrastructure
  • Build a hybrid cloud solution
  • Optimize your AI deployment

Businesses of any size are welcome.

hello@cloudrift.ai
CloudRift Inc., a Delaware corporation
PO Box 1224, Santa Clara, CA 95052, USA
+1 (831) 534-3437

I'm interested in: