The Operating System for Sovereign AI Deployments

Manage large-scale GPU infrastructure within your security perimeter: 8+ sovereign AI operators across 5 countries leverage CloudRift platform. Full virtualization. Best-in-class AI inference performance. Enterprise-grade security.

SOC 2 CertifiedSOC 2 Certified
NVIDIA Inception PartnerNVIDIA Inception Member

Powering Infrastructure For

Agile Ascent
Automatico
FlyMy
HyperCloud
Katzteleport
Mixedbread
Nebula Block
NeuralRack
Parasail
TheStage.ai
Vexpower
Yotta Labs
Agile Ascent
Automatico
FlyMy
HyperCloud
Katzteleport
Mixedbread
Nebula Block
NeuralRack
Parasail
TheStage.ai
Vexpower
Yotta Labs

Infrastructure Partners

Trusted by Sovereign AI Providers Worldwide

Our platform powers GPU infrastructure for leading providers across 5 countries. Here are some of our partners.

HyperCloud logo

HyperCloud

Central Asia, Europe, Middle East

  • Subsidiary of a major telecom provider across Central Asia, Europe, and the Middle East.
  • Multi-region data centers with enterprise-grade GPU capacity.
  • Serving regulated industries including banking and government.
Kazteleport logo

Kazteleport

Central Asia

  • 25+ years operating telecom and data center infrastructure in Central Asia.
  • 5 TIER III certified data centers with 500+ engineers on staff.
  • Enterprise customers including Halyk Bank, the largest bank in Central Asia.
Konst logo

Konst

APAC, EU, USA

  • 8 data center locations across Taiwan, Japan, Singapore, USA, and Europe.
  • ISO 27001 certified with H100, H200, and RTX 5090 GPU fleet.
  • Turnkey AI data centers delivered in 4–6 months at a third of hyperscaler cost.

Platform

One Platform, Full Control

Manage GPU clusters, tenants, and workloads across your own data centers — from a single control plane.

Control Plane

Manage your entire GPU fleet, tenants, and workloads from a single dashboard — across all sites.

Full Virtualization

VMs, containers, and bare metal. Choose the isolation level each workload requires — from bare-metal control to lightweight containers.

SKU & Pricing Control

Define GPU offerings, set pricing tiers, and configure placement rules. Full control over your brand and commercial terms.

Usage & Utilization

Per-user, per-GPU transparency. Track utilization, costs, and billing in real time across your entire fleet.

Multi-Site Management

Federate workloads across geographies. One control plane, many data centers, unified monitoring.

White-Label Console

Your customers use your brand. Embed rental tools into your portal or use our white-label console.

Deep Visibility

Real-Time Insight Across Your Infrastructure

Fleet Management

Monitor Every Node in Your Fleet

Track users, GPU allocation, and system health across all nodes. Real-time visibility into temperature, utilization, memory, and power draw.

Node Status online
Current UsersRental InfoAccess/ConnectionHardware Information
UsernameInstance IDTypeModel
alex.minst-4892GPU2x NVIDIA H100 SXM
skenioinst-6901GPU1x NVIDIA H100 SXM
dev-teaminst-4915GPU4x NVIDIA H100 SXM
System HealthGPUsCPUsMemoryStorage
GPUModelTemperatureGPU Utilization
GPU-001H100 SXM62°C
87%
GPU-002H100 SXM58°C
72%
GPU-003H100 SXM65°C
94%
+2 more GPUs

Hardware Reporting

Instance-Level GPU Telemetry

Drill into any instance to see per-GPU utilization, memory usage, thermals, power consumption, and clock speeds. Direct Grafana integration for deep analysis.

Node ManagementNode 3982cinst-4892 Utilization
Inst-4892 Utilization View in Grafana
Node: Node 3982cInstance: inst-4892User: alex.m
GPUGPU UtilizationMemory UtilizationTemperature
GPU 087%85%62 °C
GPU 172%67%58 °C

User & Spending

Tenant Management at Scale

Full visibility into every tenant — balances, active instances, spending history, and credit limits. Manage users and teams from a single dashboard.

User ManagementUsersTeams
Search users...+ Add User
UserBalance ↕Instances ↕Total Spent ↓
alex.m
alex.m@acme.ai
$38.652$4,046.34
mlops-team
ops@infraco.net
$3,895.532$1,704.46
skenio
deploy@skenio.dev
$278.831$781.16
jordan.w
jw@startup.co
$10.941$229.06
+3 more users

No Vendor Lock-In

Hardware Agnostic by Design

CloudRift abstracts the hardware layer so you can choose the GPUs that fit your workloads and budget — not the ones a vendor requires.

NVIDIA AI Enterprise

Full support for MIG, vGPU, and the NVIDIA virtualization stack. Certified enterprise workloads with proper isolation.

AMD GPU Support

Run inference and training on AMD GPUs with ROCm. No NVIDIA lock-in required — lower cost, same control.

Open-Source Virtualization

Built on QEMU/KVM, open container runtimes, and open networking. No proprietary dependencies, no licensing surprises.

Developer Tools

Ship Faster on GPU Infrastructure

Instant GPU access, pre-built ML environments, persistent storage, and full API control — from experiment to production.

Real-Time Monitoring

Track GPU instances, usage, and costs from a single dashboard across all providers.

Recipes & Templates

Pre-configured environments for common AI workloads. One-click setup for PyTorch, vLLM, and more.

GPU Rental

Flexible Hourly Pricing

Deploy VMs, containers, or bare metal in minutes. No long-term commitments required.

H100NVIDIA
$2.50/hr
80GB VRAMup to 8x
ca-central-kz — KZ
RTX PRO 6000NVIDIA
$1.20/hr
48GB VRAMup to 8x
us-east-fl-nr — USA
L40SNVIDIA
$1.80/hr
48GB VRAMup to 8x
us-east-fl-nr — USA
H200NVIDIA
$3.20/hr
141GB VRAMup to 8x
ca-central-kz — KZ
B200NVIDIA
$4.90/hr
192GB VRAMup to 8x
us-east-nc-nr — USA
MI350XAMD
$3.50/hr
288GB VRAMup to 8x
eu-west-it — IT

Persistent Storage

Your data persists across sessions. Attach volumes to any instance and pick up where you left off.

Full API Access

Programmatic control over your infrastructure. Automate deployments and integrate with CI/CD.

LLM-as-a-Service

Inference on Your Terms

Serve open-weight models on your own infrastructure with built-in inference endpoints, autoscaling, and pay-per-token pricing.

Pay-per-token

Only pay for what you use — no idle GPU costs for inference workloads.

Popular Models

Llama, DeepSeek, GLM, Kimi, Qwen, Mistral — optimized and ready to serve out of the box.

OpenAI-compatible API

Drop-in replacement — switch your base URL and keep your existing code.

API Identifier

qwen/qwen3.5-35b-a3b
import openai

client = openai.OpenAI(
    api_key="YOUR_RIFT_API_KEY",
    base_url="https://inference.cloudrift.ai/v1"
)

completion = client.chat.completions.create(
    model="qwen/qwen3.5-35b-a3b",
    messages=[
        {"role": "user", "content": "Hello"}
    ],
    stream=True
)

for chunk in completion:
    print(chunk.choices[0].delta.content or "", end="")

Model Specs

ProviderQwen
Parameters35B (3B active)
Context256K
ArchitectureMoE

Pricing

Per million tokens
Input Price$0.16
Output Price$1.30

Data Privacy

Your Data Never Leaves Your Infrastructure

Full data sovereignty by design. Every workload runs on your hardware, in your jurisdiction, under your control.

SOC 2 Certified

Independently audited security controls. Your customers get the compliance documentation they need.

End-to-End Encryption

All data encrypted at rest and in transit. Zero-trust architecture with no exceptions.

Full Tenant Isolation

Dedicated resources per customer — no shared infrastructure, no noisy neighbors, no data leakage between tenants.

Data Residency & Compliance

Keep all data and compute within national borders. Meet GDPR, data sovereignty, and industry-specific requirements.

Full Audit Logging

Every action tracked and logged. Complete visibility into who accessed what, when, and from where.

On-Premise Control

Everything runs on your hardware, in your jurisdiction. No data ever passes through third-party clouds.

Proven Credibility

Built by Experts, Recognized by Industry

Our founding team brings deep experience from major tech and gaming companies.

Apple
Roblox
Ubisoft
HP

CloudRift Has Been Featured In

Tom's HardwareTechPowerUpVideoCardzHot Hardware
Deploy in minutes

Ready to Take Control of Your GPU Infrastructure?

Talk to our team about deploying CloudRift in your data centers.