Manage large-scale GPU infrastructure within your security perimeter: 8+ sovereign AI operators across 5 countries leverage CloudRift platform. Full virtualization. Best-in-class AI inference performance. Enterprise-grade security.
SOC 2 Certified
NVIDIA Inception MemberPowering Infrastructure For
Infrastructure Partners
Our platform powers GPU infrastructure for leading providers across 5 countries. Here are some of our partners.
HyperCloud
Central Asia, Europe, Middle East
Kazteleport
Central Asia
Konst
APAC, EU, USA
Platform
Manage GPU clusters, tenants, and workloads across your own data centers — from a single control plane.
Manage your entire GPU fleet, tenants, and workloads from a single dashboard — across all sites.
VMs, containers, and bare metal. Choose the isolation level each workload requires — from bare-metal control to lightweight containers.
Define GPU offerings, set pricing tiers, and configure placement rules. Full control over your brand and commercial terms.
Per-user, per-GPU transparency. Track utilization, costs, and billing in real time across your entire fleet.
Federate workloads across geographies. One control plane, many data centers, unified monitoring.
Your customers use your brand. Embed rental tools into your portal or use our white-label console.
Deep Visibility
Fleet Management
Track users, GPU allocation, and system health across all nodes. Real-time visibility into temperature, utilization, memory, and power draw.
| Username | Instance ID | Type | Model | vRAM | CPU | RAM |
|---|---|---|---|---|---|---|
| alex.m | inst-4892 | GPU | 2x NVIDIA H100 SXM | 160Gi | 128 vCPU | 512Gi |
| skenio | inst-6901 | GPU | 1x NVIDIA H100 SXM | 80Gi | 64 vCPU | 256Gi |
| dev-team | inst-4915 | GPU | 4x NVIDIA H100 SXM | 320Gi | 256 vCPU | 1Ti |
| GPU | Model | Temperature | GPU Utilization | Memory Usage | Power |
|---|---|---|---|---|---|
| GPU-001 | H100 SXM | 62°C | 87% | 68.2 / 80.0 GiB | 580W / 700W |
| GPU-002 | H100 SXM | 58°C | 72% | 54.1 / 80.0 GiB | 520W / 700W |
| GPU-003 | H100 SXM | 65°C | 94% | 76.8 / 80.0 GiB | 640W / 700W |
Hardware Reporting
Drill into any instance to see per-GPU utilization, memory usage, thermals, power consumption, and clock speeds. Direct Grafana integration for deep analysis.
| GPU | GPU Utilization | Memory Utilization | Memory Usage | Temperature | Power | Clock Speed |
|---|---|---|---|---|---|---|
| GPU 0 | 87% | 85% | 68.2 / 80.0 GiB | 62 °C | 580 W / 700 W | 1980 MHz |
| GPU 1 | 72% | 67% | 54.1 / 80.0 GiB | 58 °C | 520 W / 700 W | 1935 MHz |
User & Spending
Full visibility into every tenant — balances, active instances, spending history, and credit limits. Manage users and teams from a single dashboard.
| User | Balance ↕ | Credit Limit ↕ | Instances ↕ | Last Login ↕ | Registered ↕ | Total Spent ↓ |
|---|---|---|---|---|---|---|
alex.m alex.m@acme.ai | $38.65 | Prepaid only | 2 ↗ | Mar 10, 2026 | Sep 1, 2025 | $4,046.34 |
mlops-team ops@infraco.net | $3,895.53 | Prepaid only | 2 ↗ | Mar 12, 2026 | Jul 9, 2024 | $1,704.46 |
skenio deploy@skenio.dev | $278.83 | Prepaid only | 1 ↗ | — | Sep 25, 2025 | $781.16 |
jordan.w jw@startup.co | $10.94 | Prepaid only | 1 ↗ | Mar 13, 2026 | Nov 27, 2025 | $229.06 |
No Vendor Lock-In
CloudRift abstracts the hardware layer so you can choose the GPUs that fit your workloads and budget — not the ones a vendor requires.
Full support for MIG, vGPU, and the NVIDIA virtualization stack. Certified enterprise workloads with proper isolation.
Run inference and training on AMD GPUs with ROCm. No NVIDIA lock-in required — lower cost, same control.
Built on QEMU/KVM, open container runtimes, and open networking. No proprietary dependencies, no licensing surprises.
Developer Tools
Instant GPU access, pre-built ML environments, persistent storage, and full API control — from experiment to production.
Track GPU instances, usage, and costs from a single dashboard across all providers.
Pre-configured environments for common AI workloads. One-click setup for PyTorch, vLLM, and more.
GPU Rental
Deploy VMs, containers, or bare metal in minutes. No long-term commitments required.
Your data persists across sessions. Attach volumes to any instance and pick up where you left off.
Programmatic control over your infrastructure. Automate deployments and integrate with CI/CD.
LLM-as-a-Service
Serve open-weight models on your own infrastructure with built-in inference endpoints, autoscaling, and pay-per-token pricing.
Pay-per-token
Only pay for what you use — no idle GPU costs for inference workloads.
Popular Models
Llama, DeepSeek, GLM, Kimi, Qwen, Mistral — optimized and ready to serve out of the box.
OpenAI-compatible API
Drop-in replacement — switch your base URL and keep your existing code.
API Identifier
import openai client = openai.OpenAI( api_key="YOUR_RIFT_API_KEY", base_url="https://inference.cloudrift.ai/v1" ) completion = client.chat.completions.create( model="qwen/qwen3.5-35b-a3b", messages=[ {"role": "user", "content": "Hello"} ], stream=True ) for chunk in completion: print(chunk.choices[0].delta.content or "", end="")
Model Specs
Pricing
Per million tokensData Privacy
Full data sovereignty by design. Every workload runs on your hardware, in your jurisdiction, under your control.
Independently audited security controls. Your customers get the compliance documentation they need.
All data encrypted at rest and in transit. Zero-trust architecture with no exceptions.
Dedicated resources per customer — no shared infrastructure, no noisy neighbors, no data leakage between tenants.
Keep all data and compute within national borders. Meet GDPR, data sovereignty, and industry-specific requirements.
Every action tracked and logged. Complete visibility into who accessed what, when, and from where.
Everything runs on your hardware, in your jurisdiction. No data ever passes through third-party clouds.
Proven Credibility
Our founding team brings deep experience from major tech and gaming companies.
CloudRift Has Been Featured In
Talk to our team about deploying CloudRift in your data centers.