Simple, Transparent Pricing

Start free, scale to enterprise. No hidden fees. Cancel anytime.

Starter
Free
 
Up to 10k inferences/day
  • Inference optimizer
  • Real-time monitoring
  • REST API access
  • Auto-scaling
  • Multi-hardware
  • SLA guarantee
Get Started
Enterprise
Custom
 
Unlimited inferences
  • Everything in Pro
  • Custom model deploy
  • On-premise option
  • 99.99% SLA
  • Priority 24/7 support
  • Dedicated engineer
Contact Sales

Compare Plans

Everything you need to choose the right plan.

Feature Starter Pro Enterprise
Inference latency optimizer
Auto-scaling
Multi-hardware support
Real-time monitoring
Custom model deployment
SLA guarantee 99.9% 99.99%
Dedicated support Email Priority 24/7
On-premise deploy

Frequently Asked Questions

What counts as an inference?

One inference = one model forward pass. Batched requests count as one inference per input item in the batch.

Can I upgrade or downgrade at any time?

Yes. Plan changes take effect immediately. Downgrades are prorated to your billing cycle.

Does Inferex store my model weights?

No. Model weights remain in your infrastructure. Inferex injects optimization at the runtime layer only. We never see your model artifacts.

What hardware does Inferex support?

Pro and Enterprise plans support NVIDIA GPU (A100, H100, L40S), Intel/AMD CPU, and edge TPU/NPU devices. Starter supports cloud CPU only.

Not Sure Which Plan?

Talk to our team. We'll help you find the right plan for your workload.

Contact Sales