Team — Inferex

Leadership

James Liu

CEO & Co-Founder

James led inference infrastructure at Google Brain before founding Inferex in 2022. He has filed 8 patents in distributed systems and ML optimization. Under his leadership, Inferex has grown to serve 45+ enterprise customers and 12B+ inferences.

How We Got Here

The story of Inferex told through the people who built it.

2022 — San Jose, CA

The Founding

James Liu and Sarah Kim left their respective roles at Google Brain and NVIDIA to tackle the biggest unsolved problem in production ML: inference at scale. They wrote the first Inferex kernel optimizer over a long weekend in James's garage.

2023 — First Customers

Proving the Technology

Marcus Webb joined as VP of Engineering, bringing deep expertise in distributed systems from his time at AWS. The first 5 enterprise customers saw 60%+ latency reductions in production within weeks of deployment.

2024 — Scaling Up

Platform Expansion

Priya Patel joined as Head of Product, transforming Inferex from a powerful but rough SDK into a full platform with monitoring, auto-scaling, and a self-service dashboard. Customer count tripled.

2025 — Today

Hyperscale Era

45+ enterprise customers. 12B+ inferences served. SOC 2 Type II certified. < 8ms average P99 latency. The journey has only just begun.

Core Team

Sarah Kim

CTO & Co-Founder

Former NVIDIA research engineer. Architect of Inferex's core kernel optimization layer and hardware abstraction stack.

Marcus Webb

VP of Engineering

Ex-AWS distributed systems lead. Built the auto-scaling infrastructure that powers Inferex's 1M+ req/s throughput.

Priya Patel

Head of Product

Former product lead at Databricks. Transformed Inferex from an SDK into a full platform used by 45+ enterprise teams.

The Team Behind Inferex