Skip to content
Sonic
AI
Sonic
AI
Home
Discover
Ask Sonic
Projects
Request source or feature
Philip Kelly — Sonic AI
Home
/
Discover
/
Philip Kelly
P
Philip Kelly
Person
11
Mentions
Episodes
11
Claims
Claims
By Source
Timeline
All
(11)
Finance
(0)
Healthcare
(0)
Government
(0)
Tech
(9)
Energy
(0)
Science
(0)
Geopolitics
(0)
Philip Kelly's book, "Inference Engineering," posits that AI inference is a full-stack problem combining CUDA, on-GPU optimization, and distributed systems.
Expert perspective
Philip Kelly
Apr 28
The GPT OSS 120B model is considered too large for many use cases, making smaller models like the Gemma family more suitable for fine-tuning.
Expert perspective
Philip Kelly
Apr 28
Baseten was a day-zero support partner for the launch of Google's Gemma 4 model.
Expert perspective
Philip Kelly
Apr 28
Baseten operates a multi-region deployment that pools GPUs from America and Europe into a unified compute pool to minimize user latency.
Expert perspective
Philip Kelly
Apr 28
Baseten was one of the first companies to use NVIDIA Dynamo for inference in a production environment.
Expert perspective
Philip Kelly
Apr 28
The Gemma family of models from Google supports native image inputs, which is beneficial for enterprise use cases like KYC and document extraction.
Expert perspective
Philip Kelly
Apr 28
The low-latency networking of Google Kubernetes Engine (GKE) saves Baseten a couple dozen milliseconds per turn between models in multi-model agentic systems.
Expert perspective
Philip Kelly
Apr 28
Baseten utilizes Google Kubernetes Engine (GKE) to build its systems on top of Google Cloud GPUs.
Expert perspective
Philip Kelly
Apr 28
The Gemma family of models provides a wide range of sizes, from 2 billion to 30 billion parameters.
Expert perspective
Philip Kelly
Apr 28
Baseten offers NVIDIA's Nemotron models, such as Nemo Tron Super, through a pay-per-token API.
Expert perspective
Philip Kelly
Apr 28
Migrating an inference system from NVIDIA's Hopper architecture to the Blackwell architecture requires significant software and kernel layer work.
Expert perspective
Philip Kelly
Apr 28
Sign up free to see the full entity analysis
Get started free
Back to Entities
Entity Detail