4 Learnings From Load Testing LLMs
The way LLMs run in Kubernetes is quite a bit different than running web apps or APIs. Recently I was digging into the benefits of the Inference Extensions for the Kubernetes Gateway API and I need...
The way LLMs run in Kubernetes is quite a bit different than running web apps or APIs. Recently I was digging into the benefits of the Inference Extensions for the Kubernetes Gateway API and I need...
Recently, I’ve been building AI agents to help automate some parts of my workflow such as deep, meaningful technical research to contribute to technical material that I build. I am using the AutoGe...
Organizations need to think about what data gets sent to any AI services. They also need to consider the LLM may respond with some unexpected or risky results. This is where guardrails come in. The...
NVIDIA NIM is a great way to run AI inference workloads in containers. I deploy primarily to Kubernetes, so I wanted to dig into deploying NIM using the Kubernetes NIM Operator and use GPUs in Goog...
Things change quickly in the land of technology. AI is the “hot” thing. I feel for the platform engineers out there struggling with technologies like Docker, Kubernetes, Prometheus, Istio, ArgoCD, ...
You probably wouldn’t be surprised if I told you modern networking based on open source projects like Istio, SPIFFE, Cilium and others (See my paper about the CAKES stack) are typically consumed by...
Platform engineering has emerged recently in part because organizations recognize the value in improving developer experience and the need to improve app developer delivery speed. And in typical or...
It’s been a while since I’ve blogged, and just like other posts in the past, this one is meant as a way to dig into something and for me to catalog my own thoughts for later. While digging into som...
Istio is a powerful service mesh built on Envoy Proxy that solves the problem of connecting services deployed in cloud infrastructure (like Kubernetes) and do so in a secure, resilient, and observa...
This post may not be able to break through the noise around API Gateways and Service Mesh. However, it’s 2020 and there is still abundant confusion around these topics. I have chosen to write this ...
I’ve been pretty invested in helping organizations with their cloud-native journeys for the last five years. Modernizing and improving a team (and eventually an organization’s) velocity to deliver ...
Recently I wrote a piece for DZone and their Migrating to Microservices Report on the challenges of adopting service mesh in an enterprise organization. One of the first things we tackle in that pi...
Service mesh is an important set of capabilities that solve some difficult service-to-service communication challenges when operating a services-style architecture. Just as Kubernetes and container...
This is part 5 of a series that explores building a control plane for Envoy Proxy. Follow along @christianposta and @soloio_inc for more!. In this blog series, we’ll take a look at the following a...
This is part 4 of a series that explores building a control plane for Envoy Proxy. Follow along @christianposta and @soloio_inc for the next part coming out in a week. In this blog series, we’ll t...
This is part 3 of a series that explores building a control plane for Envoy Proxy. In this blog series, we’ll take a look at the following areas: Adopting a mechanism to dynamically update Env...
This is part 2 of a series that explores building a control plane for Envoy Proxy. In this blog series, we’ll take a look at the following areas: Adopting a mechanism to dynamically update Env...
Envoy has become a popular networking component as of late. Matt Klein wrote a blog a couple years back talking about Envoy’s dynamic configuration API and how it has been part of the reason the ad...
So you’ve decided to run your Kubernetes workloads in AWS. As we’ve seen before setting up AWS EKS requires a lot of patience and headache. You may be able to get it working. For others, you should...
API Gateways are going through a bit of an identity crisis these days. Are they centralized, shared resources that facilitate the exposure and governance of APIs to external entities? ...