kubernetes

12 posts tagged “kubernetes”

June 6, 2026

Building a Hybrid LLM Platform on EKS, Part 4: Platform Add-ons, the Load Balancer Controller, and Karpenter

Part 4 of our hands-on EKS series. We install the two add-ons every production EKS cluster needs: the AWS Load Balancer Controller so Kubernetes Ingress objects provision real ALBs, and Karpenter for cost-aware autoscaling — including the GPU NodePool that scales to zero between inference workloads.

eks kubernetes aws-cdk karpenter load-balancer-controller autoscaling irsa ai-infrastructure typescript

June 6, 2026

Building a Hybrid LLM Platform on EKS, Part 3: Node Groups, GPU AMIs, and the NVIDIA Device Plugin

Part 3 of our hands-on EKS series. We add worker nodes to the empty cluster from Part 2: a CPU system pool for add-ons and the hybrid router, a GPU pool for vLLM model servers, the NVIDIA device plugin DaemonSet, and the taints and labels that make scheduling predictable.

eks kubernetes aws-cdk gpu nvidia node-groups ai-infrastructure typescript

May 30, 2026

Building a Hybrid LLM Platform on EKS, Part 2: The Control Plane, IAM, and IRSA

Part 2 of our hands-on EKS series. We provision the EKS cluster into the VPC from Part 1, wire up OIDC federation and IRSA so pods authenticate without static credentials, and end with a working kubectl connection to a real cluster.

eks kubernetes aws-cdk iam irsa oidc ai-infrastructure typescript

May 29, 2026

Securing Self-Hosted LLMs and AI Agents on Kubernetes

Harden self-hosted vLLM and AI agents on Kubernetes: an auth/rate-limit gateway, gVisor tool sandboxing, prompt-injection guardrails, scoped secrets, and signed model weights — mapped to the OWASP LLM Top 10.

security ai agents kubernetes llm prompt-injection supply-chain

May 24, 2026

Building a Hybrid LLM Platform on EKS, Part 1: Architecture and the Network Foundation

Part 1 of a hands-on series building the EKS-based hybrid LLM platform referenced throughout this blog. We map out the full architecture, then provision the VPC, subnets, NAT, and VPC endpoints with AWS CDK — the network foundation every later part builds on.

eks kubernetes aws-cdk llm ai-infrastructure hybrid-ai vpc typescript

May 22, 2026

The Agent Control Plane: Frontier Models Plan, Your Kubernetes Fleet Executes

How to orchestrate a fleet of AI agents using a shared task queue — frontier models like Claude handle planning and decomposition, while a local Kubernetes worker pool runs the high-volume execution tasks. Covers the task ledger, dynamic task creation, lane-based routing, and KEDA autoscaling.

ai agents orchestration kubernetes llm hybrid

May 21, 2026

Observability for LLM Applications on Kubernetes: Tokens, Traces, and Cost per Request

How to instrument self-hosted and hybrid LLM workloads with OpenTelemetry, Prometheus, and Langfuse — tracking time-to-first-token, tokens per second, GPU utilization, and unit economics down to the individual request.

kubernetes llm observability opentelemetry finops ai-infrastructure

April 3, 2026

Self-Hosting LLMs on Kubernetes: A Practical Guide

How to deploy, serve, and autoscale open-source large language models on Kubernetes with vLLM — from GPU node pools and deployment manifests to KEDA-based autoscaling and production guardrails.

kubernetes llm gpu ai-infrastructure self-hosting

February 8, 2026

Container Security on Kubernetes: A Practical Guide with Trivy, Falco, and Kyverno

Most Kubernetes clusters are running containers with known vulnerabilities, no runtime monitoring, and no policy enforcement. Here is how to fix that with three open-source tools.

kubernetes security trivy falco kyverno containers

February 7, 2025

Using AI to Monitor Kubernetes Clusters and Make Dynamic Scaling Decisions

How to move beyond static thresholds and use AI-driven observability to detect anomalies, predict traffic patterns, and automate scaling decisions across your Kubernetes infrastructure.

kubernetes ai monitoring autoscaling observability

January 29, 2025

Building a CI/CD Pipeline with Dagger That Deploys to Kubernetes

A practical guide to building a containerized CI/CD pipeline using Dagger's TypeScript SDK — from local Kind clusters to production EKS with GitHub Actions, AWS CDK, and multi-environment promotion.

dagger kubernetes cicd helm eks github-actions aws-cdk

January 15, 2025

GPU Cost Optimization on Kubernetes: A Practical Guide

Learn how to reduce GPU infrastructure costs by up to 60% with proper Kubernetes scheduling, time-slicing, and right-sizing strategies.

kubernetes gpu cost-optimization infrastructure