Skip to main content

Architecture overview

Poolside runs on Amazon EKS (Elastic Kubernetes Service) with a Terraform-managed infrastructure:
  • EKS cluster: Managed Kubernetes cluster with a centralized control plane
  • Specialized node groups:
    • CPU nodes: Host platform services, APIs, and web components
    • GPU nodes: Run model inference workloads
  • IRSA (IAM roles for service accounts): Pod-level AWS permissions using EKS OIDC integration
  • Amazon S3 storage: Centralized storage for models, logs, and application data
  • Ingress controllers: NGINX Ingress or AWS Load Balancer Controller, depending on deployment configuration

Kubernetes namespaces

Poolside organizes resources across these namespaces:
NamespacePurposeKey components
poolsideCore platform servicesCore API, reconciliation loop
poolside-modelsModel inference workloadsInference pods, model services
ingress-nginxIngress (NGINX-based)NGINX controller, admission webhooks
aws-load-balancer-controllerAWS ALB/NLB integration (when ALB is used)AWS Load Balancer Controller
kube-systemKubernetes system componentsCoreDNS, kube-proxy, AWS components

Key services

Poolside namespace

The poolside namespace contains:
  • core-api: Main application API service
  • core-api-models-reconciliation-loop: Manages model lifecycle

Poolside models namespace

The poolside-models namespace contains:
  • Inference services: Named inference-<UUID>-internal (ClusterIP)
  • Inference pods: Run model inference workloads

Ingress namespace

The ingress-nginx namespace contains:
  • ingress-nginx-controller: Handles HTTP and HTTPS traffic routing
  • ingress-nginx-controller-admission: Admission webhooks for ingress validation

Ingress controllers

Poolside supports different ingress controllers depending on the AWS deployment configuration.

NGINX ingress controller

The NGINX ingress controller is used in deployments that rely on Kubernetes-native ingress management.
  • Routes HTTP and HTTPS traffic within the cluster
  • Terminates TLS using in-cluster certificates
  • Commonly used in restricted or non-ALB-based environments

AWS Load Balancer Controller

In AWS-native deployments, Poolside can use the AWS Load Balancer Controller to provision Application Load Balancers (ALBs).
  • Integrates with AWS Elastic Load Balancing
  • Uses AWS Certificate Manager (ACM) for TLS certificate management
  • Automatically provisions and manages ALBs based on Kubernetes ingress resources
The ingress controller used depends on the deployment modality and AWS environment (commercial, GovCloud, or restricted).

Node groups

CPU node group

  • Instance type: m5.4xlarge (default)
  • Workloads: Core API, ingress controller, monitoring services
  • Configuration:
    • Minimum and default size: 3 nodes
    • Maximum size: 10 nodes
    • 200 GB gp3 EBS volume per node

GPU node group

  • Instance types:
    • p5.48xlarge (H100 GPUs)
    • p5e.48xlarge (H200 GPUs)
  • Workloads: Model inference
  • Configuration:
    • GPU taints to prevent non-GPU workloads
    • Support for capacity reservations
    • Support for capacity blocks for cost optimization
    • 200 GB gp3 EBS volume per node

Storage components

Amazon S3 buckets

Poolside uses the following S3 buckets:
  • Primary bucket: poolside-[deployment-name]
    • Models, checkpoints, and application data
    • HTTPS-only access enforcement
    • Access logging enabled
  • Access logs bucket: poolside-[deployment-name]-access-logs
    • Logs for primary bucket access
    • HTTPS-only access enforcement

IAM roles for service accounts (IRSA)

Poolside uses IAM Roles for Service Accounts (IRSA) to grant AWS permissions to Kubernetes workloads.
  • IAM roles are associated with Kubernetes service accounts
  • Permissions are scoped at the pod level
  • No node-level AWS credentials are shared across workloads
IRSA is used to access:
  • Amazon S3
  • Amazon Bedrock, if enabled
  • Marketplace metering, if applicable

Essential management operations

Viewing pods

# List Core API pods
kubectl get pods -n poolside

# List model inference pods
kubectl get pods -n poolside-models

Getting logs

# Show labels to confirm selectors
kubectl get pods -n poolside --show-labels

# Get logs for all Core API pods
kubectl logs -n poolside -l 'app.kubernetes.io/name=core-api'

# Follow Core API logs
kubectl logs -f -n poolside -l 'app.kubernetes.io/name=core-api'

# Show labels to confirm the inference selector
kubectl get pods -n poolside-models --show-labels

# Get logs for all inference pods (replace with the correct label selector)
kubectl logs -n poolside-models -l '<label-selector>'

# Follow inference logs
kubectl logs -f -n poolside-models -l '<label-selector>'

Checking GPU utilization

# Check GPU status on inference pods
kubectl exec -it <inference-pod-name> -n poolside-models -- nvidia-smi