Skip to main content

Overview

This guide breaks down the AWS infrastructure cost structure for a recommended Poolside deployment on Amazon EKS, using the reference architecture defaults as the baseline. Total cost depends on the GPU instance type, payment model (On-Demand vs Reserved), high-availability configuration, data transfer volumes, and AWS region. To model your specific deployment, use the AWS Pricing Calculator.

Cost categories

Poolside deployment cost on AWS comes from four categories:
  • GPU compute: The dominant line item. The reference architecture’s minimum supported GPU instance is p5e.48xlarge.
  • Platform compute: CPU nodes that run the Poolside platform services, the AWS Load Balancer Controller, the External Secrets Operator, and other cluster add-ons.
  • Data plane: Amazon RDS for PostgreSQL, Amazon S3 buckets (data, access logs, and an optional models bucket), and Amazon ECR repositories.
  • Network: VPC, NAT gateways, ALB, and inter-AZ data transfer.

Reference architecture baseline

The following table reflects the reference architecture defaults at recommended sizing in us-east-1, on On-Demand pricing as of April 2026. Use these figures as a starting point for the AWS Pricing Calculator. Actual cost depends on usage, region, and payment model.
ResourceDefaultApprox. $/hrNotes
GPU instances1× p5en.48xlarge~$63.30full profile only. See Reduce GPU cost for payment-model options.
CPU instances3× m5.4xlarge~$2.30Platform workloads, AWS Load Balancer Controller, External Secrets Operator, cluster add-ons.
RDS PostgreSQLdb.m7g.xlarge, Multi-AZ~$0.67Multi-AZ on by default. Scale via database_instance_class.
NAT gateways2 (one per AZ)~$0.09Set single_nat_gateway = true to halve this at the cost of AZ-level NAT redundancy.
EKS control plane1 cluster$0.10Fixed.
KMS, S3, ECR, CloudWatch, ALBn/a< $0.05Collectively negligible at platform-baseline volumes.
The reference architecture lists p5e.48xlarge (8× H200) as the minimum supported GPU instance. p5e.48xlarge is not offered On-Demand and is currently only available via EC2 Capacity Blocks for ML in us-east-2. The table above uses p5en.48xlarge (also 8× H200, with EFAv3 networking) as the closest On-Demand-priced reference in us-east-1. For p5e.48xlarge, model the Capacity Block reservation cost separately.
Approximate totals at baseline:
  • platform-only profile: ~3.22/hr( 3.22/hr (~2,350/month)
  • full profile (1 GPU node): ~66.52/hr( 66.52/hr (~48,560/month)

Reduce GPU cost

GPU cost dominates the full profile. The main levers against On-Demand pricing:
  • EC2 Instance Savings Plans or Reserved Instances: A 1–3 year commitment locked to a specific instance family and region, for example p5en in us-east-1. Deepest discount, typically ~30–50% off On-Demand on GPU families, in exchange for the least flexibility.
  • Compute Savings Plans: The same commitment dollars apply across EC2, Fargate, and Lambda, and across any instance family or region. Smaller discount than EC2 Instance Savings Plans, but worth it if you already run a fleet-wide commitment or expect to migrate to a future GPU generation.
  • EC2 Capacity Blocks for ML: prepaid reserved GPU capacity for a fixed window. Often the only way to reliably secure p5, p5e, or p5en capacity at scale; effective rate is typically below On-Demand.
For cost-sensitive testing, set gpu_desired_size = 0 to keep the GPU node group defined but empty.

Components and sizing knobs

Amazon EKS cluster

EKS control plane

  • Managed Kubernetes 1.29+ with an OIDC provider for IRSA
  • Managed add-ons: vpc-cni, kube-proxy, coredns, metrics-server, snapshot-controller, aws-ebs-csi-driver
  • The AWS Load Balancer Controller, External Secrets Operator, and (for local GPU inference) the NVIDIA GPU Operator are installed by Helm

GPU node group

  • p5e.48xlarge minimum instance type
  • 200 GB gp3 EBS volume per node, KMS-encrypted
  • On-Demand, Reserved Instances, Savings Plans, or EC2 capacity reservations supported
  • Scale via gpu_desired_size, gpu_max_size

CPU node group

  • m5.4xlarge default instance type
  • 200 GB gp3 EBS volume per node, KMS-encrypted
  • 3 nodes recommended for production
  • Runs core-api and supporting services
  • Scale via cpu_instance_type, cpu_desired_size, cpu_max_size

Database and storage

Amazon RDS

  • PostgreSQL 16+ on db.m7g.xlarge (default), Multi-AZ on by default
  • 64 GB gp3 storage (default), KMS-encrypted, AWS-managed master password
  • Performance Insights and CloudWatch log exports enabled
  • Scale via database_instance_class, database_allocated_storage_gib, database_multi_az

Amazon S3

  • Data bucket (model artifacts, telemetry, repositories)
  • Access log bucket
  • Optional models bucket for the inference checkpoint workflow (named <deployment>-models when provisioned by the reference architecture)
  • All buckets SSE-KMS encrypted with public access blocked
  • Standard S3 storage pricing

Network infrastructure

VPC components

  • Dedicated VPC with public, private worker, and private control-plane subnet tiers across multiple AZs
  • NAT gateways (one per AZ by default; toggleable to a single NAT)
  • S3 VPC gateway endpoint to bypass NAT for image pulls and model artifact downloads
  • Internet-facing ALB managed by the AWS Load Balancer Controller
  • Security Groups

Data transfer

  • Inter-AZ data transfer between worker nodes and Multi-AZ RDS
  • Internet ingress and egress through the ALB and NAT gateways
  • S3 traffic routed through the VPC gateway endpoint at no per-GB cost

Cost estimation process

  1. Work with your Poolside representative to determine:
    • Expected concurrent user load
    • Model inference requirements (which models, replica counts)
    • High availability needs (Multi-AZ RDS, GPU replica counts)
    • Data retention requirements (S3 lifecycle, audit retention)
  2. Use the AWS Pricing Calculator with the baseline configuration above. See the example EC2 calculator for a starting point.
Key factors that influence total cost:
  • GPU instance type and quantity
  • Payment model selection (On-Demand vs Reserved Instances vs Savings Plans)
  • Multi-AZ deployment for RDS and the worker subnets
  • Data transfer volumes between AZs and to the internet
  • Geographic region selection
Contact your Poolside representative for detailed pricing information and optimization recommendations tailored to your workload.