Overview
This guide breaks down the AWS infrastructure cost structure for a recommended Poolside deployment on Amazon EKS, using the reference architecture defaults as the baseline. Total cost depends on the GPU instance type, payment model (On-Demand vs Reserved), high-availability configuration, data transfer volumes, and AWS region. To model your specific deployment, use the AWS Pricing Calculator.Cost categories
Poolside deployment cost on AWS comes from four categories:- GPU compute: The dominant line item. The reference architecture’s minimum supported GPU instance is
p5e.48xlarge. - Platform compute: CPU nodes that run the Poolside platform services, the AWS Load Balancer Controller, the External Secrets Operator, and other cluster add-ons.
- Data plane: Amazon RDS for PostgreSQL, Amazon S3 buckets (data, access logs, and an optional models bucket), and Amazon ECR repositories.
- Network: VPC, NAT gateways, ALB, and inter-AZ data transfer.
Reference architecture baseline
The following table reflects the reference architecture defaults at recommended sizing inus-east-1, on On-Demand pricing as of April 2026. Use these figures as a starting point for the AWS Pricing Calculator. Actual cost depends on usage, region, and payment model.
| Resource | Default | Approx. $/hr | Notes |
|---|---|---|---|
| GPU instances | 1× p5en.48xlarge | ~$63.30 | full profile only. See Reduce GPU cost for payment-model options. |
| CPU instances | 3× m5.4xlarge | ~$2.30 | Platform workloads, AWS Load Balancer Controller, External Secrets Operator, cluster add-ons. |
| RDS PostgreSQL | db.m7g.xlarge, Multi-AZ | ~$0.67 | Multi-AZ on by default. Scale via database_instance_class. |
| NAT gateways | 2 (one per AZ) | ~$0.09 | Set single_nat_gateway = true to halve this at the cost of AZ-level NAT redundancy. |
| EKS control plane | 1 cluster | $0.10 | Fixed. |
| KMS, S3, ECR, CloudWatch, ALB | n/a | < $0.05 | Collectively negligible at platform-baseline volumes. |
The reference architecture lists
p5e.48xlarge (8× H200) as the minimum supported GPU instance. p5e.48xlarge is not offered On-Demand and is currently only available via EC2 Capacity Blocks for ML in us-east-2. The table above uses p5en.48xlarge (also 8× H200, with EFAv3 networking) as the closest On-Demand-priced reference in us-east-1. For p5e.48xlarge, model the Capacity Block reservation cost separately.platform-onlyprofile: ~2,350/month)fullprofile (1 GPU node): ~48,560/month)
Reduce GPU cost
GPU cost dominates thefull profile. The main levers against On-Demand pricing:
- EC2 Instance Savings Plans or Reserved Instances: A 1–3 year commitment locked to a specific instance family and region, for example
p5eninus-east-1. Deepest discount, typically ~30–50% off On-Demand on GPU families, in exchange for the least flexibility. - Compute Savings Plans: The same commitment dollars apply across EC2, Fargate, and Lambda, and across any instance family or region. Smaller discount than EC2 Instance Savings Plans, but worth it if you already run a fleet-wide commitment or expect to migrate to a future GPU generation.
- EC2 Capacity Blocks for ML: prepaid reserved GPU capacity for a fixed window. Often the only way to reliably secure
p5,p5e, orp5encapacity at scale; effective rate is typically below On-Demand.
gpu_desired_size = 0 to keep the GPU node group defined but empty.
Components and sizing knobs
Amazon EKS cluster
EKS control plane
- Managed Kubernetes 1.29+ with an OIDC provider for IRSA
- Managed add-ons:
vpc-cni,kube-proxy,coredns,metrics-server,snapshot-controller,aws-ebs-csi-driver - The AWS Load Balancer Controller, External Secrets Operator, and (for local GPU inference) the NVIDIA GPU Operator are installed by Helm
GPU node group
p5e.48xlargeminimum instance type- 200 GB gp3 EBS volume per node, KMS-encrypted
- On-Demand, Reserved Instances, Savings Plans, or EC2 capacity reservations supported
- Scale via
gpu_desired_size,gpu_max_size
CPU node group
m5.4xlargedefault instance type- 200 GB gp3 EBS volume per node, KMS-encrypted
- 3 nodes recommended for production
- Runs core-api and supporting services
- Scale via
cpu_instance_type,cpu_desired_size,cpu_max_size
Database and storage
Amazon RDS
- PostgreSQL 16+ on
db.m7g.xlarge(default), Multi-AZ on by default - 64 GB gp3 storage (default), KMS-encrypted, AWS-managed master password
- Performance Insights and CloudWatch log exports enabled
- Scale via
database_instance_class,database_allocated_storage_gib,database_multi_az
Amazon S3
- Data bucket (model artifacts, telemetry, repositories)
- Access log bucket
- Optional models bucket for the inference checkpoint workflow (named
<deployment>-modelswhen provisioned by the reference architecture) - All buckets SSE-KMS encrypted with public access blocked
- Standard S3 storage pricing
Network infrastructure
VPC components
- Dedicated VPC with public, private worker, and private control-plane subnet tiers across multiple AZs
- NAT gateways (one per AZ by default; toggleable to a single NAT)
- S3 VPC gateway endpoint to bypass NAT for image pulls and model artifact downloads
- Internet-facing ALB managed by the AWS Load Balancer Controller
- Security Groups
Data transfer
- Inter-AZ data transfer between worker nodes and Multi-AZ RDS
- Internet ingress and egress through the ALB and NAT gateways
- S3 traffic routed through the VPC gateway endpoint at no per-GB cost
Cost estimation process
-
Work with your Poolside representative to determine:
- Expected concurrent user load
- Model inference requirements (which models, replica counts)
- High availability needs (Multi-AZ RDS, GPU replica counts)
- Data retention requirements (S3 lifecycle, audit retention)
- Use the AWS Pricing Calculator with the baseline configuration above. See the example EC2 calculator for a starting point.
Key factors that influence total cost:
- GPU instance type and quantity
- Payment model selection (On-Demand vs Reserved Instances vs Savings Plans)
- Multi-AZ deployment for RDS and the worker subnets
- Data transfer volumes between AZs and to the internet
- Geographic region selection