Overview
This guide describes how to install the Poolside platform on Amazon Web Services (AWS) using either Amazon Bedrock or Amazon Elastic Kubernetes Service (EKS) with provisioned GPU nodes for inference. The installation process is organized into the following phases:- Create the base infrastructure: Create a VPC (or use an existing one), an EKS cluster, an RDS database, and optional ECR repositories, IAM roles, and KMS keys.
- Provision node groups and add-ons: Create CPU and GPU node groups and install EKS add-ons (AWS CNI, EBS CSI).
- Upload container images and checkpoints: Upload Poolside and third-party container images to your ECR registries.
- Deploy the application: Deploy Poolside workloads and configure ingress.
- Complete post-deployment setup: Configure DNS, OIDC, and model loading.
In standard AWS Commercial accounts, IAM/OIDC trust for EKS IRSA is handled by Terraform automatically (no manual OIDC steps required).
Deployment bundle
The bundle contains Terraform containers, example variable files, and wrapper scripts:installation_steps/aws/* directories include example terraform.tfvars files and wrapper scripts (run_terraform.sh or run.sh) for each step.
Container directories are formatted as <container_name>---<container_tag> so you can infer the container name and tags from the directory.
These directories include multi-architecture images for amd64 and arm64.
Prerequisites
- A bastion host or local machine that can access the AWS environment.
- Tools installed on the host:
- An AWS account with permissions to create VPC, EKS, RDS, IAM roles and policies, and ECR.
- If you are using Amazon Bedrock for inference, your AWS account must be allow-listed to access Poolside models (contact Poolside).
- A DNS hosted zone for the deployment hostname.
- An OCI-compatible container registry (Amazon ECR recommended). Terraform creates additional ECR repositories automatically in Step 1: Create the base infrastructure.
- Optional: A Terraform remote state S3 bucket with versioning. If you use the S3 backend and multiple users or locations run Terraform, enable DynamoDB state locking. If you run Terraform from a single host, you can use local state instead.
- If you are providing your own VPC, ensure it meets the requirements in the AWS VPC deployment guide.
If you use
aws-load-balancer-controller (ALB), you must provide an ACM certificate for your deployment domain in the same AWS region as the cluster. If you use ingress-nginx (private or air-gapped), TLS terminates inside the cluster and Terraform expects Base64-encoded certificate and key values to create the Kubernetes TLS secret.
If you need to import or request an ACM certificate, see:Environment configuration
Set the following shell variables before starting. The bash commands throughout this guide reference these variables so you can copy and paste them directly. Provided by you:Preparation
Step A: Download the deployment bundle
Download the Poolside bundle from the provided S3 location:BUNDLE_DIR to the bundle root (the directory that contains containers/ and installation_steps/):
Step B (optional): Create container registry repositories
If you do not already have a container registry, create repositories in Amazon ECR for the deployment containers:If you are using a different container registry, follow your internal guidance to create repositories.
Step C: Sync Terraform containers to your registry
-
Authenticate to your container registry before syncing the Terraform containers:
-
Import the Terraform container images from the bundle into your registry using
skopeo. Use thedir:transport to reference the extracted bundle directories.There are four deployment container images to synchronize:This step syncs only the deployment container images. Application container images are synced in Step 2.5: Upload container images and model checkpoints.poolside-self-managed-infra-phase-1: Creates VPC, EKS control plane, RDS, and ECR.poolside-self-managed-infra-phase-2: Creates CPU/GPU node groups and installs EKS add-ons.poolside-self-managed-deployment: Deploys the Poolside application.poolside-container-uploader-full: Contains and pushes all required images for the deployment.
If your registry does not support multi-architecture images, copy a single architecture and omit the--multi-arch allflag.For more details on theskopeo copycommand, see the skopeo-copy documentation. -
Optional: If you are using manual IAM, also import
poolside-aws-iamcontainer image into your registry.
Step D: Configure remote state
Each installation step must have its own unique Terraform state file. All steps can share the same S3 bucket and DynamoDB lock table, but each must use a distinctkey value in remote.tf to isolate state file.
Example remote.tf for Step 1: Create the base infrastructure:
remote.tf file to the 2_infra_phase_2 and 3_deployment directories and update the key to a unique value (for example, infra-phase-2.tfstate, deploy.tfstate).
Step E: Set AWS credentials
Each Terraform container requires AWS credentials to deploy resources. You can pass these credentials in several ways depending on your organization’s security model:-
Environment variables (recommended for most users):
-
Shared credentials volume (default for AWS CLI users):
Mount your
~/.awsdirectory into the container: - Alternative authentication: If using IAM roles, SSO, or STS federation, follow your internal process for credential injection. Poolside supports all standard AWS authentication mechanisms.
Step F (optional): Manage IAM roles and policies
This section applies only if you are not using the recommended default installation approach, which creates IAM roles and policies automatically. However, if you need to manage IAM roles externally, you can use the optionalpoolside-aws-iam container and the example file in installation_steps/aws/0_aws_iam/terraform.tfvars. You will also need to apply OIDC trust relationships after Step 1: Create the base infrastructure. A copy of the Terraform files is available in the bundle at terraform/terraform-poolside-aws-iam if you need to create the roles through your own process.
If you choose the external IAM path, update the infra terraform.tfvars files to disable IAM creation and provide the role ARNs your deployment should use.
-
Configure
installation_steps/aws/0_aws_iam/terraform.tfvars:Thedeployment_name,region, andkms_keysvalues must align with your chosen deployment settings. The KMS keys must already exist. -
Run the
poolside-aws-iamcontainer: -
Inside the container, run Terraform commands:
-
After you complete Step 1, create the EKS OIDC provider using the
irsa_requirements.eks_oidc_urloutput and set the audience tosts.amazonaws.com. Then apply IRSA trust relationships to the IAM roles listed inirsa_requirements.role_arns. See Step 1.1: OIDC and IRSA trust relationships below.
Installation
Step 1: Create the base infrastructure
Create the VPC (or use an existing one), EKS control plane, RDS, and optional ECR repositories. This step typically takes 15 to 30 minutes.If you have chosen to create the IAM roles externally, they must be created before running this step.
-
Update
installation_steps/aws/1_infra_phase_1/terraform.tfvars: -
Set
inference_configfor your inference path:- Bedrock:
enable_bedrock = true,enable_local_inference = false. - Local GPU:
enable_bedrock = false,enable_local_inference = true.
- Bedrock:
-
Run the
1_infra_phase_1container. You can use the included wrapper script or run the container manually: Using the wrapper script:Or run manually:The wrapper script usespodmanby default. Change it todockerif needed. Replace<container_repository>/poolside-self-managed-infra-phase-1:<container_version_tag>with your full container image URL. -
Inside the container, run Terraform commands:
-
Record the following outputs for use in later steps:
cluster_name(for example,poolside-<your-deployment-name>)models_bucket(S3 bucket, for example,poolside-<your-deployment-name>)bastion_public_ip(if a bastion host was created)
-
If you are using manual IAM, capture the
irsa_requirementsoutput and complete Step 1.1: OIDC and IRSA trust relationships after this step and before Step 2: Provision node groups and add-ons: - Exit the container.
Step 1.1: OIDC and IRSA trust relationships
This section only applies if you are using manual IAM role management. If you are using the default IAM path, skip this section.
-
Create the IAM OIDC provider for the EKS cluster using the
irsa_requirements.eks_oidc_urloutput. Set the audience tosts.amazonaws.com. -
Note the ARN and provider of the OIDC provider you created. For example:
- ARN:
arn:aws:iam::xxxxxxxxxxxx:oidc-provider/oidc.eks.us-east-2.amazonaws.com/id/AC1FAD5XXXXXXXXB5A016995E65314B1 - Provider:
oidc.eks.us-east-2.amazonaws.com/id/AC1FAD5XXXXXXXXB5A016995E65314B1
- ARN:
-
If you used the
poolside-aws-iamcontainer for manual IAM, update itsterraform.tfvarswith theeks_oidcvalues and re-runterraform applyto apply the trust relationships automatically. -
If you are managing IAM outside of the provided Terraform, add the following trust policy statements to each role. Replace
<ARN>and<PROVIDER>with the OIDC provider ARN and provider you created above.poolside-deploy-aws-ebs-csi:poolside-deploy-cni-pod-role:poolside-deploy-nginx-pod-role:poolside-deploy-eks-pod-role:
Step 2: Provision node groups and add-ons
Create CPU/GPU node groups and install EKS add-ons. This step typically takes 5 to 15 minutes (mostly dependent on GPU nodes if using local inference).-
Update
installation_steps/aws/2_infra_phase_2/terraform.tfvars: -
If using remote state, copy
remote.tffrom step 1 and adjust thekeyfield: -
Run the
2_infra_phase_2Terraform container: Using the wrapper script:Or run manually:The wrapper script usespodmanby default. Change it todockerif needed. Replace<container_registry>/poolside-self-managed-infra-phase-2:<container_image_tag>with your full container image URL. -
Inside the container, run Terraform commands:
- Exit the container.
Step 2.5: Upload containers and model checkpoints
Upload Poolside and third-party images to your registries and place model checkpoints in S3. You can run this in parallel with Step 2: Provision node groups and add-ons.If you enabled
ecr_create_repositories = true in Step 1: Create the base infrastructure, the required ECR repositories for application images were created automatically and will be used as targets in this step.containers/poolside-container-uploader-full---<tag>. It includes the scripts and all container images required for fully air-gapped deployments.
-
Run the container uploader:
Using the wrapper script:
Or run manually:The wrapper script is
podmanas the default option. Update it todockerif needed. Update the container image path value<container_repository>/poolside-self-managed-infra-phase-1:<container_version_tag>with full image url. Edit the run.sh file to pass theAWS_REGIONenvironment varaible using the -e flag-e AWS_REGION={AWS_REGION} \ -
Inside the container, run
upload-containers.shto populate ECR. The script reads registry targets from AWS Systems Manager Parameter Store (SSM) (populated by Step 1: Create-the-base-infrastructure if you enabled ECR creation) and uploads the required Poolside and third-party images. -
Copy model checkpoints to the S3 bucket created in Step 1: Create-the-base-infrastructure. The bucket is typically named
poolside-<your-deployment-name>:Poolside will provide the checkpoint details before you start this process. This can run in the background and does not block system availability.
Step 3: Deploy the application
Deploy the Poolside workloads, services, and ingress to the EKS cluster.If you are using a bastion host, connect to it before running this step:
-
Update
installation_steps/aws/3_deployment/terraform.tfvars: -
If using remote state, copy
remote.tfand adjust thekeyfield: -
Run the deployment container:
Using the wrapper script:
Or run manually:
-
Inside the container, run Terraform commands:
A successful apply produces output similar to:
-
Record the value of
ingress.hostname. You will use it to create your DNS record in Step 4: Complete post deployment setup.
Step 4: Complete post-deployment setup
-
Configure DNS.
Create a DNS CNAME (or an ALIAS record in Route 53) that points your domain name to the
ingress.hostnameoutput from Step 3: Deploy the application. For Route 53 instructions, see the Route 53 resource record set creation guide.The load balancer validates the request hostname. If the DNS CNAME does not match theingress.hostname, requests are rejected.If you use Cloudflare for DNS, disable proxying for the deployment hostname. Cloudflare proxying overrides the provided certificate and breaks TLS. -
Complete the initial Poolside setup.
After DNS propagation completes, open a browser and navigate to:
Follow the on-screen setup to bind your OIDC provider:
- If you use AWS Cognito, use the values outputted from Terraform.
- If you use an enterprise IdP, follow your organization’s OIDC registration flow.
- Load Poolside models. Load models through the Poolside Console after deployment and after you upload the checkpoints to S3. For more information, see Models.
Verification
Verify that your Poolside deployment is running successfully.You can also view this information in the AWS Management Console using the EKS Kubernetes resources viewing guide.
Bedrock model detection
If you are using Bedrock for inference, verify that the Poolside models are being detected:Local GPU model verification
If you are using local GPU inference, verify that the model pods are running:Troubleshooting
DNS verification
Ensure your DNS record resolves to the correct AWS load balancer:Registry authentication errors
If an image push or pull fails, re-authenticate:EKS access issues
If kubectl cannot connect, refresh your credentials:Bedrock region
If Bedrock models are not detected, verify they are available in your selected region:us-east-1 or us-west-2, or set the bedrock_region variable in your terraform.tfvars.