Overview
This guide describes how to update an existing Poolside deployment on Amazon Web Services (AWS). The update process is organized into the following phases:- Update the base infrastructure: Update the EKS cluster, VPC, RDS database, and supporting components.
- Update node groups and add-ons: Update CPU and GPU node groups and EKS add-ons.
- Upload container images and checkpoints: Upload updated container images and model checkpoints.
- Deploy the application update: Deploy the updated Poolside application.
Deployment bundle
The updated bundle follows the same structure as the initial install. For more information, see Install on AWS.Prerequisites
- A working Poolside deployment completed with the Install on AWS guide.
- The updated deployment bundle provided by Poolside.
- Tools installed on the host:
Environment configuration
Set the following shell variables before starting. The bash commands throughout this guide reference these variables so you can copy and paste them directly. These values should match the ones used during the initial install.Only the Poolside-provided values should change from release to release. The values that you provide should not change unless you are making a change to your environment.
Preparation
Step A: Download the updated deployment bundle
Download the updated bundle from the Poolside-provided S3 bucket:BUNDLE_DIR to the bundle root (the directory that contains containers/ and installation_steps/):
Step B: Sync Terraform containers to your registry
-
Authenticate to your container registry:
-
Import the updated Terraform containers from the bundle into your registry using
skopeo.This step syncs only the deployment containers. Application containers are synced in Step 2.5: Upload containers and model checkpoints. If your registry does not support multi-architecture images, copy a single architecture and omit the--multi-arch allflag.
Step C: Set AWS credentials
Each Terraform container requires AWS credentials. You can pass these credentials in several ways depending on your organization’s security model:-
Environment variables (recommended for most users):
-
Shared credentials volume (default for AWS CLI users):
Mount your
~/.awsdirectory into the container: - Alternative authentication: If using IAM roles, SSO, or STS federation, follow your internal process for credential injection. Poolside supports all standard AWS authentication mechanisms.
Step D: Review configuration files
Verify that yourremote.tf and terraform.tfvars files from the initial deployment are available for each step directory. Each step directory should contain:
remote.tf: Configures remote Terraform state storageterraform.tfvars: Deployment variables specific to your environmentrun_terraform.sh: Wrapper script for launching the Terraform container
If you do not have these files, obtain them from the initial deployment. Your
terraform.tfvars should not need changes unless you are intentionally modifying your environment configuration.Update
Step 1: Update the base infrastructure
Update EKS, VPC, RDS, and other supporting components using thepoolside-self-managed-infra-phase-1 container.
-
Run the
1_infra_phase_1container: Using the wrapper script:Or run manually: -
Inside the container, run Terraform commands:
- Exit the container.
If you created a bastion host during the initial deployment, you will need to access it in later steps.
Step 2: Update node groups and add-ons
Update EKS node groups (CPU and GPU) and cluster add-ons using thepoolside-self-managed-infra-phase-2 container.
-
Run the
2_infra_phase_2container: Using the wrapper script:Or run manually: -
Inside the container, run Terraform commands:
- Exit the container.
Step 2.5: Upload containers and model checkpoints
Upload updated Poolside and third-party images to your registries and update model checkpoints in S3. You can run this in parallel with Step 2: Update node groups and add-ons. Poolside provides a container image in the bundle atcontainers/poolside-container-uploader-full---<tag>. It includes the scripts and images required for fully air-gapped deployments.
-
Run the container uploader:
Using the wrapper script:
Or run manually:
-
Inside the container, run
upload-containers.shto populate ECR. The script reads registry targets from AWS Systems Manager Parameter Store (SSM). -
If new model checkpoints are available, copy them to the S3 bucket:
Poolside will provide the checkpoint details before you start this process.
Step 3: Deploy the application update
Deploy the updated Poolside workloads and ingress using thepoolside-self-managed-deployment container.
If you are using a bastion host, connect to it before running this step:
Ensure that you have the
remote.tf and terraform.tfvars files available. You can copy the run_terraform.sh script from installation_steps/aws/3_deployment to the bastion host.-
Run the deployment container:
Using the wrapper script:
Or run manually:
-
Inside the container, run Terraform commands:
Example output:
Verification
Verify that your updated Poolside deployment is running successfully.You can also view this information in the AWS Management Console using the EKS Kubernetes resources viewing guide.