This guide assumes that you deployed Poolside using the instructions in the Install on-premises guide.
Overview
This guide describes how to upgrade an existing on-premises Poolside deployment running on RKE2. The upgrade process includes the following phases:- Prepare the new bundle: Transfer the new bundle to the server and migrate state from the previous deployment.
- Upgrade RKE2 infrastructure: Upgrade the RKE2 cluster and refresh credentials.
- Upgrade infrastructure services: Upgrade container images, supporting services (database, storage, identity, cert-manager), and certificates.
- Deploy the application upgrade: Deploy the upgraded Poolside application.
- Upload model checkpoints: Upload any new or changed model checkpoints.
- Deploy Poolside models: Deploy the uploaded models into the upgraded cluster.
Deployment bundle
The new bundle follows the same structure as the initial install. For more information, see Install on-premises.Prerequisites
- A working Poolside deployment completed with the Install on-premises guide.
- The new deployment bundle provided by Poolside.
- SSH access to the deployment server.
- Tools installed on the server (same as initial install):
terraformv1.8.5kubectl(matching the RKE2 cluster version or later)jq,yqv4.49.2+skopeov1.18+
Downtime
The upgrade process requires a maintenance window. Poolside services are unavailable during steps 3 and 4 while the process redeploys infrastructure services and application workloads. Plan for approximately 1-2 hours depending on the number of changed container images. Notify end users before starting the upgrade.Back up the database
Before starting the upgrade, back up the PostgreSQL database. The database runs as a pod in thepoolside-services namespace.
-Fc flag produces a compressed custom-format archive, which supports selective and parallel restore using pg_restore.
Verify that the backup file is not empty:
Preparation
Step A: Download and extract the new bundle
Poolside provides a download link for the new bundle. Download the bundle on a system with network access:/tmp on the deployment server before extracting it.
Extract the bundle:
Step B: Air-gapped upgrade setup (optional)
This configuration is required for air-gapped installations. It uses a bundled Terraform CLI configuration file instead of a standard
.terraformrc file so Terraform behaves consistently when commands run as both root and a non-root user. In internet-accessible environments, you can skip this step.terraform.d directory.
-
Locate
poolside-terraform.tfrcin the root of the unpacked bundle. -
Replace the
$POOLSIDE_INSTALL_DIRplaceholder with the fully qualified path to the new bundle’s root directory. -
For all Terraform commands in this guide, prefix the command with the Terraform CLI configuration path:
You can configure Terraform using alternative methods, such as a
.terraformrc file, as described in the official HashiCorp documentation. Because the installation process runs as both root and a local user, you must ensure that both accounts are configured to reference the cached providers correctly.Step C: Migrate configuration from the previous deployment
Poolside preserves deployment credentials in dot files inpoolside-install. If this is your first upgrade to a bundle that uses preserved credential files, some files may be missing and you must create them.On later upgrades, these files should already exist as long as you continue to preserve the
poolside-install directory.
-
Copy the
poolside-installdirectory (contains kubeconfig, certificates, and service configuration shared across phases): -
Review configuration changes before copying your
terraform.tfvarsfiles. Compare them against the new bundle’s defaults to identify any new or changed variables. Poolside notes required changes in the release notes. -
Copy custom
terraform.tfvarsif you modified them from defaults during initial install:If the diff in step 2 showed new variables in the new bundle, add them to your copiedterraform.tfvarsfiles now. -
Copy your Transport Layer Security (TLS) certificates into the new bundle’s phase 3 certs directory:
This command assumes that custom TLS certificates are stored in
$OLD_BUNDLE/03-infra-services/certs. If certificates are stored at a different location outside the bundle directory, this step is not required. -
Retrieve dotfiles from the previous deployment.
Check whether
.dbpasswordalready exists:
Destroy previous Terraform
This step destroys the Terraform state, but Poolside retains the database and your existing model checkpoints. Change into the old bundle root before running the following steps:Step 1: Delete existing models
In the Poolside Console, navigate to the Models page and delete all existing models. Models are stateless, and deleting them does not affect your data. This step ensures there are no stale model references in the Poolside Console during the upgrade process.Step 2: Phase 5
Destroy the05-poolside-model-upload module.
Step 3: Phase 4
Destroy the04-poolside-deployment module.
Step 4: Phase 3
Destroy the03-infra-services module.
Step 5: Phase 2
Destroy the02-rke2-credentials module.
Step 6: Phase 1
Destroy the01-infra-rke2 module.
Validate
Confirm that RKE2 is no longer running or present as a service:Upgrade
Step 1: Upgrade RKE2
Upgrade the RKE2 cluster using the01-infra-rke2 module.
Using sudo, run the following commands from the 01-infra-rke2 directory.
Step 2: Refresh cluster credentials
Re-extract cluster credentials to ensure they are current. The02-rke2-credentials module connects to the RKE2 cluster and regenerates the configuration files required by later stages.
Using sudo, run the following commands from the 02-rke2-credentials directory.
Ready status.
If you receive the following errors, your kubeconfig is likely out of date and you need to refresh it.
Copy /etc/rancher/rke2/rke2.yaml to ~/.kube/config and retry:
Step 3: Upgrade infrastructure services
Upgrade supporting services (container registry, database, storage, identity provider, cert-manager) using the03-infra-services module.
Using sudo, run the following commands from the 03-infra-services directory.
This step reloads all container images into the local registry from
container-images/. This can take significant time as the full image set is synced on each upgrade.Step 4: Deploy the application upgrade
Deploy the upgraded Poolside application workloads using the04-poolside-deployment module.
Run the following commands from the 04-poolside-deployment directory.
Note, this step does not require sudo as the RKE2 cluster is accessible to the current user.
Air-gapped environment:
Step 5: Upload model checkpoints
If the upgrade includes new or changed model checkpoints, copy them to the local volume and run the upload job.-
Copy new or changed model files to the local host directory:
/opt/poolside/poolside-model-uploadsIf you customized host volume paths during initial install (01-infra-rke2), use the corresponding directory instead. -
Run the following commands from the
05-poolside-model-uploaddirectory to trigger the model upload job. - To upload additional or changed models later, repeat these steps. Uploads are additive and do not remove existing models from the deployment.
Step 6: Deploy Poolside models
The final phase06-poolside-inference deploys the Poolside models from S3-compatible storage into the RKE2 cluster.
The directory contains a Terraform module, which wraps the inference Helm chart used to deploy models across all modalities.
-
Verify the model paths for all S3 uploaded Poolside models. The default bucket path is
s3://poolside-models/, followed by the model folder. Use these paths formodel_s3_uriinterraform.tfvars.Example output:From this example, themodel_s3_urivalues would be:s3://poolside-models/laguna_xs_fp8_fp8kv_04_2026s3://poolside-models/malibu_v2_agent_int4_04_2026s3://poolside-models/point_v2_04_2026
-
In the
06-poolside-inferencedirectory, create or update theterraform.tfvarsfile.deployment_nameshould match the value used for each previous phase.model_s3_urisshould include the S3 URIs for each model listed in the previous step.models_configshould include the configuration for each model, including the number of replicas and GPUs for your deployment’s hardware configuration.
-
Run the following commands to deploy the models.
-
After deploying the models, run the following command to retrieve the
nameandurlfor each model. -
To deploy additional models, or to change the GPU or replica counts, modify
terraform.tfvarsand runterraform applyagain.
Verification
Verify that the upgraded deployment is running successfully. Check all pods in the application namespace:https://<poolside-hostname> (for example, https://poolside.poolside.local or your configured hostname).
Link the new models in the Poolside Console:
- In the Poolside Console, navigate to Agents > Models
- Click New Model.
- Enter the Model Name and Base URL retrieved from Step 6: Deploy Poolside models, and optionally a Description or Name override.
- Click Connect to Model.