Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.poolside.ai/llms.txt

Use this file to discover all available pages before exploring further.

This guide assumes that you deployed Poolside using the instructions in the Install on-premises guide.

Overview

This guide describes how to upgrade an existing on-premises Poolside deployment running on RKE2. The upgrade process includes the following phases:
  1. Back up the database: Back up the PostgreSQL database before starting the upgrade.
  2. Prepare the new bundle: Transfer the new bundle to the server and migrate state from the previous deployment.
  3. Upgrade RKE2 infrastructure: Upgrade the RKE2 cluster and refresh credentials.
  4. Upgrade infrastructure services: Upgrade container images, supporting services (database, storage, identity, cert-manager), and certificates.
  5. Deploy the application upgrade: Deploy the upgraded Poolside application.
  6. Upload model checkpoints: Upload any new or changed model checkpoints.
  7. Deploy Poolside models: Deploy the uploaded models into the upgraded cluster.
  8. Restore the database: Restore the Poolside database from the backup.

Deployment bundle

The new bundle follows the same structure as the initial install. For more information, see Install on-premises.

Prerequisites

  • A working Poolside deployment completed with the Install on-premises guide.
  • The new deployment bundle provided by Poolside.
  • SSH access to the deployment server.
  • Tools installed on the server (same as initial install):
    • terraform v1.8.5
    • kubectl (matching the RKE2 cluster version or later)
    • jq, yq v4.49.2+
    • skopeo v1.18+

Downtime

The upgrade process requires a maintenance window. Poolside services are unavailable during steps 3 and 4 while the process redeploys infrastructure services and application workloads. Plan for approximately 1-2 hours depending on the number of changed container images. Notify end users before starting the upgrade.

Preparation

Step A: Back up the database

Before starting the upgrade, back up the PostgreSQL database using the bundled db-snapshot.sh script. See Create a backup.

Step B: Obtain the new bundle

Obtain the new bundle from Poolside and extract it on the deployment server, using the same approach you used during the original install (see Install on-premises). Set shell variables for the old and new bundle root directories:
export OLD_BUNDLE="<path-to-previous-bundle>"
export NEW_BUNDLE="<path-to-new-bundle>"

Step C: Air-gapped upgrade setup (optional)

This configuration is required for air-gapped installations. It uses a bundled Terraform CLI configuration file instead of a standard .terraformrc file so Terraform behaves consistently when commands run as both root and a non-root user. In internet-accessible environments, you can skip this step.
To use the local Terraform provider cache included in the bundle, configure Terraform to load providers from the bundled terraform.d directory.
  1. Locate poolside-terraform.tfrc in the root of the unpacked bundle.
  2. Replace the $POOLSIDE_INSTALL_DIR placeholder with the fully qualified path to the new bundle’s root directory.
  3. For all Terraform commands in this guide, prefix the command with the Terraform CLI configuration path. Because the upgrade runs against the new bundle, point at $NEW_BUNDLE:
    TF_CLI_CONFIG_FILE=$NEW_BUNDLE/poolside-terraform.tfrc terraform <command>
    
Setting this variable ensures that both root and non-root users reference the same cached Terraform providers.
You can configure Terraform using alternative methods, such as a .terraformrc file, as described in the official HashiCorp documentation. Because the installation process runs as both root and a local user, you must ensure that both accounts are configured to reference the cached providers correctly.

Step D: Migrate configuration from the previous deployment

Poolside preserves deployment credentials in dot files in poolside-install. If this is your first upgrade to a bundle that uses preserved credential files, some files may be missing and you must create them.
On later upgrades, these files should already exist as long as you continue to preserve the poolside-install directory.
Do not skip this step.
If a file is missing, the upgrade will generate a new value, while the existing services and encrypted data still depend on the old one. This can make the deployment inaccessible or leave stored data unreadable.
  1. Retrieve or recreate dotfiles in the previous deployment’s poolside-install directory. These files must exist in $OLD_BUNDLE/poolside-install before the directory copy in the next step, so the new bundle inherits the same credentials.
    cd "$OLD_BUNDLE/03-infra-services"
    terraform output identity_config > identity_config.json
    
    # Retrieve the encryption key from the old bundle's 03-infra-services/locals.tf (poolside_deployment_encryption_key_static_value), unless it was manually changed during initial installation.
    echo "<encryption-key-secret>" > "$OLD_BUNDLE/poolside-install/.poolside_encryption_key"
    echo "<identity-client-secret>" > "$OLD_BUNDLE/poolside-install/.identity_client_secret"
    echo "<identity-admin-password>" > "$OLD_BUNDLE/poolside-install/.identity_admin_password"
    
    Check whether .dbpassword already exists:
    ls "$OLD_BUNDLE/poolside-install/.dbpassword"
    
    If the file doesn’t exist, extract the password from sensitive Terraform output and store it:
    terraform output -json database_config | jq -r '.password' > "$OLD_BUNDLE/poolside-install/.dbpassword"
    
  2. Copy the poolside-install directory (contains kubeconfig, BYO TLS certificates under byo-certs/, the dotfiles created in the previous step, and other persistent installation state shared across phases):
    cp -aT "$OLD_BUNDLE/poolside-install" "$NEW_BUNDLE/poolside-install"
    
  3. Review configuration changes before copying your terraform.tfvars files. Compare them against the new bundle’s defaults to identify any new or changed variables. Poolside notes required changes in the release notes.
    for phase in 01-infra-rke2 03-infra-services 04-poolside-deployment; do
      echo "$phase"
      diff "$OLD_BUNDLE/$phase/terraform.tfvars" "$NEW_BUNDLE/$phase/terraform.tfvars" || true
      echo
    done
    
  4. Apply your customizations to the new bundle’s terraform.tfvars Use the diff from step 3 as a migration checklist. Do not replace the new bundle’s terraform.tfvars with the old one. It may ship with new variables or updated defaults that the upgrade depends on. For each phase, open $NEW_BUNDLE/$phase/terraform.tfvars and copy across only the values you customized in the old bundle (for example, domain names, Classless Inter-Domain Routing (CIDR) blocks, sizing). Leave any net-new variables at their new defaults unless the release notes specify otherwise.
    If you used bring-your-own (BYO) TLS certificates during the original install, the custom_certificates and custom_ca_trust_chain paths in your 03-infra-services/terraform.tfvars reference files inside <path-to-bundle>/poolside-install/byo-certs/. Update those paths to point at $NEW_BUNDLE/poolside-install/byo-certs/ so they resolve against the new bundle.

Destroy previous Terraform

This step is required to upgrade RKE2. Destroying the Terraform phases permanently removes the identity and database credentials from Terraform state. Confirm that you have saved them to the dotfiles in poolside-install (step D) before continuing. Otherwise, the upgraded deployment will be unable to access your existing database and identity provider.
Destroying Terraform state does not delete your Poolside application data: the PostgreSQL database and uploaded model checkpoints persist on the cluster, and the new bundle reuses them.

Step 1: Delete existing models

In the Poolside Console, navigate to Agents > Models and delete all existing models. Models are stateless, and deleting them does not affect your data. This step ensures there are no stale model references in the Poolside Console during the upgrade process.

Step 2: Destroy Phase 5

Destroy the 05-poolside-model-upload module.
cd $OLD_BUNDLE/05-poolside-model-upload
terraform destroy

Step 3: Destroy Phase 4

Destroy the 04-poolside-deployment module.
cd $OLD_BUNDLE/04-poolside-deployment
terraform destroy

Step 4: Destroy Phase 3

Destroy the 03-infra-services module.
cd $OLD_BUNDLE/03-infra-services
sudo /usr/local/bin/terraform destroy

Step 5: Destroy Phase 2

Destroy the 02-rke2-credentials module.
cd $OLD_BUNDLE/02-rke2-credentials
sudo /usr/local/bin/terraform destroy

Step 6: Destroy Phase 1

Destroy the 01-infra-rke2 module.
cd $OLD_BUNDLE/01-infra-rke2
sudo /usr/local/bin/terraform destroy

Validate

Confirm that RKE2 is no longer running or present as a service:
ps -ef | grep rke2
sudo systemctl status rke2-server

Upgrade

Step 1: Upgrade RKE2

Upgrade the RKE2 cluster using the 01-infra-rke2 module. Using sudo, run the following commands from the 01-infra-rke2 directory.
You must run this step using sudo from the same user account that owns the deployment. Terraform uses the original user and group IDs from the sudo environment to set ownership and permissions required by later stages.
cd $NEW_BUNDLE/01-infra-rke2
Air-gapped environment:
sudo TF_CLI_CONFIG_FILE=$NEW_BUNDLE/poolside-terraform.tfrc /usr/local/bin/terraform init
sudo TF_CLI_CONFIG_FILE=$NEW_BUNDLE/poolside-terraform.tfrc /usr/local/bin/terraform plan
sudo TF_CLI_CONFIG_FILE=$NEW_BUNDLE/poolside-terraform.tfrc /usr/local/bin/terraform apply
Online environment:
sudo /usr/local/bin/terraform init
sudo /usr/local/bin/terraform plan
sudo /usr/local/bin/terraform apply

Step 2: Refresh cluster credentials

Re-extract cluster credentials to ensure they are current. The 02-rke2-credentials module connects to the RKE2 cluster and regenerates the configuration files required by later stages. Using sudo, run the following commands from the 02-rke2-credentials directory.
You must run this step using sudo from the same user account that owns the deployment. Terraform uses the original user and group IDs from the sudo environment to set the permissions required for RKE2 cluster access in later stages.
cd $NEW_BUNDLE/02-rke2-credentials
Air-gapped environment:
sudo TF_CLI_CONFIG_FILE=$NEW_BUNDLE/poolside-terraform.tfrc /usr/local/bin/terraform init
sudo TF_CLI_CONFIG_FILE=$NEW_BUNDLE/poolside-terraform.tfrc /usr/local/bin/terraform plan
sudo TF_CLI_CONFIG_FILE=$NEW_BUNDLE/poolside-terraform.tfrc /usr/local/bin/terraform apply
Online environment:
sudo /usr/local/bin/terraform init
sudo /usr/local/bin/terraform plan
sudo /usr/local/bin/terraform apply
Verify cluster access:
kubectl get nodes

The node should show Ready status. If you receive the following errors, your kubeconfig is likely out of date and you need to refresh it. Copy /etc/rancher/rke2/rke2.yaml to ~/.kube/config and retry:
Unable to connect to the server: tls: failed to verify certificate: x509: certificate signed by unknown authority
# or 
E0427 13:03:37.926499  176601 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"http://localhost:8080/api?timeout=32s\": dial tcp [::1]:8080: connect: connection refused"

Step 3: Upgrade infrastructure services

Upgrade supporting services (container registry, database, storage, identity provider, cert-manager) using the 03-infra-services module. Using sudo, run the following commands from the 03-infra-services directory.
You must run this step using sudo from the same user account that owns the deployment. Terraform uses the original user and group IDs from the sudo environment to set the permissions required for RKE2 cluster access in later stages.
This step reloads all container images into the local registry from container-images/. This can take significant time as the full image set is synced on each upgrade.
cd $NEW_BUNDLE/03-infra-services
Air-gapped environment:
sudo TF_CLI_CONFIG_FILE=$NEW_BUNDLE/poolside-terraform.tfrc /usr/local/bin/terraform init
sudo TF_CLI_CONFIG_FILE=$NEW_BUNDLE/poolside-terraform.tfrc /usr/local/bin/terraform plan
sudo TF_CLI_CONFIG_FILE=$NEW_BUNDLE/poolside-terraform.tfrc /usr/local/bin/terraform apply
Online environment:
sudo /usr/local/bin/terraform init
sudo /usr/local/bin/terraform plan
sudo /usr/local/bin/terraform apply
After apply completes, re-export cluster certificates if the output indicates changes:
sudo bash $NEW_BUNDLE/poolside-install/export_cluster_certificates.sh

Step 4: Deploy the application upgrade

Deploy the upgraded Poolside application workloads using the 04-poolside-deployment module. Run the following commands from the 04-poolside-deployment directory. This step does not require sudo because the RKE2 cluster is accessible to the current user.
cd $NEW_BUNDLE/04-poolside-deployment
Air-gapped environment:
TF_CLI_CONFIG_FILE=$NEW_BUNDLE/poolside-terraform.tfrc terraform init
TF_CLI_CONFIG_FILE=$NEW_BUNDLE/poolside-terraform.tfrc terraform plan
TF_CLI_CONFIG_FILE=$NEW_BUNDLE/poolside-terraform.tfrc terraform apply
Online environment:
terraform init
terraform plan
terraform apply

Step 5: Upload model checkpoints

If the upgrade includes new or changed model checkpoints, copy them to the local volume and run the upload job.
  1. Copy new or changed model files to the local host directory: /opt/poolside/poolside-model-uploads If you customized host volume paths during initial install (01-infra-rke2), use the corresponding directory instead.
  2. Run the following commands from the 05-poolside-model-upload directory to trigger the model upload job.
    cd $NEW_BUNDLE/05-poolside-model-upload
    
    Air-gapped environment:
    TF_CLI_CONFIG_FILE=$NEW_BUNDLE/poolside-terraform.tfrc terraform init
    TF_CLI_CONFIG_FILE=$NEW_BUNDLE/poolside-terraform.tfrc terraform plan
    TF_CLI_CONFIG_FILE=$NEW_BUNDLE/poolside-terraform.tfrc terraform apply
    
    Online environment:
    terraform init
    terraform plan
    terraform apply
    
  3. To upload additional or changed models later, repeat these steps. Uploads are additive and do not remove existing models from the deployment.

Step 6: Deploy Poolside models

The final phase 06-poolside-inference deploys the Poolside models from S3-compatible storage into the RKE2 cluster. The directory contains a Terraform module, which wraps the inference Helm chart used to deploy models across all modalities.
  1. Verify the model paths for all uploaded Poolside models in S3-compatible storage. The default bucket path is s3://poolside-models/, followed by the model folder. Use these paths for model_s3_uri in terraform.tfvars.
    ls /opt/poolside/poolside-model-uploads
    
    Example output:
    laguna_xs_fp8_fp8kv_04_2026  malibu_v2_agent_int4_04_2026  point_v2_04_2026
    
    From this example, the model_s3_uri values would be:
    • s3://poolside-models/laguna_xs_fp8_fp8kv_04_2026
    • s3://poolside-models/malibu_v2_agent_int4_04_2026
    • s3://poolside-models/point_v2_04_2026
  2. In the 06-poolside-inference directory, create or update the terraform.tfvars file.
    • deployment_name should match the value used for each previous phase.
    • model_s3_uris should include the S3 URIs for each model listed in the previous step.
    • models_config should include the configuration for each model, including the number of replicas and GPUs for your deployment’s hardware configuration.
    deployment_name = "poolside-server"
    
    model_s3_uris = {
      agent_small = "s3://poolside-models/laguna_xs_fp8_fp8kv_04_2026"
      agent = "s3://poolside-models/malibu_v2_agent_int4_04_2026"
      completion = "s3://poolside-models/malibu_v2_completion_int4_04_2026"
    }
    
    models_config = {
      agent_small = {
        replicas = 1
        gpus = 1
      }
      agent = {
        replicas = 1
        gpus = 2
      }
      completion = {
        replicas = 1
        gpus = 1
      }
    }
    
  3. Run the following commands to deploy the models. Air-gapped environment:
    TF_CLI_CONFIG_FILE=$NEW_BUNDLE/poolside-terraform.tfrc terraform init
    TF_CLI_CONFIG_FILE=$NEW_BUNDLE/poolside-terraform.tfrc terraform apply
    
    Online environment:
    terraform init
    terraform apply
    
  4. After deploying the models, run the following command to retrieve the name and url for each model.
    terraform output model_urls
    
  5. To deploy additional models, or to change the GPU or replica counts, modify terraform.tfvars and run terraform apply again.

Step 7: Restore the database

Restore the Poolside database from the backup you created in Step A: Back up the database using the bundled db-snapshot.sh script. See Restore from a backup.

Verification

Verify that the upgraded deployment is running successfully. Check all pods in the application namespace:
kubectl get pods -n poolside
Example output (pod suffixes vary):
NAMESPACE   NAME                                           READY   STATUS    RESTARTS   AGE
poolside    core-api-58fb46cb65-h7zrk                      1/1     Running   0          6m
poolside    core-api-58fb46cb65-x8t58                      1/1     Running   0          6m
poolside    core-api-58fb46cb65-z2t6b                      1/1     Running   0          6m
poolside    core-api-models-reconciliation-loop-0          1/1     Running   0          6m
poolside    web-assistant-6f5977c886-hcttp                 1/1     Running   0          6m
poolside    web-assistant-6f5977c886-jnrjt                 1/1     Running   0          6m
poolside    web-assistant-6f5977c886-tb6rm                 1/1     Running   0          6m
Check infrastructure services:
kubectl get pods -n poolside-services
Check model inference pods:
kubectl get pods -n poolside-models
Verify the web interface is accessible at https://<poolside-hostname> (for example, https://poolside.poolside.local or your configured hostname). Link the new models in the Poolside Console:
  • In the Poolside Console, navigate to Agents > Models.
  • Click New Model.
  • Enter the Model Name and Base URL retrieved from Step 6: Deploy Poolside models, and optionally a Description or Model Name Override.
  • Click Connect to Model.