Upgrade on-premises

This guide assumes that you deployed Poolside using the instructions in the Install on-premises guide.

Overview

This guide describes how to upgrade an existing on-premises Poolside deployment running on RKE2. The upgrade process includes the following phases:

Prepare the new bundle: Transfer the new bundle to the server and migrate state from the previous deployment.
Upgrade RKE2 infrastructure: Upgrade the RKE2 cluster and refresh credentials.
Upgrade infrastructure services: Upgrade container images, supporting services (database, storage, identity, cert-manager), and certificates.
Deploy the application upgrade: Deploy the upgraded Poolside application.
Upload model checkpoints: Upload any new or changed model checkpoints.
Deploy Poolside models: Deploy the uploaded models into the upgraded cluster.

Deployment bundle

The new bundle follows the same structure as the initial install. For more information, see Install on-premises.

Prerequisites

A working Poolside deployment completed with the Install on-premises guide.
The new deployment bundle provided by Poolside.
SSH access to the deployment server.
Tools installed on the server (same as initial install):
- terraform v1.8.5
- kubectl (matching the RKE2 cluster version or later)
- jq, yq v4.49.2+
- skopeo v1.18+

Downtime

The upgrade process requires a maintenance window. Poolside services are unavailable during steps 3 and 4 while the process redeploys infrastructure services and application workloads. Plan for approximately 1-2 hours depending on the number of changed container images. Notify end users before starting the upgrade.

Back up the database

Before starting the upgrade, back up the PostgreSQL database. The database runs as a pod in the poolside-services namespace.

kubectl exec -n poolside-services postgres-0 -- \
  pg_dump -h postgres.poolside-services.svc.cluster.local -U poolside -d poolside -Fc \
  > poolside-db-backup-$(date +%Y%m%d-%H%M%S).dump

The -Fc flag produces a compressed custom-format archive, which supports selective and parallel restore using pg_restore. Verify that the backup file is not empty:

ls -lh poolside-db-backup-*.dump

Store the backup in a safe location outside the bundle directory.

Preparation

Step A: Download and extract the new bundle

Poolside provides a download link for the new bundle. Download the bundle on a system with network access:

curl -o /tmp/poolside-bundle-<version>.tar "<download-url-provided-by-poolside>"

Transfer the bundle to /tmp on the deployment server before extracting it. Extract the bundle:

cd /opt
sudo tar xf /tmp/poolside-bundle-<version>.tar

Set shell variables for the old and new bundle root directories:

export OLD_BUNDLE="<path-to-previous-bundle>"
export NEW_BUNDLE="<path-to-new-bundle>"

Step B: Air-gapped upgrade setup (optional)

This configuration is required for air-gapped installations. It uses a bundled Terraform CLI configuration file instead of a standard .terraformrc file so Terraform behaves consistently when commands run as both root and a non-root user. In internet-accessible environments, you can skip this step.

To use the local Terraform provider cache included in the bundle, configure Terraform to load providers from the bundled terraform.d directory.

Locate poolside-terraform.tfrc in the root of the unpacked bundle.
Replace the $POOLSIDE_INSTALL_DIR placeholder with the fully qualified path to the new bundle’s root directory.
For all Terraform commands in this guide, prefix the command with the Terraform CLI configuration path:
```
TF_CLI_CONFIG_FILE=<path-to-bundle>/poolside-terraform.tfrc terraform <command>
```

Setting this variable ensures that both root and non-root users reference the same cached Terraform providers.

You can configure Terraform using alternative methods, such as a .terraformrc file, as described in the official HashiCorp documentation. Because the installation process runs as both root and a local user, you must ensure that both accounts are configured to reference the cached providers correctly.

Step C: Migrate configuration from the previous deployment

Poolside preserves deployment credentials in dot files in poolside-install. If this is your first upgrade to a bundle that uses preserved credential files, some files may be missing and you must create them.
On later upgrades, these files should already exist as long as you continue to preserve the poolside-install directory.

Do not skip this step.
If a file is missing, the upgrade will generate a new value, while the existing services and encrypted data still depend on the old one. This can make the deployment inaccessible or leave stored data unreadable.

Copy the poolside-install directory (contains kubeconfig, certificates, and service configuration shared across phases):
```
cp -r "$OLD_BUNDLE/poolside-install" "$NEW_BUNDLE/poolside-install"
```
Review configuration changes before copying your terraform.tfvars files. Compare them against the new bundle’s defaults to identify any new or changed variables. Poolside notes required changes in the release notes.
```
for phase in 01-infra-rke2 03-infra-services 04-poolside-deployment; do
  echo "$phase"
  diff "$OLD_BUNDLE/$phase/terraform.tfvars" "$NEW_BUNDLE/$phase/terraform.tfvars" || true
  echo
done
```

Copy custom terraform.tfvars if you modified them from defaults during initial install:

for phase in 01-infra-rke2 03-infra-services 04-poolside-deployment; do
  cp "$OLD_BUNDLE/$phase/terraform.tfvars" "$NEW_BUNDLE/$phase/terraform.tfvars"
  echo "Copied tfvars for $phase"
done

If the diff in step 2 showed new variables in the new bundle, add them to your copied terraform.tfvars files now.

Copy your Transport Layer Security (TLS) certificates into the new bundle’s phase 3 certs directory:
```
cp -r "$OLD_BUNDLE/03-infra-services/certs" "$NEW_BUNDLE/03-infra-services/certs"
```
This command assumes that custom TLS certificates are stored in $OLD_BUNDLE/03-infra-services/certs. If certificates are stored at a different location outside the bundle directory, this step is not required.

Retrieve dotfiles from the previous deployment.

cd "$OLD_BUNDLE/03-infra-services"
terraform output identity_config > identity_config.json

# .poolside_encryption_key is a static value, unless manually changed during initial installation.
echo "TbHrnO0WIOdAk50fVclUJqWRxbtFUM9k" > "$OLD_BUNDLE/poolside-install/.poolside_encryption_key"
echo "<identity-client-secret>" > "$OLD_BUNDLE/poolside-install/.identity_client_secret"
echo "<identity-admin-password>" > "$OLD_BUNDLE/poolside-install/.identity_admin_password"

Check whether .dbpassword already exists:

ls "$OLD_BUNDLE/poolside-install/.dbpassword"

# TODO: confirm how users should extract dbpassword from the database_config output.
terraform output database_config
echo "<db-password>" > "$OLD_BUNDLE/poolside-install/.dbpassword"

Destroy previous Terraform

This step is required to upgrade RKE2.
Once the Terraform phases have been destroyed, there is no way to recover the identity or database credentials if they have not already been retrieved in the previous step.

This step destroys the Terraform state, but Poolside retains the database and your existing model checkpoints. Change into the old bundle root before running the following steps:

Step 1: Delete existing models

In the Poolside Console, navigate to the Models page and delete all existing models. Models are stateless, and deleting them does not affect your data. This step ensures there are no stale model references in the Poolside Console during the upgrade process.

Step 2: Phase 5

Destroy the 05-poolside-model-upload module.

cd $OLD_BUNDLE/05-poolside-model-upload
terraform destroy

Step 3: Phase 4

Destroy the 04-poolside-deployment module.

cd $OLD_BUNDLE/04-poolside-deployment
terraform destroy

Step 4: Phase 3

Destroy the 03-infra-services module.

cd $OLD_BUNDLE/03-infra-services
sudo /usr/local/bin/terraform destroy

Step 5: Phase 2

Destroy the 02-rke2-credentials module.

cd $OLD_BUNDLE/02-rke2-credentials
sudo /usr/local/bin/terraform destroy

Step 6: Phase 1

Destroy the 01-infra-rke2 module.

cd $OLD_BUNDLE/01-infra-rke2
sudo /usr/local/bin/terraform destroy

Validate

Confirm that RKE2 is no longer running or present as a service:

ps -ef | grep rke2
sudo systemctl status rke2-server

Upgrade

Step 1: Upgrade RKE2

Upgrade the RKE2 cluster using the 01-infra-rke2 module. Using sudo, run the following commands from the 01-infra-rke2 directory.

You must run this step using sudo from the same user account that owns the deployment. Terraform uses the original user and group IDs from the sudo environment to set ownership and permissions required by later stages.

cd $NEW_BUNDLE/01-infra-rke2
sudo TF_CLI_CONFIG_FILE=<path-to-bundle>/poolside-terraform.tfrc /usr/local/bin/terraform init
sudo TF_CLI_CONFIG_FILE=<path-to-bundle>/poolside-terraform.tfrc /usr/local/bin/terraform plan
sudo TF_CLI_CONFIG_FILE=<path-to-bundle>/poolside-terraform.tfrc /usr/local/bin/terraform apply

Step 2: Refresh cluster credentials

Re-extract cluster credentials to ensure they are current. The 02-rke2-credentials module connects to the RKE2 cluster and regenerates the configuration files required by later stages. Using sudo, run the following commands from the 02-rke2-credentials directory.

You must run this step using sudo from the same user account that owns the deployment. Terraform uses the original user and group IDs from the sudo environment to set the permissions required for RKE2 cluster access in later stages.

cd $NEW_BUNDLE/02-rke2-credentials
sudo TF_CLI_CONFIG_FILE=<path-to-bundle>/poolside-terraform.tfrc /usr/local/bin/terraform init
sudo TF_CLI_CONFIG_FILE=<path-to-bundle>/poolside-terraform.tfrc /usr/local/bin/terraform plan
sudo TF_CLI_CONFIG_FILE=<path-to-bundle>/poolside-terraform.tfrc /usr/local/bin/terraform apply

Verify cluster access:

kubectl get nodes

The node should show Ready status. If you receive the following errors, your kubeconfig is likely out of date and you need to refresh it. Copy /etc/rancher/rke2/rke2.yaml to ~/.kube/config and retry:

Unable to connect to the server: tls: failed to verify certificate: x509: certificate signed by unknown authority
# or 
E0427 13:03:37.926499  176601 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"http://localhost:8080/api?timeout=32s\": dial tcp [::1]:8080: connect: connection refused"

Step 3: Upgrade infrastructure services

Upgrade supporting services (container registry, database, storage, identity provider, cert-manager) using the 03-infra-services module. Using sudo, run the following commands from the 03-infra-services directory.

This step reloads all container images into the local registry from container-images/. This can take significant time as the full image set is synced on each upgrade.

cd $NEW_BUNDLE/03-infra-services
sudo TF_CLI_CONFIG_FILE=<path-to-bundle>/poolside-terraform.tfrc /usr/local/bin/terraform init
sudo TF_CLI_CONFIG_FILE=<path-to-bundle>/poolside-terraform.tfrc /usr/local/bin/terraform plan
sudo TF_CLI_CONFIG_FILE=<path-to-bundle>/poolside-terraform.tfrc /usr/local/bin/terraform apply

After apply completes, re-export cluster certificates if the output indicates changes:

sudo bash $NEW_BUNDLE/poolside-install/export_cluster_certificates.sh

Step 4: Deploy the application upgrade

Deploy the upgraded Poolside application workloads using the 04-poolside-deployment module. Run the following commands from the 04-poolside-deployment directory. Note, this step does not require sudo as the RKE2 cluster is accessible to the current user. Air-gapped environment:

cd $NEW_BUNDLE/04-poolside-deployment
TF_CLI_CONFIG_FILE=<path-to-bundle>/poolside-terraform.tfrc terraform init
TF_CLI_CONFIG_FILE=<path-to-bundle>/poolside-terraform.tfrc terraform plan
TF_CLI_CONFIG_FILE=<path-to-bundle>/poolside-terraform.tfrc terraform apply

Step 5: Upload model checkpoints

If the upgrade includes new or changed model checkpoints, copy them to the local volume and run the upload job.

Copy new or changed model files to the local host directory: /opt/poolside/poolside-model-uploads If you customized host volume paths during initial install (01-infra-rke2), use the corresponding directory instead.

Run the following commands from the 05-poolside-model-upload directory to trigger the model upload job.

cd $NEW_BUNDLE/05-poolside-model-upload
TF_CLI_CONFIG_FILE=<path-to-bundle>/poolside-terraform.tfrc terraform init
TF_CLI_CONFIG_FILE=<path-to-bundle>/poolside-terraform.tfrc terraform plan
TF_CLI_CONFIG_FILE=<path-to-bundle>/poolside-terraform.tfrc terraform apply

To upload additional or changed models later, repeat these steps. Uploads are additive and do not remove existing models from the deployment.

Step 6: Deploy Poolside models

The final phase 06-poolside-inference deploys the Poolside models from S3-compatible storage into the RKE2 cluster. The directory contains a Terraform module, which wraps the inference Helm chart used to deploy models across all modalities.

Verify the model paths for all S3 uploaded Poolside models. The default bucket path is s3://poolside-models/, followed by the model folder. Use these paths for model_s3_uri in terraform.tfvars.
```
ls /opt/poolside/poolside-model-uploads
```
Example output:
```
laguna_xs_fp8_fp8kv_04_2026  malibu_v2_agent_int4_04_2026  point_v2_04_2026
```
From this example, the model_s3_uri values would be:
- s3://poolside-models/laguna_xs_fp8_fp8kv_04_2026
- s3://poolside-models/malibu_v2_agent_int4_04_2026
- s3://poolside-models/point_v2_04_2026

In the 06-poolside-inference directory, create or update the terraform.tfvars file.

deployment_name should match the value used for each previous phase.
model_s3_uris should include the S3 URIs for each model listed in the previous step.
models_config should include the configuration for each model, including the number of replicas and GPUs for your deployment’s hardware configuration.

deployment_name = "poolside-server"

model_s3_uris = {
  agent_small = "s3://poolside-models/laguna_xs_fp8_fp8kv_04_2026"
  agent = "s3://poolside-models/malibu_v2_agent_int4_04_2026"
  completion = "s3://poolside-models/malibu_v2_completion_int4_04_2026"
}

models_config = {
  agent_small = {
    replicas = 1
    gpus = 1
  }
  agent = {
    replicas = 1
    gpus = 2
  }
  completion = {
    replicas = 1
    gpus = 1
  }
}

Run the following commands to deploy the models.

TF_CLI_CONFIG_FILE=<path-to-bundle>/poolside-terraform.tfrc terraform init
TF_CLI_CONFIG_FILE=<path-to-bundle>/poolside-terraform.tfrc terraform apply

After deploying the models, run the following command to retrieve the name and url for each model.
```
terraform output model_urls
```
To deploy additional models, or to change the GPU or replica counts, modify terraform.tfvars and run terraform apply again.

Verification

Verify that the upgraded deployment is running successfully. Check all pods in the application namespace:

kubectl get pods -n poolside

Example output (pod suffixes vary):

NAMESPACE   NAME                                           READY   STATUS    RESTARTS   AGE
poolside    core-api-58fb46cb65-h7zrk                      1/1     Running   0          6m
poolside    core-api-58fb46cb65-x8t58                      1/1     Running   0          6m
poolside    core-api-58fb46cb65-z2t6b                      1/1     Running   0          6m
poolside    core-api-models-reconciliation-loop-0          1/1     Running   0          6m
poolside    web-assistant-6f5977c886-hcttp                 1/1     Running   0          6m
poolside    web-assistant-6f5977c886-jnrjt                 1/1     Running   0          6m
poolside    web-assistant-6f5977c886-tb6rm                 1/1     Running   0          6m

Check infrastructure services:

kubectl get pods -n poolside-services

Check model inference pods:

kubectl get pods -n poolside-models

Verify the web interface is accessible at https://<poolside-hostname> (for example, https://poolside.poolside.local or your configured hostname). Link the new models in the Poolside Console:

In the Poolside Console, navigate to Agents > Models
Click New Model.
Enter the Model Name and Base URL retrieved from Step 6: Deploy Poolside models, and optionally a Description or Name override.
Click Connect to Model.

Overview

Cloud deployment

On-premises deployment

Configuration

Metrics and telemetry

Legacy

Upgrade on-premises

Overview

Deployment bundle

Prerequisites

Downtime

Back up the database

Preparation

Step A: Download and extract the new bundle

Step B: Air-gapped upgrade setup (optional)

Step C: Migrate configuration from the previous deployment

Destroy previous Terraform

Step 1: Delete existing models

Step 2: Phase 5

Step 3: Phase 4

Step 4: Phase 3

Step 5: Phase 2

Step 6: Phase 1

Validate

Upgrade

Step 1: Upgrade RKE2

Step 2: Refresh cluster credentials

Step 3: Upgrade infrastructure services

Step 4: Deploy the application upgrade

Step 5: Upload model checkpoints

Step 6: Deploy Poolside models

Verification

Overview

Cloud deployment

On-premises deployment

Configuration

Metrics and telemetry

Legacy

​Overview

​Deployment bundle

​Prerequisites

​Downtime

​Back up the database

​Preparation

​Step A: Download and extract the new bundle

​Step B: Air-gapped upgrade setup (optional)

​Step C: Migrate configuration from the previous deployment

​Destroy previous Terraform

​Step 1: Delete existing models

​Step 2: Phase 5

​Step 3: Phase 4

​Step 4: Phase 3

​Step 5: Phase 2

​Step 6: Phase 1

​Validate

​Upgrade

​Step 1: Upgrade RKE2

​Step 2: Refresh cluster credentials

​Step 3: Upgrade infrastructure services

​Step 4: Deploy the application upgrade

​Step 5: Upload model checkpoints

​Step 6: Deploy Poolside models

​Verification

Overview

Deployment bundle

Prerequisites

Downtime

Back up the database

Preparation

Step A: Download and extract the new bundle

Step B: Air-gapped upgrade setup (optional)

Step C: Migrate configuration from the previous deployment

Destroy previous Terraform

Step 1: Delete existing models

Step 2: Phase 5

Step 3: Phase 4

Step 4: Phase 3

Step 5: Phase 2

Step 6: Phase 1

Validate

Upgrade

Step 1: Upgrade RKE2

Step 2: Refresh cluster credentials

Step 3: Upgrade infrastructure services

Step 4: Deploy the application upgrade

Step 5: Upload model checkpoints

Step 6: Deploy Poolside models

Verification