Skip to main content

1. Introduction

This guide assumes the administrator performing the deployment is familiar with Infrastructure as Code (IaC) and AWS services and network architectures. Before beginning, Poolside recommends that you review the entire document to understand the requirements. If you have any questions, contact Poolside before starting the installation process.

2. Architecture

The following diagram illustrates the Poolside architecture after a successful deployment. Architecture

3. AWS and tooling requirements

3.1. Requirements overview

Ensure you meet the following deployment requirements before starting the installation. This guide assumes general knowledge of the technologies involved, but provides detailed instructions to ensure clarity and ease of implementation. Poolside is available to provide assistance throughout the deployment process as needed.

3.2. AWS resources

For each instance of a Poolside deployment on AWS, you need:
  • A target VPC for Poolside installation
    • Including P5 GPU capacity
  • Subnet IDs for:
    • EKS control plane
    • Private pod networking
  • A work environment with:
    • Valid AWS credentials
    • Access to EKS control plane subnet
    • Access to pod networking subnet (for validation)
These resources must not be subject to restrictive service quotas. If limits apply, adjust them or request increases through Service Quotas.

3.3. Tools

Install the following tools in the environment where you run the deployment commands. The recommended approach is to run the deployment from a bastion host in the target VPC. If you deploy from a bastion host, install these tools on that host. If you deploy from a local machine, install them locally and ensure network access to the required AWS subnets.

3.4. Poolside assets

Obtain access to the following components from the Poolside team before starting the deployment:
  • Terraform and Helm charts
  • Three container images to store in ECR:
    • API
    • Web Assistant
    • Inference
  • Splash client
  • Model weights
  • IDE extension
Verify that you have access to all required assets before proceeding with the installation.

3.4.1. Confirm the environment has the necessary permissions

To verify the AWS CLI configuration and authentication setup, run the following command to list the IAM users, groups, and policies in the target AWS account:
aws iam get-account-authorization-details
The command output should list IAM users, groups, and policies. If it does not, verify the authentication configuration and run the command again.

3.4.2. AWS VPC and subnet setup

Use the provided network deployment Terraform script as a starting point to create the required AWS networking resources. Modify the Terraform configuration or use an alternative provisioning method if organizational requirements or constraints apply. The provided Terraform serves as a reference for the minimum resources needed. Prepare the following AWS VPC resources before proceeding with the Poolside deployment:
  • A dedicated AWS VPC
  • Two VPC subnets for inter-pod connectivity (private_subnet_ids)
  • Two VPC subnets for control plane services (control_plane_subnet_ids)

3.4.3. Network access requirements

Allow outbound network access so Poolside can ingest Git repositories and user-approved URLs. Ensure routing rules and network access controls permit these connections.

3.4.4. Select a deployment method and set up tooling

After completing the AWS networking setup, select a deployment method:
  1. Bastion host (recommended)
  2. Local machine
    • Install all required dependencies
    • Ensure network access to the target EKS environment

4. Infrastructure deployment

Use Terraform to deploy the required infrastructure. After this step completes, the core Poolside infrastructure components are in place. Before running any commands, configure the AWS CLI to use the correct AWS account and region in the environment where you run the deployment. For setup instructions, see the AWS CLI documentation.

4.1. Extract the Poolside bundle and set the working directory

Extract the Poolside files into a new directory. Set this directory as the working directory by running:
# set the working directory to the current directory
export WORKDIR=$PWD
Use the WORKDIR to move between installer file and script directories in the following steps.

4.2. Set up the configuration files

Navigate to the iac directory:
cd $WORKDIR/iac
Create the file $WORKDIR/iac/terraform.tfvars using the following template. Replace each placeholder with the appropriate values from the previous sections. If you plan to use permissions_boundary_arn, uncomment the line and provide the required IAM boundary ARN.
vpc_id = "vpc-..."
private_subnet_ids = [ "subnet-...", "subnet-..." ]
control_plane_subnet_ids = [ "subnet-...", "subnet-..." ]
enable_public_access = false
#ami_id = "ami-..."
#user_data = <<-EOT
# echo "Starting EKS node..."
# EOT
#permissions_boundary_arn = "arn:aws:iam::..."
For the GPU node group, use the AL2_x86_64_GPU AMI type or an AMI that includes NVIDIA drivers.
Verify that the file is populated correctly by running the following command and reviewing the output:
cat $WORKDIR/iac/terraform.tfvars

4.3. Deploy the infrastructure

Prepare the working directory for Terraform. Run terraform init to download the required providers and modules, initialize the backend for state storage, and validate the configuration.
terraform init
The command should complete successfully and display output similar to the following:
Terraform has been successfully initialized!
You may now begin working with Terraform. Try running "terraform plan" to view any changes required for your infrastructure. All Terraform commands should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other commands will detect it and remind you to do so if necessary.
If multiple operators will run Terraform, configure a remote state backend to prevent concurrent runs from conflicting. Add a remote.tf file in $WORKDIR/iac and replace the S3 bucket and DynamoDB table values with the appropriate resources. For additional details, see the HashiCorp S3 backend documentation.
remote.tf
terraform {
  backend "s3" {
  bucket = "TERRAFORM-STATE-BUCKET"
  key = "poolside-terraform/terraform.tfstate"
  region = "us-east-1"
  encrypt = true
  dynamodb_table = "TERRAFORM-STATE-TABLE"
  }
}
Re-run terraform init to validate the remote backend configuration and confirm that no configuration or access errors occur. After configuring the variables and, if applicable, the backend, run terraform plan to review the execution plan and see the list of resources Terraform will create:
terraform plan
The command should complete without errors and display the resource changes that Terraform will apply. To create the resources, run:
terraform apply
Terraform prompts for confirmation. Type yes to proceed. A typical run takes approximately 15 minutes and results in a configured, functional EKS cluster. On successful completion, Terraform displays several output values similar to the following:
artifacts_bucket = "poolside-xxx"
cluster_endpoint = "https://Exxx6865xxxF72DA59xxxxxxxx27B724.gr7.eu-west-1.eks.amazonaws.com"
cluster_name = "poolside-xxx"
cluster_security_group_id = "sg-0410xxxxxxxxc8392"
fluent_bit_role_arn = "arn:aws:iam::xxxxxxxxxxxx:role/eks-fluent-bit-xxxx"
inference_role_arn = ""
logs_bucket = "poolside-logs-xxxx"
logs_bucket_region = "eu-west-1"
pod_identities_enabled = true
public_access_enabled = false
rds_database = "poolside"
rds_hostname = "poolside-rds-xxx.xxxxxxxxxxxx.eu-west-1.rds.amazonaws.com"
rds_password_output = "xxxxxx"
rds_port = 5432
rds_username = "poolside"
tenant_suffix = "xxx"

4.4 Manage database credentials

Rotate RDS database credentials on a regular basis. Use existing internal processes or third-party services to manage credential rotation. To rotate the credentials using Terraform:
  1. Update the random_string keeper value in prepare_values.sh.
  2. Run terraform plan.
  3. Run terraform apply.
  4. Run the prepare_values.sh script.
You can also configure the Terraform scripts to store RDS database credentials in AWS Secrets Manager. Coordinate with your Poolside representative to adapt the deployment scripts as needed.

5. Kubernetes configuration

5.1. Connect to the Kubernetes cluster

After deploying the EKS cluster, verify that it is running. Configure your kubeconfig to point to the newly deployed cluster:
aws eks update-kubeconfig --name <your-cluster-name> --region <region-name>
Verify that the nodes are present and in a healthy, active state:
kubectl get nodes
If you encounter connection errors, review the network configuration. Ensure the EKS security group includes an inbound rule that allows access from the bastion host or local machine used for deployment.

5.2. Configure the Kubernetes cluster

Configure the Kubernetes cluster with the correct container image locations. Poolside uses two images: one for the API and one for inference. Provide the appropriate registry and tag values to the prepare_values.sh script based on where the images are stored.
  • --api-registry: Registry for the API image
  • --api-tag: Tag for the API image
  • --inference-registry: Registry for the inference image
  • --inference-tag: Tag for the inference image
Navigate to the scripts directory:
cd $WORKDIR/scripts
Ensure that the script has execute permissions:
chmod +x prepare_values.sh
Run the script and supply the appropriate values for api-registry and api-tag, for example:
sh prepare_values.sh \
  --api-registry 93xxxxxx6136.dkr.ecr.us-east-1.amazonaws.com/runtime/forge_api \
  --api-tag latest \
  --inference-registry 93xxxxxx6136.dkr.ecr.us-east-1.amazonaws.com/poolside/vllm \
  --inference-tag latest
The script generates two files in the kube/values directory:
  • core-api-generated.yaml
  • fluent-bit-generated.yaml
These files include the parameters required to configure the Kubernetes workloads.

5.3. Deploy the Poolside infra-check application with Helmfile

This step requires Helm and Helmfile. See Tools. Navigate to the kube directory:
cd $WORKDIR/kube
Install the following helm plugins:
  • helm diff
  • helm s3
  • helm secrets
  • helm git
helm plugin install https://github.com/databus23/helm-diff
helm plugin install https://github.com/hypnoglow/helm-s3.git
helm plugin install https://github.com/jkroepke/helm-secrets
helm plugin install https://github.com/aslafy-z/helm-git
After installing the required plugins, synchronize the Helm releases with the desired state defined in the Helmfile by running:
helmfile sync
On success, the terminal displays output similar to the following:
UPDATED RELEASES:
NAME           CHART                            VERSION
core-api       charts/core-api                  0.0.1
fluent-bit     fluent/fluent-bit                0.46.1
gpu-operator   nvidia/gpu-operator              v23.9.2
ingress-nginx  ingress-nginx/ingress-nginx      4.10.0

5.4. Verify that the Poolside API is running

Confirm that the system is operational by checking that the pods are running:
kubectl get pods
Retrieve and save the API URL for later use by running:
kubectl get ingress
The command outputs information similar to the following:
NAME      CLASS    HOSTS ADDRESS                               PORTS   AGE
core-api  nginx     *     hostname.elb.us-east-2.amazonaws.com  80     22h
Verify that the API is running and accessible by opening /docs in a browser using the URL associated with the core-api service. For example, if the output shows http://hostname.elb.us-east-2.amazonaws.com, open:
http://hostname.elb.us-east-2.amazonaws.com/docs
The API documentation should load in the browser.

6. DNS and SSL configuration

Configure DNS and SSL to ensure secure and reliable communication between Poolside components. These instructions describe how to use the default Poolside NGINX Ingress Controller, an AWS Certificate Manager (ACM) certificate, and a single domain with conditional logic.
This section provides high-level guidance. Adjust the steps as needed to meet specific deployment requirements.

6.1. Overview

This guide follows these steps:
  1. Select a domain or subdomain for the API, referred to as <api.domain.com> (for example, api.poolside.com).
  2. Create a DNS CNAME record that points to the Elastic Load Balancer address returned by kubectl get ingress.
  3. Request an SSL/TLS certificate for the selected domain using AWS Certificate Manager (ACM) in the same AWS region as the EKS cluster.
  4. Configure the NGINX Ingress Controller service to use the SSL certificate.
  5. Configure host-based routing so requests to the domain route correctly.
  6. Allow inbound HTTPS traffic (port 443) to the API endpoint in firewall and network policies.

6.2. SSL configuration file

Use the ingress-nginx-ssl.yaml file in the kube/values directory to configure SSL for the ingress. Populate this file with the SSL certificate details. The default configuration uses AWS Certificate Manager. Modify the configuration if a different certificate provider is required.
controller:
  service:
    annotations:
      service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
      service.beta.kubernetes.io/aws-load-balancer-ssl-cert:
        "arn:aws:acm:<region>:<account-id>:certificate/<certificate-id>"
        # Replace with your ACM certificate ARN
    service.beta.kubernetes.io/aws-load-balancer-backend-protocol: "tcp"
    service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "443"
Replace the placeholders in the file with the appropriate certificate information. This configuration applies only when the enableSSL setting in the next step is set to true.

6.3. Custom values configuration

In the kube/values directory, open custom-values.yaml and replace the placeholders with the required values:
  enableSSL: true # Set to false to disable SSL configurations
ingress:
  host: "<api.domain.com>" # Your domain or subdomain
When enableSSL is set to false (the default), the Helmfile deployment uses the non-SSL ingress configuration.

6.4. Apply changes

Save all modified files, navigate to the top-level kube directory, and reapply the Helmfile by running:
helmfile apply

6.5. Confirm the ELB hostname

Changes to the Ingress Controller Service configuration may cause Kubernetes to provision a new Elastic Load Balancer with a different DNS hostname. If this occurs, update the DNS CNAME record accordingly. In some cases, Kubernetes updates the existing load balancer without changing the hostname. To confirm the current load balancer hostname, run:
kubectl get service ingress-nginx-controller \
  -n ingress-nginx \
  -o jsonpath='{.status.loadBalancer.ingress[0].hostname}{"\n"}'
If the hostname matches the value obtained in 5.4. Verify that the Poolside API is running, DNS is configured correctly. If the hostname differs, update the CNAME record for the domain with the domain registrar.

6.6. Validate the HTTPS endpoint

After DNS and certificate changes propagate (typically within minutes, but up to 48 hours), test the HTTPS endpoint by running:
curl -v https://<api.domain.com>/docs
You can also open the URL directly in a web browser. The page should display the same API documentation shown in 5.4. Verify that the Poolside API is running.

7. OAuth and SSO configuration

Configure OAuth and Single Sign-On (SSO) by integrating Poolside with an existing identity provider (IdP) using the OpenID Connect (OIDC) protocol. This guide describes configuration using Amazon Cognito or Keycloak, and the same approach applies to any OIDC-compliant provider.

7.1. Overview

Follow these high-level steps to configure OAuth and SSO:
  1. Identify an existing (or set up a new) Identity Provider (IdP)
  2. Obtain OIDC credentials from your IdP:
    • Client ID
    • Client Secret
    • Provider URL (often called Discovery URL or Issuer URL)
  3. Configure your IdP for Poolside:
    • Set the Redirect URI to: https://<api.domain.com>/auth/callback
    • Ensure that the necessary scopes are enabled: openid, profile, email
  4. Update your values/custom-values.yaml file with the OIDC configuration
  5. Apply the configuration changes to your Poolside deployment
  6. Verify the SSO setup

7.2. Example: Configure Amazon Cognito

Complete the following steps to configure Amazon Cognito as the identity provider.
  1. Sign in to the AWS Management Console and open the Amazon Cognito service.
  2. Create a new user pool or select an existing one. If you create a new user pool:
    1. Create a new user pool in Amazon Cognito.
      1. Select Traditional web application for Define your application.
      2. Under Options for sign-in identifiers, select Email.
      3. Create the user pool and open the user pool overview.
    2. Under Authentication methods, keep the default password policy or define a custom policy as required.
    3. Under Sign-in:
      • Configure multi-factor authentication as needed.
      • Leave user account recovery settings at their defaults or adjust as required.
    4. Under Sign-up:
      • Keep self-service sign-up enabled.
      • Use the default attribute verification and confirmation settings unless specific requirements apply.
      • For proof-of-concept deployments, enable Cognito-assisted verification and confirmation.
  3. After the user pool is created, open ApplicationsApp clients and select the existing app client created by Cognito.
    1. Edit the app client information.
      • Under Authentication flows, enable ALLOW_USER_PASSWORD_AUTH and disable all other flows.
    2. Under Login pages, edit the managed login page configuration.
      • Set Allowed callback URLs to https://<api.domain.com>/auth/callback.
      • Enable the Authorization code grant OAuth 2.0 flow.
      • Enable the OpenID Connect scopes openid, email, and profile.
    3. Copy the Client ID and Client secret. These values are required for the Poolside configuration.
  4. Configure branding and a domain for authentication. For proof-of-concept deployments, the default Cognito domain is sufficient.
  5. Copy the issuerURL value from the app client example code. Use this value as the providerURL in the Poolside configuration.
  6. Proceed to Configure Poolside to complete the OAuth and SSO integration.

7.3. Example: Configure Keycloak

To use a self-managed identity provider, configure an existing Keycloak deployment or deploy Keycloak in the target environment. Ensure Keycloak is accessible over HTTPS and uses a valid SSL certificate before proceeding.
  1. Sign in to the Keycloak Admin Console using administrator credentials.
  2. Create a new realm or select an existing realm.
  3. Create a new client:
    • Set Client ID to a meaningful name for the Poolside deployment.
    • Set Client Protocol to openid-connect.
    • Set Access Type to confidential.
    • Set Valid Redirect URIs to https://<api.domain.com>/auth/callback.
  4. After creating the client, open the Credentials tab and record the Secret value.
  5. Configure client scopes:
    • Open the Client Scopes tab.
    • Ensure the openid, email, and profile scopes are selected and assigned to the client.
  6. Record the providerURL, clientID, and clientSecret.

8. Configure Poolside

Use the Splash CLI to complete the initial Poolside setup.

8.1. Prerequisites and required information

Before starting, ensure the following are available:
  1. A system with a graphical interface, browser access, and network connectivity to the Poolside deployment
  2. The Splash CLI, provided by Poolside via S3, installed and available in the system PATH
  3. The Poolside API URL (https://<api.domain.com>), as configured in 6.6. Validate the HTTPS endpoint
  4. OIDC credentials from the selected identity provider:
    • OIDC Client ID
    • OIDC Client secret
    • OIDC Provider URL
    • Redirect URL: https://<api.domain.com>/auth/callback

8.2. Retrieve the bootstrap token

On initial startup, Poolside has no tenants and generates a bootstrap token for first-time setup. Run the following command to identify a pod, retrieve its logs, and extract the bootstrap token:
kubectl logs $(kubectl get pods -l app.kubernetes.io/name=core-api -o name | head -n 1)  | grep -i 'ps-bootstrap' | grep -i 'token:' | awk -F 'token: *' '{print $2}' | head -n 1
Save the token for use in the next steps.

8.3. Configure the tenant

Create a splash-config file and replace all <...> placeholders with the appropriate configuration values:
baseURL: https://<api.domain.com>
tenantName: <any name you want for this tenant>
bootstrapToken: <bootstrap token>
oidc:
  providerURL: <OIDC Provider URL>
  clientID: <OIDC Client ID>
  clientSecret: <OIDC Client secret>
Verify that the file is saved correctly by running cat splash-config. Then run the Poolside CLI as follows:
splash bootstrap --config splash-config
On successful execution, the command displays output similar to the following:
POOLSIDE CONFIG SUMMARY

API: https://<api.domain.com>
Tenant: your tenant name
Tenant UUID: 0190exx3-f0a1-7x0a-bcx7-83xxx3484c2d

Please update your OIDC provider with the following callback URI: https://<api.domain.com>/auth/callback

Once done you can call splash login to authenticate cli and continue setup.
The Cognito configuration sets the callback URI automatically. If you use a different authentication provider, ensure the callback URI is configured on that provider.
Log in to Poolside. The first user created becomes the administrator. Run the following command:
splash login
The CLI prompts you to open a browser and authenticate with a valid user from the identity provider. On successful authentication, the Splash CLI displays Login Successful.
When using Cognito for a proof-of-concept deployment, create a user account first by completing the sign-up flow before attempting to log in.

8.4. Create a team

The first user to log in becomes the administrator by default. Create teams to onboard additional, non-administrator users. Use the splash teams create command to create a team. For example, to create a team named engineering and automatically include users with email addresses ending in @my-domain.com, run:
splash teams create engineering --condition='email.endsWith("@my-domain.com")'

9. Load models into Poolside

Poolside supports two inference mechanisms. Select the option that matches the deployment requirements.

Amazon Bedrock

Serverless inference

VPC inference

Self-managed compute

10. Configure the IDE extension

The IDE extensions are not available in public marketplaces and are provided by the Poolside team for internal distribution. Open the Poolside extension settings in the IDE and set the poolside URI to the API hostname configured in the earlier deployment steps.