Introduction
Poolside on-premises allows you to deploy the Poolside platform on your own hardware. This is useful for organizations that want to keep their data on-premises, have specific security requirements, must be air-gapped, have limited internet access, or have compliance requirements that prevent them from using cloud-based solutions.On-premises hardware options
Poolside offers multiple on-premises deployment options optimized for different team sizes and workload mixes. Larger HGX-based systems are intended for enterprise-scale usage, while RTX-based workstation configurations are suitable for smaller teams and departmental deployments.| Option 1: Customer-provided hardware (BYO) | Option 2: Turnkey HGX rack (Dell or Supermicro) | Option 3: Turnkey GPU workstation tower | Option 4: Turnkey GPU workstation rack | |
|---|---|---|---|---|
| GPU configuration | 8× H200 (recommended)* | 8× H200 | 4× RTX 6000 | 8× RTX 6000 (5U) |
| Description | Suitable for large enterprise teams | Fully integrated HGX rack solution validated by Poolside | Workstation-based option for smaller teams and individual groups | Rack-mounted workstation option for mid-sized teams |
| Maximum recommended users | 150–200 developers | 150–200 developers | Up to 20 developers | Up to 50 developers |
| Operating system | Ubuntu 22.04 LTS or RHEL 9.6 | - | - | - |
| CPU | Customer-provided CPU (128+ cores, 3.0 GHz or higher) | 2× AMD EPYC 9555 (64 cores, 3.2–4.4 GHz) | Intel Xeon w9-3575X (44 cores) | 2× Intel Xeon 6960P (72 cores each) |
| GPU | 8× NVIDIA H200 SXM (HGX baseboard, 1128 GB total VRAM) | 8× NVIDIA H200 SXM (HGX baseboard, 1128 GB total VRAM) | 4× NVIDIA RTX 6000 Blackwell Max-Q (PCIe, 96 GB GDDR6 each) | 8× NVIDIA RTX 6000 Blackwell Server Edition (PCIe, 96 GB GDDR6 each) |
| Memory | 1 TB DDR5 recommended (512 GB minimum for low-concurrency or PoC environments) | 1 TB DDR5 (12× 96 GB, 4800 MT/s) | 512 GB DDR5 (8× 64 GB, 4800 MT/s) | 1 TB DDR5 (16× 64 GB, 4800 MT/s) |
| Storage | 1 TB NVMe (OS) + 10 TB NVMe (scratch) | 1.92 TB NVMe (OS) + 10 TB NVMe (scratch) | 2× 2 TB NVMe (OS) + 2× 4 TB NVMe (data, RAID1) | 2× 4 TB NVMe (OS) + 2× 13 TB NVMe (data, RAID1) |
| Network | Dual 10G+ ethernet, 1G IPMI | Dual 10G RJ45, 1G IPMI | 10 GbE NIC | Dual 10 GbE NICs |
Sizing and deployment notes
- Customer-provided hardware (option 1) can start with 4× H200 GPUs; however, capacity must be validated based on your intended workload.
- Team size estimates assume mixed usage of chat, completion, and agent workloads.
- Actual capacity depends on concurrency levels, model selection, and usage patterns.
- Poolside validates all on-premises hardware configurations before deployment.
- Review the official power and electrical specifications provided by the hardware vendor before deployment. Certain workstation configurations may require dedicated high-capacity circuits and specialized cooling. Confirm your hosting environment meets the required power and cooling specifications.
Architecture
All on-premises deployments currently operate as single-node Kubernetes clusters. While you can configure multiple model replicas for increased throughput, this architecture does not provide high availability against hardware failures. Multi-node Kubernetes clusters and high availability for on-premises deployments are not currently supported. The on-premises architecture includes:- RKE2 Kubernetes
- PostgreSQL database
- S3-compatible object storage
- Keycloak identity management (optional)
- Container registry
- cert-manager
- Poolside platform

Installation
Poolside on-premises uses a step-based Terraform deployment process:- Infrastructure provisioning: Provision and configure the Kubernetes (RKE2) cluster.
- Access configuration: Configure cluster credentials and access required for deployment.
- Supporting services: Deploy the infrastructure services required by the Poolside platform.
- Platform deployment: Deploy the Poolside platform components.
- Model upload: Upload Poolside models so they are available to the platform.
Important considerations
- Single-node architecture: Multi-node Kubernetes and high availability are not supported
- Model requirements: Unquantized models require H100/H200 GPUs
- GPU allocation: Malibu (chat) models require 2x, 4x, or 8x GPU configurations
Operational responsibilities
For all on-premises deployments, your organization is responsible for:- Infrastructure resilience: Power redundancy, cooling, and physical security
- Data protection: Backup strategies for the PostgreSQL database and object storage
- System monitoring: Resource utilization, health checks, and alerting infrastructure
- Network security: Firewall rules, network segmentation, and access controls
- Network bandwidth: Sufficient bandwidth for model downloads (100 GB+) and user traffic
- Capacity planning: Scaling decisions based on user load and model requirements
- Disaster recovery: Business continuity planning and recovery procedures