The UberCloud Blog

Kubernetes, Containers, and HPC Applications in Hybrid and Multi-Cloud Environments

Written by Daniel Gruber | Sep 5, 2021 7:25:11 PM

Kubernetes, Containers, and HPC Applications in Hybrid and Multi-Cloud Environments

At UberCloud our maxim is to constantly evaluate new and better HPC cloud solutions for our customers. The complexity of rolling out HPC applications in private and public clouds and hiding all the complex details from the engineers but allowing them to constantly innovate in their product design process has led us to move to more general abstraction layers over the years. Starting with installation scripts and configuration management we quickly moved towards robust and all-purpose HPC containers which can run unchanged in any environment more than half a decade ago. 
 
But containers don’t exist alone. What we ultimately needed were flexible, robust, and fully pre-configured HPC clusters being capable of running any kind of HPC application in any cloud. Kubernetes appears to us as a modern, widely accepted, and adopted cluster containment solution. But Kubernetes has its own challenges in the high-performance computing space which has to be researched and understood. Over the past two years, we published a series of three articles on HPCwire contributing to the community our view, experience, and developments using Kubernetes for managing engineering simulation workloads in any cloud.

The first article, in 2019,  Kubernetes, Containers and HPC, started this discussion by highlighting the value of containers for HPC applications and showing major differences between Kubernetes and traditional HPC workload managers. 

The second article, in 2020, Kubernetes and HPC Applications in Hybrid Cloud Environments, dived deeper into HPC workload management with Kubernetes.

 
And our most recent publication, Deploying Kubernetes based HPC Clusters in a Multi-Cloud Environment, discusses Kubernetes as an option for multi-cloud infrastructures. It highlights our recent experiences of creating 3,000 Google Kubernetes Engine clusters running Abaqus simulations in an extremely economic way while having the management infrastructure running on Azure, and using SUSE Rancher as our Kubernetes Management platform. 

Enjoy reading!