resource manager

 

We were all accustomed to on-premise clusters managed through our favorite resource manager. Univa GridEngine, IBM LSF, Altair PBS Pro or another. But, since the Cloud came along, our world has been changing.

For a few years, we ignored the Cloud. It was a good fit for enterprise workloads like, CRM, ERP, office automation, “but NOT for HPC!” we exclaimed in unison. Didn’t last long.

The Cloud operators busted down our doors. At the Supercomputing 2017 Conference in Denver, it wasn’t only Amazon AWS and Microsoft Azure. Google Compute, Oracle Cloud, Penguin on Demand, IBM Softlayer and many others were there, too. The Cloud is here to stay and it will continue to change HPC.

If you’ve been sitting on the sidelines and not exploring the Cloud, you are not alone, and this is not your fault. You’ve been told the Cloud is complex, expensive and not secure; and you need a whole new software stack to manage it. At the giant Supercomputing 2017 Exhibition floor, I asked my colleagues: "Does the Cloud mean that we have to learn how to do HPC all over again?"

The good news is that it isn’t as scary as you may think. Providers addressed the cost and security aspects over the years. Yes, you need to do your budget math and yes, you need to decide how to secure your cloud. But the good news is the templates and tools are in place now.

You will not need to learn new ways because most popular resource managers are now Cloud aware. Univa GridEngine, IBM LSF, Altair PBS Pro, all tie Cloud resources to on-premise resources to manage the Hybrid environment.

Univa Resource Manager

I attended a Univa lunch-and-learn, at a quiet hotel conference room near the Supercomputing 2017 show floor. I learned that Univa® UniCloud® is the solution for organizations experiencing increasing volumes of workloads. UniCloud dynamically adjusts Cloud usage according to rules you define. UniCloud monitors workloads queuing up in your on-premise Univa Grid Engine® resource manager. Then, sends eligible workloads to a your Cloud provider, such as Microsoft Azure.

I noticed that this feature set it becoming quite common. IBM, Altair and others built similar features. Plus, the list of Cloud providers the resource managers support is getting longer.

Why was Cloud support by resource managers important for HPC users?

Because, resource managers tie HPC workloads together. This is how we abstract the complexity of HPC workloads.

Every engineer uses applications with a variety of compute requirements. Some applications require distributed memory via MPI, where others are single threaded. Workloads also differ in their need for hardware requirements. For example some workloads run faster with the aid of a co-processor.

Managing hybrid infrastructures and assigning the most appropriate resources to each workload is what resource managers are very good at. A resource manager also good at tying workloads together to form a workflow.

Cloud is here. Popular resource managers are supporting the Cloud and making it possible to mix on-premise compute resources and Cloud resources to achieve a Hybrid infrastructure. Your turn now.

Related posts

Engineering HPC Applications in Google Kubernetes Engine

Daniel Gruber | July 12, 2022

UberCloud helps engineers run their simulations with high performance and reliability. We achieve this by helping enterprises...

Related posts

Using Infiniband on Azure Kubernetes Service (AKS) for HPC Applications

Daniel Gruber | April 29, 2022
In the last years there has been a growing interest in extending the use of cloud computing for HPC applications. HPC...
Burak Yenier

Posted by: Burak Yenier

Burak is a regular speaker about High Performance Computing, Cloud and Software Containers. He is an expert in large-scale, high availability systems and cloud. As an early SaaS proponent, Burak's management experience spans software development and operations. His most recent role was as the Vice President of Operations of a Silicon Valley SaaS company in banking. Burak built the company's cloud infrastructure and operations from scratch and for scale. He also managed all the data centers and the digital payment operations. Burak simplifies the lives of engineers with powerful, easy to use compute environments in the Cloud.
New Call-to-action

Recent Articles

Popular Articles