Building an HPC app with Singularity and Cloud Build

Dr. Joseph Schoonover

Last updated : March 23, 2020

How we containerized our HPC application

The goal of containerization, for most development teams, is to accelerate a continuous integration/continuous delivery (or deployment) (CI/CD) pipeline. High Performance Computing (HPC) applications present challenges for implementing a CI/CD pipeline that web service applications typically don’t encounter. Challenges include

Dependencies that often must be built from source
Access to vendor compilers ( e.g. PGI or Intel )
Access to 100’s of cores or more for testing
Access to GPU accelerators for testing

This article is the beginning of a short series, CI/CD in the cloud for HPC applications. In this article, we’ll share how we leveraged Google Cloud Platform (GCP) for automatically building Singularity images for a GPU accelerated HPC application. In a future article, we’ll show how we automate testing of our application using GCP Compute Engine resources.

If you're interested in leveraging Google Cloud's built in container registry and Docker, rather than Singularity, check out our article on Building HPC apps with Docker and Cloud Build

Our HPC Application : SELF-Fluids

SELF-Fluids is a Fortran application that solves the 3-D compressible Navier Stokes equations using a Discontinuous Galerkin Spectral Element Method(Learn more about SELF-Fluids). This code depends on HDF5 and METIS. To build with GPU acceleration, we need to use the PGI compilers.

Building HDF5 can take between 30 minutes to 1 hour, whereas building SELF-Fluids takes less than one minute. Because of this, we opt to build a “dependency image” that contains all of SELF-Fluids’ dependencies. Once the dependency image is built, it is used to build SELF-Fluids.

How to use Cloud Build with an HPC app

Google Cloud Build is a service on GCP that is used to build container images in a private, secure virtual machine. Our goal is to create a Singularity image that has our application installed using Cloud Build. From a high level, the process for achieving this is as follows :

Install/Setup gcloud SDK
Create a new GCP project for building and testing our code
Create a build step for working with Singularity
Create a dependency image with our software’s dependencies installed
Build our application with Cloud Build
Set up build triggers

All of these steps are well documented in Google Cloud’s documentation and in the community tutorials. However, the subtle n uances of using cloud build for HPC apps with dependencies requires development in steps 4 and 5.

Creating a dependency image

A container image that contains all of the prerequisites can reduce the amount of time it takes to build your application. It’s possible that a container registry like Dockerhub or Singularityhub already hosts container images with your software’s dependencies. In this article, we’ll show how we build a dependency container image for our application.

As mentioned earlier, our HPC application depends on HDF5 and METIS and requires the PGI compilers for GPU acceleration. To start, we download the PGI community edition compilers and HDF5 source code. We then upload both to a new Google Cloud Storage (GCS) Bucket called gs://self-fluids_dependencies.

From here, we set up a Dockerfile that is used to install PGI, HDF5, and METIS. Additionally, Google Cloud Build instructions are written, in YAML format, to build an image from the Dockerfile. This image, called the “dependency image”, is stored in our GCP project’s container registry after it is built.

The cloudbuild.yaml file copies HDF5, METIS, and PGI tar-balls from our GCS Bucket and makes them available for use when running docker build. Once the files are copied, the Docker build step is used to build the dependency image. Notice that the output Docker image points to gcr.io/${PROJECT_ID}/pgi:latest, which outputs the image to the Google Container Registry (GCR).

With the Dockerfile and cloudbuild.yaml files defined, we use gcloud builds to build the PGI docker container

$ gcloud builds submit .

At this point, it’s worth pointing out that directory organization is necessary when setting up this build infrastructure. We chose to create a subdirectory called cloudbuild/ that has two subdirectories, pgi/ and singularity/. The singularity directory contains a Dockerfile and cloudbuild.yaml for building the custom singularity build step. The pgi directory contains a Dockerfile and cloudbuild.yaml for building the dependency image. Additionally, it contains self-fluids.def, a singularity definition file for building our code.

We opted for this directory structure with the idea that we’ll eventually be testing builds of self-fluids with other compilers and multiple MPI flavors. Down the road, there will be pgi-mvapich, pgi-openmpi, gcc, etc. In general, the directory structure follows the format cloudbuild/${_BUILD_BASE}/. Underneath this directory, there must be a Dockerfile and cloudbuild.yaml for building the dependency image and a self-fluids.def file for building self-fluids with the dependency image.

Building our application with cloud build

Now that we have a dependency image, we’re ready to build our HPC application. To build SELF-Fluids, we add a Singularity definition file under the cloudbuild/pgi/ subdirectory, called self-fluids.def. This definition file uses the docker bootstrap and starts from the dependency image created in the previous step.

When working with cloud build, source code is copied into a virtual machine on GCP into a directory, /workspace. Because of this, the singularity container must copy in the /workspace directory in the %files section of the definition file.

The %post section contains the instructions for building self-fluids. In this case, self-fluids uses the configure, make, make install pipeline. During the configure stage, we build with GPU acceleration by using a custom configure flag called --enable-cuda. This flag is specific to the SELF-Fluids build system.

Finally, the %test section defines a test by executing sfluid ( the self-fluids binary ) with a provided set of example input files. To test the code we will launch a Google Compute Engine resource with a GPU attached. To execute the code test, we can run

$ singularity test self-fluids_${_BUILD_BASE.sif}

.provided the host VM has singularity and Nvidia drivers installed

With the singularity definition file in place, I then created a cloudbuild file in the root directory of the repository. This cloudbuild file uses the custom singularity build step to the self-fluids image from the singularity definition file.

Since the resulting image is not a docker image, it cannot be stored in the Google Container Registry. Instead, the singularity image is considered an artifact of the build. The Cloud Build API provides a method for saving artifacts to GCS buckets. Because of this, I created another GCS bucket, called gs://self-fluids-singularity for saving the singularity images. The images are stored in subdirectories, according to the branch they are built on. The directory structure in the bucket is gs://self-fluids-singularity/builds/${BRANCH_NAME}.

Now that we have the main cloudbuild file in place, we can build the singularity image of our code from the root directory of our repository.

$ gcloud builds submit .

Summary

With this setup in place, our team now has a reproducible means for building our HPC application. We can add value to this setup by integrating with build triggers for automatic build testing. If you have code in Bitbucket, Github, or Google Source Repositories, you can quickly set up build triggers. If you are working with Gitlab, an easy solution is to copy your repository to Google Source Repositories and set up GSR as a second push-only remote. From here, we defer to Google Cloud documentation on setting up build triggers.

For self-fluids, I set up a build trigger to automatically build a singularity image when a commit is made to the develop branch of the repository. When such a commit is made, the pgi docker image is pulled from the container registry and used build self-fluids from the latest source code on the develop branch. After a successful build, the singularity image is stored in the self-fluids-singularity GCS bucket.

What’s next

Now that we’re at a place where our application builds automatically. We would like to set up automatic testing of our HPC application. In the next article in this series, CI/CD in the cloud for HPC applications, I’ll share the next steps for testing a GPU accelerated HPC application on Google Cloud Platform.