Building / Running Cardinal on HPC Systems
Cardinal has several dependancies, and so compiling it on HPC systems can be a daunting task. This page contains common resources and tips to streamline the process of getting Cardinal built and running on HPC systems.
Resources and Tips for Building
Almost all HPC systems use Lmod, which is a Lua-based environment management system. Whenever you load modules using Lmod the binary location for each program or library is added to your system PATH
. When you remove modules the location is removed from your system PATH
. You'll use Lmod to load the modules you need to build Cardinal - some helpful commands include:
module list MOD_NAME
, shows you any modules currently loaded which includeMOD_NAME
. If you don't specify a module name it shows all currently loaded modules.module spider MOD_NAME
, shows you all available modules which includeMOD_NAME
. If you don't specify a module name it will show you all available modules on the HPC (not recommended!).module load MOD_NAME1 MOD_NAME2
, loads the module (makes it available to your shell session) with the namesMOD_NAME1
andMOD_NAME2
.module reset
, restores your shell session to the HPC system default modules.module purge
, unloads all currently loaded modules. Some modules are mandatory and so Lmod will refuse to unload them.
Some modules will have dependencies and you will fail to load if they have not been loaded first. Some modules may be a combination of other modules (such as use.moose
and moose-dev
at INL), and so they may satisfy multiple requirements for building Cardinal. Before you start loading modules, it's best to consult the minimum requirements of MOOSE, NekRS, OpenMC, and DAGMC:
MOOSE: https://mooseframework.inl.gov/getting_started/installation/hpc_install_moose.html
NekRS: https://github.com/Nek5000/nekRS?tab=readme-ov-file#build-instructions
OpenMC: https://docs.openmc.org/en/stable/usersguide/install.html#prerequisites
DAGMC: https://svalinn.github.io/DAGMC/install/dependencies.html
Out of the above requirements MOOSE's are the strictest. They can be found below:
Minimum System Requirements
In general, the following is required for MOOSE-based development:
A POSIX compliant Unix-like operating system. This includes any modern Linux-based operating system (e.g., Ubuntu, Fedora, Rocky, etc.), or a Macintosh machine running either of the last two MacOS releases.
Hardware | Information |
---|---|
CPU Architecture | x86_64, ARM (Apple Silicon) |
Memory | 8 GB (16 GBs for debug compilation) |
Disk Space | 30GB |
Libraries | Version / Information |
---|---|
GCC | 8.5.0 - 12.2.1 |
LLVM/Clang | 10.0.1 - 16.0.6 |
Intel (ICC/ICX) | Not supported at this time |
Python | 3.9 - 3.11 |
Python Packages | packaging pyaml jinja2 |
From here, you will need to find a set of modules that satisfy these minimum requirements. This tends to involve a substantial amount of trial and error which depends a lot on what each HPC system offers. It's best to consult the documentation for your specific HPC system to see which modules they recommend for specific requirements of the MOOSE software stack. If your system does not contain the modules necessary to install Cardinal, you can contact the HPC's administrators and they will be able to install modules for you.
Do not use the MOOSE conda environment on HPC systems! You may be able to install the environment on a login node, but compute nodes often cannot access or activate the environment when you submit a job.
Be incredibly careful when loading Python modules on HPC systems, especially modules that contain a version of Anaconda. Anaconda often comes with it's own libraries that conflict with the libraries that PETSc/libMesh/WASP/MOOSE build against, which causes linker errors at the end of the build. Python dependencies installed with pip (a Python package manager) are often not available on compute nodes and so the only modules that can be installed with pip (that are not guaranteed to cause runtime errors) are packaging
and pyaml
.
Some HPC systems require that users also apply for specific file system allocations per-project. If that is the case for your system, make sure that you cd
into the project directory before you clone/compile Cardinal as your home directory may not be available from a compute node. In this tutorial we assume that you can build/run Cardinal in your home directory, but the steps/tips apply to builds done on other file systems.
Once you've loaded a set of modules that you believe will compile Cardinal, you need to set up a collection of environment variables. Assuming your Cardinal installation is located in a folder called projects
in your home/project directory, they are:
You can put these exports is your .bashrc
to avoid needing to export these environment variables every time you spin up a new shell, however this may lead to environment conflicts if you need to compile a different piece of software. Best practice dictates that you export the modules you need once while building, then reset/purge them when you're done. If you want to put module loading in your .bashrc
on a HPC system with a shared file system, make sure you do so inside a guarding if
statement to avoid the same modules getting loaded on a different HPC:
Once these variables have been exported and you've loaded a set of modules that you believe will compile Cardinal, you can run MOOSE diagnostics to make sure you haven't missed anything:
If critical errors are reported at this stage, it is likely because your current module set does not meet MOOSE's requirements. If you receive an error stating that jinja2
is missing or you are building without a MOOSE conda environment, you can safely move on.
If you want to build NekRS with GPU support, you need GPU-enabled compilers (CUDA compilers, OpenCL compilers, etc.) on the login node. Some HPC systems only allow users to load those modules on nodes which contain GPUs. If that is the case for you, you'll need to build with a job script (see the next section). To enable a GPU build, set one of the following environment variables in Cardinal's makefile to 1
(depending on the specific GPUs on your system): OCCA_CUDA_ENABLED
, OCCA_HIP_ENABLED
, or OCCA_OPENCL_ENABLED
. You'll also need to make sure you load modules with GPU compilers.
From here, you can run the commands below to build MOOSE's submodules. We recommend building with nohup
or screen
to avoid getting timed out while you're connected to a login node.
Building libMesh may fail due to a lack of libraries that are normally included by default on Linux machines. The most common culprit is libxdr
, which is required for XDR binary file output in libMesh/MOOSE. The vast majority of the time you'll use Exodus binary output when running Cardinal, and so XDR can be disabled by adding two flags to update_and_rebuild_libmesh.sh
.
There are many similar flags like this, that can disable parts of libMesh or ask libMesh to install dependencies if it couldn't find them itself. If you encounter persistent errors, ask on the MOOSE discussion page for help, and they may be able to recommend a flag for you to set. You can also see all of the available flags when building libMesh by running ./contrib/moose/libmesh/configure --help
.
If you didn't get any build errors when building PETSc/libMesh/WASP, you can run
to build MOOSE, Cardinal, and all of Cardinal's dependencies. Occasionally the MOOSE Solid Mechanics module will fail to build due to missing F77
compilers. If that is the case, you can either find mpif77 compilers in a different module set or disable the Solid Mechanics module by setting SOLID_MECHANICS := no
in Cardinal's makefile. The only MOOSE module you absolutely must have enabled is the REACTOR
module (some of Cardinal's source files link against utilities in that module). Otherwise, feel free to disable all others.
If you got an executable after following this guide (cardinal-opt
or cardinal-dbg
), congratulations! You have successfully built Cardinal on a new HPC system.
Building on a Compute Node
Some GPU systems will force you to build on a compute node to use GPU-specific compilers. This means you'll need to add all of the commands to a job script to be executed; an example of what this looks like can be found below:
You'd then submit your build job script to the HPC queue (for PBS the command is qsub job_script_name
). We dump the output of each component of the build to different logfiles; this allows us to follow the build in real-time and makes it easier to parse for errors.
In addition to building from inside a job script, you can also build in interactive mode. This is equivalent to submitting a job with no script and an interactive flag, which will open a shell terminal on a compute node after waiting in queue. For PBS the command is:
which will open an interactive job with PROC
cores in the compute queue for one hour. After getting your interactive shell, you can build Cardinal as if you were on a login node using the instructions in the previous section.
Running Cardinal on HPC Systems
The next step to getting Cardinal up and running on an HPC system is to write a job script to be submitted to the HPC queue. This will be quite similar to the job script listed in Building on a Compute Node - an annotated sample can be found below:
Building Cardinal on an HPC system is often the main challenge, getting Cardinal to run is substantially less difficult. An auxiliary challenge is running your Cardinal simulations with optimal performance, which is highly dependant on both the capabilities of Cardinal you're using (NekRS, OpenMC, or both) and the architecture of the HPC system. Here are some general tips for running Cardinal with reasonable performance:
Use all of the cores on the nodes requested to avoid wasting CPU-hours.
If running exclusively with OpenMC, attempt to use shared memory (OpenMP) parallelism as much as possible. Groups of cores on a node are bunched together into so-called numa nodes (not to be confused with the actual compute node) which all access the same memory space. numa nodes are also usually bound to physical communication hareware, and so significant performance gains can be obtained by binding one MPI rank to each numa node (per compute node) and populating the rest of the work with OpenMC threads.
As an example: for 10 compute node job on a HPC system with 8 numa nodes per compute node and 128 hardware threads per compute node you would use
mpirun -np 80 --bind-to numa --map-by numa $CARDINAL_DIR/cardinal-opt -i $input_file --num-threads=16
If running exclusively with NekRS, consult the scripts directory of the NekRS repository. They have jobscripts for several performance-optimized GPU enabled HPC systems which can serve as a template for writing your own. Note that you will need to swap the
nekrs
executables withcardinal-opt
to use these.If running a OpenMC-NekRS simulation optimize performance for NekRS.
If NekRS is running on GPUs you can get some OpenMC performance back by increasing the number of OpenMP threads. This is similar to binding by numa nodes, except NekRS requires that MPI ranks be bound with each representing a GPU.
If NekRS is running on CPUs you must increase the number of MPI ranks as NekRS doesn't support shared memory parallelism.
Finally, if you find that your simulations are running out of memory you may need to decrease the number of MPI ranks as the memory consumption per node scales (roughly) linearly with the value of -np
provided.