awesome-hpc

HPC suite

A collection of tools and resources for building high-performance computing systems

A collection of Awesome HPC software and tools

GitHub

108 stars
7 watching
11 forks
last commit: 4 months ago
Linked from 1 awesome list

awesomeawesome-list

Awesome HPC / Provisioning

Grendel Bare Metal Provisioning system for HPC Linux clusters ( )
XCat xCAT is a toolkit for deployment and administration of clusters of all sizes ( )
Warewulf Warewulf is a stateless and diskless container operating system provisioning system for large clusters of bare metal and/or virtual systems ( )
Rocks A Linux distribution for developing Linux clusters
Cobbler Cobbler is a Linux installation server that allows for rapid setup of network installation environments ( )
Base Command Manager Base Command Manager allows administrator to quickly build and manage heterogeneous clusters
Scyld Scyld Clusterware Scyld ClusterWare is developed based on the continuing evolution of Beowulf clusters first developed at NASA in the 1990s
BlueBanquise BlueBanquise is an open source cluster deployment and management stack built on Python and Ansible ( )

Awesome HPC / Workload Managers

Slurm A free and open source job scheduler ( )
LSF A job scheduler and workload management software developed by IBM
Moab Moab is a workload management and job scheduler
Torque Torque is a workload management and job scheduler
OpenLava OpenLava is a workload management and job scheduler
UGE/SGE Univa Grid Engine is a workload management engine for HPC
Volcano Volcano is a batch system built on Kubernetes
Maui Maui is a workload management and job scheduler
Kube Batch 1,080 over 1 year ago A batch scheduler of kubernetes for high performance workload, e.g. AI/ML, BigData, HPC
OpenPBS OpenPBS® software optimizes job scheduling and workload management in high-performance computing (HPC) environments ( )

Awesome HPC / Pipelines

Nextflow Data drive computational pipelines
Cromwell Scientific workflow engine designed for simplicity & scalability ( )
Pegasus A configurable system for mapping and executing scientific workflows over a wide range of computational infrastructure ( )

Awesome HPC / Applications

Spack A flexible package manager that supports multiple versions, configurations, platforms, and compilers ( )
EasyBuild EasyBuild - building software with ease ( )

Awesome HPC / Compilers

Nvidia NVIDIA HPC compiler suite for Fortran, C/C++ with OpenACC
Portland Group The Portland Group compilers were Fortran, C/C++ compilers now integrated into NVIDIA HPC SDK
Intel The Intel compiler suite offers many language compilers for use in the HPC space
Cray A suite of compilers designed and optimized to target the AMD interlagos instruction set
GNU The GNU Compiler Collection is a suite of compilers targeting many languages ( )
LLVM The LLVM project is a collection of modular compilers and toolchains ( )

Awesome HPC / MPI

OpenMPI OpenMPI is an open source implementation of the MPI-3.1 standard ( )
MPICH MPICH is a high-performance and widely portable implementation of the MPI-3.1 standard ( )
MVAPICH MVAPICH is an open source implementation of the MPI-3.1 standard developed by Ohio State University
Intel-MPI Intel-MPI is Intel's MPI-3.1 implementation included in their compiler suite

Awesome HPC / Parallel Computing

ArrayFire A general purpose tensor library that simplifies the process of software development for parallel architectures
OpenMP OpenMP is an application programming interface that supports multi-platform shared-memory multiprocessing programming

Awesome HPC / Benchmarking

OSU Benchmarks A collection of benchmarking tools for MPI developed by Ohio State University
Intel MPI Benchmarks A set of benchmarks developed by Intel for use with their Intel MPI
HPCC Systems HPCC Systems (High Performance Computing Cluster) is an open source, massive parallel-processing computing platform for big data processing and analytics ( )
LINPACK LINPACK is a set of efficient fortran subroutines for solving linear systems which benchmarks are useful for HPC
IOzone IOzone is a filesystem benchmark tool
IOR Interleaved or Random is a useful benchmarking tool for testing parallel filesystems
MDtest MDtest is an MPI-based application for evaluating the metadata performance of a file system
FIO Flexible I/O is an advanced disk benchmark that depends upon the kernel's AIO access library ( )
elbencho 170 26 days ago A distributed storage benchmark for files, objects & blocks with support for GPUs

Awesome HPC / Miscellaneous

OpenOnDemand Open OnDemand helps computational researchers and students efficiently utilize remote computing resources by making them easy to access from any device ( )
Open XDMod Open XDMoD is an open source tool to facilitate the management of high performance computing resources ( )
Coldfront ColdFront is an open source resource allocation system designed to provide a central portal for administration, reporting, and measuring scientific impact of HPC resources ( )
Pavilion2 Pavilion is a Python 3 (3.6+) based framework for running and analyzing tests targeting HPC systems ( )
Reframe A powerful Python framework for writing and running portable regression tests and benchmarks for HPC systems. ( )
OLCF Test Harness The OLCF Test Harness (OTH) helps automate the testing of applications, tools, and other system software ( )
GoSlmailer 42 about 2 months ago Goslmailer is a drop-in notification delivery solution for slurm that can do slack, mattermost, teams, and more

Awesome HPC / Performance

TotalView TotalView is a debugging tool for HPC applications
Tau TAU Performance System® is a portable profiling and tracing toolkit for performance analysis of parallel programs written in Fortran, C, C++, UPC, Java, Python
Valgrind Valgrind is a tool designed to profile programs to determine memory leaks ( )
Paraver Paraver is a very flexible data browser that is part of the CEPBA-Tools toolkit
PAPI Performance Application Programming Interface (PAPI) is a performance analysis tool ( )

Awesome HPC / Parallel Shells

pdsh pdsh runs terminal commands across multiple hosts in parallel ( )
ClusterShell Scalable cluster administration Python framework ( )

Awesome HPC / Containers

Apptainer Apptainer is an open source container system ( )
Charliecloud Charliecloud provides user-defined software stacks (UDSS) for high-performance computing (HPC) centers ( )
Docker Docker is a set of platform as a service products that use OS-level virtualization to deliver software in packages called containers
uDocker A basic user tool to execute simple docker containers in batch or interactive systems without root privileges ( )
Shifter Shifter is Linux containers for HPC ( )
HPC Container Maker 457 21 days ago HPC Container Maker is an open source tool to make it easier to generate container specification files.
Scarus 129 22 days ago An OCI-compatible container engine for HPC
Singularity HPC Singularity Registry HPC (shpc) allows you to install containers as modules ( )

Awesome HPC / Environment Management

Lmod Lmod: An Environment Module System based on Lua, Reads TCL Modules, Supports a Software Hierarchy ( )
Environment Modules Environment Modules: provides dynamic modification of a user's environment ( )
Anaconda Anaconda is a Python and R distribution for use in computational science
Mamba Mamba is a reimplementation of the conda package manager in C++ ( )

Awesome HPC / Visualization

Visit VisIt - Visualization and Data Analysis for Mesh-based Scientific Data ( )
Paraview ParaView is an open-source, multi-platform data analysis and visualization application based on Visualization Toolkit (VTK) ( )

Awesome HPC / Parallel Filesystems

GPFS GPFS is a high-performance clustered file system software developed by IBM
Quobyte A high performance filesystem
Ceph Ceph is a distributed object, block, and file storage platform ( )
Weka A file system designed for HPC
Lustre/Exascaler Lustre is an open-source, distributed parallel file system software platform designed for scalability, high-performance, and high-availability ( )
BeeGFS BeeGFS is a hardware-independent POSIX parallel file system developed with a strong focus on performance and designed for ease of use, simple installation, and management
OrangeFS OrangeFS is a next generation parallel file system for Linux clusters ( )
MooseFS Moose File System is an Open-source, POSIX-compliant distributed file system developed by Core Technology ( )

Awesome HPC / Programming Languages

Julia Julia is a high-level, high-performance dynamic language for technical computing
Futhark Futhark is a purely functional data-parallel programming language in the ML family
Chapel Chapel is a programming language designed for productive parallel computing at scale

Awesome HPC / Monitoring / Prometheus Based

Slurm Exporter 0 over 3 years ago Prometheus exporter for performance metrics from Slurm
Slurm Exporter 14 4 months ago Slurm Exporter for Prometheus using Rest API
Infiniband Exporter 51 3 months ago The InfiniBand exporter collects counters from InfiniBand switches and HCAs
Cgroup Exporter 19 about 1 month ago Produces metrics from cgroups
Cgroup Exporter 4 over 3 years ago A Prometheus exporter for cgroup-level metrics
GPFS Exporter 38 8 months ago The GPFS exporter collects metrics from the GPFS filesystem
Lustre Exporter 19 27 days ago Prometheus exporter for use with the Lustre parallel filesystem
DCGM Exporter 920 17 days ago NVIDIA GPU metrics exporter for Prometheus leveraging DCGM

Awesome HPC / Journals

Journal of Super Computing An International Journal of High-Performance Computer Design, Analysis, and Use

Awesome HPC / Podcasts

This week in HPC Each week, Intersect360 Research CEO Addison Snell and HPCwire editor Tiffany Trader dissect the week's top HPC stories
Exascaler Project ECP's Let's Talk Exascale podcast goes behind the scenes to chat with some of the people who are bringing a capable and sustainable exascale computing ecosystem to fruition
@HPCpodcast Join Shahin Khan and Doug Black as they discuss Supercomputing technologies and the applications, markets, and policies that shape them

Awesome HPC / Blogs

HPCWire Since 1987 covering the fastest computers in the world and the people who run them
InsideHPC insideHPC is a global publication recognized for its comprehensive and insightful coverage of the HPC-AI community, linking vendors, end-users and HPC strategists
The Next Platform Offers in-depth coverage of high-end computing at large enterprises, supercomputing centers, hyperscale data centers, and public clouds
The Register HPC The Register is a leading and trusted global online enterprise technology news publication, reaching roughly 40 million readers worldwide
HPC at Dell High-Performance Computing knowledge base articles from Dell

Awesome HPC / Conferences

Pearc Practice & Experience in Advanced Research Computing
Supercomputing (SC) The International Conference for High Performance Computing, Networking, Storage, and Analysis
Supercomputing International (ISC) The International Conference for High Performance Computing, Networking, Storage, and Analysis
CCGrid IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing
IEEE-HPEC IEEE High Performance Embedded Computing
Hot Chips Semiconductor industry's leading conference on high-performance microprocessors and related circuits
Hot Interconnects IEEE conference on software architectures and implementations for interconnection networks of all scales
ESSA Workshop on Extreme-Scale Storage and Analysis
IEEE-IPDPS IEEE International Parallel & Distributed Processing Symposium
ESPM2 Workshop International Workshop on Extreme Scale Programming Models and Middleware
LCI Workshops The Linux Clusters Institute (LCI) is providing education and advanced technical training for the deployment and use of computing clusters to the high performance computing community worldwide
HPC Carpentry Teaching basic skills for high-performance computing

Awesome HPC / Websites

Top500 The TOP500 project ranks and details the 500 most powerful non-distributed computer systems in the world

Awesome HPC / User Groups

MVAPICH The MUG conference provides an open forum for all attendees (users, system administrators, researchers, engineers, and students) to discuss and share their knowledge on using MVAPICH libraries
Slurm The annual Slurm user group meeting

Backlinks from these awesome lists:

More related projects: