Menu

Papers

  • Dong H. Ahn, Ned Bass, Albert Chu, Jim Garlick, Mark Grondona, Stephen Herbein, Helgi I. Ingólfsson, Joseph Koning, Tapasya Patki, Thomas R.W. Scogland, Becky Springmeyer, Michela Taufer, “Flux: Overcoming Scheduling Challenges for Exascale Workflows”, Future Generation Computer Systems, Volume 110, 2020, Pages 202-213. [pre-print pdf] [journal link]

  • Stephen Herbein, David Domyancic, Paul Minner, Ignacio Laguna, Rafael Ferreira da Silva, Dong H. Ahn, “MCEM: Multi-Level Cooperative Exception Model for HPC Workflows,” 9th International Workshop on Runtime and Operating Systems for Supercomputers (ROSS) Phoenix, AZ, June 2019. [paper] [slides]

  • Dong H. Ahn, Ned Bass, Albert Chu, Jim Garlick, Mark Grondona, Stephen Herbein, Joseph Koning, Tapasya Patki, Thomas R. W. Scogland, Becky Springmeyer, Michela Taufer, “Flux: Overcoming Scheduling Challenges for Exascale Workflows,” Workflows in Support of Large-Scale Science in conjunction with International Conference for High Performance Computing, Networking, Storage, and Analysis (SC|18), Dallas, TX, November 2018. [pdf]

  • Samuel D. Pollard, Nikhil Jain, Stephen Herbein, Abhinav Bhatele, “Evaluation of an Interference-free Node Allocation Policy on Fat-tree Clusters,” International Conference for High Performance Computing, Networking, Storage, and Analysis, Dallas, TX, November 2018. [pdf]

  • Michael Wyatt, Stephen Herbein, Todd Gamblin, Adam Moody, Dong H. Ahn, and Michela Taufer, “PRIONN: Predicting Runtime and IO using Neural Networks,” 47th International Conference on Parallel Processing, Eugene, OR, August 2018. [pdf]

  • Stephen Herbein, Dong H. Ahn, Don Lipari, Thomas R.W. Scogland, Marc Stearman, Mark Grondona, Jim Garlick, Becky Springmeyer, Michela Taufer, “Scalable I/O-Aware Job Scheduling for Burst Buffer Enabled HPC Clusters”, 25th International Symposium on High-Performance Parallel and Distributed Computing, Kyoto, Japan, June 2016. [pdf]

  • Dong H. Ahn, Jim Garlick, Mark Grondona, Don Lipari, Becky Springmeyer, Martin Schulz, “Flux: A Next-Generation Resource Management Framework for Large HPC Centers”, 10th International Workshop on Scheduling and Resource Management for Parallel and Distributed Systems, Minneapolis, MN, September 2014. [pdf]

  • Dong H. Ahn, Jim Garlick, Mark Grondona, Don Lipari, “Vision and Plan for a Next Generation Resource Manager”, LLNL DRAFT technical report, May 2013. [pdf]

Talks

  • Dong H. Ahn, Ned Bass, Al Chu, Jim Garlick, Mark Grondona, Stephen Herbein, Tapasya Patki, Tom Scogland, Becky Springmeyer, “Flux: Practical Job Scheduling”, Lawrence Livermore National Laboratory’s Computation’s Developer Day, Livermore, CA, August 2018. [pptx] [pdf]

Posters

  • Stephen Herbein, Tapasya Patki, Dong H. Ahn, Don Lipari, Tamara Dahlgren, David Domyancic, Michela Taufer. Fully Hierarchical Scheduling: Paving the Way to Exascale Workloads, International Conference for High Performance Computing, Networking, Storage and Analysis, Denver, CO, November 2017. [poster] [abstract]

  • Stephen Herbein, Dong H. Ahn, Don Lipari, Thomas R.W. Scogland, Marc Stearman, Mark Grondona, Jim Garlick, Becky Springmeyer, Michela Taufer, “Scalable I/O-Aware Job Scheduling for Burst Buffer Enabled HPC Clusters”, Salishan Conference on High-Speed Computing, Gleneden Beach, OR, April 2016. [pdf]

  • Stephen Herbein, Dong H. Ahn, Don Lipari, Thomas R.W. Scogland, Kento Sato, Jim Garlick, Mark Grondona, Becky Springmeyer, Michela Taufer. Exploring the Trade-off Space of Hierarchical Scheduling for Very Large HPC Centers, International Conference for High Performance Computing, Networking, Storage and Analysis, Austin, TX, November 2015. [pdf]

Theses

  • Stephen Herbein. Advanced Schedulers for Next-Generation HPC Systems, Dissertation, Department of Computer & Information Sciences, University of Delaware, Newark, DE, August 2018.