flux-sched v0.8.0
Published: Dec 6, 2019 by flux-framework
Download from GitHub here.
Release Notes
Summary:
This version of flux-sched integrates our new graph-based scheduler with the new execution system within flux-core. For this purpose, it introduces the qmanager module that can easily enforce various queuing and backfilling policies on batch jobs while serving as a conduit between our resource matching service and flux-core’s job-manager. In addition, it adds a new resource match policy that considers the performance class of each compute node as a constraint for scheduling to handle the natural performance variations of modern microprocessors. Finally, this version lays some of the foundational work to allow flux-sched to ultimately reconstruct its state using the queue state of job-manager for resiliency.
New features
- qmanager: integrate with the new exec system (#481)
- qmanager: add hello/exception callback support (#493)
- qmanager: add EASY/HYBRID/CONSERVATIVE policies (#504)
- resource: RFC20 resource set specification version 1 support (#455)
- resource: add hwloc whitelist support (#467)
- resource: add set- and get-property support (#490, #513)
- resource: add support for checking a job’s satisfiability (#503)
- resource: support for variation-aware scheduler (#517)
- resource: add JGF reader support (#521)
- resource: resource graph metadata (by_path) optimization (#536)
- libjobspec: update command to be list instead of list or string (#549)
- resource: add resource update support (#543)
- add smart pointer support and misc. cleanup (#537)
- test: add test cases to support systems with disaggregated resources (#460)
- test: add test cases for AMD GPUs (#464)
Cleanup
- resource: tidy up JGF match writer support (#520)
Fixes
- resource/sched/simulator: update KVS API usage (#435)
- build: fix build issues for priority_mod_fair_tree.so (#434)
- api: sync flux-sched with removed interfaces in flux-core (#440)
- resource: update for kvs watch API removal (#442)
- API: remove the use of deprecated Python API (#450)
- resource: update to flux_respond_error() (#457)
- resource: bug fix for incorrectly handling implicit exclusivity (#502)
- libschedutil: remove vendored copy and use flux-core’s exported lib (#516)
- testsuite: update flux-sharness.sh to version from flux-core (#523)
- rc3: Fix a bug in RC3 dir definition within configure.ac (#525)
- resource: fix buffer overflow when handling slot type in a jobspec (#548)