Torque Resource Manager
Torque Resource Manager
Adaptive Computing offers a fully developed version of TORQUE Resource Manager including support. Torque 7.0.0 has recently been released with the following enhancements:
- adds support for numerous Ubuntu OS versions in addition to Red Hat 8 and SUSE 15
- contains the MIG support
- contains over 100 improvements and bug fixes
- has been thoroughly tested (10s of thousands of tests across each of the supported OS versions)
TORQUE can integrate with Moab®, Adaptive’s workload manager that intelligently places workloads and adapts resources to optimize application performance, increase system utilization, and achieve organizational objectives. It is customizable to each system’s specific situation, and provides control over batch jobs and distributed computing resources.
TORQUE incorporates significant advances in the areas of scalability, reliability, and functionality and is currently in use at tens of thousands of leading government, academic, and commercial sites throughout the world.
- Ease-of-Use Job Submission and Management: Simplify the workload submission process for end-users with an easy-to-use job submission portal, which includes features like application templates, script builders, job details, and web-based file management.
- Multiple Groups or Heterogeneous Hardware: Meet the needs of multiple groups and optimize resources in complex or heterogeneous environments, as well as guarantee SLAs and achieve business objectives.
- Modular Add-ons: Obtain additional controls through powerful add-on capabilities like portal-based job submission, accounting, grid and power management, high throughput submission, and more. Adaptive7.0.0
TORQUE is an industry-standard resource manager solution with higher adoption than any other resource management offering. It provides enhancements over other resource managers in the following areas:
- Additional failure conditions checked/handled
- Node health check script support
- Extended query interface providing the scheduler with additional and more accurate information
- Extended control interface allowing the scheduler increased control over job behavior and attributes
- Allows the collection of statistics for completed jobs
- Significantly improved server to MOM communication model
- Ability to handle larger clusters with tens of thousands of nodes and jobs
- Ability to handle larger jobs that span hundreds of thousands of processors
- High responsiveness and reliability with multi-threading and TCP-based communication
- Extensive logging additions
- More human readable logging (i.e., no more ‘error 15038 on command 42’)