Ph.D. Dissertation by Yonghong Yan: Scheduling Scientific Workflow Applications in Computational Grids

Abstract

A computational grid, built using grid computing technology, is a network of computing resources that work together as a single, uniform operating environment. It can be viewed as a virtual supercomputer designed for large-scale applications. One important characteristic of these applications is that they are no longer being developed as monolithic and single-executable codes, but incorporate multiple dependent computational modules. The execution of these applications involves the concurrent and sequential execution of multiple modules in a predefined order, and the automatic and timely data transfer between modules. These applications are often referred to as scientific workflow applications.

A very important issue in executing a scientific workflow application in computational grids is how to map and schedule workflow modules onto multiple distributed resources, and handle module dependencies in a timely manner to deliver users expected performance. The goal of this research is to develop a workflow system to address the issue of workflow scheduling in computational grid environments. In our work, we have developed a grid workflow description language that addresses the limitation of lacking support for resource request specification in current related efforts. An integrated workflow scheduling architecture has been defined that provides the capabilities of workflow execution planning, resource allocation and execution coordination. Our workflow scheduler applies advanced scheduling techniques, such as planning, resource reservation and performance predictions in the resource allocation process. The simulation results show that our workflow scheduler reduces the workflow execution time by about 20% on average under moderate to high resource load, compared to the scheduling policies used in most of current workflow systems.

Complete copy, PDF(1.3M)(1.3M), PPT(6.9M)