COPYRIGHTED BY
Yonghong Yan
May 2007

SCHEDULING SCIENTIFIC WORKFLOW APPLICATIONS IN COMPUTATIONAL GRIDS

An Abstract of a Dissertation
Presented to
the Faculty of the Department of Computer Science
University of Houston

 

In Partial Fulfillment
of the Requirements for the Degree
Doctor of Philosophy

 

By
Yonghong Yan
May 2007


Abstract

A computational grid, built using grid computing technology, is a network of computing resources that work together as a single, uniform operating environment. It can be viewed as a virtual supercomputer designed for large-scale applications. One important characteristic of these applications is that they are no longer being developed as monolithic and single-executable codes, but incorporate multiple dependent computational modules. The execution of these applications involves the concurrent and sequential execution of multiple modules in a predefined order, and the automatic and timely data transfer between modules. These applications are often referred to as scientific workflow applications.

A very important issue in executing a scientific workflow application in computational grids is how to map and schedule workflow modules onto multiple distributed resources, and handle module dependencies in a timely manner to deliver users’ expected performance. The goal of this research is to develop a workflow system to address the issue of workflow scheduling in computational grid environments. In our work, we have developed a grid workflow description language that addresses the limitation of lacking support for resource request specification in current related efforts. An integrated workflow scheduling architecture has been defined that provides the capabilities of workflow execution planning, resource allocation and execution coordination. Our workflow scheduler applies advanced scheduling techniques, such as planning, resource reservation and performance predictions in the resource allocation process. The simulation results show that our workflow scheduler reduces the workflow execution time by about 20% on average under moderate to high resource load, compared to the scheduling policies used in most of current workflow systems.

Contents
List of Figures
List of Tables
1 Introduction
 1.1 Workflow Applications in Computational Grid Environments
 1.2 Research Goals and Contributions
 1.3 Dissertation Organization
2 Grid Computing
 2.1 Background
 2.2 Grid Problems and Grid Computing
 2.3 Grid Architecture and Middlewares
 2.4 Resource Management in Grids
3 Scientific Workflow Applications
 3.1 Grid Scientific Workflow Applications
 3.2 Workflow Management Systems
4 Related Work and Motivation
 4.1 Related Work: Grid Workflow Description Languages
 4.2 Related Work: Workflow and Grid Application Scheduling
 4.3 Motivation: The GRACCE Framework
5 Grid Application Modeling and Description Language
 5.1 GAMDL Capabilities and Features
 5.2 GAMDL Entities and Core Concepts
 5.3 GAMDL Application and Workflow
 5.4 GAMDL’s Support for Resource Co-Allocations
 5.5 A GAMDL Example of a Workflow with Complex Control-Flow Logic
 5.6 Summary
6 Resource Allocation and Scheduling of Workflow Applications
 6.1 The GRACCE Workflow System Architecture
 6.2 Resource Allocation and Workflow Execution Planning
 6.3 Summary
7 Experiment and Performance Evaluation
 7.1 Simulation Environment Setup
 7.2 Performance Evaluation of Workflow Execution
 7.3 Simulation Results
8 Conclusion
 8.1 Future Work 
Bibliography

Committee and Acknowledgements


About this documents:

This document was generated using the TeX4ht: LaTeX and TeX for Hypertext translator version 20070717-2 from original latex source files with command: htlatex diss.tex "html,2,info" on Dec 19th 2007.