Scientific workflow applications in distributed and grid environments involve multiple modules coupled together in execution to accomplish certain domain goals. An uninterrupted and coordinated execution of such an application requires both the mapping and scheduling of the application modules onto grid resources, and the on-time and automatic processing of dependencies between coupled modules. Furthermore, users expect not only the correct and uninterfered execution of the workflow, but also have certain quality of services requirements, commonly, the performance. This is the topic of grid workflow scheduling, which has been an active and open research area recently.
This research studies the topic of executing scientific workflow applications in the dynamic distributed and grid environments. The dissertation researches the issues of workflow description and scheduling, and defines a full-featured workflow description language and an integrated architecture of a grid workflow system. In defining the architecture, an advanced scheduler with features of workflow planning, resource advanced reservation and performance prediction is proposed. Our simulation results show that using our workflow scheduler, the workflow execution performance gains about 20% improvement under high resource load, compared to the regular, widely used workflow scheduling approach.
The workflow description language defined in this work addresses the issue of lacking support for workflow resource allocation in current workflow description languages. Using our language, users no longer need to manually specify resource multi-request for the workflow tasks, nor need to resort to another resource specification language for describing task resource requests, which are proved to be inflexible and inconvenient for end users. Other than that, our language introduces lots of other features that are not found in other languages and are very convenient to end users and workflow developers.
We will first refine and enhance our simulation environment with more features and capabilities, and make it as close as possible to a real grid environment. Based on that, we will further study the behavior of our scheduler. Along with that, we will set up a real grid environment that has the widely-used local scheduler, such as LSF, PBS, or SGE, installed on resources, and test our scheduler on this environment to study the performance improvement. In terms of the workflow scheduler, in current scheduling policy, the guideline of allocating a resource for a task is to improve the performance of this task as much as possible in order to improve the workflow execution performance. We believe a more intelligent scheduler should also consider the performance impacts of the task resource allocation on its child and sibling tasks. We will study what algorithms can be used in such a scheduler and enhance our scheduler with this feature.