Tasking
5. Tasking#
Tasking constructs provide units of work to a thread for execution. Worksharing constructs do this, too (e.g. for, do, sections, and singles constructs); but the work units are tightly controlled by an iteration limit and limited scheduling, or a limited number of sections or single regions. Worksharing was designed with “ data parallel “ computing in mind. Tasking was designed for “ task parallel “ computing and often involves non-locality or irregularity in memory access.
The task construct can be used to execute work chunks: in a while loop; while traversing nodes in a list; at nodes in a tree graph; or in a normal loop (with a taskloop construct). Unlike the statically scheduled loop iterations of worksharing, a task is often enqueued, and then dequeued for execution by any of the threads of the team within a parallel region. The generation of tasks can be from a single generating thread (creating sibling tasks), or from multiple generators in a recursive graph tree traversals. A taskloop construct bundles iterations of an associated loop into tasks, and provides similar controls found in the task construct.
Sibling tasks are synchronized by the taskwait construct, and tasks and their descendent tasks can be synchronized by containing them in a taskgroup region. Ordered execution is accomplished by specifying dependences with a depend clause. Also, priorities can be specified as hints to the scheduler through a priority clause.
Various clauses can be used to manage and optimize task generation, as well as reduce the overhead of execution and to relinquish control of threads for work balance and forward progress.
Once a thread starts executing a task, it is the designated thread for executing the task to completion, even though it may leave the execution at a scheduling point and return later. The thread is tied to the task. Scheduling points can be introduced with the taskyield construct. With an untied clause any other thread is allowed to continue the task. An if clause with an expression that evaluates to false results in an undeferred task, which instructs the runtime to suspend the generating task until the undeferred task completes its execution. By including the data environment of the generating task into the generated task with the mergeable and final clauses, task generation overhead can be reduced.
A complete list of the tasking constructs and details of their clauses can be found in the Tasking Constructs chapter of the OpenMP Specifications, in the OpenMP Application Programming Interface section.