Parallel Processing & Distributed Systems - Chapter 9: Parallel Job Schedulings - Thoai Nam

Scheduling on UMA
Multiprocessors
 Schedule:
allocation of tasks to processors
 Dynamic scheduling
– A single queue of ready processes
– A physical processor accesses the queue to run the next
process
– The binding of processes to processors is not tight
 Static scheduling
– Only one process per processor
– Speedup can be predicted

27 trang thamphan 10320 Free

Download

Bạn đang xem 20 trang mẫu của tài liệu "Parallel Processing & Distributed Systems - Chapter 9: Parallel Job Schedulings - Thoai Nam", để tải tài liệu gốc về máy hãy click vào nút Download ở trên.

File đính kèm:

parallel_processing_distributed_systems_chapter_9_parallel_j.pdf

Nội dung text: Parallel Processing & Distributed Systems - Chapter 9: Parallel Job Schedulings - Thoai Nam

Parallel Job Schedulings Thoai Nam
Classes of scheduling  Static scheduling – An application is modeled as an directed acyclic graph (DAG) – The system is modeled as a set of homogeneous processors – An optimal schedule: NP-complete  Scheduling in the runtime system – Multithreads: functions for thread creation, synchronization, and termination – Parallelizing compilers: parallelism from the loops of the sequential programs  Scheduling in the OS – Multiple programs must co-exist in the same system  Administrative scheduling Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM
Gantt chart  Gantt chart indicates the time each task T1 spends in execution, as well as the 2 processor on which it executes T4 2 T4 T2 3 T T T3 3 6 1 T1 T2 T5 T7 Processors T6 T5 3 1 2 3 4 5 6 7 8 9 3 Time T7 1 Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM
Graham’s list scheduling algorithm  T = {T1, T2, , Tn} a set of tasks  : T (0, ) a function associates an execution time with each task  A partial order < on T  L is a list of task on T  Whenever a processor has no work to do, it instantaneously removes from L the first ready task; that is, an unscheduled task whose predecessors under < have all completed execution. (The processor with the lower index is prior) Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM
Graham’s list scheduling algorithm - Problem T1 T9 T1 T9 3 9 T2 T4 T5 T7 T 2 T5 T3 T6 T8 2 4 T3 T6 T1 T8 2 4 T2 T5 T9 T 4 T7 T3 T6 2 4 T4 T7 T8 4 L = {T1, T2, T3, T4, T5, T6, T7, T8, T9} Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM
Coffman-Graham’s scheduling algorithm (2)  Let T = T1, T2, , Tn be a set of n unit-time tasks to be executed on p processors  If Ti < Tj, then task is Ti an immediate predecessor of task Tj, and Tj is an immediate successor of task Ti  Let S(Ti) denote the set of immediate successor of task Ti  Let (Ti) be an integer label assigned to Ti.  N(T) denotes the decreasing sequence of integers formed by ordering of the set { (T’)| T’ S(T)} Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM
Coffman-Graham’s scheduling algorithm – Example (1) T1 T2 T5 T2 T6 T4 T7 T9 T3 T6 T1 T3 T8 T4 T5 T8 T7 T9 Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM
Coffman-Graham’s scheduling algorithm – Example (3)  i=7: R = {T1, T6}, N(T1)= {6, 5, 4} and N(T6)= {3} Choose task T6 and assign 7 to (T6)  i=8: R = {T1, T2}, N(T1)= {6, 5, 4} and N(T2)= {7} Choose task T1 and assign 8 to (T1)  i=9: R = {T2}, N(T2)= {7} Choose task T2 and assign 9 to (T2) Step 3 of algorithm L = {T2, T1, T6, T3, T5, T4, T8, T7, T9} Step 4 of algorithm Schedule is the result of applying Graham’s list-scheduling algorithm to task graph T and list L Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM
Current approaches  Global queue  Variable partitioning  Dynamic partitioning with two-level scheduling  Gang scheduling Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM
Variable partitioning  Processors are partitioned into disjoined sets and each job is run only in a distinct partition Parameters taken into account Scheme User request System load Changes Fixed no no no Variable yes no no Adaptive yes yes no Dynamic yes yes yes  Distributed memory machines: Intel and nCube hypercudes, IBM PS2, Intel Paragon, Cray T3D  Problem: fragmentation, big jobs Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM
Gang scheduling  Problem: Interactive response times time slicing – Global queue: uncoordinated manner  Observartion: – Coordinated scheduling in only needed if the job’s threads interact frequently – The rare of interaction can be used to drive the grouping of threads into gangs  Samples: – Co-scheduling – Family scheduling: which allows more threads than processors and uses a second level of internal time slicing Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM
Co-Scheduling  Context switching between applications rather then between tasks of several applications.  Solving the problem of “preemption inside spinlock-controlled critical sections”.  Cache corruption??? Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM
Scheduling in the NYU Ultracomputer  Tasks can be formed into groups  Tasks in a group can be scheduled in any of the following ways: – A task can be scheduled or preempted in the normal manner – All the tasks in a group are scheduled or preempted simultaneously – Tasks in a group are never preempted.  In addition, a task can prevent its preemption irrespective of the scheduling policy (one of the above three) of its group. Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM
Scheduling in the Mach OS  Threads  Processor sets: disjoin  Processors in a processor set is assigned a subset of threads for execution. – Priority scheduling: LQ, GQ(0), ,GQ(31) 0 P0 Global 1 Local queue P1 queue (GQ) (LQ) 31 Pn – LQ and GQ(0-31) are empty: the processor executes an special idle thread until a thread becomes ready. – Preemption: if an equal or higher priority ready thread is present Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM