Parallel Processing & Distributed Systems - Chapter 10: Parallel Paradigms & Programming Models - Thoai Nam

Outline
 Parallel programming paradigms
 Programmability Issues
 Parallel programming models
– Implicit parallelism
– Explicit parallel models
– Other programming models
pdf 28 trang thamphan 26/12/2022 2100
Bạn đang xem 20 trang mẫu của tài liệu "Parallel Processing & Distributed Systems - Chapter 10: Parallel Paradigms & Programming Models - Thoai Nam", để tải tài liệu gốc về máy hãy click vào nút Download ở trên.

File đính kèm:

  • pdfparallel_processing_distributed_systems_chapter_9_parallel_p.pdf

Nội dung text: Parallel Processing & Distributed Systems - Chapter 10: Parallel Paradigms & Programming Models - Thoai Nam

  1. Parallel Paradigms & Programming Models Thoai Nam
  2. Parallel Programming Paradigms  Parallel programming paradigms/models are the ways to – Design a parallel program – Structure the algorithm of a parallel program – Deploy/run the program on a parallel computer system  Commonly-used algorithmic paradigms – Phase parallel – Synchronous and asynchronous iteration – Divide and conquer – Pipeline – Process farm – Work pool Khoa Khoa học và Kỹ thuật Máy tính - Đại học Bách Khoa TP.HCM -3-
  3. Structuredness  A program is structured if it is comprised of structured constructs each of which has these 3 properties – Is a single-entry, single-exit construct – Different semantic entities are clearly identified – Related operations are enclosed in one construct  The structuredness mostly depends on – The programming language – The design of the program Khoa Khoa học và Kỹ thuật Máy tính - Đại học Bách Khoa TP.HCM -5-
  4. Portability  A program is portable across a set of computer system if it can be transferred from one machine to another with little effort  Portability largely depends on – The language of the program – The target machine’s architecture  Levels of portability 1. Users must change the program’s algorithm 2. Only have to change the source code 3. Only have to recompile and relink the program 4. Can use the executable directly Khoa Khoa học và Kỹ thuật Máy tính - Đại học Bách Khoa TP.HCM -7-
  5. Implicit Parallelism  The compiler and the run-time support system automatically exploit the parallelism from the sequential-like program written by users  Ways to implement implicit parallelism – Parallelizing Compilers – User directions – Run-time parallelization Khoa Khoa học và Kỹ thuật Máy tính - Đại học Bách Khoa TP.HCM -9-
  6. Parallelizing Compiler(cont’d)  Data dependence X = X + 1 Y = X + Y  Control dependence If f(X) = 1 then Y = Y + Z;  When dependencies do exist, transformation techniques/ optimizing techniques should be used – To eliminate those dependencies or – To make the code parallelizable, if possible Khoa Khoa học và Kỹ thuật Máy tính - Đại học Bách Khoa TP.HCM -11-
  7. Some Optimizing Techniques for Eliminating Data Dependencies(cont’d)  Reduction technique Do i=1,N ParDo i=1,N P: X(i) = P: X(i) = Q: Sum = Sum + X(i) Q: Sum = sum_reduce(X(i)) End Do End Do The Do loop can not be A parallel reduction function is used executed in parallel since the to avoid data dependency computing of Sum in the i-th iteration needs the values of the previous iteration Khoa Khoa học và Kỹ thuật Máy tính - Đại học Bách Khoa TP.HCM -13-
  8. Run-Time Parallelization  Parallelization involves both the compiler and the run-time system – Additional construct is used to decompose the sequential program into multiple tasks and to specify how each task will access data – The compiler and the run-time system recognize and exploit parallelism at both the compile time and run-time  Example: Jade language (Stanford Univ.) – More parallelism can be recognized – Automatically exploit the irregular and dynamic parallelism Khoa Khoa học và Kỹ thuật Máy tính - Đại học Bách Khoa TP.HCM -15-
  9. Explicit Programming Models  Data-Parallel  Message-Passing  Shared-Variable Khoa Khoa học và Kỹ thuật Máy tính - Đại học Bách Khoa TP.HCM -17-
  10. Data-Parallel: An Example Example: a data-parallel program main() { to compute the constant “pi” double local[N], tmp[N], pi, w; long i, j, t, N=100000; A: w=1.0/N; B: forall(i=0; i<N; i++) { P: local[i]=(i +0.5)*w; Data-parallel operations Q: tmp[i]=4.0/(1.0+local[i]*local[i]); } Reduction operation C: pi=sum(tmp); D: printf(“pi is %f\n”, pi*w); } //end main Khoa Khoa học và Kỹ thuật Máy tính - Đại học Bách Khoa TP.HCM -19-
  11. Message-Passing Model (cont’d)  Explicit Interactions – Programmer must resolve all the interaction issues: data mapping, communication, synchronization and aggregation  Explicit Allocation – Both workload and data are explicitly allocated to the process by the user Khoa Khoa học và Kỹ thuật Máy tính - Đại học Bách Khoa TP.HCM -21-
  12. Shared-Variable Model  Has a single address space  Has multithreading and asynchronous model  Data reside in a single, shared address space, thus does not have to be explicitly allocated  Workload can be implicitly or explicitly allocated  Communication is done implicitly – Through reading and writing shared variables  Synchronization is explicit Khoa Khoa học và Kỹ thuật Máy tính - Đại học Bách Khoa TP.HCM -23-
  13. Comparision of Four Models Issues Implicit Data-parallel Message-passing Shared-Variable Platform-independent Kap, Forge Fortran 90, HPF, PVM, MPI X3H5 examples HPC++ Platform-dependent CM C* SP2 MPL, Cray Craft, examples Paragon Nx SGI Power C Parallelism issues           Allocation issues           Communication            Synchronization            Aggregation             Irregularity            Termination            Determinacy            Correctness           Generality          Portability           Structuredness         Khoa Khoa học và Kỹ thuật Máy tính - Đại học Bách Khoa TP.HCM -25-
  14. Comparision of Four Models (cont’d)  Message-passing model – More flexible than the data-parallel model – Lacks support for the work pool paradigm and applications that need to manage a global data structure – Be widely-accepted – Expoit large-grain parallelism and can be executed on machines with native shared-variable model (multiprocessors: DSMs, PVPs, SMPs)  Shared-variable model – No widely-accepted standard programs have low portability – Programs are more difficult to debug than message-passing programs Khoa Khoa học và Kỹ thuật Máy tính - Đại học Bách Khoa TP.HCM -27-