Università degli Studi di Siena
Department of Information Engineering and Mathematics (DIISM)
Course of
High-Performance Computer Architecture 2024-2025
 
 
 Latest News (home)
 Registration
 Lessons
(restricted access)
 Errata slides
 Tools for lab
 Syllabus
 Office Hours
 previus exams
 Projects
 Exam Rules
 RELATED LINKS
 BEST PROJECTS
bgcolor="#FFFFFF" width="840" height="3900" border="0" cellpadding="0" cellspacing="0" valign="top" align="left">
 FIRST PART SLIDES (THEORY) IN A SINGLE FILE

COURSE SCHEDULE (TENTATIVE): THE FOLLOWING DATES ARE INDICATIVE AND RELATED TO THE HYPOTHETIC REGULAR PROGRESS OF THE LESSONS. NOTE (THIS HAS BEEN ASKED TO ME ...) THE NUMBERING OF LESSON REFERS ONLY TO THE TOPIC, IT DOES NOT IMPLY BY ANY MEANS THAT THE LESSONS WILL BE GIVEN IN A CERTAIN ORDER.

NOTE: the actual schedule will be updated weekly.

BIBLIOGRAPHIC REFERENCES:

 LESSON #01 of 01-Oct-2024 (11:00-13:00)
Introduction, Evaluating Computers, Pipelining
(PART A)
PRESENTATION/SLIDES/VIDEO:
  • c225lez01-intro_pipe.pdf
  • BIBLIOGRAPHIC REFERENCES:
  • Dubois: Chap. 1,3.3
  • (Hennessy-Patterson-4: 2.1,2.2)
  • (Hennessy-Patterson-5: 3.1,3.2)
  •  LESSON #01 of 02-Oct-2024 (14:00-16:00)
    Introduction, Evaluating Computers, Pipelining
    (PART B)
    PRESENTATION/SLIDES/VIDEO:
  • c225lez01-intro_pipe.pdf
  • BIBLIOGRAPHIC REFERENCES:
  • Dubois: Chap. 1,3.3
  • (Hennessy-Patterson-4: 2.1,2.2)
  • (Hennessy-Patterson-5: 3.1,3.2)
  •  LESSON #02 of 02-Oct-2024 (16:00-17:00)
    Dynamic Instruction Scheduling
    BIBLIOGRAPHIC REFERENCES:
  • Dubois: Chap. 3.4,3.4.1
  • (Hennessy-Patterson-4 - 2.4,2.5)
  • (Hennessy-Patterson-5 - 3.4,3.5)
  • An Efficient Algorithm for Exploiting Multiple Arithmetic Units
  • tomasulo.c
  •  PRACTICING/LAB #01 of 02-Oct-2024 (17:00-18:00)
    Dynamic Scheduling exercize.
    PRESENTATION/SLIDES/VIDEO:
  • c225es01-tomasulo.pdf
  •  LESSON #03 of 08-Oct-2024 (11:00-13:00)
    Branch Prediction: speculation of branch condition and branch target, BPRED, BTB. Predictor types, Bimodal, BHSR, BHT, PHT, 2-level adaptive. Other predictors (gshare, gselect).
    PRESENTATION/SLIDES/VIDEO:
  • c225lez03-branch_prediction.pdf
  • BIBLIOGRAPHIC REFERENCES:
  • Dubois - 3.3.4,3.4.3
  • (Hennessy-Patterson-4 - 2.3)
  • (Hennessy-Patterson-5 - 3.3)
  • Optional reading: [Yeh, Patt - 1992]
  • Optional reading: [Nair - 1995]
  • Optional reading: [Young - 1995]
  • Optional reading: [McFarling - 1993]
  •  PRACTICING/LAB #02 of 09-Oct-2024 (14:00-16:00)
    Introduction to Linux (PART A)
    PRESENTATION/SLIDES/VIDEO:
  • c225es02-linux_intro.pdf
  •  PRACTICING/LAB #02 of 09-Oct-2024 (16:00-18:00)
    Introduction to Linux (PART B)
    PRESENTATION/SLIDES/VIDEO:
  • c225es02-linux_intro.pdf
  •  LESSON #05 of 15-Oct-2024 (11:00-12:00)
    Introduction to Superscalar Processors: general scheme and Renaming.
    PRESENTATION/SLIDES/VIDEO:
  • c225lez05-superscalar1.pdf
  • BIBLIOGRAPHIC REFERENCES:
  • Dubois 3.3.3,3.4.6
  • (Hennessy-Patterson-4 - 2.6-2.9)
  • (Hennessy-Patterson-5 - 3.6-3.10)
  •  LESSON #06 of 15-Oct-2024 (12:00-13:00)
    Superscalar execution example: Re-Order Buffer and Instruction Window. Case studies: MIPS, Alpha, AMD, Intel, ARM.
    PRESENTATION/SLIDES/VIDEO:
  • c225lez06-superscalar2.pdf
  • BIBLIOGRAPHIC REFERENCES:
  • Dubois 3.4.4,3.4.5,3.4.7,3.4.8,3.4.9
  • detailed output of the example analyzed during the lesson.
  •  LESSON #08 of 16-Oct-2024 (14:00-16:00)
    Software methods to extract Instruction Level Parallelsim.
  • Animated slide for the software pipelining example.
  • PRESENTATION/SLIDES/VIDEO:
  • c225lez08-software_ilp.pdf
  • BIBLIOGRAPHIC REFERENCES:
  • Dubois 3.3.5,3.5,3.5.1-5
  • (Hennessy-Patterson-4 2.7-2.8)
  •  PRACTICING/LAB #03 of 16-Oct-2024 (16:00-18:00)
    Exercizes on Depenencies, Superscalar, VLIW processors, Tomasulo.
    PRESENTATION/SLIDES/VIDEO:
  • c225es03-workingsheets.pdf
  • RESOURCES:
  • Exercize #1 of the exam of the 07-07-2009.
  • Exercize #2 of the exam of the 07-07-2009.
  • Exercize #1 of the exam of the 30-06-2008.
  • Exercize #1 of the exam of the 20-01-2010.
  •  PRACTICING/LAB #04 of 22-Oct-2024 (11:00-12:00)
    Various excersizes.
    PRESENTATION/SLIDES/VIDEO:
  • c225es04-workingsheets.pdf
  • RESOURCES:
  • Exercize #1 of the exam of the 04-02-2013.
  •  PRACTICING/LAB #04d of 22-Oct-2024 (12:00-13:00)
    Using the Superscalar simulator FREESS
    PRESENTATION/SLIDES/VIDEO:
  • c225es04d-freess.pdf
  • RESOURCES:
  • Educational Simulator FreeSs
  • Exercize #1 of the exam of the 31-10-2018.
  • Output of the FreeSs Simulator for this exercize
  •  PRACTICING/LAB #05 of 23-Oct-2024 (14:00-16:00)
    Various excersizes.
    PRESENTATION/SLIDES/VIDEO:
  • c225es05-workingsheets.pdf
  • RESOURCES:
  • Exercize #1 of the exam of the 05-07-2006.
  • Exercize #1 of the exam of the 07-11-2014.
  •  PRACTICING/LAB #06 of 23-Oct-2024 (16:00-18:00)
    Various excersizes.
    PRESENTATION/SLIDES/VIDEO:
  • c225es06-workingsheets.pdf
  • RESOURCES:
  • Exercize #1 of the exam of the 31-10-2017.
  • Exercize #1 of the exam of the 30-10-2019.
  •  29-Oct-2024 - MIDTERM TEST (11:00-14:00)

     LESSON #11 of 30-Oct-2024 (14:00-16:00)
    Introduction to multiprocessor systems, Flynn's taxonomy, UMA, NUMA, COMA systems, programming models
    BIBLIOGRAPHIC REFERENCES:
  • Dubois 5.1,5.4
  • (see also Culler-Singh, Cap.1)
  •  LESSON #12 of 30-Oct-2024 (16:00-18:00)
    Coherence Protocols: Write Update, Write Invalidate, Hybrid. Snoopy based protocols: the MESI and DRAGON protocols
    PRESENTATION/SLIDES/VIDEO:
  • c225lez12-multiprocessor_coherency.pdf
  • BIBLIOGRAPHIC REFERENCES:
  • Dubois 5.4,5.5,7.3
  • (see also Culler-Singh, Cap.5)
  •  LESSON #14 of 05-Nov-2024 (11:00-13:00)
    Memory Consistency Models: Sequential Consistency and Relaxed Consistency
    BIBLIOGRAPHIC REFERENCES:
  • Dubois 7.4,7.5,7.6,7.7
  • (v. Culler-Singh, cap. 5.2, 5.5)
  • Optional reading (open-access): A Primer on Memory Consistency and Cache Coherence, Second Edition
  • Optional reading: Litmus Tests for checking Memory Models
  •  LESSON #60 of 06-Nov-2024 (14:00-15:00)
    Introduction to FPGAs
    PRESENTATION/SLIDES/VIDEO:
  • c225lez60-fpga1.pdf
  • BIBLIOGRAPHIC REFERENCES:
  • Optional reference: 1364-2001 - IEEE Standard Verilog Hardware Description Language
  • Optional reference: 1076-2019 - IEEE Standard for VHDL Language Reference Manual
  • Free book: L.H. Crockett et al. "The Zynq book"
  •  LESSON #63 of 06-Nov-2024 (15:00-16:00)
    High-level FPGA Programming
    PRESENTATION/SLIDES/VIDEO:
  • c225lez63-fpga4.pdf
  • BIBLIOGRAPHIC REFERENCES:
  • https://www.khronos.org/sycl
  •  PRACTICING/LAB #09 of 06-Nov-2024 (16:00-18:00)
    Exercizes on Coherency from past exams.
    PRESENTATION/SLIDES/VIDEO:
  • c225es09-coherency_exercize1.pdf
  • RESOURCES:
  • Exercize #3 of the exam of the 07-07-2009 - (Coherence: Dragon vs. MESI).
  • Exercize #2 of the exam of the 10-02-2016 - (Coherence - MSI vs MESI, 3 streams).
  • Exercize #1 of the exam of the 16-01-2015 - (Competitive Protocol).
  •  PRACTICING/LAB #19 of 12-Nov-2024 (11:00-13:00)
    Various exercizes from previous exams.
    PRESENTATION/SLIDES/VIDEO:
  • c225es19-multiprocessor_exercize3.pdf
  • RESOURCES:
  • Exercize #1 of the exam of the 20-01-2010 - (MSI protocol).
  • spreadsheet for this excercize (20-01-2010)
  • Exercize #3 of the exam of the 10-02-2016 - (Consistency and use of FENCE).
  • Exercize #1 of the exam of the 31-01-2011 - (Coherence - Bit-Vector vs. Single-Sharer).
  •  PRACTICING/LAB #20 of 13-Nov-2024 (14:00-16:00)
    Various exercizes from previous exams.
    PRESENTATION/SLIDES/VIDEO:
  • c225es20-coherency_exercize2.pdf
  • RESOURCES:
  • Exercize #1 of the exam of the 21-12-2016 - (Coherence - Dragon patterns).
  • Exercize #1 of the exam of the 19-12-2018 - (Coherence - Dragon, TAS best/worst).
  • Exercize #1 of the exam of the 23-12-2020 - (Coherence - Competitive, TAS best/worst).
  •  PRACTICING/LAB #21 of 13-Nov-2024 (16:00-18:00)
    Various exercizes from previous exams.
    PRESENTATION/SLIDES/VIDEO:
  • IN PREPARATION
  • RESOURCES:
  • Exercize #1 of the exam of the 27-11-2023 - (Coherence - MOESI patterns.
  • Exercize #2 of the exam of the 18-12-2015 - (Coherence and Interleaving - Jacobi).
  •  19-Nov-2024 - FINAL TEST (11:00-14:00)

     LESSON #21 of 20-Nov-2024 (14:00-16:00)
    Introduction to Parallel Programming
    BIBLIOGRAPHIC REFERENCES:
  • OpenCilk @ MIT
  • Programming in Cilk
  •  LESSON #22 of 20-Nov-2024 (16:00-17:00)
    Parallelsim and Performance
    BIBLIOGRAPHIC REFERENCES:
  • Reading: He et al., The Cilkview Scalability Analyzer
  • Optional reading: Frigo et al., The implementation of the Cilk-5 multithreaded language
  •  PRACTICING/LAB #11A of 20-Nov-2024 (17:00-18:00)
    Experimenting several programming models: Pthreads, OpenMP, TBB, Cilk
    PRESENTATION/SLIDES/VIDEO:
  • c225es11A-cilk_lab1.pdf
  •  PRACTICING/LAB #11B of 26-Nov-2024 (11:00-12:00)
    Methodology for carrying out performance measurements; discussion on projects; visit to Computer Architecture Lab
    PRESENTATION/SLIDES/VIDEO:
  • c225es11B-methodology.pdf
  • RESOURCES:
  • SAMPLE PROJECT
  •  PRACTICING/LAB #11C of 26-Nov-2024 (12:00-13:00)
    Experimenting Cilk Tools.
    PRESENTATION/SLIDES/VIDEO:
  • c225es11C-cilk_lab2.pdf
  •  PRACTICING/LAB #61 of 27-Nov-2024 (14:00-16:00)
    Vitis Acceleration Application Flow
    PRESENTATION/SLIDES/VIDEO:
  • c225es61-handson_vitis.pdf
  •  LESSON #23 of 27-Nov-2024 (16:00-18:00)
    Introduction to CUDA parallel programming model (PART A)
    PRESENTATION/SLIDES/VIDEO:
  • c225lez23-cuda1.pdf
  • BIBLIOGRAPHIC REFERENCES:
  • D. B. Kirk, W. W. Hwu, Programming Massively Parallel Processors 3rd Edition, Morgan kaugfman, 2017: CAP.1, CAP.2.
  •  LESSON #23 of 03-Dec-2024 (11:00-12:00)
    Introduction to CUDA parallel programming model (PART B)
    PRESENTATION/SLIDES/VIDEO:
  • c225lez23-cuda1.pdf
  • BIBLIOGRAPHIC REFERENCES:
  • D. B. Kirk, W. W. Hwu, Programming Massively Parallel Processors 3rd Edition, Morgan kaugfman, 2017: CAP.1, CAP.2.
  •  LESSON #24 of 03-Dec-2024 (12:00-13:00)
    CUDA Threads, Atomics, and Memory (PART A)
    PRESENTATION/SLIDES/VIDEO:
  • c225lez24-cuda2.pdf
  • BIBLIOGRAPHIC REFERENCES:
  • D. B. Kirk, W. W. Hwu, Programming Massively Parallel Processors 3rd Edition, Morgan kaugfman, 2017: CAP.4, CAP.5.
  •  LESSON #24 of 04-Dec-2024 (14:00-16:00)
    CUDA Threads, Atomics, and Memory (PART B)
    PRESENTATION/SLIDES/VIDEO:
  • c225lez24-cuda2.pdf
  • BIBLIOGRAPHIC REFERENCES:
  • D. B. Kirk, W. W. Hwu, Programming Massively Parallel Processors 3rd Edition, Morgan kaugfman, 2017: CAP.4, CAP.5.
  •  PRACTICING/LAB #13 of 04-Dec-2024 (16:00-18:00)
    Overview of CUDA environment and simple examples.
    PRESENTATION/SLIDES/VIDEO:
  • c225es13-cuda_lab0.pdf
  • RESOURCES:
  • In the lab, download: PPF-LAB-V07.tgz initial examples
  • At home (you MUST have an NVIDIA card in your computer): docker pull robgiorgi/ppf-cuda-v07
  • CUDA CheatSheet
  • MATRIX-MULTIPLY CODE
  • (optional reference) CUDA Programming Guide
  • (optional reference) CUDA Runtime API
  •  LESSON #31 of 10-Dec-2024 (11:00-12:00)
    Clusters
    PRESENTATION/SLIDES/VIDEO:
  • c225lez31-clusters.pdf
  • BIBLIOGRAPHIC REFERENCES:
  • Dubois 5.2.2, 5.3
  •  LESSON #32 of 10-Dec-2024 (12:00-13:00)
    Introduction to MPI
    PRESENTATION/SLIDES/VIDEO:
  • c225lez32-mpi_introduction.pdf
  • BIBLIOGRAPHIC REFERENCES:
  • Reference site for OpenMPI:https://www.open-mpi.org/doc/
  • Reference tutorial: A. Lumsdaine et al., OpenMPI Tutorial
  •  PRACTICING/LAB #16 of 11-Dec-2024 (14:00-16:00)
    Using MPI
    PRESENTATION/SLIDES/VIDEO:
  • c225es16-mpi_lab.pdf
  • RESOURCES:
  • hello.c HELLOWORLD MPI CODE
  • testmpi.c TESTMPI CODE
  • mmxmpi2.c MATRIX-MULTIPLICATION MPI CODE
  •  PRACTICING/LAB #17 of 11-Dec-2024 (16:00-18:00)
    Study of specific parallel patterns in CUDA
    PRESENTATION/SLIDES/VIDEO:
  • c225es17-histogram_cuda.pdf
  • RESOURCES:
  • Exercize #2 of the exam of the 19-12-2017 - (CUDA - Histogram).
  • CUDA - - Histogram Example Code
  •  PRACTICING/LAB #18 of 17-Dec-2024 (11:00-13:00)
    Study of specific parallel patterns in Cilk/OpenMP/MPI
    PRESENTATION/SLIDES/VIDEO:
  • c225es18-histogram_cilk_openmp_mpi.pdf
  • RESOURCES:
  • Exercize #2 of the exam of the 18-12-2019 - (Cilk/Histogram).
  • Exercize #2 of the exam of the 19-12-2018 - (OpenMP/Histogram).
  • Exercize #2 of the exam of the 23-12-2020 - (MPI/Histogram).
  • CILK - Histogram Example Code
  • OPENMP - Histogram Example Code
  • MPI - Histogram Example Code
  •  PRACTICING/LAB #29 of 18-Dec-2024 (14:00-16:00)
    Review of Course Projects
    PRESENTATION/SLIDES/VIDEO:
  • c225es29-ProjectReview.pdf
  • RESOURCES:
    See also the dedicated project page and list of themes.





    To visualize the content of the above presentations in Acrobat format (.pdf) you can use Acrobat Reader, freely realesed by Adobe (Download Acrobat Reader)