|
|
bgcolor="#FFFFFF" width="840" height="1160" border="0" cellpadding="0" cellspacing="0" valign="top" align="left"> |
THEME A01
EXPLORATORY STUDY OF INTEL oneAPI
Intel has recently released a Unified, Standards-Based Programming Model called oneAPI
Modern workload diversity necessitates the need for architectural diversity; no single architecture is best for every workload. A mix of scalar, vector, matrix, and spatial (SVMS) architectures deployed in CPU, GPU, AI, FPGA, and other accelerators is required to extract high performance.
Intel oneAPI products will deliver the tools needed to deploy applications and solutions across SVMS architectures. Its set of complementary toolkits—a base kit and specialty add-ons—simplify programming and help developers improve efficiency and innovation.
Student task: by using oneAPI, try a simple program like matrix multiplication on 3 different platforms: multicore, GPU, FPGA;
collect performance results; write a very short report that comments the work done.
|
| THEME A02
FPGA PROJECT
USE one of the available FPGA project to implement a Blocked Matrix Multiplication.
Set matrix size MS and block size BS (e.g., MS=64,128,256,512,1024,... and BS=8,16) and compare the execution time with a multicore execution.
We have several FPGA boards available (ZYBO, VIRTEX ML605, ZYNQ ZC-706, PARALLELA): choose one of them.
|
| THEME A03
IMPLEMENTING A NUMERICAL ALGORITHM for the MAXELER APP GALLERY (+ interniship at Maxeler if successful)
Select a numerical problem (from the book indicated by prof. Milutinovic) and implement a new app which is not yet present in the MAXELER app gallery (https://appgallery.maxeler.com/).
|
| THEME A04
FFT ON NVIDIA V100
Referring to the book [1], Chapter-3, re-implement on the MAX2C board of our lab223 the FFT algorithm and verify the results.
REFERENCES
[1] M. Milutinovic, J. Salom, N. Trifunovic, R. Giorgi, "Guide to DataFlow Supercomputing", Springer, Berlin, DE, Apr 2015, pp. 1-127
|
| THEME A05
EVALUATING CONSISTENCY THROUGH LITMUS TESTS (BONUS=2)
Using litmus tests (https://diy.inria.fr/) evaluate 3 different x86 machines and discuss the results.
|
| THEME A06
CPU/GPU performance comparison
Choose a good algorithm for Machine Learning (NOT: PCA, K-means as they have been already assigned in previous years) and implement it both on CPUs and GPUs.
For inspiration, you can also use one of the algoirthms that youy can find here: https://appgallery.maxeler.com/
Abalyze the performance and explain with appropriate numerical comparison the performance differences.
|
| THEME A07
ANALYZE AND TEST THE CONTENT OF ONE OF CHAPTER OF THE KIRK/HWU BOOK
Referring to the book [1], analyze and test the content of one of the chapter from 10 to 17.
REFERENCES
[1] David B. Kirk and Wen-mei W. Hwu, "Programming Massively Parallel Processors: A Hands-on Approach", 3rd ed., Morgan Kaufmann (2019) ISBN 978-0-12-811986-0
|
| THEME A08
TURING-PI-2 PROJECT
The Turing Pi 2 platform has been released in 2023: https://turingpi.com/product/turing-pi-2/
IN our lab (lab223) we have 2 Turing Pi 2 boards, 6 CM4 (RaspberryPi-4 modules) and 2 NVIDIA Jetson Javier NX GPU modules (you can mis them in the 4 Turing Pi slots as you prefer).
The project consists in making a first run of the platform and test it (see this video for example: https://www.youtube.com/watch?v=9Llchw14cDA )
|
|
|
|
|