Short Courses of SSCAD 2025
Keywords:
Large Language Models, LLMs, Computer Architecture, Prompt Engineering, Jupyter Notebook, Google Colab, Data Science, Machine Learning, High-Performance Computing, Advanced Computer Architectures, RISC-V, LLVM Compiler, Open MPI 5.0, Message Passing Interface, MPI, Modular Component Architecture, MCA, Valgrind, Memchecker, Parallel ApplicationsSynopsis
This edition of the Short Courses of SSCAD 2025 compiles the instructional material prepared by the authors of the three tutorials presented during the 26th Symposium on High-Performance Computing Systems (SSCAD), held from October 28 to 31, 2025, in Bonito, MS.
The first tutorial proposes a practical approach for using Large Language Models (LLMs) as a support tool for developing instructional materials and conducting research in the areas of Computer Architecture and High-Performance Computing. The chapter presents fundamental concepts of prompt engineering, details the functionality of major interactive notebook software—Jupyter Notebook and Google Colab—used for code development and execution, and provides interactive examples of systems built with LLMs. The examples explore topics such as data science, machine learning, high-performance computing, as well as advanced, dedicated, and specialized computer architectures, along with evaluation methods and performance measurement and prediction.
The second tutorial offers a detailed guide on adding new instructions to the RISC-V backend of the LLVM compiler infrastructure. The tutorial describes step-by-step the design of a small extension called “xmatrix,” which introduces 32 dedicated 512-bit matrix registers (4 × 4 matrices of 32-bit elements) and a limited set of arithmetic and load/store operations.
Finally, the third tutorial discusses the Open MPI 5.0 mechanisms—one of the most widely used APIs in the development of parallel applications in high-performance environments—that simplify debugging and fine-tuning activities. The content begins with a brief introduction to MPI (Message Passing Interface), followed by a detailed explanation of the Modular Component Architecture (MCA), which provides developers with an extensive set of options for tuning and debugging MPI programs. The chapter concludes with practical aspects of the tuning process, including the use of Valgrind and Memchecker tools for debugging parallel applications.
We believe that this book will enable students, researchers, professionals, and enthusiasts in the fields of Computer Architecture and High-Performance Computing to gain consistent and in-depth access to the knowledge presented at SSCAD 2025, offering a solid, clear, and long-lasting resource for advancing their studies—even for those who were unable to attend the event in person.
Chapters
-
1. Increased Productivity with LLMs, Prompt Engineering, Machine Learning, and GPUs
-
2. Implementing new RISC-V Instructions with the LLVM Compiler Infrastructure
-
3. Tuning and debugging applications in Open MPI 5.0
Downloads
References
Al-Shetairy, M., Hindy, H., Khattab, D., and Aref, M. M. (2024). Transformers utilization in chart understanding: A review of recent advances & future trends. arXiv preprint arXiv:2410.13883. DOI: 10.48550/arXiv.2410.13883
Almanasra, S. and Suwais, K. (2025). Analysis of ChatGPT-Generated Codes Across Multiple Programming Languages. IEEE Access. DOI: 10.1109/ACCESS.2025.3538050
Canesche, M., Bragança, L., Neto, O. P. V., Nacif, J. A., and Ferreira, R. (2021). Google Colab CAD4U: Hands-On Cloud Laboratories for Digital Design. In 2021 IEEE International Symposium on Circuits and Systems (ISCAS), pages 1–5. IEEE. DOI: 10.1109/ISCAS51556.2021.9401151
CARRIERO, N.; GELERNTER, D. Linda in context. Communications of the ACM, ACM New York, NY, USA, v. 32, n. 4, p. 444–458, 1989. DOI: 10.1145/63334.63337
Chen, B., Zhang, Z., Langrené, N., and Zhu, S. (2023). Unleashing the potential of prompt engineering in large language models: a comprehensive review. arXiv preprint arXiv:2310.14735. DOI: 10.48550/arXiv.2310.14735
Coura, P., Freitas, I., Costa, H., Nacif, J., and Ferreira, R. (2025). Desmistificando o ensino de inteligência artificial e aprendizado de máquina. In Simpósio Brasileiro de Educação em Computação (EDUCOMP), pages 25–27. SBC. DOI: 10.5753/educomp_estendido.2025.6578
de Figueiredo, G. A., de Souza, E. S., Rodrigues, J. H., Nacif, J. A., and Ferreira, R. (2024). Desenvolvendo Ferramentas para Ensino de RISC-V com Python, Verilog, Matplotlib, SVG e ChatGPT. International Journal of Computer Architecture Education, 13(1):43–52. DOI: 10.5753/ijcae.2024.5343
Elon University (2025). Survey: 52% of u.s. adults now use ai large language models like chatgpt. Elon University.
FAUSEY, M. R. CPS and the Fermilab farms. In: FERMI NATIONAL ACCELERATOR LAB., BATAVIA, IL (UNITED STATES). 1992. Disponível em: [link].
Ferreira, R. and Nacif, R. D. G. P. (2025). Desenvolvendo simuladores para arquitetura de computadores com auxílio de modelos generativos de linguagens. International Journal of Computer Architecture Education, 14.
Ferreira, R., Canesche, M., Jamieson, P., Neto, O. P. V., and Nacif, J. A. (2024). Examples and tutorials on using google colab and gradio to create online interactive student-learning modules. Computer Applications in Engineering Education, page e22729. DOI: 10.1002/cae.22729
GEIST, A. et al. PVM: Parallel virtual machine: a users’ guide and tutorial for networked parallel computing. Cambridge, MA, USA: MIT Press, 1995. ISBN 0262571080.
Godage, T., Nimishan, S., Vasanthapriyan, S., Palanisamy, V., Joseph, C., and Thuseethan, S. (2025). Evaluating the effectiveness of large language models in automated unit test generation. In 2025 5th International Conference on Advanced Research in Computing (ICARC), pages 1–6. IEEE. DOI: 10.1109/ICARC64760.2025.10962997
GROPP, W.; LUSK, E. Sowing mpich: a case study in the dissemination of a portable environment for parallel scientific computing. The International Journal of Supercomputer Applications and High Performance Computing, v. 11, n. 2, p. 103–114, 1997. DOI: 10.1177/109434209701100204
HPCToolkit Project. HPCToolkit. 2025. Acessada em outubro de 2005. Disponível em: [link].
Jamieson, P., Ferreira, R., and Nacif, J. (2025). Board# 72: Leveraging large language models to create interactive online resources for digital systems and computer architecture education. In 2025 ASEE Annual Conference & Exposition. DOI: 10.18260%2F1-2--55888
Joel, S., Wu, J. J., and Fard, F. H. (2024). A survey on llm-based code generation for low-resource and domain-specific programming languages. arXiv preprint arXiv:2410.03981. DOI: 10.48550/arXiv.2410.03981
Linaro Limited. DDT - Distributed Debugging Tool. 2025. Disponível em: [link].
Lisboa, M. O., Costa, H., Coura, P., Freitas, I., Villela, M. L. B., and Ferreira, R. (2025). Modelos generativos de linguagem na construção de ferramentas de ensino de computação com interface gráfica. In Simpósio Brasileiro de Educação em Computação (EDUCOMP), pages 639–650. SBC. DOI: 10.5753/educomp.2025.4927
LLNL. LLNL/mpiP: A lightweight MPI profiler. 2025. [link]. Acessada em outubro de 2025.
MANACERO, A. Técnicas para análise e otimização de programas. In: Minicursos do SSCAD 2024. SBC, 2024. cap. 6, p. 23. DOI: 10.5753/sbc.16010.0.6
MICROSOFT. MS - MPI v10.1.3. 2025. Disponível em: [link].
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al. (2011). Scikit-learn: Machine learning in python. the Journal of machine Learning research, 12:2825–2830. DOI: 10.1109/MCSE.2007.53
Pérez, F. and Granger, B. E. (2007). Ipython: a system for interactive scientific computing. Computing in science & engineering, 9(3):21–29.
Perforce. TotalView. 2025. [link]. Acessada em outubro de 2025.
PERFORMANCE RESEARCH LAB. TAU - Tuning and Analysis Utilities. 2025. [link]. Acessada em outubro de 2025.
Rule, A., Birmingham, A., Zuniga, C., Altintas, I., Huang, S.-C., Knight, R., Moshiri, N., Nguyen, M. H., Rosenthal, S. B., Pérez, F., and Rose, P. W. (2019). Ten simple rules for writing and sharing computational analyses in jupyter notebooks. PLOS Computational Biology, 15(7):1–8. DOI: 10.1371/journal.pcbi.1007007
Rule, A., Tabard, A., and Hollan, J. D. (2018). Exploration and explanation in computational notebooks. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, pages 1–12. DOI: 10.1145/3173574.3173606
Ságodi, Z., Siket, I., and Ferenc, R. (2024). Methodology for code synthesis evaluation of llms presented by a case study of chatgpt and copilot. Ieee Access, 12:72303–72316. DOI: 10.1109/ACCESS.2024.3403858
The Open MPI Community. Open MPI v5.0.x. 2025. [link]. Accessado em julho de 2025.
Vyas, H. and BHARDWAJ, R. G. (2025). Chatgpt vs deepseek: A comparative evaluation on the international computer science benchmark–acm icpc. DOI: 10.21203/rs.3.rs-7077588/v1
Zala, A., Lin, H., Cho, J., and Bansal, M. (2023). Diagrammergpt: Generating open-domain, open-platform diagrams via llm planning. arXiv preprint arXiv:2310.12128. DOI: 10.48550/arXiv.2310.12128

