Minicursos da XXVI ERAD-RS 2026

André Rauber Du Bois; Sandro da Silva Camargo; Gerson Geraldo H. Cavalheiro; André Rauber Du Bois; Alexandro Baldassin; Lucas Mello Schnorr

doi:10.5753/sbc.19367.4

Autores

André Rauber Du Bois (ed)

UFPel

Sandro da Silva Camargo (ed)

Unipampa

DOI: https://doi.org/10.5753/sbc.19367.4

Palavras-chave:

Programação Concorrente, Memória Compartilhada, Análise de Desempenho

Sinopse

O Livro de Minicursos da XXVI Escola Regional de Alto Desempenho da Região Sul (ERAD-RS) aborda conteúdos relacionados à concorrência em sistemas computacionais e à análise de desempenho em ambientes de computação de alto desempenho. No primeiro capítulo, intitulado “Concorrência sem Memória Compartilhada”, os autores apresentam alternativas ao modelo tradicional de programação concorrente baseado em memória compartilhada, explorando mecanismos de coordenação fundamentados na comunicação e na troca de dados. No segundo capítulo, “Análise de Desempenho de Sistemas Computacionais: Fundamentos e Metodologia Científica para HPC”, o autor discute a importância da análise de desempenho no desenvolvimento e otimização de sistemas computacionais, especialmente em aplicações de Computação de Alto Desempenho (HPC). Os dois capítulos deste livro apresentam fundamentos teóricos, metodologias e práticas voltadas à área de sistemas computacionais e computação de alto desempenho, constituindo uma obra relevante tanto para estudantes quanto para pesquisadores e profissionais interessados em concorrência, avaliação de desempenho e HPC.

Capítulos

1. Concorrência sem Memória Compartilhada

Gerson Geraldo H. Cavalheiro, André Rauber Du Bois, Alexandro Baldassin

DOI: https://doi.org/10.5753/sbc.19367.4.1

Capítulo 1
2. Análise de Desempenho de Sistemas Computacionais: Fundamentos e Metodologia Científica para HPC

Lucas Mello Schnorr

DOI: https://doi.org/10.5753/sbc.19367.4.2

Capítulo 2

Downloads

Não há dados estatísticos.

Referências

Asch, C. and Schnorr, L. M. (2025). Profiling a task-based molecular dynamics application with a data science approach. In Latin America High Performance Computing Conference (CARLA).

Bailey, D. H., Barszcz, E., Barton, J. T., Browning, D. S., Carter, R. L., Dagum, L., Fatoohi, R. A., Frederickson, P. O., Lasinski, T. A., Schreiber, R. S., Simon, H. D., Venkatakrishnan, V., and Weeratunga, S. K. (1991). The NAS parallel benchmarks. International Journal of High Performance Computing Applications, 5(3):63–73.

Bhatele, A., Dhakal, R., Movsesyan, A., Ranjan, A. K., and Cankur, O. (2023). Pipit: Scripting the analysis of parallel execution traces. arXiv preprint arXiv:2306.11177.

Bruel, P., Mittal, V., Milojicic, D., Faloutsos, M., and Frachtenberg, E. (2023). Revisiting performance evaluation in the age of uncertainty. In 2023 IEEE 30th International Conference on High Performance Computing, Data and Analytics Workshop (HiPCW), pages 1–8. IEEE.

Carns, P., Latham, R., Ross, R., Iskra, K., Lang, S., and Riley, K. (2009). 24/7 characterization of petascale I/O workloads. In Proceedings of the IEEE International Conference on Cluster Computing and Workshops.

Casanova, H., Giersch, A., Legrand, A., Quinson, M., and Suter, F. (2014). Versatile, scalable, and accurate simulation of distributed applications and platforms. Journal of Parallel and Distributed Computing, 74(10):2899–2917.

Cavalheiro, G. G. H. (2009). Programação com pthreads. In Mattos, J. C. B., Da Rosa Junior, L. S., and Pilla, M. L., editors, Desafios e Avanços em Computação: O Estado da Arte, pages 137–151. Editora e Gráfica Universitária - PREC UFPel, Pelotas.

Cavalheiro, G. G. H. and Santos, R. R. (2007). Multiprogramação leve em arquiteturas multi-core. In Kowaltowski, T. and Breitman, K. K., editors, Atualizações em Informática 2007, pages 327–379. PUC-Rio, Rio de Janeiro.

Cavalheiro, G. G. H., Baldassin, A., and Bois, A. R. D. (2025). Programação multithread: Modelos e abstrações em linguagens contemporâneas. In Musse, S. R. and dos Santos, A. P., editors, 44a Jornada de Atualização em Informática (JAI 2025). Sociedade Brasileira de Computação (SBC), Porto Alegre.

Chandra, R., Dagum, L., Kohr, D., Maydan, D., McDonald, J., and Menon, R. (2001). Parallel programming in OpenMP. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.

Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J. W., Lee, S.-H., and Skadron, K. (2009). Rodinia: A benchmark suite for heterogeneous computing. In International Symposium on Workload Characterization (IISWC), pages 44–54. IEEE.

Coulomb, K., Degomme, A., Faverge, M., and Trahay, F. (2012). An open-source tool-chain for performance analysis. In Tools for High Performance Computing 2011: Proceedings of the 5th International Workshop on Parallel Tools for High Performance Computing, pages 37–48, Berlin, Heidelberg. Springer.

Cox-Buday, K. (2017). Concurrency in Go: Tools and Techniques for Developers. O’Reilly Media, Inc., 1st edition.

David, H., Gorbatov, E., Hanebutte, U. R., Khanna, R., and Le, C. (2010). RAPL: Memory power estimation and capping. In Proceedings of the 16th ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED), pages 189–194.

Degomme, A., Legrand, A., Markomanolis, G., Quinson, M., Stillwell, M., and Suter, F. (2017). Simulating MPI applications: The SMPI approach. IEEE Transactions on Parallel and Distributed Systems, 28(8):2387–2400.

Denis, A., Jaeger, J., Jeannot, E., and Reynier, F. (2022). A methodology for assessing computation/communication overlap of MPI nonblocking collectives. Concurrency and Computation: Practice and Experience, 34(22).

Dongarra, J., Heroux, M. A., and Luszczek, P. (2016). Highperformance conjugate-gradient benchmark: A new metric for ranking highperformance computing systems. International Journal of High Performance Computing Applications, 30(1):3–10.

Few, S. (2012). Show Me the Numbers: Designing Tables and Graphs to Enlighten. Analytics Press, Burlingame, CA, 2 edition.

Freedman, D. and Diaconis, P. (1981). On the histogram as a density estimator: L2 theory. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 57(4):453–476.

Gospodinov, S. (2021). Concurrent Data Processing in Elixir: Fast, Resilient Applications with OTP, GenState, Flow, and Broadway. The Pragmatic Bookshelf, Raleigh, North Carolina.

Hager, G. and Wellein, G. (2010). Introduction to High Performance Computing for Scientists and Engineers. CRC Press, Boca Raton, FL.

Hager, G., Treibig, J., Habich, J., and Wellein, G. (2016). Exploring performance and power properties of modern multicore chips via simple machine models. Concurrency and Computation: Practice and Experience, 28(2):189–210.

Hoefler, T. and Belli, R. (2015). Scientific benchmarking of parallel computing systems: Twelve ways to tell the masses when reporting performance results. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC’15), pages 1–12, Austin, TX, USA. ACM.

Intel Corporation (2013). Intelligent platform management interface specification, version 2.0. Technical report, Intel Corporation. [link].

Jain, R. (1991). The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling. Wiley, New York.

Kale, L. V., Zheng, G., Lee, C. W., and Kumar, S. (2006). Scaling applications to massively parallel machines using projections performance analysis tool. Future Generation Computer Systems, 22(3):347–358.

Karlin, I., Keasler, J., and Neely, R. (2013). LULESH 2.0 updates and changes. Technical Report LLNL-TR-641973, Lawrence Livermore National Laboratory.

Kerr, N. L. (1998). HARKing: Hypothesizing after the results are known. Personality and Social Psychology Review, 2(3):196–217.

Kleiman, S., Shah, D., and Smaalders, B. (1996). Programming with Threads. Sun Soft Press ; Prentice Hall, Mountain View, Calif. ; Upper Saddle River, NJ.

Knüpfer, A., Brunst, H., Doleschal, J., Geimer, M., Jurenz, M., Lieber, M., Mickler, H., Otto, S., Rahn, M., Schilling, K., Wylie, B. J. N., and Nagel, W. E. (2008). The Vampir performance analysis tool-set. In Tools for High Performance Computing, pages 139–155. Springer.

Knüpfer, A., Rössel, C., an Mey, D., Biersdorff, S., Diethelm, K., Eschweiler, D., Geimer, M., Gerndt, M., Lorenz, D., Malony, A., Nagel, W. E., Oleynik, Y., Philippen, P., Saviankou, P., Schmidl, D., Shende, S., Tschüter, R.,Wagner, M., Wesarg, B., and Wolf, F. (2012). Score-P: A joint performance measurement runtime infrastructure for Periscope, Scalasca, TAU and Vampir. In Tools for High Performance Computing 2011, pages 79–91. Springer.

Lawrence Livermore National Laboratory (2013). AMG: Algebraic multigrid benchmark. [link].

Le Boudec, J.-Y. (2011). Performance Evaluation of Computer and Communication Systems. CRC Press, USA.

Lilja, D. J. (2000). Measuring Computer Performance: A Practitioner’s Guide. Cambridge University Press, Cambridge, UK.

McGill, R., Tukey, J. W., and Larsen, W. A. (1978). Variations of box plots. The American Statistician, 32(1):12–16.

Montgomery, D. C. (2012). Design and Analysis of Experiments. Wiley, New York, 8 edition.

Mucci, P. J., Browne, S., Deane, C., and Ho, G. (1999). PAPI: A portable interface to hardware performance counters. In Proceedings of the Department of Defense HPCMP Users Group Conference, pages 7–10.

Murphy, R. C., Wheeler, K. B., Barrett, B. W., and Ang, J. A. (2010). Introducing the Graph500. In Cray User Group (CUG).

NVIDIA Corporation (2024). NVIDIA Management Library (NVML) Reference. [link].

Pearce, O., Gamblin, T., de Supinski, B. R., Schulz, M., and Amato, N. M. (2012). Quantifying the effectiveness of load balance algorithms. In Proceedings of the 26th ACM International Conference on Supercomputing (ICS’12), pages 185–194, Venice, Italy. ACM.

Petitet, A., Whaley, R. C., Dongarra, J., and Cleary, A. (2004). HPL – a portable implementation of the high-performance Linpack benchmark for distributedmemory computers. Technical report, Innovative Computing Laboratory, University of Tennessee. Disponível em [link].

Pillet, V., Labarta, J., Cortes, T., and Girona, S. (1995). PARAVER: A tool to visualize and analyze parallel code performance. In Proceedings of WoTUG-18: Transputer and occam Developments, pages 17–31. IOS Press.

Plale, B., Malik, T., et al. (2021). Reproducibility practice in highperformance computing: Community survey results. Computing in Science & Engineering, 23(5):16–26.

Reussner, R. H., Sanders, P., and Träff, J. L. (2002). SKaMPI: A comprehensive benchmark for public benchmarking of MPI. Scientific Programming, 10(1):55–65.

Sandia National Laboratories (2011). miniMD: A simple molecular dynamics proxy application. [link].

SC Repr. Initiative (2015). SC reproducibility initiative. [link]. Iniciativa lançada em 2015; AD appendix obrigatório desde 2019.

Schnorr, L. M. and Legrand, A. (2013). Visualizing more performance data than what fits on your screen. In Tools for High Performance Computing 2012, pages 149–162. Springer.

Scott, D. W. (1979). On optimal and data-based histograms. Biometrika, 66(3):605–610.

Shende, S. S. and Malony, A. D. (2006). The TAU parallel performance system. International Journal of High Performance Computing Applications, 20(2):287–311.

Silberschatz, A., Galvin, P. B., and Gagne, G. (2018). Operating System Concepts. Wiley, 10 edition.

Stanisic, L., Mello Schnorr, L., Degomme, A., Heinrich, F., Legrand, A., and Videau, B. (2017). Characterizing the Performance of Modern Architectures Through Opaque Benchmarks: Pitfalls Learned the Hard Way. In IPDPS 2017 - 31st IEEE International Parallel & Distributed Processing Symposium (RepPar workshop), Orlando, United States.

Sturges, H. A. (1926). The choice of a class interval. Journal of the American Statistical Association, 21(153):65–66.

Tanenbaum, A. S. and Bos, H. (2022). Modern Operating Systems. Prentice Hall, 5 edition.

Team, T. R. (2021). The Rust Programming Language. Rust Core Team. Official language guide.

TOP500 Project (2024). TOP500 supercomputer sites. [link].

Treibig, J., Hager, G., and Wellein, G. (2010). LIKWID: A lightweight performance-oriented tool suite for x86 multicore environments. In Proceedings of the 39th International Conference on Parallel Processing Workshops (ICPPW), pages 207–216.

Tufte, E. R. (2001). The Visual Display of Quantitative Information. Graphics Press, Cheshire, CT, 2 edition.

Vetter, J. S. and Mueller, F. (2001). Communication characteristics of large-scale scientific applications for contemporary cluster architectures. In Proceedings of the 15th Intl. Par. and Distributed Processing Symposium (IPDPS).

Williams, A. (2019). C++ Concurrency in Action, Second Edition. Manning, 2 edition.

Williams, S., Waterman, A., and Patterson, D. (2009). Roofline: An insightful visual performance model for multicore architectures. Communications of the ACM, 52(4):65–76.

Minicursos da XXVI ERAD-RS 2026

Autores

Palavras-chave:

Sinopse

Capítulos

Downloads

Referências

Downloads

Data de publicação

Categorias

Licença

Detalhes sobre o formato disponível para publicação: Volume Completo

ISBN-13 (15)

Idioma