Minicursos do SSCAD 2024

Autores

Arthur F. Lorenzon (ed)
UFRGS
Álvaro Luiz Fazenda (ed)
UNIFESP

Palavras-chave:

Big Data, Aprendizado de Máquina, Interface Myriad, HPCC Systems, Perfilamento, Escalabilidade, Aplicações Paralelas, Parallel Scalability Suite, Programação Paralela, MPI, OpenMP Offloading, Simulação Arquitetural, gem5, Computação Quântica, IBM/Qiskit, Análise e Otimização de Programas

Sinopse

Esta edição do Livro de Minicursos do SSCAD traz seis minicursos apresentados durante o XXV Simpósio em Sistemas Computacionais de Alto Desempenho, realizado entre os dias 23 e 25 de outubro de 2024 em São Carlos, SP. O primeiro capítulo versa sobre conceitos essenciais de processamento e análise de volumes massivos de dados, fazendo uso de algoritmos de aprendizado de máquina de forma prática na plataforma HPCC (High Performance Computing Cluster). Já no segundo capítulo, é apresentada ao leitor a possibilidade de compreender o Parallel Scalability Suite para avaliar o comportamento de aplicações paralelas mediante perfilamento e visualização da escalabilidade. O terceiro capítulo, por sua vez, apresenta técnicas de programação paralela híbridas aderentes ao padrão MPI e OpenMP Offloading, com ênfase nos modelos de paralelismo em aceleradores. No quarto capítulo, conceitos fundamentais de simulação arquitetural com o simulador gem5 são introduzidos. O capítulo também explora como a simulação habilita os projetistas a explorar, verificar e otimizar arquiteturas através da modelagem de seu comportamento e interação com componentes chaves do sistema. Considerando os avanços da computação quântica, o capítulo 5 demonstra como desenvolver algoritmos para uma arquitetura de computação quântica através do kit de desenvolvimento usando o IBM/Qiskit. Por fim, o sexto capítulo estuda as definições sobre análise de desempenho, as principais técnicas utilizadas e algumas das ferramentas para analisar o desempenho de aplicações paralelas.

Capítulos

Downloads

Referências

Amdahl, G. M. (1967) Validity of the single processor approach to achieving large scale computing capabilities. In Proceedings of the April 18-20, 1967, Spring Joint Computer Conference (pp. 483–485). Association for Computing Machinery.

AMDAHL, G. M. Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of the April 18-20, 1967, Spring Joint Computer Conference. New York, NY, USA: Association for Computing Machinery, 1967. (AFIPS ’67 (Spring)), p. 483–485. ISBN 9781450378956. DOI: 10.1145/1465482.1465560.

Arrighi, P. (2019). An overview of quantum cellular automata.

Arute, F., Arya, K., Babbush, R., Bacon, D., Bardin, J. C., Barends, R., Biswas, R., Boixo, S., Brandao, F. G. S. L., Buell, D. A., Burkett, B., Chen, Y., Chen, Z., Chiaro, B., Collins, R., Courtney, W., Dunsworth, A., Farhi, E., Foxen, B., Fowler, A., Gidney, C., Giustina, M., Graff, R., Guerin, K., Habegger, S., Harrigan, M. P., Hartmann, M. J., Ho, A., Hoffmann, M., Huang, T., Humble, T. S., Isakov, S. V., Jeffrey, E., Jiang, Z., Kafri, D., Kechedzhi, K., Kelly, J., Klimov, P. V., Knysh, S., Korotkov, A., Kostritsa, F., Landhuis, D., Lindmark, M., Lucero, E., Lyakh, D., Mandrà, S., McClean, J. R., McEwen, M., Megrant, A., Mi, X., Michielsen, K., Mohseni, M., Mutus, J., Naaman, O., Neeley, M., Neill, C., Niu, M. Y., Ostby, E., Petukhov, A., Platt, J. C., Quintana, C., Rieffel, E. G., Roushan, P., Rubin, N. C., Sank, D., Satzinger, K. J., Smelyanskiy, V., Sung, K. J., Trevithick, M. D., Vainsencher, A., Villalonga, B., White, T., Yao, Z. J., Yeh, P., Zalcman, A., Neven, H., and Martinis, J. M. (2019). Quantum supremacy using a programmable superconducting processor. Nature, 574(7779):505–510.

B N Chandrashekhar and H A Sanjay. Performance analysis of sequential and parallel programming paradigms on cpu-gpus cluster. In 2021 Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV), pages 1205–1213, 2021. DOI: 10.1109/ICICV50876.2021.9388469.

Baczyk, M. (2024). Shall you buy a quantum computer today? - analysis of qc on-premise deployments. Quantum Computing Report. Accessed: 2024-08-08.

Binkert, N., Beckmann, B., Black, G., Reinhardt, S. K., Saidi, A., Basu, A., Hestness, J., Hower, D. R., Krishna, T., Sardashti, S., Sen, R., Sewell, K., Shoaib, M., Vaish, N., Hill, M. D., and Wood, D. A. (2011). The gem5 simulator. SIGARCH Comput. Archit. News, 39(2):1–7.

Binkert, N., Dreslinski, R., Hsu, L., Lim, K., Saidi, A., and Reinhardt, S. (2006). The m5 simulator: Modeling networked systems. IEEE Micro, 26(4):52–60.

Black, G., Binkert, N., Reinhardt, S. K., and Saidi, A. (2010). Modular ISA-Independent Full-System Simulation, pages 65–83. Springer US, Boston, MA.

CHEN, K. Performance Evaluation by Simulation and Analysis with Applications to Computer Networks. [S.l.]: John Wiley & Sons, Ltd, 2015. 286 p. ISBN 9781119006190.

Cornell Virtual Workshop. MPI Calls Among Threads. Technical report, Cornell University, 2024. [link].

CSC – IT Center for Science Ltd. Hybrid CPU programming with OpenMP and MPI. Technical report, CSC – IT Center for Science Ltd, 2022. [link].

Deutsch, D. (1985). Quantum theory, the Church–Turing principle and the universal quantum computer. Proc. R. Soc. Lond., 400(1818):97–117.

Documentation: LearningTrees Documentation. Disponível em: [link] Acesso em: 25 set. 2024.

Documentation: PBblas Documentation. Disponível em: [link] Acesso em: 25 set. 2024.

ECL-ML Machine Learning Module. Disponível em: [link] Acesso em: 25 set. 2024.

Feynman, R. P. (1982). Simulating physics with computers. International Journal of Theoretical Physics, 21(6-7):467–488.

Gabriel P. Silva, Calebe P. Bianchini, and Evaldo B. Costa. Programação Paralela e Distribuída com MPI, OpenMP e OpenACC para computação de alto desempenho. CasaDoCodigo, 2022.

Gamberi, G. P. and Bianchini, C. P. (2023). Study of quantum algorithms and their implementations. In 2023 International Conference on Electrical, Communication and Computer Engineering (ICECCE), pages 1–6.

gem5 (2022). gem5: Execution Basics. [link]. [Accessed 23-09-2024].

Gem5 Project (2024). Getting started with gem5. [link]. Accessed: 2024-10-02.

Grover, L. K. (1996). A fast quantum mechanical algorithm for database search.

GUSTAFSON, J. L. Reevaluating amdahl’s law. Commun. ACM, Association for Computing Machinery, New York, NY, USA, v. 31, n. 5, p. 532–533, may 1988. ISSN 0001-0782. DOI: 10.1145/42411.42415.

GWT-TUD GmbH. Vampir 10.5. 2024. Accessed: 2024-09-13. Disponível em: [link].

Hermes Senger and Jaime Freire de Souza. Programe sua GPU com OpenMP. Technical report, ERAD/RS 2022, 2022. [link].

HOLLINGSWORTH, J.; MILLER, B.; CARGILLE, J. Dynamic program instrumentation for scalable performance tools. In: Proceedings of IEEE Scalable High Performance Computing Conference. [S.l.: s.n.], 1994. p. 841–850.

Holly Judge and Mark Bull. Understanding Hybrid MPI + OpenMP Performance. Technical report, EPCC, University of Edinburgh, 2022. [link].

HPCC Systems Machine Learning Library. Disponível em: [link] Acesso em: 25 set. 2024.

Introducing the new, improved HPCC Systems Machine Learning Library | HPCC Systems. Disponível em: [link] Acesso em: 25 set. 2024.

Introduction to HPCC Systems Open Source Big Data Platform. Disponível em: [link] Acesso em: 25 set. 2024.

Introduction to using PBblas on HPCC Systems. Disponível em: [link] Acesso em: 25 set. 2024.

JAIN, R. The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling. [S.l.]: Wiley, 1991. 685 p. (Wiley professional computing). ISBN 978-0-471-50336-1.

Javadi-Abhari, A., Treinish, M., Krsulich, K., Wood, C. J., Lishman, J., Gacon, J., Martiel, S., Nation, P. D., Bishop, L. S., Cross, A.W., Johnson, B. R., and Gambetta, J. M. (2024). Quantum computing with qiskit.

John L. Gustafson. 1988. Reevaluating Amdahl’s law. Commun. ACM 31, 5 (May 1988), 532–533. DOI: 10.1145/42411.42415

Jorio, A. and Frossard, J. V. (2024). Material de Estudos para Mecânica Quântica. Programa de Pós-Graduação em Física, UFMG, 2nd edition.

Joshua Hoke Davis, Christopher Daley, Swaroop Pophale, Thomas Huber, Sunita Chandrasekaran, and Nicholas J. Wright. Performance assessment of openmp compilers targeting nvidia v100 gpus, 2020. [link].

Learning Trees — A guide to Decision Tree based Machine Learning. Disponível em: [link] Acesso em: 25 set. 2024.

LIN, Y. C.; SNYDER, L. Principles of Parallel Programming. Boston, Mass: Pearson/Addison Wesley, 2008. ISBN 978-0321487902. Disponível em: [link].

Lowe-Power, J., Ahmad, A. M., Akram, A., Alian, M., Amslinger, R., Andreozzi, M., Armejach, A., Asmussen, N., Bharadwaj, S., Black, G., Bloom, G., Bruce, B. R., Carvalho, D. R., Castrillón, J., Chen, L., Derumigny, N., Diestelhorst, S., Elsasser, W., Fariborz, M., Farahani, A. F., Fotouhi, P., Gambord, R., Gandhi, J., Gope, D., Grass, T., Hanindhito, B., Hansson, A., Haria, S., Harris, A., Hayes, T., Herrera, A., Horsnell, M., Jafri, S. A. R., Jagtap, R., Jang, H., Jeyapaul, R., Jones, T. M., Jung, M., Kannoth, S., Khaleghzadeh, H., Kodama, Y., Krishna, T., Marinelli, T., Menard, C., Mondelli, A., Mück, T., Naji, O., Nathella, K., Nguyen, H., Nikoleris, N., Olson, L. E., Orr, M. S., Pham, B., Prieto, P., Reddy, T., Roelke, A., Samani, M., Sandberg, A., Setoain, J., Shingarov, B., Sinclair, M. D., Ta, T., Thakur, R., Travaglini, G., Upton, M., Vaish, N., Vougioukas, I.,Wang, Z.,Wehn, N.,Weis, C., Wood, D. A., Yoon, H., and Zulian, É. F. (2020). The gem5 simulator: Version 20.0+. CoRR, abs/2007.03152.

Machine Learning Demystified. Disponível em: [link] Acesso em: 25 set. 2024.

MANACERO, A. Predição do desempenho de programas paralelos por simulação do grafo de execução. Tese (Doutorado) — University of Campinas, Brazil, 1997. DOI: 10.47749/T/UNICAMP.1997.118682.

Martin, M. M. K., Sorin, D. J., Beckmann, B. M., Marty, M. R., Xu, M., Alameldeen, A. R., Moore, K. E., Hill, M. D., and Wood, D. A. (2005). Multifacet’s general execution-driven multiprocessor simulator (gems) toolset. SIGARCH Comput. Archit. News, 33(4):92–99.

McIntosh, H. V. (2009). One Dimensional Cellular Automata. Luniver Press.

Michael Klemm and Jim Cownie. High performance parallel runtimes: Design and implementation, volume 1. De Gruyter Oldenbourg, 1 edition, 2021.

Microsoft Quantum (n.d.a). Quantum computing concepts: Entanglement. [link]. Accessed: 2024-09-05.

Microsoft Quantum (n.d.b). Quantum computing concepts: Superposition. [link]. Accessed: 2024-09-05.

MILLER, B.; HOLLINGSWORTH, J. Paradyn Tools Project. 2024. Accessed: 2024-09-13. Disponível em: [link].

ML_Core Documentation. Disponível em: [link] Acesso em: 25 set. 2024.

MPI Forum. MPI: A Message-Passing Interface Standard Version 2.2. Technical report, MPI Forun, 2009.

NASA Advanced Suporcomputing Division. NAS Parallel Benchmarks. 2024. Accessed: 2024-09-13. Disponível em: [link].

Nayak, P., Rathod, S., Surabhi, and Sukanya (2024). Quantum computing: Circuits, algorithms and application. International Journal of Advanced Research in Science, Communication and Technology (IJARSCT), 4(1).

Nielsen, M. A. and Chuang, I. L. (2010). Quantum Computation and Quantum Information: 10th Anniversary Edition. Cambridge University Press.

Nóbrega-da-Silva, Anderson, Cunha, Daniel, Silva, Vitor, Araújo Furtunato, Alex Fabiano, and Xavier-de-Souza, Samuel (2019). "PaScal Viewer: A Tool for the Visualization of Parallel Scalability Trends". In: Handbook of Research on Emerging Developments and Applications of High Performance Computing. pp. 250-264. ISBN: 978-981-13-6209-5. DOI: 10.1007/978-3-030-17872-7_15.

NVIDIA. NVIDIA’s Next Generation Compute Architecture: Kepler GK110/210. Technical report, NVIDIA, 2014.

OpenMP ARB. OpenMP Application Programming Interface Version 5.0. Technical report, OpenMP ARB, 2018.

Pacheco, P. S. (2011) An introduction to parallel programming. Morgan Kaufmann.

Parmar, D. (2024). Patent landscape for quantum computing: A survey of patenting activities on different physical realization methods. IPWatchdog. Accessed: 2024-08-08.

PERFORMANCE RESEARCH LAB. TAU - Tuning and Analysis Utilities. 2006. [link]. Accesseda em Julho de 2024.

Preskill, J. (2012). Quantum computing and the entanglement frontier.

REINDERS, J. Vtune performance analyzer essentials. [S.l.]: Intel Press, 2007.

Ruud van der. Pas, Eric Stotzer, and Christian Terboven. Using OpenMP - the next step: affinity, accelerators, tasking, and SIMD. the MIT Press, 2017.

SAHNI, S.; THANVANTRI, V. Performance metrics: keeping the focus on runtime. IEEE Parallel & Distributed Technology: Systems & Applications, v. 4, n. 1, p. 43–56, 1996.

Shafique, M. A., Munir, A., and Latif, I. (2024). Quantum computing: Circuits, algorithms, and applications. IEEE Access, 12:22296–22314.

ShareTechNote (n.d.). Quantum computing - bloch sphere. [link]. Accessed: 2024-09-05.

Shor, P. (1994). Algorithms for quantum computation: discrete logarithms and factoring. In Proceedings 35th Annual Symposium on Foundations of Computer Science, pages 124–134.

Silva, V. (2018a). Practical Quantum Computing for Developers: Programming Quantum Rigs in the Cloud using Python, Quantum Assembly Language and IBM QExperience. Apress.

Silva, V. (2018b). Practical Quantum Computing for Developers: Programming Quantum Rigs in the Cloud Using Python, Quantum Assembly Language and IBM QExperience. Apress L.P., New York.

Silva, Vitor, Nóbrega-da-Silva, Anderson, Valderrama Sakuyama, C., Manneback, Pierre, and Xavier-de-Souza, Samuel (2022). "A Minimally Intrusive Approach for Automatic Assessment of Parallel Performance Scalability of Shared-Memory HPC Applications". Electronics, vol. 11, no. 5. DOI: 10.3390/electronics11050689.

Source code: HPCC Systems ML_Core repository on GitHub. Disponível em: [link] Acesso em: 25 set. 2024.

Source code: HPCC Systems PBblas repository on GitHub. Disponível em: [link] Acesso em: 25 set. 2024.

Source code: LearningTrees repository on GitHub. Disponível em: [link] Acesso em: 25 set. 2024.

SPEC. Standard Performance Evaluation Corporation. 2024. Accessed: 2024-09-13. Disponível em: [link].

STERLING, T.; ANDERSON, M.; BRODOWICZ, M. High Performance Computing: Modern Systems and Practices. EUA: Morgan Kaufmann, 2017. 718 p.

TAY, Y. C. Analytical Performance Modeling for Computer Systems. Springer International Publishing, 2014. ISSN 1932-1686. ISBN 9783031018008. DOI: 10.1007/978-3-031-01800-8.

Thomas Huber, Swaroop Pophale, Nolan Baker, Michael Carr, Nikhil Rao, Jaydon Reap, Kristina Holsapple, Joshua Hoke Davis, Tobias Burnus, Seyong Lee, David E. Bernholdt, and Sunita Chandrasekaran. Ecp sollve: Validation and verification test-suite status update and compiler insight for openmp. In 2022 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pages 123–135, 2022. DOI: 10.1109/P3HPC56579.2022.00017.

Tom Deakin and Timothy G. Mattson. Programming your GPU with openmp: Performance portability for gpus, volume 1. The MIT Press, 1 edition, 2023.

Understanding the Myriad Interface feature of HPCC Systems Machine Learning | HPCC Systems. Disponível em: [link] Acesso em: 25 set. 2024.

Using HPCC Systems Machine Learning. Disponível em: [link] Acesso em: 25 set. 2024.

Valdez, F. and Melin, P. (2022). A review on quantum computing and deep learning algorithms and their applications. Soft Computing.

Waterman, A., Lee, Y., Patterson, D. A., and Asanovic, K. (2014). The risc-v instruction set manual, volume i: User-level isa, version 2.0. EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2014-54, page 4.

Wolfram, S. (2002). A New Kind of Science. Wolfram Media.

Yanofsky, N. S. and Mannucci, M. A. (2008). Quantum computing for computer scientists. Cambridge University Press.

Zhao, J. (2022). Possible implementations of oracles in quantum algorithms. J. Phys. Conf. Ser., 2386(1):012010.

Data de publicação

23/10/2024

Licença

Creative Commons License
Este trabalho está licenciado sob uma licença Creative Commons Attribution 4.0 International License.

Detalhes sobre o formato disponível para publicação: Volume Completo

Volume Completo

ISBN-13 (15)

978-85-7669-610-0