Generalist vs Specialist Language Models

Table of Contents

Seminars-2-2024 - This article is part of a series.

Part 1: The Dawn of an Immersive Internet: XR, Generative AI and the Road to 6G

Part 2: AGI Chips - The Next Frontier

Part 3: This Article

Part 4: From Intelligent Surfaces to Noise-Driven Communication: Innovative Technologies for 6G and Beyond

Part 5: An Overview of Evolutionary Multi-Objective Optimization

Part 6: Packet Trimming at the Edge for Low Latency in 6G Environments

Part 7: Scientific Machine Learning and Quantum Utility: A Near Future Perspective

Part 8: Cloud Storage Systems: Latency Characterization and Extensions

Part 9: Let the Plants do the Talking: Climate-Smart Agriculture by the Messages Received from Plants and Soil

Part 10: More is different: How the Science of Complex Systems can inspire Future Autonomous Networks

Abstract #

We are currently experiencing a paradigm shift in the field of artificial intelligence, where with every new version of a large language model, tasks that once seemed far from being solved are now being addressed. However, due to the vast amount of data and computation required, the cost to develop and serve these general-purpose AIs is high. On the other hand, recent results show that specialization in domains and languages can lead to an increase in the quality of predictions at lower training and inference costs. In this talk, we will examine empirical results that exemplify this tension between generalist models, which aim to learn universal knowledge, and specialists, which focus on obtaining localized knowledge. Finally, we will discuss the future of artificial intelligence, particularly in Brazil.

Bio of Rodrigo Nogueira #

Rodrigo Nogueira is the founder and CEO of Maritaca AI, a company specializing in the development of specialized LLMs in Brazil. He was a pioneer in the use of Transformers in search systems and co-author of the book “Pretrained Transformers for Text Ranking.” Rodrigo holds a Ph.D. in Computer Science from New York University (NYU), having been mentored by the renowned Professor Kyunghyun Cho. Throughout his career, Nogueira has made contributions to the fields of Information Retrieval and Natural Language Processing through the creation of models such as BERTimbau, doc2query, monoT5, and more recently, the Sabiá 1, 2 and 3 models, which are specialized LLMs in Brazil.