KDD'25 A Tutorial on Small Language Models in the Era of Large Language Models: Architecture, Capabilities, and Trustworthiness
Fali Wang1, Minhua Lin1, Yao Ma2, Hui Liu3, Qi He, Xianfeng Tang3, Jiliang Tang4, Jian Pei5, Suhang Wang11 The Pennsylvania State University
2 Rensselaer Polytechnic Institute
3 Amazon
4 Michigan State University
5 Duke University
Time: Aug 3, 2025 8:00 AM - 11:00 AM (EDT)
Location: [to-fill-later], Toronto, Canada
Abstract
Large language models (LLMs) based on the Transformer architecture are powerful but face challenges with deployment, inference latency, and costly fine-tuning. These limitations highlight the emerging potential of small language models (SLMs), which can either replace LLMs through innovative architectures and technologies, or assist them as efficient proxy or reward models. Emerging architectures such as Mamba and xLSTM address the quadratic scaling of inference with window length in Transformers by enabling linear scaling. To maximize SLM performance, test-time compute scaling strategies reduce the performance gap with LLMs by allocating extra compute budget during test time. Beyond standalone usage, SLMs could also assist in LLMs via weak-to-strong learning, proxy tuning, and guarding, fostering secure and efficient LLM deployment. Lastly, the trustworthiness of SLMs remains a critical yet underexplored research area. However, there is a lack of tutorials on cutting-edge SLM technologies, prompting us to conduct one.
This tutorial covers recent progress in Small Language Models (SLMs) in the era of Large Language Models (LLMs), focusing on architecture, capabilities, and trustworthiness.
- LLM Foundations: Overview of recent LLM developments that support and inspire SLM design.
- SLM Architecture: Efficient architectures tailored for small-scale models, including Transformer variants and state-space models.
- From Weak to Strong: Techniques to boost SLM performance include distillation, test-time scaling, retrieval augmentation, and agent collaboration.
- Trustworthiness: SLM robustness in adversarial settings, jailbreak resistance, fairness, and privacy.
Schedule
[to-fill-later]
Slides
- Introduction [to-fill-the-link-later]
- Part I: SLM Architecture
- Part II: Weak to Strong Methods
- Part III: Trustworthiness
- Conclusion
Authors
Fali Wang, Ph.D. student, Informatics, PSU. His research focuses on the Intersection of graphs and LLMs, small language models.
Website: https://FairyFali.github.io/
BibTeX
@article{wang2024comprehensive,
title={A Comprehensive Survey of Small Language Models in the Era of Large Language Models: Techniques, Enhancements, Applications, Collaboration with LLMs, and Trustworthiness},
author={Wang, Fali and Zhang, Zhiwei and Zhang, Xianren and Wu, Zongyu and Mo, Tzuhao and Lu, Qiuhao and Wang, Wanjing and Li, Rui and Xu, Junjie and Tang, Xianfeng and others},
journal={arXiv preprint arXiv:2411.03350},
year={2024}
}