Event Details

Towards Principled Post-Training of Large Language Models

Banghua Zhu (UC Berkeley)

Colloquium

Tuesday, April 9, 2024, 3:30 pm

Gates Center - Amazon Auditorium (G20)

Abstract

In the development of large language models (LLMs), post-training is a critical step that significantly improves model capabilities and aligns them with human preferences. In this talk, I will discuss the design principles behind post-training techniques that have led to the creation of strong compact open models, including Starling-7B and NexusRaven-13B. Starling-7B is the best 7B chat model according to human evaluation in Chatbot Arena, outperforming models such as Llama2-70B-Chat and GPT-3.5-Turbo. NexusRaven-13B surpasses GPT-4 in function calling capabilities.

Specifically, I will discuss existing issues with reinforcement learning from human feedback (RLHF) -- a pivotal technique for aligning large language models with human values. I will present improved RLHF algorithms informed by statistical decision theory, along with our high-quality open datasets for RLHF. By combining the enhanced RLHF algorithm with our own dataset, we created the Starling model suites. Our techniques and the resulting models contribute to the understanding of learning human preferences and aligning language models with human values.

Bio

Banghua is a final-year Ph.D. student at Berkeley, advised by Professor Michael I. Jordan and Jiantao Jiao. Banghua's research focuses on statistics and information theory, with applications in contract theory, noisy computing, robust statistics, reinforcement learning, large language models and machine learning systems. He is a recipient of the David Sakrison Memorial Prize for outstanding Ph.D. research at Berkeley EECS.

News & Events

Towards Principled Post-Training of Large Language Models

Abstract

Bio