UCLA Reinforcement Learning for LLMs

Reinforcement Learning for LLMs – UCLA Course

عدد الدروس : 12 عدد ساعات الدورة : 10:57:33 شهادة معتمدة : نعم التسجيل في الدورة للحصول على شهادة

للحصول على شهادة

1- التسجيل
2- مشاهدة الكورس كاملا
3- متابعة نسبة اكتمال الكورس تدريجيا
4- بعد الانتهاء تظهر الشهادة في الملف الشخصي الخاص بك

Learn how reinforcement learning integrates with large language models (LLMs) in this UCLA course, covering MDPs, policy gradients, transformers, and instruction fine-tuning.

قائمة الدروس

1 - [UCLA RL-LLM] Chapter 0: Course outline and prologue

2 - [UCLA RL-LLM] Chapter 1.1: MDP foundations, imitation learning, and value iteration

3 - [UCLA RL-LLM] Chapter 1.2: Deep policy evaluation

4 - [UCLA RL-LLM] Chapter 1.3: Deep policy gradient methods (A3C)

5 - [UCLA RL-LLM] Chapter 1.4: Deep policy gradient methods (PPO, GRPO)

6 - [UCLA RL-LLM] Chapter 1.5: AlphaGo, test-time compute, and expert iteration

7 - [UCLA RL-LLM] Chapter 2.1: NLP foundations, language modeling, RNNs

8 - [UCLA RL-LLM] Chapter 2.2: Transformers I (BERT, GPT-1)

9 - [UCLA RL-LLM] Chapter 2.3: Transformers II (modern transformers updates and sampling methods)

10 - [UCLA RL-LLM] Chapter 2.4: In-context learning and instruction fine-tuning

11 - [UCLA RL-LLM] Chapter 3.1: Reinforcement learning from human feedback (PPO, DPO)

12 - [UCLA RL-LLM] Chapter 3.2: Reinforcement learning with verifiable rewards (RLVR)

عن الدورة

The Reinforcement Learning for Large Language Models course from UCLA provides a comprehensive introduction to combining RL and LLMs. It begins with foundational concepts in Markov Decision Processes (MDPs), imitation learning, and value iteration, giving students the theoretical grounding for RL in sequential decision-making tasks. The course then dives into deep policy evaluation and advanced policy gradient methods, including A3C, PPO, and GRPO, teaching how to train agents efficiently and effectively.

Students will also explore applications such as AlphaGo, test-time compute considerations, and expert iteration methods, highlighting real-world strategies for improving RL performance. The course then transitions to NLP foundations, covering language modeling, RNNs, and the evolution of transformers, including BERT, GPT-1, and modern sampling methods. Advanced topics include in-context learning and instruction fine-tuning, critical for adapting LLMs to specialized tasks using reinforcement learning techniques.

Through lectures, coding examples, and hands-on exercises, students gain both theoretical and practical understanding, preparing them to experiment with RL-enhanced LLMs in research or applied AI settings. By course end, learners are equipped to implement and evaluate RL methods for LLM training and fine-tuning.

Reinforcement Learning for LLMs – UCLA Course

قائمة الدروس

عن الدورة

دورات ذات صلة

English Speaking Practice | Food & Restaurant Conversations

Probability and Statistics Tutorials – 365 Data Science

Stanford CME295 Transformers & LLMs – Autumn 2025

Practical Introduction to Large Language Models (LLMs) – Full Series

Intro to Large Language Models – Andrej Karpathy

Stanford CS336 – Language Modeling from Scratch | Spring 2025

LLMs Level 1 – Master Large Language Models | H2O.ai

Build an LLM from Scratch 1: Set Up Your Code Environment