AI Stories

Fine-Tuning LLMs, Hugging Face & Open Source with Lewis Tunstall #49

Neil Leiser Season 3 Episode 15

Our guest today is Lewis Tunstall, LLM Engineer and researcher at Hugging Face and book author of "Natural Language Processing with Transformers".

In our conversation, we dive into topological machine learning and talk about giotto-tda, a high performance topological ml Python library that Lewis worked on. We then dive into LLMs and Transformers. We discuss the pros and cons of open source vs closed source LLMs and explain the differences between encoder and decoder transformer architectures. Lewis finally explains his day-to-day at Hugging Face and his current work on fine-tuning LLMs.

If you enjoyed the episode, please leave a 5 star review and subscribe to the AI Stories Youtube channel.

Link to Train in Data courses (use the code AISTORIES to get a 10% discount): https://www.trainindata.com/courses?affcode=1218302_5n7kraba

Natural Language Processing with Transformers book: https://www.oreilly.com/library/view/natural-language-processing/9781098136789/

Giotto-tda library: https://github.com/giotto-ai/giotto-tda

KTO alignment paper: https://arxiv.org/abs/2402.01306

Follow Lewis on LinkedIn: https://www.linkedin.com/in/lewis-tunstall/

Follow Neil on LinkedIn: https://www.linkedin.com/in/leiserneil/  

---

(00:00) - Intro

(03:00) - How Lewis Got into AI

(05:33) - From Kaggle Competitions to Data Science Job

(11:09) - Get an actual Data Science Job!

(15:18) - Deep Learning or Excel?

(19:14) - Topological Machine Learning

(28:44) - Open Source VS Closed Source LLMs

(41:44) - Writing a Book on Transformers

(52:33) - Comparing BERT, Early Transformers, and GPT-4

(54:48) - Encoder and Decoder Architectures

(59:48) - Day-To-Day Work at Hugging Face

(01:09:06) - DPO and KTO

(01:12:58) - Stories and Career Advice