2 min readfrom Machine Learning

Hi Reddit, I posted my Build Your Own LLM workshop to Youtube teaching ML, LLM and math intuition [P]

Our take

Hello Reddit! Interested in understanding how Large Language Models truly work? I’ve released a comprehensive workshop on YouTube, “Build Your Own LLM,” designed to demystify the process without requiring prior math or ML expertise. We cover everything from fundamental machine learning principles to advanced transformer architectures, illustrated with practical code and Excel examples. Explore topics like tokenization, attention mechanisms, and pre-training—grokking each component is the goal. For deeper insights into fine-tuning techniques, check out our related article, “Best current methods for finetuning whisper."
Hi Reddit, I posted my Build Your Own LLM workshop to Youtube teaching ML, LLM and math intuition [P]

The recent proliferation of accessible resources for understanding and building Large Language Models (LLMs) is a welcome trend, and the release of Justin Angel's "Build Your Own LLM" workshop on YouTube exemplifies this positive shift. It’s encouraging to see complex topics like transformer architecture and reinforcement learning being broken down in a way that prioritizes intuition over dense mathematical formalism. This approach aligns perfectly with the need to democratize AI knowledge, moving beyond the realm of highly specialized researchers and engineers. The inclusion of Excel-based examples to illustrate underlying mathematical concepts is particularly ingenious – a practical, relatable method for those who may not have a strong background in machine learning. This complements ongoing discussions about practical debugging techniques in neural network training, as seen in Data-centric debugging for teams training neural nets, highlighting the growing emphasis on tools and methods that make AI development more accessible to a wider audience. The workshop’s focus on providing both slides and self-paced exercises further enhances its utility for learners with diverse preferences.

What’s truly notable is the workshop’s comprehensive scope, covering a considerable breadth of LLM development from foundational concepts like perceptrons and activation functions to more advanced techniques like instruction tuning and reinforcement learning. While acknowledging what wasn’t covered (scaling), the creator has clearly aimed to provide a robust grounding in the core principles. This kind of holistic understanding is vital as the field rapidly evolves. It’s also interesting to see the discussion around fine-tuning methodologies – a challenge that many practitioners face. The community is actively exploring different approaches, as demonstrated by the conversation around Best current methods for finetuning whisper on domain specific vocabulary?, which reveals a desire for more efficient and targeted methods for adapting pre-trained models to specific tasks. The inclusion of GPU coding examples using PyTorch and related technologies underscores the practical, hands-on nature of the workshop—critical for translating theoretical knowledge into tangible skills.

The shift towards making LLM development more approachable is a significant step forward. Previously, the entry barrier for contributing to or even understanding this area of AI was exceedingly high. This workshop, along with other initiatives, helps to lower that barrier, empowering a new generation of AI practitioners and fostering innovation. The community’s interest in adapter techniques, such as those discussed in EMA on LoRA ?, further illustrates a desire to leverage existing models and fine-tune them efficiently – a strategic approach that aligns well with the principles of accessible and practical AI development championed by this workshop. This move away from building everything from scratch is a pragmatic response to the computational resources and expertise required to train LLMs from the ground up.

Ultimately, Justin Angel's workshop represents a positive trend in AI education – one that emphasizes intuitive understanding, practical application, and accessibility. The ability to build and experiment with LLMs, even at a relatively basic level, democratizes access to this powerful technology and encourages broader participation in its development. It remains to be seen how this influx of new practitioners will shape the future of LLMs, but the current trajectory suggests a more diverse and inclusive landscape, where innovation is driven not just by large corporations, but also by a vibrant community of empowered individuals exploring the possibilities of AI. Will we see a rise in specialized, niche LLMs developed by individuals leveraging accessible tools like those showcased in this workshop, ultimately leading to a more tailored and responsive AI ecosystem?

Hi Reddit, I posted my Build Your Own LLM workshop to Youtube teaching ML, LLM and math intuition [P]

Hi internet friends, I recorded a workshop about building your own LLM without any math / ML prerequisites. It covers everything from machine learning fundamentals, deep neural networks, transformer architecture, and pre/post-training.

The only prerequisite is being comfortable with learning through code & excel examples.

  1. Sampling Large Language Models
  2. Reverse Engineering Large Language Model
  3. Perceptrons: wx+b
  4. Activation Functions: ReLU, GELU, SwiGLU
  5. GPU Coding: PyTorch, torch.compile(), fused kernels, CUDA, Triton
  6. MLPs/FFNs: Multi-input, Multi-Layer Perceptrons, Feed-Forward Networks
  7. Loss Functions: Residual errors, RMSE, Cross Entropy, Loss Landscapes
  8. Backpropagation: Training loops, Optimizers, Learning Rate, Batch Size
  9. Saving & Loading Models
  10. Initialization: Kaiming, Glorot
  11. Residuals: Addition, Scaling, Gated, Concatenation
  12. Normalization: Pre-norm vs. Post-norm, RMSNorm, BatchNorm, LayerNorm
  13. Regularization: Dropout, Gradient Clipping, Weight Decay
  14. SoftMax
  15. Tokenizers: By Character, By Word, BPE, SentencePiece
  16. Embeddings: Absolute vs. Learned, Sinusoidal vs. RoPE
  17. Attention: MHA, GQA, MQA, MLA
  18. Transformers
  19. Pre-training: Data Sources, Datasets, HTML Cleaning, Quality Filtering, Sharding
  20. Evaluation: Leaderboards, Benchmarks, Verifiers vs LLM-as-Judge
  21. Instruction Tuning: Alpaca & Other Formats, Self Instruct, Capabilities
  22. Reinforcement Learning: Policy Optimization, SimPO
  23. What We Didn't Cover: Scaling

Each section has slides teaching the concepts, followed by excel-by-hand developing intuition for the math, and then coding examples. The goal is able to grok all parts of modern LLM development.

We did this workshop in-person in San Francisco last month and hopefully the spaciousness of watching online works for everyone. If don't like watching videos, you can get the slides and exercises and work self-paced.

submitted by /u/JustinAngel
[link] [comments]

Read on the original site

Open the publisher's page for the full experience

View original article

Tagged with

#rows.com#machine learning in spreadsheet applications#Excel alternatives for data analysis#natural language processing for spreadsheets#generative AI for data analysis#self-service analytics tools#Excel compatibility#large dataset processing#natural language processing#Excel alternatives#self-service analytics#data cleaning solutions#big data management in spreadsheets#modern spreadsheet innovations#conversational data analysis#real-time data collaboration#financial modeling with spreadsheets#intelligent data visualization#no-code spreadsheet solutions#AutoML capabilities