Arxiv Dives - Self-Rewarding Language Models
The goal of this paper is to see if we can create a self-improving feedback loop to achieve “superhuman agents”
Arxiv Dives - Direct Preference Optimization (DPO)
This paper provides a simple and stable alternative to RLHF for aligning Large Language Models with human preferences called "
Arxiv Dives - Efficient Streaming Language Models with Attention Sinks
This paper introduces the concept of an Attention Sink which helps Large Language Models (LLMs) maintain the coherence of text
Arxiv Dives - How Mixture of Experts works with Mixtral 8x7B
Mixtral 8x7B is an open source mixture of experts large language model released by the team at Mistral.ai that
Arxiv Dives - LLaVA 🌋 an open source Large Multimodal Model (LMM)
What is LLaVA?
LLaVA is a Multi-Modal model that connects a Vision Encoder and an LLM for general purpose visual
Practical ML Dive - Building RAG from Open Source Pt 1
RAG was introduced by the Facebook AI Research (FAIR) team in May of 2020 as an end-to-end way to include
Arxiv Dives - How Mistral 7B works
What is Mistral 7B?
Mistral 7B is an open weights large language model by Mistral.ai that was build for
Practical ML Dive - How to train Mamba for Question Answering
What is Mamba 🐍?
There is a lot of hype about Mamba being a fast alternative to the Transformer architecture. The
Mamba: Linear-Time Sequence Modeling with Selective State Spaces - Arxiv Dives
What is Mamba 🐍?
Mamba at it's core is a recurrent neural network architecture, that outperforms Transformers with faster
Practical ML Dive - How to customize a Vision Transformer on your own data
Welcome to Practical ML Dives, a series spin off of Arxiv Dives.
In Arxiv Dives, we cover state of the