Arxiv Dives

Each week we dive deep into a topic in machine learning, data management, or general artificial intelligence research. These are notes from a live reading group we do every Friday. Captured for future reference. Join the community here: https://lu.ma/oxen

Apr

How RWKV-7 Goose Works 🪿 + Notes from the Author

In this special Arxiv Dive, we're joined by Eugene Cheah - author, lead in RWKV org, CEO of

Greg Schoeninger

Apr 15, 2025

17 min read

Mar

How Phi-4 Cracked Small Multimodality

Phi-4 extends the existing Phi model’s capabilities by adding vision and audio all in the same model. This means

Greg Schoeninger

Mar 25, 2025

8 min read

Feb

Why GRPO is Important and How it Works

Last week on Arxiv Dives we dug into research behind DeepSeek-R1, and uncovered that one of the techniques they use

Greg Schoeninger

Feb 11, 2025

12 min read

Feb

How DeepSeek R1, GRPO, and Previous DeepSeek Models Work

In January 2025, DeepSeek took a shot directly at OpenAI by releasing a suite of models that “Rival OpenAI’s

Greg Schoeninger

Feb 4, 2025

15 min read

Jan

No Hype DeepSeek-R1 Reading List

DeepSeek-R1 is a big step forward in the open model ecosystem for AI with their latest model competing with OpenAI&

Greg Schoeninger

Jan 29, 2025

27 min read

Jan

arXiv Dive: RAGAS - Retrieval Augmented Generation Assessment

RAGAS is an evaluation framework for Retrieval Augmented Generation (RAG). A paper released by Exploding Gradients, AMPLYFI, and CardiffNLP. RAGAS

Greg Schoeninger

Jan 21, 2025

13 min read

Dec

OpenCoder: The OPEN Cookbook For Top-Tier Code LLMs

Welcome to the last arXiv Dive of 2024! Every other week we have been diving into interesting research papers in

Greg Schoeninger

Dec 23, 2024

14 min read

Dec

LLaVA-CoT: Let Vision Language Models Reason Step-By-Step

When it comes to large language models, it is still the early innings. Many of them still hallucinate, fail to

Greg Schoeninger

Dec 9, 2024

12 min read

Nov

How Upcycling MoEs Beat Dense LLMs

In this Arxiv Dive, Nvidia researcher, Ethan He, presents his co-authored work Upcycling LLMs in Mixture of Experts (MoE). He

Greg Schoeninger

Nov 18, 2024

1 min read

Nov

Thinking LLMs: General Instruction Following with Thought Generation

The release of OpenAI-O1 has motivated a lot of people to think deeply about…thoughts 💭. Thinking before you speak is

Greg Schoeninger

Nov 11, 2024

14 min read