Arxiv Dives

Each week we dive deep into a topic in machine learning, data management, or general artificial intelligence research. These are notes from a live reading group we do every Friday. Captured for future reference. Join the community here: https://lu.ma/oxen

Oct

The Prompt Report Part 2: Plan and Solve, Tree of Thought, and Decomposition Prompting

In the last blog, we went over prompting techniques 1-3 of The Prompt Report. This arXiv Dive, we were lucky

Greg Schoeninger

Oct 31, 2024

17 min read

Oct

The Prompt Report Part 1: A Systematic Survey of Prompting Techniques

For this blog we are switching it up a bit. In past Arxiv Dives, we have gone deep into the

Greg Schoeninger

Oct 9, 2024

12 min read

Sep

arXiv Dive: How Flux and Rectified Flow Transformers Work

Flux made quite a splash with its release on August 1st, 2024 as the new state of the art generative

Greg Schoeninger

Sep 18, 2024

9 min read

Aug

arXiv Dive: How Meta Trained Llama 3.1

Llama 3.1 is a set of Open Weights Foundation models released by Meta, which marks the first time an

Greg Schoeninger

Aug 26, 2024

12 min read

Jul

ArXiv Dives: How ReFT works

ArXiv Dives is a series of live meetups that take place on Fridays with the Oxen.ai community. We believe

Greg Schoeninger

Jul 21, 2024

10 min read

Jun

ArXiv Dives:💃 Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

Modeling sequences with infinite context length is one of the dreams of Large Language models. Some LLMs such as Transformers

Greg Schoeninger

Jun 26, 2024

5 min read

Jun

ArXiv Dives: Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet

The ability to interpret and steer large language models is an important topic as they become more and more a

Greg Schoeninger

Jun 4, 2024

9 min read

May

ArXiv Dives: Efficient DiT Fine-Tuning with PixART for Text to Image Generation

Diffusion Transformers have been gaining a lot of steam since OpenAI's demo of Sora back in March. The

Greg Schoeninger

May 29, 2024

9 min read

May

ArXiv Dives: Evaluating LLMs for Code Completion with HumanEval

Large Language Models have shown very good ability to generalize within a distribution, and frontier models have shown incredible flexibility

Greg Schoeninger

May 17, 2024

15 min read

Apr

How to Train Diffusion for Text from Scratch

This is part two of a series on Diffusion for Text with Score Entropy Discrete Diffusion (SEDD) models. Today we

Greg Schoeninger

Apr 29, 2024

16 min read