Latest
Dec
26
The Best AI Data Version Control Tools [2025]
Data is often seen as static. It's common to just dump your data into S3 buckets in tarballs
6 min read
Dec
23
OpenCoder: The OPEN Cookbook For Top-Tier Code LLMs
Welcome to the last arXiv Dive of 2024! Every other week we have been diving into interesting research papers in
14 min read
Dec
09
LLaVA-CoT: Let Vision Language Models Reason Step-By-Step
When it comes to large language models, it is still the early innings. Many of them still hallucinate, fail to
12 min read
Nov
18
How Upcycling MoEs Beat Dense LLMs
In this Arxiv Dive, Nvidia researcher, Ethan He, presents his co-authored work Upcycling LLMs in Mixture of Experts (MoE). He
1 min read
Nov
11
Thinking LLMs: General Instruction Following with Thought Generation
The release of OpenAI-O1 has motivated a lot of people to think deeply about…thoughts 💭. Thinking before you speak is
14 min read
Oct
31
The Prompt Report Part 2: Plan and Solve, Tree of Thought, and Decomposition Prompting
In the last blog, we went over prompting techniques 1-3 of The Prompt Report. This arXiv Dive, we were lucky
17 min read
Oct
09
The Prompt Report Part 1: A Systematic Survey of Prompting Techniques
For this blog we are switching it up a bit. In past Arxiv Dives, we have gone deep into the
12 min read
Sep
18
arXiv Dive: How Flux and Rectified Flow Transformers Work
Flux made quite a splash with its release on August 1st, 2024 as the new state of the art generative
9 min read
Sep
13
How Well Can Llama 3.1 8B Detect Political Spam? [4/4]
It only took about 11 minutes to fine-tuned Llama 3.1 8B on our political spam synthetic dataset using ReFT.
3 min read
Sep
04
Fine-Tuning Llama 3.1 8B in Under 12 Minutes [3/4]
Meta has recently released Llama 3.1, including their 405 billion parameter model which is the most capable open model
3 min read