Oxen.ai
Subscribe for the latest news, research, and updates from Oxen.ai

Latest

Dec
09
LLaVA-CoT: Let Vision Language Models Reason Step-By-Step

LLaVA-CoT: Let Vision Language Models Reason Step-By-Step

When it comes to large language models, it is still the early innings. Many of them still hallucinate, fail to
12 min read
Nov
18
How Upcycling MoEs Beat Dense LLMs

How Upcycling MoEs Beat Dense LLMs

In this Arxiv Dive, Nvidia researcher, Ethan He, presents his co-authored work Upcycling LLMs in Mixture of Experts (MoE). He
1 min read
Nov
11
Thinking LLMs: General Instruction Following with Thought Generation

Thinking LLMs: General Instruction Following with Thought Generation

The release of OpenAI-O1 has motivated a lot of people to think deeply about…thoughts 💭. Thinking before you speak is
14 min read
Oct
31
The Prompt Report Part 2: Plan and Solve, Tree of Thought, and Decomposition Prompting

The Prompt Report Part 2: Plan and Solve, Tree of Thought, and Decomposition Prompting

In the last blog, we went over prompting techniques 1-3 of The Prompt Report. This arXiv Dive, we were lucky
17 min read
Oct
09
The Prompt Report Part 1: A Systematic Survey of Prompting Techniques

The Prompt Report Part 1: A Systematic Survey of Prompting Techniques

For this blog we are switching it up a bit. In past Arxiv Dives, we have gone deep into the
12 min read
Sep
18
arXiv Dive: How Flux and Rectified Flow Transformers Work

arXiv Dive: How Flux and Rectified Flow Transformers Work

Flux made quite a splash with its release on August 1st, 2024 as the new state of the art generative
9 min read
Sep
13
How Well Can Llama 3.1 8B Detect Political Spam? [4/4]

How Well Can Llama 3.1 8B Detect Political Spam? [4/4]

It only took about 11 minutes to fine-tuned Llama 3.1 8B on our political spam synthetic dataset using ReFT.
3 min read
Sep
04
Fine-Tuning Llama 3.1 8B in Under 12 Minutes [3/4]

Fine-Tuning Llama 3.1 8B in Under 12 Minutes [3/4]

Meta has recently released Llama 3.1, including their 405 billion parameter model which is the most capable open model
3 min read
Aug
26
arXiv Dive: How Meta Trained Llama 3.1

arXiv Dive: How Meta Trained Llama 3.1

Llama 3.1 is a set of Open Weights Foundation models released by Meta, which marks the first time an
12 min read
Aug
22
How to De-duplicate and Clean Synthetic Data [2/4]

How to De-duplicate and Clean Synthetic Data [2/4]

Synthetic data has shown promising results for training and fine tuning large models, such as Llama 3.1 and the
6 min read