The Prompt Report Part 2: Plan and Solve, Tree of Thought, and Decomposition Prompting
In the last blog, we went over prompting techniques 1-3 of The Prompt Report. This arXiv Dive, we were lucky
The Prompt Report Part 1: A Systematic Survey of Prompting Techniques
For this blog we are switching it up a bit. In past Arxiv Dives, we have gone deep into the
ArXiv Dives: How ReFT works
ArXiv Dives is a series of live meetups that take place on Fridays with the Oxen.ai community. We believe
ArXiv Dives: Efficient DiT Fine-Tuning with PixART for Text to Image Generation
Diffusion Transformers have been gaining a lot of steam since OpenAI's demo of Sora back in March. The
ArXiv Dives: Evaluating LLMs for Code Completion with HumanEval
Large Language Models have shown very good ability to generalize within a distribution, and frontier models have shown incredible flexibility
How to Train Diffusion for Text from Scratch
This is part two of a series on Diffusion for Text with Score Entropy Discrete Diffusion (SEDD) models. Today we
ArXiv Dives: Text Diffusion with SEDD
Diffusion models have been popular for computer vision tasks. Recently models such as Sora show how you can apply Diffusion
ArXiv Dives: The Era of 1-bit LLMs, All Large Language Models are in 1.58 Bits
This paper presents BitNet b1.58 where every weight in a Transformer can be represented as a {-1, 0, 1}
ArXiv Dives: Evolutionary Optimization of Model Merging Recipes
Today, we’re diving into a fun paper by the team at Sakana.ai called “Evolutionary Optimization of Model Merging
ArXiv Dives: I-JEPA
Today, we’re diving into the I-JEPA paper. JEPA stands for Joint-Embedding Predictive Architecture and if you have been following