How to Fine-Tune a FLUX.1-dev LoRA with Code, Step by Step
FLUX.1-dev is one of the most popular open-weight models available today. Developed by Black Forest Labs, it has 12
How to Fine-Tune PixArt to Generate a Consistent Character
Can we fine-tune a small diffusion transformer (DiT) to generate OpenAI-level images by distilling off of OpenAI images? The end
How to Fine-Tune Qwen3 on Text2SQL to GPT-4o level performance
Welcome to a new series from the Oxen.ai Herd called Fine-Tuning Fridays! Each week we will take an open
Fine-Tuning Fridays
Welcome to a new series from the Oxen.ai Herd called Fine-Tuning Fridays! Each week we will take an open
How RWKV-7 Goose Works 🪿 + Notes from the Author
In this special Arxiv Dive, we're joined by Eugene Cheah - author, lead in RWKV org, CEO of
How Phi-4 Cracked Small Multimodality
Phi-4 extends the existing Phi model’s capabilities by adding vision and audio all in the same model. This means
Training a Rust 1.5B Coder LM with Reinforcement Learning (GRPO)
Group Relative Policy Optimization (GRPO) has proven to be a useful algorithm for training LLMs to reason and improve on
Why GRPO is Important and How it Works
Last week on Arxiv Dives we dug into research behind DeepSeek-R1, and uncovered that one of the techniques they use
🧠 GRPO VRAM Requirements For the GPU Poor
Since the release of DeepSeek-R1, Group Relative Policy Optimization (GRPO) has become the talk of the town for Reinforcement Learning
How DeepSeek R1, GRPO, and Previous DeepSeek Models Work
In January 2025, DeepSeek took a shot directly at OpenAI by releasing a suite of models that “Rival OpenAI’s