How NOT to store unstructured machine learning datasets
Training data is typically the most valuable part of any machine learning project. As we converge on model architectures like
🧼 SUDS - A Guide to Structuring Unstructured Data
At Oxen.ai we value high quality datasets. We have many years of experience training and evaluating models, and have
Arxiv Dives - Vision Transformers (ViT)
With all of the hype around Transformers for natural language processing and text, the authors of this paper beg the
Reading List For Andrej Karpathy’s “Intro to Large Language Models” Video
Andrej Karpathy recently released an hour long talk on “The busy person’s intro to large language models” that had
Arxiv Dives - A Mathematical Framework for Transformer Circuits - Part 2
Every Friday at Oxen.ai we host a paper club called "Arxiv Dives" to make us smarter Oxen
Arxiv Dives - A Mathematical Framework for Transformer Circuits - Part 1
Every Friday at Oxen.ai we host a paper club called "Arxiv Dives" to make us smarter Oxen
Data Version Control 101 with Oxen
This intro tutorial from Oxen.ai shows how Oxen can make versioning your data as easy as versioning your code.
Arxiv Dive Manifesto
Every Friday the team at Oxen.ai gets together and goes over research papers, blog posts, or books that help
Arxiv Dives - Attention Is All You Need
Every Friday at Oxen.ai we host a paper club called "Arxiv Dives" to make us smarter Oxen
Arxiv Dives - How LoRA fine-tuning works
Every Friday at Oxen.ai we host a paper club called "Arxiv Dives" to make us smarter Oxen