Aakash Nain
Blog
Archive
Resources
Annotated Research Papers
Kaggle Notebooks
TF-JAX Tutorials
Diffusion Models Tutorials
Archive
NVILA
Efficient Frontier Visual Language Models
Dec 13, 2024
Rotary Position Encoding
A figure among cyphers: Part-1
Dec 10, 2024
PaliGemma 2
A Family of Versatile VLMs for Transfer
Dec 9, 2024
Star Attention
Efficient LLM Inference over Long Sequences
Dec 2, 2024
AIMv2
Multimodal Autoregressive Pre-training of Large Vision Encoders
Nov 27, 2024
JanusFlow
Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation
Nov 25, 2024
Cut Your Losses in Large-Vocabulary Language Models
Nov 20, 2024
The Super Weight in Large Language Models
Nov 13, 2024
Depth Pro
Nov 8, 2024
A Hitchhiker’s Guide to Scaling Law Estimation
Nov 4, 2024
OmniParser for Pure Vision Based GUI Agent
Oct 28, 2024
Normalized Transformer
Oct 23, 2024
What Matters for Model Merging at Scale?
Oct 15, 2024
Agent WorkFlow Memory
Sep 24, 2024
No matching items