Aakash Kumar Nain
  • CV
  • Blog
  • Archive
  • Resources
    • Annotated Research Papers
    • Kaggle Notebooks
    • TF-JAX Tutorials
    • Diffusion Models Tutorials

Archive

Reverse-Engineered Reasoning for Open-Ended Generation

Sep 12, 2025

On the Theoretical Limitations of Embedding-Based Retrieval

Sep 9, 2025

Kimi K2: Open Agentic Intelligence

Jul 25, 2025

Scaling Laws Are Unreliable for Downstream Tasks: A Reality Check

Jul 7, 2025

L1

Mar 10, 2025

Matryoshka Quantization

Feb 14, 2025

Janus-Pro

Jan 28, 2025

DeepSeek-R1

Jan 21, 2025

Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Jan 20, 2025

DeepSeek-VL2

Dec 17, 2024

Gaze-LLE

Dec 16, 2024

NVILA

Dec 13, 2024

Rotary Position Encoding

Dec 10, 2024

PaliGemma 2

Dec 9, 2024

Star Attention

Dec 2, 2024

AIMv2

Nov 27, 2024

JanusFlow

Nov 25, 2024

Cut Your Losses in Large-Vocabulary Language Models

Nov 20, 2024

The Super Weight in Large Language Models

Nov 13, 2024

Depth Pro

Nov 8, 2024

A Hitchhiker’s Guide to Scaling Law Estimation

Nov 4, 2024

OmniParser for Pure Vision Based GUI Agent

Oct 28, 2024

Normalized Transformer

Oct 23, 2024

What Matters for Model Merging at Scale?

Oct 15, 2024

Agent WorkFlow Memory

Sep 24, 2024
No matching items