Blog Posts
ML-DL Concepts
Rotary Position Encoding
LLMs
position encoding
advanced
No matching items
Paper Summaries
Reverse-Engineered Reasoning for Open-Ended Generation
papers
summary
research
retrieval
On the Theoretical Limitations of Embedding-Based Retrieval
papers
summary
research
retrieval
Kimi K2: Open Agentic Intelligence
papers
summary
research
llm
agents
Scaling Laws Are Unreliable for Downstream Tasks: A Reality Check
papers
summary
research
llm
scaling
L1
papers
summary
research
lrm
Matryoshka Quantization
papers
summary
research
quantization llms
Janus-Pro
papers
summary
research
diffusion
DeepSeek-R1
papers
summary
research
LLMs
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps
papers
summary
research
diffusion
DeepSeek-VL2
papers
summary
research
VLMs
Gaze-LLE
papers
vision
NVILA
papers
summary
research
VLMs
PaliGemma 2
papers
summary
research
VLMs
Star Attention
papers
summary
research
LLMs
AIMv2
papers
summary
research
MLLMs
JanusFlow
papers
summary
research
MLLMs
generation
Cut Your Losses in Large-Vocabulary Language Models
papers
summary
research
LLMs
The Super Weight in Large Language Models
papers
summary
research
LLMs
Depth Pro
papers
summary
research
vision
A Hitchhiker’s Guide to Scaling Law Estimation
papers
summary
research
transformers
scaling
OmniParser for Pure Vision Based GUI Agent
papers
summary
research
VLMs
MLLMs
Normalized Transformer
papers
summary
transformers
research
LLMs
What Matters for Model Merging at Scale?
papers
summary
transformers
research
LLMs
model_merging
Agent WorkFlow Memory
papers
summary
research
agents
No matching items