From 0623ceb960f277479dd5721d08227c611a2375f0 Mon Sep 17 00:00:00 2001 From: Kye Gomez <98760976+kyegomez@users.noreply.github.com> Date: Sun, 19 Apr 2026 22:01:10 -0400 Subject: [PATCH] Update README.md --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 3542f2c..4439d56 100644 --- a/README.md +++ b/README.md @@ -338,6 +338,7 @@ Theoretical analysis suggests 2-3x improvements in inference throughput. For a d - Reasoning with Latent Thoughts — On the Power of Looped Transformers: https://arxiv.org/abs/2502.17416 - Training Large Language Models to Reason in a Continuous Latent Space: https://arxiv.org/abs/2412.06769 - Relaxed Recursive Transformers — Effective Parameter Sharing with Layer-wise LoRA: https://arxiv.org/pdf/2410.20672 +- Mixture-of-Depths Attention: https://arxiv.org/abs/2603.15619 ---