Skip to content

Instantly share code, notes, and snippets.

@NyanHelsing
Last active February 12, 2026 17:39
Show Gist options
  • Select an option

  • Save NyanHelsing/703e5cba55e9f91234143605578d9deb to your computer and use it in GitHub Desktop.

Select an option

Save NyanHelsing/703e5cba55e9f91234143605578d9deb to your computer and use it in GitHub Desktop.
Attentive RAG

EigenAttention

Embedding-Conditioned External Attention Priors for Schema-Aware Context Routing


Abstract

Large language models (LLMs) struggle with long-range context awareness in complex codebases and organizational knowledge systems. Traditional retrieval-augmented generation (RAG) selects documents based on semantic similarity to a prompt, but it does not model attention structure. As a result, retrieval is reactive and often shallow: it retrieves what looks similar, not what is structurally relevant.

This paper proposes EigenAttention, an embedding-conditioned external attention routing mechanism that approximates transformer-style attention at the schema level rather than the document level. Instead of retrieving documents directly from a prompt embedding, the system retrieves an attention prior—a reusable ranking distribution over contextual entries—based on similarity between the prompt embedding and a set of learned attention indices.

The result is schema-aware contextualization: prompts retrieve patterns of relevance, not just similar documents. This approach better models how humans recall structured knowledge and how transformers would behave if given unlimited context.


1. Introduction

Modern LLM workflows typically rely on:

  • Prompt engineering
  • Retrieval-Augmented Generation (RAG)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment