attention

17 Jan 2026 - 17 Jan 2026
Open in Logseq
    • Attention (machine learning) - Wikipedia)
      • Attention-like mechanisms were introduced in the 1990s under names like multiplicative modules, sigma pi units, and hyper-networks. Its flexibility comes from its role as "soft weights" that can change during runtime, in contrast to standard weights that must remain fixed at runtime.
      • hard weights from training (backward), soft wweights from use (forward). Weird terminology but OK.