Softmax Attention Mechanism

Select Query:

1. Query & Keys (Dot Product)

Query (Q)

[1.0, 0.2]
"Apple"
×

Key 1 (K1)

[0.9, 0.1]
"Fruit"

Key 2 (K2)

[0.1, 0.9]
"Tech"

Key 3 (K3)

[-0.5, 0.2]
"River"

2. Raw Scores -> Softmax -> Weights

Score 1

0.00

Score 2

0.00

Score 3

0.00
Softmax(Scores)

Weight 1

0.00

Weight 2

0.00

Weight 3

0.00

3. Weighted Sum of Values (Context)

Value 1

[1, 0, 0]
(Red)

Value 2

[0, 1, 0]
(Green)

Value 3

[0, 0, 1]
(Blue)
=

Output (Context)

[0.0, 0.0, 0.0]

How It Works

This demo simulates how an AI "pays attention" to different words.

  • Query (Q): The word we are currently focusing on (e.g., "Apple").
  • Keys (K): The "labels" of other words in the database ("Fruit", "Tech", "River"). We calculate the Dot Product (similarity) between Q and each K.
  • Softmax: Converts raw similarity scores into probabilities (Weights) that sum to 1. High score = High attention.
  • Values (V): The actual content (represented here as colors). We calculate a Weighted Sum of Values based on the attention weights.
  • Result: If "Apple" is the query, it matches "Fruit" (Key 1) strongly. The output will be mostly Red (Value 1). If "Code" is the query, it matches "Tech" (Key 2), outputting Green.