How It Works
RoPE encodes position by rotating the word vector in a 2D plane. The rotation angle is proportional to the position index.
Angle_m = m * θ
Angle_n = n * θ
Relative Angle = (m - n) * θ
Observe the following behaviors:
- Absolute Position: As you increase position m or n, the corresponding vector rotates counter-clockwise.
- Relative Position: The angle between the two vectors depends only on the difference (m - n).
- Translation Invariance: If you increase both m and n by the same amount (e.g., +5), both vectors rotate, but the angle between them stays exactly the same. This is why RoPE captures relative position so well!