dm.cs.tu-dortmund.de/mlbits/neural-nlp-positional-encoding/
Positional Encoding – Lecture Notes
log or power functions) [ CFRR22 ]
RoPE still seems to be the favorite choice e.g., Google Gemma 3 (2025) uses RoPE with an increased frequency
References
[CFRR22]
Chi, T.-C., Fan, T.-H., Ramadge, P.J. and …