W
T
= (1 - 2μλ)
n
W
T-n
− μ Σ (1 - 2μλ)
j-1
G
T-j
B
3
μ
0.05
λ
0.01
Play
Step
Reset
Weight matrix W
T
Singular values
Memory window
(recent gradient updates, opacity = decay weight)
Step
0
Effective rank
-
Memory window
-