HSE SLT, lecture 13: proof of lazy training in wide networks
- Proof that parameter updates in wide networks have small norm - Neural tangent kernel and its min-eigenvalue - Solving linear systems with gradient descent, condition number, link to double descent - Label dependent bound on the distance and optional tasks (see chapter 18 of the notes) Course website http://wiki.cs.hse.ru/Statistical_learning_theory_2025
- Proof that parameter updates in wide networks have small norm - Neural tangent kernel and its min-eigenvalue - Solving linear systems with gradient descent, condition number, link to double descent - Label dependent bound on the distance and optional tasks (see chapter 18 of the notes) Course website http://wiki.cs.hse.ru/Statistical_learning_theory_2025