Добавить
Уведомления

HSE SLT, lecture 13: proof of lazy training in wide networks

- Proof that parameter updates in wide networks have small norm - Neural tangent kernel and its min-eigenvalue - Solving linear systems with gradient descent, condition number, link to double descent - Label dependent bound on the distance and optional tasks (see chapter 18 of the notes) Course website http://wiki.cs.hse.ru/Statistical_learning_theory_2025

12+
2 просмотра
2 дня назад
12+
2 просмотра
2 дня назад

- Proof that parameter updates in wide networks have small norm - Neural tangent kernel and its min-eigenvalue - Solving linear systems with gradient descent, condition number, link to double descent - Label dependent bound on the distance and optional tasks (see chapter 18 of the notes) Course website http://wiki.cs.hse.ru/Statistical_learning_theory_2025

, чтобы оставлять комментарии