Optimization methods. MSAI 2022

The course is a summary of state-of-the-art results and approaches in solving applied optimization problems. Despite the focus on applications, the course contains the necessary set of theoretical foundations to understand why and how given methods work.

Classes are taken online twice a week for an hour and a half. In the lecture session a brief theoretical introduction to the topic is discussed, in the practical interactive session students solve problems on the topic on their own with Q&A.

Program

Introductory session. 📝 Notes. 📼 Video

Week 1

🦄 Lecture	🏛 Seminar
Brief recap of matrix calculus. 📄 presentation 📝 notes 📼 video	Examples of matrix and vector derivatives. 📼 video 🐍 code
Idea of automatic differentiation. 📄 presentation 📝 notes 🐍 code 📼 video	Work with automatic differentiation libraries - jax, pytorch, autograd. 📼 video 🐍 code

Week 2

🦄 Lecture	🏛 Seminar
Markowitz portfolio theory 📄 presentation 📝 notes 📼 video 🐍 code	Building a portfolio based on a real-world data. 📼 video 🐍 code

Week 3

🦄 Lecture	🏛 Seminar
Applications of linear programming. 📄 presentation 📝 notes 📼 video 🐍 code	LP applications exercises: selecting TED talks as LP, production planning. 📼 video 🐍 code

Week 4

🦄 Lecture	🏛 Seminar
Zero order methods: simulated annealing, evolutionary algorithms, genetic algorithm. Idea of Nelder Mead algorithm. 🐍 code ML models hyperparameter search with nevergrad 🐍 code and optuna 🐍 code. 📄 presentation 📝 notes 📼 video	ML models Hyperparameter search with optuna and keras. 📼 video 🐍 code

Week 5

🦄 Lecture	🏛 Seminar
Newton method. 🐍 code Quasi-Newton methods. 🐍 code 📄 presentation 📝 notes 📼 video	Implementation of the damped Newton method. Finding the analytical center of a set. Convergence study. Comparison with other methods. Benchmarking of quasi-Newtonian methods. 📼 video 🐍 code

Week 6

🦄 Lecture	🏛 Seminar
Stochastic gradient descent method. Batches, epochs, schedulers. Nesterov Momentum and Polyak Momentum. Accelerated gradient method. Adaptive stochastic methods. Adam, RMSProp, AdaDelta. 🐍 code 📄 presentation 📝 notes 📼 video	A convergence study of the SGD. Hyperparameter tuning. Convergence study of accelerated methods in neural network training. Convergence study of adaptive methods in neural network training. 📼 video 🐍 code

Week 7

🦄 Lecture
The landscape of the loss function of a neural network. Neural network fine-tuning aka transfer learning. Neural style transfer. 🐍 code Using GANs to train density distribution on the plane. Generating new pokemons using deep neural networks. 🐍 code Visualizing the projection of the loss function of a neural network on a straight line and a plane. 🐍 code 📼 video