Universität Wien

390012 SE PhD-AW: Zero order stochastic optimization revisited (2022S)

Continuous assessment of course work
MIXED

Registration/Deregistration

Note: The time of your registration within the registration period has no effect on the allocation of places (no first come, first served).

Details

Language: English

Lecturers

Classes (iCal) - next class is marked with N

  • Wednesday 30.03. 15:00 - 16:30 Hörsaal 8 Oskar-Morgenstern-Platz 1 1.Stock
  • Thursday 31.03. 15:00 - 16:30 Hörsaal 6 Oskar-Morgenstern-Platz 1 1.Stock
  • Friday 01.04. 15:00 - 16:30 Hörsaal 5 Oskar-Morgenstern-Platz 1 Erdgeschoß
  • Monday 04.04. 15:00 - 16:30 Hörsaal 8 Oskar-Morgenstern-Platz 1 1.Stock
  • Tuesday 05.04. 15:00 - 16:30 Hörsaal 8 Oskar-Morgenstern-Platz 1 1.Stock
  • Monday 25.04. 11:30 - 13:00 Digital
  • Tuesday 26.04. 11:30 - 13:00 Digital
  • Wednesday 27.04. 11:30 - 13:00 Digital
  • Thursday 28.04. 11:30 - 13:00 Digital
  • Friday 29.04. 11:30 - 13:00 Digital

Information

Aims, contents and method of the course

We study the problem of finding the minimizer of a function by a sequential exploration of its values, under measurement noise. This problem is known under the name of derivative-free or zero-order stochastic optimization. Classical results in the statistics literature starting from 1950-ies focused on deriving consistent and asymptotically normal procedures and explored the rates of convergence when the number of queries tends to infinity and all other parameters stay fixed. In recent years, zero-order methods re-emerged in the machine learning literature in the context of bandit problems, deep neural networks and other applications. In contrast to the earlier work in statistics, this line of research is interested in non-asymptotic garantees on the performance of the algorithms and understanding the explicit dependency of the bounds on the dimension and other parameters.
In the current course, we provide an introduction to zero-order stochastic optimization under this new perspective and highlight its connection to nonparametric estimation. The main objects of study are approximations of the gradient descent algorithm where the gradient is estimated by randomized procedures involving kernel smoothing. Assuming that the objective function is (strongly) convex and β-Hölder and the noise is adversarial, we obtain non-asymptotic upper bounds both for the optimization error and the cumulative regret of the algorithms. We also establish minimax lower bounds for any sequential search method implying that the suggested algorithms are nearly optimal in a minimax sense in terms of sample complexity and the problem parameters. The approach is extended to several other settings, such as non-convex stochastic optimization, estimation of the minimum value of the function and zero-order distributed stochastic optimization.

Assessment and permitted materials

More information to be announced in the first session of the seminar

Minimum requirements and assessment criteria

More information to be announced in the first session of the seminar

Examination topics

More information to be announced in the first session of the seminar

Reading list

References:

Akhavan A., Pontil, M., Tsybakov, A.B. (2020) Exploiting higher order smoothness in derivative-free optimization and continuous bandits. Proceedings of NeurIPS-2020. arXiv:2006.07862

Akhavan A., Pontil, M., Tsybakov, A.B. (2021) Distributed zero-order optimization under adversarial noise. Proceedings of NeurIPS-2021. arXiv:2102.01121

Bach F., Perchet V. (2016) Highly-smooth zero-th order online optimization. Proc. 29th Annual Conference on Learning Theory, 1-27.

Polyak, B.T., Tsybakov A.B. (1990) Optimal order of accuracy of search algorithms in stochastic optimization. Problems of Information Transmission, 26, 45-53.

Shamir, O. (2013) On the complexity of bandit and derivative-free
stochastic convex optimization. J. Machine Learning Research, 30, 1-22.

Association in the course directory

Last modified: Th 11.05.2023 11:28