Topics covered in the course
The level of detail in the course is not uniform. Some topics are covered in depth, while others are only briefly mentioned. The following list provides a comprehensive overview of the topics discussed in the course.
Definitions
- Univariate, Multi-variate, real-valued, vector-valued functions
 - Convex, Non-Convex, Concave functions
 - Unimodal, Multimodal
 - Lipschitz Continuity
 - Strongly Convex
 - Smoothness
 - Positive Definitness
 
Derivatives
- Derivatives in Multiple Dimensions
 - Numerical Differentiation
 - Automatic Differentiation
 
Bracketing
Finding an Initial Bracket:
- Fibonacci Search
 - Golden Section Search
 - Quadratic Fit Search
 - Bisection Method
 
Descent Direction Iteration
Line Search:
- Exact Line Search
 - Approximate Line Search:
    
- Backtracking line search (Armijo line search)
 - Strong backtracking line search (bracketing + zoom)
 - Trust Region Methods
 
 - Termination Conditions
 
First-Order Methods:
- Gradient Descent
 - Conjugate Gradient
 - Momentum:
    
- Nesterov Momentum
 - Adagrad
 - RMSProp
 - Adadelta
 - Adam
 - Hypergradient Descent
 
 
Second-Order Methods:
- Newton’s Method
 - Secant Method
 - Quasi-Newton Methods:
    
- DFP
 - BFGS
 - L-BFGS
 
 
Direct Methods:
- Cyclic Coordinate Search
 - Powell’s Method
 - Hooke-Jeeves
 - Generalized Pattern Search
 - Nelder-Mead Simplex Method
 - (Divided Rectangles)
 
Beyond Local Optima
- Noisy Descent
 - Stochastic Gradient Descent
 - Mesh Adaptive Direct Search
 - Simulated Annealing
 - Cross-Entropy Method
 - Natural Evolution Strategies
 - Covariance Matrix Adaptation
 - Genetic Algorithms
 - Differential Evolution
 - Particle Swarm Optimization
 
Optimization for ML
ML tasks:
- empirical risk minimization
 
Convergence analysis of stochastic gradient
- convergence rate
 - definitions and results
 
Beyond stochastic gradient:
- noise reduction methods (dynamic sample size, gradient aggregation, iterated averaging)
 - second order methods
 
Constrained Optimization
- Duality
 - Penalty Methods
 - Interior Point Methods
 - Augmented Lagrange Method
 - Linear Programming
 - Simplex Method
 - Modeling in LP
 - Dual certificates
 
Sampling Methods
- Full Factorial
 - Random Sampling
 - Uniform Projection Plans
 - Stratified Sampling
 - Space-Filling Metrics
 - Space-Filling Subsets
 - Quasi-Random Sequences
 
Discrete Optimization
- Integer Linear Programming
    
- Branch and Bound
 - Cutting Planes
 
 - Dynamic Programming
 - (Constraint Programming and Backtracking)
 - Construction Heuristics and Local Search