The world in which we live is driven by optimisation. From public transport to the design of power networks and statistical learning techniques, optimisation is used in many spheres of our lives. Society relies on it as the underlying tool powering the technology and services we love and enjoy on a regular basis. Is such a powerful tool subject to limitations? Are there situations where optimisation fails to work or works poorly? There are and one such instances is its performance on nonsmooth functions.
Many gradient based descent methods are known to perform poorly on non-smooth functions. One method however, the BFGS method has been observed to perform well in comparison and can even converge when tuned appropriately. Although this was first observed over 25 years ago, little research has been done until recently into why the algorithm is successful. There are plenty of opportunities for research and development in this area which I attempted to explore in my research project.
Optimisation algorithms attempt to approximate an input X such that f(X) is a minimum. A class of methods known as “gradient descent” methods approach this problem by “stepping” down a function by using the gradient vector and considering the local shape of the function until the the minimum is found. This process is similar to how a person standing on a hill would find the bottom; first a direction is chosen, a step is taken and the process repeats until the bottom is reached.
The method was tested on the non-smooth Rosenbrock function which is known to be difficult for optimisation algorithms to minimise. The algorithm was found to zigzag along the valley of the function and follow it closely towards the minimum. The rate of convergence slowed down significantly as the iterates became close and they back-tracked periodically. While the performance was not optimal it still managed to locate the minimum.
Through the project, my supervisor Prof. Andrew Eberhard and I determined that the method attempted to follow the smoothness in the function given by the UV space, which theory suggests that non-smooth functions are partially smooth in the sense that there exist manifolds of smoothness and non-smoothness.
We attempted to modify the method to avoid following the non-smooth regions of the function closely. An alternative method to choosing the step size called a non-monotone line search was trialled. The method allowed the algorithm to take larger steps and escape the non-smooth region. When combined with another modification which projected the step direction in some instances onto the V space once it was located, the algorithm was able to correct the poor directions resulting from the line search modification. The final result was a method which followed the valley at a distance towards the minimum.
While the new method was a significant improvement, many issues still exist, and the algorithm is far from perfect. Non-smooth optimisation remains an under-researched area of mathematics and there are many more developments to be made.
Daniel Molent was a recipient of a 2018/19 AMSI Vacation Research Scholarship.