Abstract |
The characterization of the structure of the manifold of low-energy lying states in neural networks is among
the most fundamental theoretical questions in machine learning.
In recent years, many empirical studies on the landscape of neural networks and constraint satisfaction
problem have shown that the low-lying configurations are often found in complex connected structures,
where zero-energy paths between pairs of distant solutions can be constructed.
In this talk, I will discuss the geometrical organization and the connectivity properties of solutions in two
linear neural network models having respectively binary and continuous weights: the "binary'' and the"
negative perceptron''. I will show that wide flat minima arise as complex extensive structures from the
coalescence of minima around "high-margin'' (i.e. locally robust) configurations [1]. Moreover, I will introduce
a novel analytical method for characterizing the typical energy barriers between groups of configurations
sampled from the zero-temperature measure of the problem [2]. In the negative perceptron case, we find
that, despite the overall non-convexity of the space of solutions, below a critical fraction of constraints
$\alpha_\star$ the geodesic path between any solution and the robust solutions of the problem, located in
the interior of the solution space, remains strictly zero-energy. The value of $\alpha_\star$ where this simple
connectivity property breaks down is compatible with the point at which the dense core of solutions
fragments in multiple smaller pieces [3].
References:
[1] C. Baldassi, C. Lauditi, E. M. Malatesta, G. Perugini, and R. Zecchina, Physical Review Letters 127, 278301
(2021).
[2] B. L. Annesi, C. Lauditi, C. Lucibello, E. M. Malatesta, G. Perugini, F. Pittorino, and L. Saglietti, In
preparation (2023).
[3] C. Baldassi, E. M. Malatesta, G. Perugini, and R. Zecchina, In preparation (2023). |