This is Chris Sweet's
web presence where I can share my ideas on bringing the discipline of Statistical Mechanics to the chaotic world of AI. I will also share the challenges I have with software engineering in getting enough information to get something basic working. Usualy with a working setup enginnering becomes easy, but basic working examples are either scarce or the authors continually add their new knowledge until it is no longer basic.
I came (relatively) late in life to Academia, having spent many years building a series of Tech Startups. I started at Leicester University (home of Genetic fingerprinting, invented and developed at Leicester in 1984 by Sir Alec Jeffreys, and the Attenborough brothers), obtaining a first class honours degree in Mathematics in July 2001 and later My Ph.D., supervised by Prof. Ben Leimkuhler, now at Edinburgh University. My Thesis was entitled Hamiltonian Thermostatting Techniques for Molecular Dynamics Simulation, and was submitted on 11th May 2004 at the University of Leicester, with the viva voce being held on 7th June 2004.
Jekyll and Github Pages is great because I can include Math(s) in the pages!
\[L_{C E}+L_{0}=\sum_{i=1}^{B} \sum_{j=1}^{c} q_{i j} \log \left(\frac{p_{i i}}{H}\right)\]This is the Cross Entropy loss function for an experiment on using Class Activation Map (CAM) entropies as an estimate for the probability of drawing the image from the probability density function.
The University of Leicester was founded in 1921 and was awarded its Royal Charter in 1957. It was ranked number 15 in the UK by the Times guide, and awarded the title “University of the Year” by the Times Higher Education at their annual awards celebration 2008 (OK, I need to update that!). The techniques used in Genetic fingerprinting were invented and developed at Leicester in 1984 by Sir Alec Jeffreys. It also houses Europe’s biggest academic center for space research, in which space probes have been built, most notably the ill fated Mars Lander Beagle 2, which was built in collaboration with the Open University.
After completing my Ph.D. in Statistical Mechanics I moved to the University of Notre Dame to work with Jesus Izaguirre to work on using Normal Mode Analysis (NMA) techniques to facilitate the acceleration of numerical methods. The aim is to approximate the kinetics or thermodynamics of a biomolecule by a reduced model based on a normal mode decomposition of the dynamical space.
After woring with a student on training Neural Networks (NN) with small traing sets I became interested in applying some of the ideas of Statistical Mechanics to the problem. The simplest way to do this is via the concept of Energy Based Models which are based on the idea that probability density functions (PDF) that maximize entropy, for \(x \in \mathbb{R}^\mathrm{D}\), can be expressed as,
\[\begin{align} \label{eqn:probdense} p_{\theta}(x) = \frac{e^{-E_{\theta} (x)}}{Z_{\theta}}, \qquad Z_{\theta}=\int e^{-E_{\theta} (x)}dx, \end{align}\]directly via a scalar function \(E_{\theta} (x)\) parameterized by \(\theta\), the deep neural network (NN) characteristics and weights. By defining \(E_{\theta} (x)\) for a NN, \(p_{\theta}(x)\) allows us to learn an underlying data distribution by analyzing the observed data,
\[\begin{align} E_\theta(x) = -\log \sum_{y} e^{f_\theta(x)_{y}}, \end{align}\]where \(f_\theta(x)_{y}\) is the \(y^{th}\) logit.
EBMs have become increasingly popular within computer vision in recent years, commonly being applied for various generative image modeling tasks. (I love Jekyll/MathJax!)
Most of my work concerns trying to learn the underlying PDF during training rather than running MCMC simulations, which are expernsive and tend to be unstable.
Thanks for reading!