For one of my earliest projects, I tried to use an Object Oriented language to model a neural network.  Below is a simple pseudocode example.

class Neuron {
  List children;
  double weight;
  double activation;
  double Fire(double input);
}

This was a rather naive approach.  One of the biggest problems was performance.

To activate each node, the system would have to do the following:

  • Fetch a node variable from our array
  • read the weight property
  • multiply by the input
  • iterate throught the children property and propogate the result through them all

Seems pretty straightforward until you think about all the memory look-ups that have to take place to navigate a hierarchy of artificial neurons and all the extra overhead involved in managing the list of children.  Also, iterating through a hiearchy like this requires that we traverse the entire length of a branch before we can finish a single layer.

Fortunately, if we take our heads out of the magical world of object-oriented design, we can find a much more efficient solution: Matrices.

 

Storing our data in a matrix makes it much easier to do the same amount of work.

We still have to do a single lookup to get our array of “neurons” but after that, a simple for loop can calculate all of our outputs without any child-node lookups.  On top of all that, there are many techniques to increase array processing.  Many of the matrix operations used in machine learning can also be executed in parallel!

The down-side to all this is that we have to manage all of our resources manually.  Sometimes, this is exactly what we want though.  We, as developers, can select which tradeoffs we want to make.  We can choose to make our algorithms faster at the expense of flexibility, or flexible at the expense of speed.

So, rather than keeping all of our artificial neuron information inside a nice neat object oriented class, we’re going to change our perspective, and put all of our weights for all of our neurons (in a specific layer) into a single array.  Then we will put each of the other parameters we use into seperate arrays.  We keep track of each individual “neuron” by keeping all the parameters in the same offset of the different arrays.

However, there is  a catch here.  The object oriented model above allowed for a more “organic” network to be created.  We could add connections between neurons on the fly and remove connections if desired.

The array-based method, however, produces a more “engineered” design.  Neurons are explicitly mapped to the neurons in the next layer.

Next week I’ll go into some more detail about some of the math involved.  Don’t run screaming, it’s really not as bad as you might think.

I’ll also begin exploring the application of neural networks for image recognition.

 

Categories: Machine Learning

Leave a Reply

Your email address will not be published. Required fields are marked *