Skip to content

Artificial Neural Networks Exploration (Part 3): Fundamentals of Neural Networks

Discussing previous posts on Regression and Support Vector Machines (SVM) as significant methods in Machine Learning, we've also touched upon SVM's resemblance to Regression Analysis. Notably, an offshoot of Logistic Regression exists, known as the 'Artificial Neural Network (ANN)'. This...

Artificial Neural Networks Fundamentals (Part 3): Deep Dive into ANNs
Artificial Neural Networks Fundamentals (Part 3): Deep Dive into ANNs

Artificial Neural Networks Exploration (Part 3): Fundamentals of Neural Networks

In the realm of machine learning, two popular models stand out for their unique approaches to binary classification problems – Logistic Regression and Artificial Neural Networks (ANNs).

Logistic Regression, a simple linear model, estimates the probability of a binary outcome by learning a set of weights applied directly to input features. It does this using a logistic (sigmoid) function to map linear combinations to probabilities. Structurally, it can be seen as a single-layer neural network without hidden layers, making it faster and requiring less data to perform well on problems with linear separability.

On the other hand, ANNs are multilayer, powerful models capable of capturing complex data patterns. They consist of multiple layers of interconnected nodes (neurons), including one or more hidden layers. ANNs learn patterns by adjusting many parameters (weights and biases) through iterative algorithms such as backpropagation combined with gradient descent. Due to their depth and complexity, they require more computational resources and longer training times.

The output layer of an ANN has one Neuron which outputs the probability of the input point for being from the class. The backpropagation step begins by computing the error at the last layer of the network and uses the chain rule of the partial derivatives to compute the change in the weights at each layer. The computed gradients are then used by an optimizer (e.g., Gradient Descent) to update the weights. Learning takes place between the layers, with weights lW at each layer learned to resemble the true class label when plugged into the ANN.

For non-linearly separable data, a larger ANN with multiple hidden layers is constructed. For instance, a network with two hidden layers, each containing five Neurons, is applied on the training set to obtain a set of optimized weights for this new network.

The objective function for an ANN is to find the best values for weights such that when plugged into the network, it should output the true class label for the input . The weights are arranged in the form of matrices, with the dimension of a weight matrix lW depending on the number of neurons in the layer 'l' and the number of neurons in the previous layer.

The Neural Network is used to classify a set of data points belonging to two classes (0/1). The objective function is minimized by minimizing the error (ŷ - y), where ŷ is the predicted class probability by the ANN while the y is the true class label. A regularization term JREG is added to the objective function to ensure that the model avoids overfitting.

Interpretability is one area where Logistic Regression outshines ANNs. The weights in Logistic Regression have direct, intuitive meanings reflecting feature influence, whereas ANN weights are less interpretable and often considered a “black box.”

In summary, Logistic Regression is a linear, simpler model with direct interpretability and fast training, suitable for linearly separable problems. ANNs, on the other hand, are multilayer, powerful models capable of capturing complex data patterns at the cost of interpretability and computational efficiency.

References: [1] Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. [3] Murphy, K. P. (2012). Machine Learning: A Probabilistic Perspective. MIT Press.

Artificial Neural Networks (ANNs), being multilayer and powerful, can capture complex data patterns but require more computational resources and longer training times compared to Logistic Regression. In contrast, Logistic Regression, although a simple linear model, offers direct interpretability by providing intuitive meanings for its weights, unlike ANNs.

Read also:

    Latest