mcz3n | Deep Learning

Deep Learning uses neural networks with multiple layers (deep) to analyze data and learning new patterns.

Deep learning is inspired by how our brains work with it structures and function, achieving amazing performances on various tasks. It is used to create intelligent agents and solving complex problems. But humans brains can be unpredictable, which we see in deep learning as well. Because there so many layers in deep learning we don't fully understand why we get certain results back.

Artificial Neural Networks (ANNs)

These are computing systems based on neural networks similar to animal brains. Its interconnected nodes/neurons built up in layers. Each connection has a weight associated with the neuron, representing the strength of the connection. By adjusting these weights based on input data the algorithm learns to make decisions and predictions.

Layers

Deep Learning consists of 3 main layers:

Input layer: Receiving the initial input
Hidden layers: These intermediate layers perform computations, extracting features from data
Output layer: Network's final output.

Concept	Description	Examples / Notes
Activation Functions	Introduce non-linearity to the network, deciding if a neuron should activate based on input.	Sigmoid: 0–1 range. ReLU: 0 for negative, input for positive. Tanh: -1 to 1 range.
Backpropagation	Algorithm to train networks by calculating gradients of the loss function w.r.t. weights and updating them to minimize loss.	Iterative process allows the network to learn and improve over time.
Loss Function	Measures error between predictions and actual targets; training aims to minimize it.	Regression: Mean Squared Error. Classification: Cross-Entropy Loss.
Optimizer	Updates network weights using gradients from backpropagation to minimize loss.	Popular ones: SGD, Adam, RMSprop.
Hyperparameters	Settings defined before training that control learning, like learning rate, number of layers, and neurons per layer.	Proper tuning is crucial for optimal performance.
Importance	These concepts are the building blocks of deep learning.	Understanding them is essential to construct, train, and use models effectively.

Perceptrons

A perceptron is a buildig block of neural networks. A simplified model of a biological neuron capable of making basis decisions. It is build up by:

Input values (x1, x2, xn): Data points fed into perceptron
Weight (w1, w2, wn): A input value has a weight, determining its strength/importance.
Summation Function (∑): Weight inputs summed together
Bias (b): Allows to activate perceptron even if all inputs are zero.
Activation function (f): Implements non-linearity into the perceptron
Output (y): Final output of perceptron.

Neural Networks

Neural networks works with multiple layers instead of single-layer perceptron. Also known as multi-layer perceptrons (MLPs).

Neuron

A neuron is a fundamental computational unit in neural networks. It is capable of receiving input, processing them then uses weight with a bias to apply an activation function resulting in a output. A perceptron uses a step function while neuros can use various functions.

Layer	Description	Key Points / Notes
Input Layer	Entry point for data; each neuron represents a feature.	Passes data to the first hidden layer.
Hidden Layers	Intermediate layers that perform computations and extract features.	- Each neuron receives inputs from previous layer, applies weighted sum + bias, then activation function. Multiple hidden layers learn complex, non-linear relationships. Early layers learn simple features; deeper layers combine features into complex representations.
Output Layer	Produces the network's final result.	- Binary classification 1 neuron. Multi-class classification 1 neuron per class.

Convolutional Neural Networks

Convolutional Neural Networks (CNNs) are specialized neural networks built specifically for processing grid-structured data like images. They're exceptionally good at recognizing spatial patterns and feature hierarchies, making them highly effective for image recognition, object detection, and image segmentation tasks.

Recurrent Neural Networks

Recurrent Neural Networks (RNNs) are neural networks designed specifically for sequential data where order matters. Unlike regular neural networks that process data once, RNNs have a special structure that remembers previous inputs, creating a "memory" effect. This memory allows them to understand patterns and dependencies over time, making them ideal for natural language processing, speech recognition, and time series analysis tasks.