Deep Learning uses neural networks with multiple layers (deep) to analyze data and learning new patterns.
Deep learning is inspired by how our brains work with it structures and function, achieving amazing performances on various tasks. It is used to create intelligent agents and solving complex problems. But humans brains can be unpredictable, which we see in deep learning as well. Because there so many layers in deep learning we don't fully understand why we get certain results back.
Artificial Neural Networks (ANNs)
These are computing systems based on neural networks similar to animal brains. Its interconnected nodes/neurons built up in layers. Each connection has a weight associated with the neuron, representing the strength of the connection. By adjusting these weights based on input data the algorithm learns to make decisions and predictions.
Layers
Deep Learning consists of 3 main layers:
- Input layer: Receiving the initial input
- Hidden layers: These intermediate layers perform computations, extracting features from data
- Output layer: Network's final output.
| Concept | Description | Examples / Notes |
|---|---|---|
| Activation Functions | Introduce non-linearity to the network, deciding if a neuron should activate based on input. | Sigmoid: 0–1 range. ReLU: 0 for negative, input for positive. Tanh: -1 to 1 range. |
| Backpropagation | Algorithm to train networks by calculating gradients of the loss function w.r.t. weights and updating them to minimize loss. | Iterative process allows the network to learn and improve over time. |
| Loss Function | Measures error between predictions and actual targets; training aims to minimize it. | Regression: Mean Squared Error. Classification: Cross-Entropy Loss. |
| Optimizer | Updates network weights using gradients from backpropagation to minimize loss. | Popular ones: SGD, Adam, RMSprop. |
| Hyperparameters | Settings defined before training that control learning, like learning rate, number of layers, and neurons per layer. | Proper tuning is crucial for optimal performance. |
| Importance | These concepts are the building blocks of deep learning. | Understanding them is essential to construct, train, and use models effectively. |
Perceptrons
A perceptron is a buildig block of neural networks. A simplified model of a biological neuron capable of making basis decisions. It is build up by:
- Input values (x1, x2, xn): Data points fed into perceptron
- Weight (w1, w2, wn): A input value has a weight, determining its strength/importance.
- Summation Function (∑): Weight inputs summed together
- Bias (b): Allows to activate perceptron even if all inputs are zero.
- Activation function (f): Implements non-linearity into the perceptron
- Output (y): Final output of perceptron.
Neural Networks
Neural networks works with multiple layers instead of single-layer perceptron. Also known as multi-layer perceptrons (MLPs).
Neuron
A neuron is a fundamental computational unit in neural networks. It is capable of receiving input, processing them then uses weight with a bias to apply an activation function resulting in a output. A perceptron uses a step function while neuros can use various functions.
| Layer | Description | Key Points / Notes |
|---|---|---|
| Input Layer | Entry point for data; each neuron represents a feature. | Passes data to the first hidden layer. |
| Hidden Layers | Intermediate layers that perform computations and extract features. | - Each neuron receives inputs from previous layer, applies weighted sum + bias, then activation function. Multiple hidden layers learn complex, non-linear relationships. Early layers learn simple features; deeper layers combine features into complex representations. |
| Output Layer | Produces the network's final result. | - Binary classification 1 neuron. Multi-class classification 1 neuron per class. |
Convolutional Neural Networks
Convolutional Neural Networks (CNNs) are specialized neural networks built specifically for processing grid-structured data like images. They're exceptionally good at recognizing spatial patterns and feature hierarchies, making them highly effective for image recognition, object detection, and image segmentation tasks.
Recurrent Neural Networks
Recurrent Neural Networks (RNNs) are neural networks designed specifically for sequential data where order matters. Unlike regular neural networks that process data once, RNNs have a special structure that remembers previous inputs, creating a "memory" effect. This memory allows them to understand patterns and dependencies over time, making them ideal for natural language processing, speech recognition, and time series analysis tasks.