What is Keras?
Keras is an open-source high-level neural networks API written in Python. It allows for fast experimentation with deep neural networks. Built on top of TensorFlow, it provides a user-friendly interface for building and training models.

from keras.models import Sequential
model = Sequential()

History and evolution
Keras was developed by François Chollet in 2015 to provide a simpler way to build deep learning models. Initially supporting Theano and TensorFlow, it later became a part of TensorFlow’s core API.

import keras
print(keras.__version__)

Installing Keras & dependencies
Keras can be installed using pip, and TensorFlow is required as the backend. Install both for a complete setup.

pip install tensorflow keras

Backend support (TensorFlow, Theano, CNTK)
Keras supports multiple backends, though TensorFlow is now the official backend. Theano and CNTK are deprecated.

from keras import backend as K
print(K.backend())

Keras vs other frameworks
Compared to frameworks like PyTorch, Keras is more user-friendly and better suited for beginners, though PyTorch offers more flexibility for researchers.

# Keras uses Sequential or Functional APIs for ease of use
model = Sequential()

Keras ecosystem overview
The Keras ecosystem includes libraries for preprocessing (Keras Utils), datasets (Keras Datasets), and deployment (TF Lite). It's a full-stack solution.

from keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()

Hello World with Keras
A basic Keras model can be created in just a few lines using the Sequential API.

from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(10, input_shape=(5,), activation='relu'))

Project structure in Keras
A standard Keras project includes separate modules for data, models, training scripts, and utility functions, aiding scalability and maintenance.

/project
  /data
  /models
  train.py
  utils.py

Community & documentation
Keras has extensive official documentation and a vibrant community. It is supported by TensorFlow.org, StackOverflow, and GitHub.

# Visit https://keras.io for API references and tutorials

Use cases of Keras
Keras is used for tasks like image classification, text generation, time series forecasting, and even reinforcement learning in industrial and research settings.

# Keras is used in domains like healthcare, finance, and robotics

Biological inspiration
Neural networks are inspired by the human brain, where neurons process signals and pass them forward. Artificial neurons mimic this mechanism to learn patterns in data.

# Each artificial neuron: output = activation(weighted_sum + bias)

Perceptron model
A perceptron is a single-layer neural network unit that makes decisions by weighing inputs and applying an activation function.

output = 1 if (w1*x1 + w2*x2 + b) > 0 else 0

Feedforward neural networks
In feedforward networks, data flows from input to output without loops. They’re the foundation of most deep learning models.

model = Sequential([
  Dense(10, input_shape=(4,), activation='relu'),
  Dense(1, activation='sigmoid')
])

Activation functions
Activation functions introduce non-linearity, enabling networks to learn complex patterns. Common examples include ReLU, Sigmoid, and Tanh.

from keras.layers import Activation
model.add(Dense(64))
model.add(Activation('relu'))

Loss functions
Loss functions measure how far predictions are from the target. Common ones: binary_crossentropy, categorical_crossentropy, and MSE.

model.compile(loss='binary_crossentropy', optimizer='adam')

Optimizers overview
Optimizers update weights based on gradients. Examples include SGD, Adam, RMSprop, etc. Adam is widely used for its adaptive learning rate.

model.compile(optimizer='adam')

Forward and backward propagation
Forward pass computes output; backward pass computes gradients and updates weights via backpropagation to minimize loss.

# Automatically handled in model.fit()

Epochs and batches
One epoch = one pass through the dataset. Batch size defines how many samples are processed before weight update.

model.fit(X, y, epochs=10, batch_size=32)

Overfitting and underfitting
Overfitting occurs when a model memorizes training data; underfitting happens when it can’t learn enough. Regularization and more data help.

from keras.layers import Dropout
model.add(Dropout(0.5))

Deep learning in real life
Applications include facial recognition, fraud detection, natural language processing, medical diagnosis, and more.

# Example: text sentiment classification or image detection

Sequential API
The Sequential API is a simple way to build models layer-by-layer. It’s ideal for feedforward networks with a linear stack.

model = Sequential()
model.add(Dense(64, activation='relu', input_shape=(100,)))

Adding dense layers
Dense layers are fully connected layers where every neuron is connected to every input. Useful for classification and regression.

model.add(Dense(32, activation='relu'))
model.add(Dense(10, activation='softmax'))

Configuring activation functions
Activation functions are set within layers to determine neuron output. Softmax is used for multiclass, sigmoid for binary.

model.add(Dense(1, activation='sigmoid'))

Compiling the model
Compilation sets loss, optimizer, and metrics before training. It prepares the model for backpropagation.

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

Model summary
The `summary()` method displays a model’s structure including layers, shapes, and parameter counts.

model.summary()

Training the model
Train using `fit()` with input data, labels, epochs, and batch size. It handles forward and backward passes automatically.

model.fit(X_train, y_train, epochs=10, batch_size=32)

Evaluating model accuracy
Use `evaluate()` to assess performance on test data and retrieve accuracy or loss.

loss, acc = model.evaluate(X_test, y_test)
print(f"Test accuracy: {acc}")

Making predictions
`predict()` outputs results for unseen data, useful for classification, regression, or next-step recommendations.

predictions = model.predict(X_new)

Saving and loading models
Keras models can be saved in `.h5` format or as a TensorFlow SavedModel for later reuse or deployment.

model.save('my_model.h5')
loaded_model = keras.models.load_model('my_model.h5')

Use case: Basic classifier
A basic digit classifier with MNIST demonstrates input, hidden layers, and softmax output for 10 classes.

model = Sequential([
  Dense(128, activation='relu', input_shape=(784,)),
  Dense(10, activation='softmax')
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)

Normalization and Scaling
Normalization rescales features to a standard range (0 to 1 or -1 to 1), which speeds up convergence in neural networks. Keras offers layers and preprocessing tools for normalization.

from tensorflow.keras.layers import Normalization
normalizer = Normalization()
normalizer.adapt(data)  # data is a NumPy array or tf.data dataset

Splitting train/test/validation
It's critical to split your data to evaluate generalization. Keras works well with `train_test_split` from sklearn.

from sklearn.model_selection import train_test_split
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2)

Encoding labels
Label encoding is needed for classification tasks. Use one-hot encoding for categorical labels.

from tensorflow.keras.utils import to_categorical
y_encoded = to_categorical(y)

Data generators
Keras `ImageDataGenerator` or `Sequence` allows for efficient data loading and augmentation.

from tensorflow.keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(rescale=1./255)
train_gen = datagen.flow_from_directory('train/', target_size=(64, 64))

Image preprocessing
Images are scaled, resized, and normalized before training. Keras handles this in real-time with generators.

img = tf.keras.utils.load_img("cat.jpg", target_size=(64,64))
img_array = tf.keras.utils.img_to_array(img)/255.0

Text preprocessing
Text needs tokenization, truncation, and padding before feeding into models. Keras makes it easy.

from tensorflow.keras.preprocessing.text import Tokenizer
tokenizer = Tokenizer()
tokenizer.fit_on_texts(texts)
sequences = tokenizer.texts_to_sequences(texts)

Sequence padding
Pad sequences to equal length so they can be processed in batches.

from tensorflow.keras.preprocessing.sequence import pad_sequences
padded = pad_sequences(sequences, padding='post', maxlen=100)

Handling missing values
Handle missing data with imputation or removal before training.

import numpy as np
X = np.nan_to_num(X)  # replaces NaN with 0

Batch processing
Batch size affects training speed and memory. Keras uses batch size in fit and generators.

model.fit(X_train, y_train, batch_size=32, epochs=10)

Data augmentation
Augmentation generates more diverse training samples. Keras supports image flipping, rotation, etc.

datagen = ImageDataGenerator(rotation_range=20, horizontal_flip=True)
augmented = datagen.flow(X_train, y_train)

Sigmoid
Sigmoid squashes inputs between 0 and 1. It’s good for binary classification outputs.

from tensorflow.keras.activations import sigmoid
output = sigmoid(x)

Tanh
Tanh outputs between -1 and 1. It's a zero-centered alternative to sigmoid.

from tensorflow.keras.activations import tanh
output = tanh(x)

ReLU
ReLU returns zero for negative inputs and the input itself otherwise. It’s widely used in hidden layers.

from tensorflow.keras.layers import Activation
model.add(Dense(64))
model.add(Activation('relu'))

Leaky ReLU
Leaky ReLU allows small gradients for negative inputs to avoid dying neurons.

from tensorflow.keras.layers import LeakyReLU
model.add(LeakyReLU(alpha=0.01))

ELU
ELU (Exponential Linear Unit) adds smoothness and improves learning with negative values.

from tensorflow.keras.layers import ELU
model.add(ELU(alpha=1.0))

Softmax
Softmax converts logits to probabilities, ideal for multi-class classification output layers.

model.add(Dense(3, activation='softmax'))

Swish
Swish is a newer activation defined as `x * sigmoid(x)`, offering better performance in deep nets.

from tensorflow.keras.activations import swish
output = swish(x)

Choosing the right activation
Use ReLU for hidden layers, sigmoid for binary output, softmax for multiclass output. Swish and ELU may improve deeper nets.

// No code — selection depends on task

Custom activations
You can define your own activation functions in Keras with Lambda or functions.

from tensorflow.keras.layers import Lambda
model.add(Lambda(lambda x: x**2))

Visualizing activations
Use matplotlib to plot activations and understand their behavior across inputs.

import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(-10, 10, 100)
plt.plot(x, np.maximum(0, x))  # ReLU
plt.show()

Mean Squared Error
MSE is used in regression, measuring the average of squared differences between predicted and actual values.

model.compile(optimizer='adam', loss='mean_squared_error')

Binary Crossentropy
Binary crossentropy is ideal for binary classification problems, computing loss from logits vs. 0 or 1.

model.compile(optimizer='adam', loss='binary_crossentropy')

Categorical Crossentropy
Used when labels are one-hot encoded for multi-class classification tasks.

model.compile(optimizer='adam', loss='categorical_crossentropy')

Sparse Categorical Crossentropy
Works for integer-encoded labels instead of one-hot, saving memory and computation.

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')

Hinge loss
Hinge loss is used in SVM-style binary classification with {-1, 1} labels.

model.compile(optimizer='adam', loss='hinge')

Kullback-Leibler divergence
KL divergence measures how one probability distribution diverges from another.

model.compile(optimizer='adam', loss='kullback_leibler_divergence')

Custom loss functions
You can define your own loss functions for unique tasks in Keras using Python functions.

def custom_loss(y_true, y_pred):
    return tf.reduce_mean(tf.square(y_true - y_pred) + 0.1)
model.compile(optimizer='adam', loss=custom_loss)

Choosing loss by task
Use MSE for regression, binary crossentropy for binary classification, categorical for one-hot multiclass, and sparse for integer labels.

// Decision logic based on task type

Regularization penalties
Add L1/L2 penalties to discourage overfitting via additional loss terms.

from tensorflow.keras import regularizers
model.add(Dense(64, kernel_regularizer=regularizers.l2(0.01)))

Loss function behavior
Visualizing how loss changes with predictions helps understand optimization and convergence.

import matplotlib.pyplot as plt
x = np.linspace(-1, 1, 100)
plt.plot(x, x**2)  # MSE shape
plt.show()

Gradient Descent
Gradient Descent is the foundation of optimization in machine learning. It updates model parameters by moving in the direction opposite to the gradient of the loss function to minimize error.

# Basic Gradient Descent
theta = theta - learning_rate * gradient

Stochastic Gradient Descent (SGD)
SGD updates parameters using a single data point at a time, providing faster updates but more noise, which can help escape local minima.

optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)

Momentum
Momentum accelerates SGD by adding a fraction of the previous update vector to the current one, reducing oscillations.

optimizer = tf.keras.optimizers.SGD(learning_rate=0.01, momentum=0.9)

Nesterov Accelerated Gradient
Nesterov improves Momentum by calculating the gradient after the current momentum step, leading to more accurate updates.

optimizer = tf.keras.optimizers.SGD(learning_rate=0.01, momentum=0.9, nesterov=True)

Adam
Adam combines Momentum and RMSprop. It adapts learning rates for each parameter and generally performs well across tasks.

optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)

RMSprop
RMSprop divides the learning rate by an exponentially decaying average of squared gradients to adapt learning dynamically.

optimizer = tf.keras.optimizers.RMSprop(learning_rate=0.001)

Adagrad
Adagrad adapts learning rate based on frequency of parameters, lowering the rate for frequent ones.

optimizer = tf.keras.optimizers.Adagrad(learning_rate=0.01)

Nadam
Nadam blends Adam and Nesterov Momentum for improved convergence behavior, especially on complex tasks.

optimizer = tf.keras.optimizers.Nadam(learning_rate=0.002)

Choosing the right optimizer
Optimizer choice depends on task, data, and speed requirements. Adam is generally a good starting point, but SGD+Momentum may perform better for large datasets.

# Example: Try different optimizers to compare performance

Learning rate scheduling
Learning rate scheduling adjusts the rate during training to balance convergence speed and stability.

lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
    initial_learning_rate=0.01, decay_steps=10000, decay_rate=0.9)
optimizer = tf.keras.optimizers.SGD(learning_rate=lr_schedule)

Accuracy
Accuracy measures the proportion of correct predictions over total predictions. It’s useful when class distribution is balanced.

accuracy = (TP + TN) / (TP + TN + FP + FN)

Precision & Recall
Precision measures how many predicted positives are actually correct. Recall tells how many actual positives were captured.

precision = TP / (TP + FP)
recall = TP / (TP + FN)

F1 Score
The F1 score is the harmonic mean of precision and recall. It balances both metrics and is useful for imbalanced datasets.

f1_score = 2 * (precision * recall) / (precision + recall)

Confusion Matrix
A confusion matrix shows the breakdown of true vs predicted labels, helping visualize classification errors.

from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_true, y_pred)

ROC-AUC
The ROC curve plots true positive rate vs false positive rate. AUC is the area under this curve, showing model performance.

from sklearn.metrics import roc_auc_score
auc = roc_auc_score(y_true, y_scores)

Mean Absolute Error
MAE measures the average absolute difference between actual and predicted values in regression.

from sklearn.metrics import mean_absolute_error
mae = mean_absolute_error(y_true, y_pred)

R-squared
R² shows how much of the variance in the target variable is explained by the model. Closer to 1 is better.

from sklearn.metrics import r2_score
r2 = r2_score(y_true, y_pred)

Custom metrics
You can define custom metrics for your specific domain or application requirements.

def custom_metric(y_true, y_pred):
    return tf.reduce_mean(tf.abs(y_true - y_pred))

Visualizing metrics
Plotting metrics over epochs helps identify overfitting, underfitting, and performance trends.

import matplotlib.pyplot as plt
plt.plot(history.history['accuracy'])

Metric callbacks
Keras callbacks let you track and act on metrics during training (e.g., early stopping or saving best models).

EarlyStopping(monitor='val_accuracy', patience=3)

Saving models (H5, SavedModel)
TensorFlow models can be saved in HDF5 or SavedModel formats. H5 is a single-file format, while SavedModel includes metadata.

model.save('model.h5')  # HDF5 format
model.save('my_model/')  # TensorFlow SavedModel format

Loading saved models
You can reload models from disk to continue training or perform inference.

model = tf.keras.models.load_model('model.h5')

Model checkpoints
Model checkpoints save weights during training, allowing recovery or rollback on failure.

ModelCheckpoint(filepath='best_model.h5', save_best_only=True)

TensorFlow Lite for mobile
Convert models to TensorFlow Lite format for mobile deployment with smaller size and faster inference.

converter = tf.lite.TFLiteConverter.from_saved_model('my_model')
tflite_model = converter.convert()

TensorFlow.js for web
Convert models for use in browsers using TensorFlow.js.

tensorflowjs_converter --input_format=tf_saved_model my_model/ web_model/

Exporting to ONNX
ONNX allows model interchange between frameworks like PyTorch and TensorFlow.

# Use tf2onnx
python -m tf2onnx.convert --saved-model my_model --output model.onnx

Versioning models
Maintain multiple versions of models for rollback, A/B testing, or tracking performance changes.

# Save versioned directories: model/v1/, model/v2/

Using Pickle
Pickle can serialize traditional ML models (not deep learning models) in Python.

import pickle
pickle.dump(model, open('model.pkl', 'wb'))

Inference API
Expose your model for prediction via an API using Flask, FastAPI, or cloud platforms.

@app.route('/predict', methods=['POST'])
def predict():
    data = request.get_json()
    prediction = model.predict(data)
    return jsonify(prediction.tolist())

Deployment examples
Deploy models on AWS Lambda, Google Cloud Functions, or via Docker for scalable inference.

# Docker example
FROM tensorflow/tensorflow:latest
COPY model/ /app/model/

What are callbacks?
Callbacks are functions that are executed during training at certain stages such as at the end of an epoch. They allow you to monitor training, save models, adjust learning rates, and more. Keras offers several built-in callbacks and also supports custom ones.

from tensorflow.keras.callbacks import Callback
# Example of using callbacks in model training
model.fit(X, y, epochs=10, callbacks=[callback1, callback2])

EarlyStopping
EarlyStopping stops training when a monitored metric stops improving, preventing overfitting and saving time.

from tensorflow.keras.callbacks import EarlyStopping
early_stop = EarlyStopping(monitor='val_loss', patience=3)
model.fit(X, y, validation_data=(X_val, y_val), callbacks=[early_stop])

ModelCheckpoint
ModelCheckpoint saves model weights during training at specified intervals or when performance improves.

from tensorflow.keras.callbacks import ModelCheckpoint
checkpoint = ModelCheckpoint('model.h5', save_best_only=True)
model.fit(X, y, validation_split=0.2, callbacks=[checkpoint])

LearningRateScheduler
This callback dynamically changes the learning rate based on the epoch number or a custom function.

from tensorflow.keras.callbacks import LearningRateScheduler
def lr_schedule(epoch): return 0.01 * (0.1 ** (epoch // 10))
lr_sched = LearningRateScheduler(lr_schedule)
model.fit(X, y, callbacks=[lr_sched])

ReduceLROnPlateau
ReduceLROnPlateau reduces the learning rate when a metric has stopped improving.

from tensorflow.keras.callbacks import ReduceLROnPlateau
reduce_lr = ReduceLROnPlateau(monitor='val_loss', patience=2)
model.fit(X, y, validation_data=(X_val, y_val), callbacks=[reduce_lr])

TensorBoard callback
This callback logs training metrics that can be visualized in TensorBoard, such as loss and accuracy over time.

from tensorflow.keras.callbacks import TensorBoard
tensorboard = TensorBoard(log_dir='./logs')
model.fit(X, y, callbacks=[tensorboard])

CSVLogger
The CSVLogger saves metrics to a CSV file after each epoch, useful for post-analysis and debugging.

from tensorflow.keras.callbacks import CSVLogger
csv_logger = CSVLogger('training.log')
model.fit(X, y, callbacks=[csv_logger])

Custom callbacks
You can write your own callback class by extending `Callback` and overriding methods like `on_epoch_end`.

class MyCallback(Callback):
    def on_epoch_end(self, epoch, logs=None):
        print(f"Epoch {epoch} ended. Loss: {logs['loss']}")

model.fit(X, y, callbacks=[MyCallback()])

Callback chaining
Multiple callbacks can be combined in a list and passed together to the `fit()` function to run them in parallel.

callbacks = [early_stop, checkpoint, tensorboard]
model.fit(X, y, callbacks=callbacks)

Monitoring and logging
Callbacks allow monitoring of training and validation loss/accuracy, and you can log them for inspection and debugging later.

# All logs can be accessed via `logs` dictionary in custom callbacks
def on_epoch_end(self, epoch, logs=None): print(logs['val_accuracy'])

What are CNNs?
Convolutional Neural Networks (CNNs) are specialized for processing grid-like data, such as images. They automatically detect features using filters that reduce the need for manual feature extraction.

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(64, 64, 3)))

Convolution layers
Convolution layers apply filters to detect features like edges, shapes, and textures in images.

model.add(Conv2D(64, (3, 3), activation='relu'))

Pooling layers
Pooling layers reduce the spatial size of feature maps and help control overfitting. MaxPooling is the most common type.

from tensorflow.keras.layers import MaxPooling2D
model.add(MaxPooling2D(pool_size=(2, 2)))

Flattening
Flattening converts the 2D feature maps into a 1D array before passing them to fully connected layers.

from tensorflow.keras.layers import Flatten
model.add(Flatten())

Feature maps
Feature maps are the outputs of convolution layers that capture various patterns such as lines and textures in images.

# Feature maps are automatically generated by Conv2D
# Inspect via intermediate model or visualize with matplotlib

Dropout in CNNs
Dropout randomly disables a fraction of neurons during training to prevent overfitting and improve generalization.

from tensorflow.keras.layers import Dropout
model.add(Dropout(0.5))

Image classification with CNNs
CNNs are widely used for image classification tasks such as recognizing objects, animals, or facial expressions.

model.add(Dense(10, activation='softmax'))  # for 10 classes

Transfer learning basics
Transfer learning uses pre-trained models like VGG or ResNet as feature extractors to speed up training and improve accuracy.

from tensorflow.keras.applications import VGG16
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(64,64,3))

Regularization in CNNs
Techniques like L2 regularization, Dropout, and batch normalization help prevent overfitting in CNNs.

from tensorflow.keras.regularizers import l2
Conv2D(32, (3,3), activation='relu', kernel_regularizer=l2(0.01))

Real-world CNN use case
CNNs power applications like facial recognition, autonomous driving, and medical image diagnostics.

# Example: MRI tumor detection using CNN
# Architecture same, but trained on MRI image dataset

Image loading
Loading images in Keras can be done using utilities like `image.load_img()` or with `ImageDataGenerator` for batch processing.

from tensorflow.keras.preprocessing.image import load_img
img = load_img('cat.jpg', target_size=(64, 64))

Rescaling pixels
Rescaling normalizes image pixel values (0–255) to a smaller range (e.g., 0–1) for faster convergence and stability.

from tensorflow.keras.preprocessing.image import img_to_array
img_array = img_to_array(img) / 255.0

ImageDataGenerator
This class allows real-time image augmentation and batch-wise data loading for training deep models.

from tensorflow.keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(rescale=1./255)
train_gen = datagen.flow_from_directory('train/', target_size=(64,64), class_mode='categorical')

Data augmentation
Augmentation artificially expands the dataset by applying transformations like flips, rotations, and zooms.

aug_datagen = ImageDataGenerator(rotation_range=20, zoom_range=0.2, horizontal_flip=True)

Building CNN for CIFAR-10
CIFAR-10 contains 60,000 32×32 color images. You can build a small CNN with 3–4 Conv layers for classification.

model = Sequential([
  Conv2D(32, (3,3), activation='relu', input_shape=(32,32,3)),
  MaxPooling2D(2,2),
  Flatten(),
  Dense(10, activation='softmax')
])

Fine-tuning pre-trained models
Fine-tuning unfreezes some layers of a pre-trained model and retrains them on your specific dataset.

base_model.trainable = True  # Unfreeze layers for fine-tuning
model.compile(optimizer='adam', loss='categorical_crossentropy')

Multi-class classification
Multi-class problems use `categorical_crossentropy` loss with `softmax` output layer in Keras.

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

Batch normalization
This layer normalizes activations, speeding up training and improving performance.

from tensorflow.keras.layers import BatchNormalization
model.add(BatchNormalization())

Visualizing filters
You can visualize intermediate activations and filters using Keras functions or `matplotlib` to understand what the model sees.

# Use intermediate layer model to get filter outputs
from tensorflow.keras.models import Model
intermediate_model = Model(inputs=model.input, outputs=model.layers[1].output)

Model evaluation
Evaluate the trained model on the test set to get accuracy and loss metrics.

model.evaluate(X_test, y_test)

What is transfer learning?
Transfer learning allows leveraging pre-trained models on large datasets (like ImageNet) for new tasks. This improves performance and reduces training time, especially when labeled data is limited.

from tensorflow.keras.applications import VGG16
base_model = VGG16(weights='imagenet', include_top=False)

VGG16 usage
VGG16 is a popular deep CNN model. It's often used as a feature extractor by removing its top layers and adding custom classifiers for specific tasks.

model = VGG16(weights="imagenet", include_top=False, input_shape=(224, 224, 3))

ResNet integration
ResNet, with its residual blocks, allows very deep networks without degradation. It's widely used for fine-tuning in transfer learning setups.

from tensorflow.keras.applications import ResNet50
base_model = ResNet50(weights='imagenet', include_top=False)

InceptionNet and Xception
InceptionNet uses parallel convolutions, and Xception uses depthwise separable convolutions. Both are powerful backbones for image classification.

from tensorflow.keras.applications import InceptionV3
base_model = InceptionV3(weights='imagenet', include_top=False)

Freezing and unfreezing layers
Initially, freeze all layers of the base model to preserve learned features. Unfreeze selective layers later for fine-tuning.

for layer in base_model.layers:
    layer.trainable = False

Feature extraction
With frozen base layers, extract features from the new dataset using a few custom dense layers to classify outputs.

x = base_model.output
x = GlobalAveragePooling2D()(x)
predictions = Dense(10, activation='softmax')(x)

Fine-tuning strategy
Fine-tuning involves unfreezing top layers of the pre-trained model and retraining them on new data to improve performance.

for layer in base_model.layers[-4:]:
    layer.trainable = True

Using pre-trained weights
Pre-trained weights save training time and yield better accuracy. Most Keras models come with ImageNet weights by default.

model = VGG16(weights="imagenet", include_top=False)

Transfer learning on custom data
Transfer learning is ideal for domain-specific datasets like medical or satellite images, leveraging strong pre-trained features.

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(train_data, epochs=10)

Model export
After training, export the fine-tuned model for deployment using formats like SavedModel or HDF5.

model.save("custom_transfer_model.h5")

Introduction to RNNs
Recurrent Neural Networks are specialized for sequential data like time series and text. They maintain a hidden state, allowing memory of past inputs.

from tensorflow.keras.layers import SimpleRNN
model.add(SimpleRNN(64, input_shape=(timesteps, features)))

Use cases
RNNs are useful in text generation, speech recognition, sentiment analysis, and time-series forecasting.

# Text classification, stock prediction, etc.

RNN layers in Keras
Keras provides SimpleRNN, LSTM, and GRU layers for building RNNs. They can be stacked or combined for complex architectures.

from tensorflow.keras.layers import SimpleRNN
rnn_layer = SimpleRNN(32)

Sequence modeling
Sequence modeling refers to predicting sequences or outputs based on input sequences, e.g., language translation.

model = Sequential()
model.add(SimpleRNN(64, return_sequences=True))

Vanishing gradient problem
RNNs struggle with long-term dependencies due to vanishing gradients during backpropagation, leading to poor learning for earlier inputs.

# LSTM/GRU address this with gating mechanisms

Bidirectional RNNs
These RNNs process data in both forward and backward directions, improving context capture in sequences.

from tensorflow.keras.layers import Bidirectional
model.add(Bidirectional(SimpleRNN(64)))

Masking and padding
Sequences are padded to uniform length, and masking ensures the model ignores padding during training.

from tensorflow.keras.layers import Masking
model.add(Masking(mask_value=0.0, input_shape=(timesteps, features)))

Simple RNN model
A simple RNN model consists of one or more SimpleRNN layers followed by dense output layers.

model = Sequential([
    SimpleRNN(64),
    Dense(1, activation='sigmoid')
])

Forecasting with RNNs
RNNs are ideal for time-series forecasting like predicting future stock prices, temperatures, or sales.

# Input shape: (batch_size, timesteps, features)
model.fit(X_train, y_train, epochs=20)

Real-world applications
Applications include language translation, chatbots, music generation, video captioning, and healthcare diagnostics.

# Google Translate and Alexa use RNN-based models

What is LSTM?
LSTM (Long Short-Term Memory) networks are RNNs designed to remember long-term dependencies using gates for controlling information flow.

from tensorflow.keras.layers import LSTM
model.add(LSTM(64))

GRU vs LSTM
GRUs are a simplified version of LSTMs with fewer gates, offering similar performance with faster training and lower computational cost.

from tensorflow.keras.layers import GRU
model.add(GRU(64))

Memory cells
LSTM cells maintain internal memory and gates (input, forget, output) to decide what to keep, forget, and output.

# Automatically handled in LSTM layer, no manual memory cell needed

Time steps
Time steps refer to the number of past inputs considered for prediction. LSTM layers require input shape including time steps.

model.add(LSTM(64, input_shape=(10, 8)))  # 10 time steps, 8 features

Text classification
LSTM can classify sentiment, topics, or categories from text by processing word sequences.

model = Sequential([
    Embedding(input_dim=5000, output_dim=128),
    LSTM(64),
    Dense(1, activation='sigmoid')
])

Time series prediction
LSTM is ideal for predicting future values based on historical time-series data such as finance or weather.

model.compile(optimizer='adam', loss='mse')
model.fit(X_train, y_train, epochs=10)

Stacked LSTM
Multiple LSTM layers can be stacked for deeper learning. The first layers must return sequences.

model.add(LSTM(64, return_sequences=True))
model.add(LSTM(32))

Dropout in LSTM
Dropout regularizes LSTM by randomly dropping connections, reducing overfitting during training.

model.add(LSTM(64, dropout=0.2, recurrent_dropout=0.2))

Bidirectional LSTM
Processes sequences in both forward and backward directions, improving context understanding.

from tensorflow.keras.layers import Bidirectional
model.add(Bidirectional(LSTM(64)))

Combining LSTM with CNN
CNN extracts local patterns; LSTM captures sequential dependencies. Combined models improve tasks like video classification or NLP.

model = Sequential([
    Conv1D(64, 3, activation='relu'),
    LSTM(64),
    Dense(1, activation='sigmoid')
])

Tokenization
Tokenization splits text into words or subwords for analysis. It's the first step in any NLP pipeline.

from tensorflow.keras.preprocessing.text import Tokenizer
tokenizer = Tokenizer(num_words=1000)
tokenizer.fit_on_texts(["This is a sentence"])
print(tokenizer.word_index)

Text vectorization
Vectorization converts tokens to numbers. Keras provides layers and tools to automate it.

from tensorflow.keras.layers import TextVectorization
vectorizer = TextVectorization(max_tokens=1000)
vectorizer.adapt(["This is a sentence"])
print(vectorizer(["This is a sentence"]))

Word embeddings
Embeddings are dense vectors representing words' meanings. Used to capture semantic relationships.

from tensorflow.keras.layers import Embedding
embedding_layer = Embedding(input_dim=1000, output_dim=64)

Word2Vec integration
Word2Vec is a model that learns embeddings by context. You can train it or use pre-trained vectors.

from gensim.models import Word2Vec
model = Word2Vec(sentences, vector_size=100, window=5, min_count=1)
model.wv["word"]

Using pre-trained embeddings
Load embeddings like GloVe into Keras models to improve accuracy.

embedding_matrix = ...  # Load GloVe vectors into matrix
model.add(Embedding(input_dim=vocab_size, output_dim=100, weights=[embedding_matrix], trainable=False))

LSTM for text generation
LSTM captures sequence dependencies for tasks like text generation or classification.

from tensorflow.keras.layers import LSTM
model.add(LSTM(128))

Sentiment analysis
NLP models like LSTM or CNN can classify text sentiment (positive, negative).

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=5)

Attention mechanisms
Attention scores input tokens based on relevance, improving long-sequence understanding.

# See Transformer section for self-attention code

Transformer basics
Transformers use self-attention and parallelism, outperforming RNNs on many tasks.

from transformers import TFAutoModel
transformer = TFAutoModel.from_pretrained("bert-base-uncased")

Text summarization
Models like BART or T5 can summarize text using encoder-decoder transformers.

from transformers import pipeline
summarizer = pipeline("summarization")
summary = summarizer("Long article text here...")

What are embeddings?
Embeddings are dense representations of words in a continuous vector space. They capture semantic similarity between words.

# Example: Similar words have similar vectors
embedding['king'] - embedding['man'] + embedding['woman'] ≈ embedding['queen']

One-hot vs embeddings
One-hot encoding creates sparse, high-dimensional vectors. Embeddings reduce dimensionality and capture meaning.

# One-hot example:
[0, 0, 1, 0] → word3 (but no meaning or similarity info)

Keras Embedding layer
The Embedding layer maps integers to dense vectors and is trainable with your model.

model.add(Embedding(input_dim=10000, output_dim=64, input_length=100))

Pre-trained GloVe usage
Load GloVe vectors from file and assign them to your embedding matrix.

embeddings_index = {}
with open("glove.6B.100d.txt") as f:
  for line in f:
    values = line.split()
    word = values[0]
    coefs = np.asarray(values[1:], dtype='float32')
    embeddings_index[word] = coefs

Fine-tuning embeddings
Set `trainable=True` in the Embedding layer to allow model updates during training.

model.add(Embedding(input_dim=10000, output_dim=100, weights=[embedding_matrix], trainable=True))

Visualizing embeddings
Use t-SNE or PCA to project embeddings into 2D for visual inspection.

from sklearn.manifold import TSNE
import matplotlib.pyplot as plt
tsne = TSNE(n_components=2)
reduced = tsne.fit_transform(embedding_matrix)
plt.scatter(reduced[:, 0], reduced[:, 1])

Embedding matrix
The embedding matrix maps vocab indices to vectors, used to initialize the Keras Embedding layer.

embedding_matrix = np.zeros((vocab_size, 100))
embedding_matrix[word_index['hello']] = embeddings_index['hello']

Handling unknown tokens
Reserve a token for out-of-vocabulary words so the model can still process unknown inputs.

tokenizer = Tokenizer(oov_token="")

Padding sequences
Padding ensures input sequences are the same length. Use post-padding for RNNs.

from tensorflow.keras.preprocessing.sequence import pad_sequences
padded = pad_sequences(sequences, padding='post', maxlen=100)

Use case: sentiment model
Embeddings improve sentiment model performance by providing contextual word understanding.

model = Sequential([
  Embedding(vocab_size, 100, input_length=100),
  LSTM(64),
  Dense(1, activation='sigmoid')
])

Introduction to Attention
Attention mechanisms allow the model to focus on relevant parts of input sequences, improving translation and NLP tasks.

# Attention formula:
Attention(Q, K, V) = softmax(QK^T / sqrt(d_k)) * V

Self-attention
In self-attention, Q, K, and V come from the same sequence, capturing intra-sequence relationships.

# Used in Transformer encoders to weigh all tokens against each other

Encoder-Decoder structure
Encoders process inputs, decoders generate outputs. This structure is used in translation and summarization.

# Example in Transformer models: BERT (encoder), GPT (decoder)

Scaled Dot-Product Attention
It computes attention using dot products scaled by the square root of the dimension.

def scaled_dot_attention(Q, K, V):
  d_k = tf.cast(tf.shape(K)[-1], tf.float32)
  scores = tf.matmul(Q, K, transpose_b=True) / tf.math.sqrt(d_k)
  weights = tf.nn.softmax(scores)
  return tf.matmul(weights, V)

Positional encoding
Positional encodings add order info to embeddings, since transformers process inputs in parallel.

# Sinusoidal or learned positional encodings are added to input embeddings

BERT with Keras
BERT is a pre-trained bidirectional transformer used for many NLP tasks via Hugging Face.

from transformers import TFBertModel
bert = TFBertModel.from_pretrained("bert-base-uncased")

GPT architecture
GPT is a transformer decoder-only model trained to generate text autoregressively.

from transformers import GPT2LMHeadModel
model = GPT2LMHeadModel.from_pretrained("gpt2")

Hugging Face Transformers
Hugging Face provides easy APIs to use and fine-tune transformer models in just a few lines.

from transformers import pipeline
qa = pipeline("question-answering")
qa({"question": "Who is the CEO of OpenAI?", "context": "Sam Altman is the CEO."})

Transformer training
Training transformers requires careful batching, masking, and optimizers like AdamW.

from transformers import Trainer, TrainingArguments
args = TrainingArguments(output_dir="./model", per_device_train_batch_size=16)
trainer = Trainer(model=model, args=args, train_dataset=train_ds)

Applications of transformers
Transformers are used in translation, summarization, chatbots, sentiment analysis, and code generation.

# Examples: BERT for classification, GPT for text gen, T5 for summarization

What is generative modeling?
Generative modeling is a type of machine learning that involves generating new data instances that resemble a given dataset. These models learn the underlying distribution of the data and can produce similar samples.

# A generative model tries to model P(data)
# E.g., generate images like handwritten digits from MNIST

Autoencoders
Autoencoders compress input data into a latent-space representation and then reconstruct it. They're used for denoising, compression, and anomaly detection.

from keras.models import Model
from keras.layers import Input, Dense

input_img = Input(shape=(784,))
encoded = Dense(64, activation='relu')(input_img)
decoded = Dense(784, activation='sigmoid')(encoded)
autoencoder = Model(input_img, decoded)

Variational Autoencoders (VAE)
VAEs add a probabilistic twist to autoencoders, enabling generative sampling from latent space. They use KL divergence loss to ensure continuous latent variables.

# VAE requires defining custom loss with KL divergence + reconstruction loss

GANs
GANs consist of a generator and a discriminator in competition. The generator creates fake data while the discriminator distinguishes real from fake, improving both over time.

# GAN: train generator to fool discriminator, discriminator to detect fakes

Building simple GAN
A basic GAN has two networks: generator (to create data) and discriminator (to classify real/fake). They are trained alternately.

# generator = make_generator()
# discriminator = make_discriminator()
# Train them in alternating loops

Conditional GAN
cGANs extend GANs by conditioning both generator and discriminator on class labels, allowing control over generated outputs.

# Input = [noise + label] for generator

Text generation
Text generation uses models like RNNs, LSTMs, or Transformers to produce coherent sequences of words based on training data.

from keras.preprocessing.sequence import pad_sequences
# Train LSTM on character or word sequences to predict next word

Image generation
GANs are commonly used for image generation, producing realistic faces, art, or scenes from random noise or conditioned input.

# StyleGAN, DCGAN are popular for this task

DeepFakes
DeepFakes use GANs or autoencoders to swap faces in videos/images, raising ethical and legal concerns.

# Typically use encoder-decoder to encode face features and reconstruct on another face

Ethical considerations
Generative models raise issues around data misuse, fake media, and privacy. Responsible use, watermarking, and regulation are key to prevent harm.

# Important: use responsibly and be aware of consequences

Sequential vs Functional
The Functional API is more flexible than Sequential. It supports complex architectures such as multi-input/output models and models with shared layers or non-linear topology.

from keras.models import Model
from keras.layers import Input, Dense

inputs = Input(shape=(784,))
x = Dense(64, activation='relu')(inputs)
outputs = Dense(10, activation='softmax')(x)
model = Model(inputs, outputs)

Inputs and outputs
Functional models start by defining `Input()` objects and then chaining layers to produce outputs. You can access `.inputs` and `.outputs` for flexibility.

print(model.inputs)
print(model.outputs)

Multi-input models
Models can receive multiple inputs, useful in tasks like question answering or multi-modal learning.

input1 = Input(shape=(32,))
input2 = Input(shape=(64,))
merged = concatenate([input1, input2])

Multi-output models
These models generate multiple outputs from shared layers, ideal for multitask learning.

output1 = Dense(1, name='output1')(merged)
output2 = Dense(1, name='output2')(merged)
model = Model(inputs=[input1, input2], outputs=[output1, output2])

Shared layers
A shared layer is used multiple times in the model for different inputs. Great for siamese networks or duplicate detection.

shared_dense = Dense(64)
output1 = shared_dense(input1)
output2 = shared_dense(input2)

Residual connections
Residuals add inputs back into outputs to help gradient flow, useful in deep architectures like ResNet.

from keras.layers import Add
residual = Add()([input_tensor, x])

Model visualization
You can visualize the model architecture using `plot_model`, which creates a visual diagram.

from keras.utils import plot_model
plot_model(model, to_file='model.png', show_shapes=True)

Model summary
Just like Sequential models, Functional models provide `.summary()` to inspect layers and parameter counts.

model.summary()

Custom models
You can subclass the `Model` class to create fully custom behavior while still leveraging Keras features.

class MyModel(keras.Model):
    def __init__(self):
        super().__init__()
        self.dense = Dense(10)

    def call(self, inputs):
        return self.dense(inputs)

Real-world case
Functional API is used in real-world models like Inception, ResNet, and attention networks that need branches or custom connections.

# Complex models with branches are easier using Functional API

Why create custom layers
Custom layers allow you to extend Keras’ functionality by defining your own computations, useful for novel architectures or research experiments.

# Use when standard layers aren’t sufficient for your logic

Building custom Layer class
Subclass `keras.layers.Layer` and implement `__init__`, `build()`, and `call()` methods to define a custom layer.

class MyLayer(keras.layers.Layer):
    def __init__(self):
        super().__init__()

    def call(self, inputs):
        return inputs * 2

Using custom functions
You can wrap arbitrary Python or TensorFlow logic inside `call()` of a layer or model for full control.

class MultiplyByTen(keras.layers.Layer):
    def call(self, inputs):
        return inputs * 10

Custom activation
Custom activations can be defined using functions or Lambda layers for new mathematical functions.

from keras.layers import Activation, Lambda
def custom_relu(x):
    return tf.maximum(0.1 * x, x)
model.add(Lambda(custom_relu))

Custom loss function
Loss functions define how the model is penalized. You can create custom losses using functions or subclassing `Loss`.

def custom_loss(y_true, y_pred):
    return tf.reduce_mean(tf.square(y_pred - y_true))
model.compile(loss=custom_loss, optimizer='adam')

Custom metric
Custom metrics are used to monitor training. They can be created like loss functions or as subclasses.

def custom_accuracy(y_true, y_pred):
    return tf.reduce_mean(tf.cast(tf.equal(y_true, tf.round(y_pred)), tf.float32))

Subclassing Model
Subclass the `Model` class for complete control of training, evaluation, and forward pass.

class MyModel(keras.Model):
    def __init__(self):
        super().__init__()
        self.dense = Dense(10)

    def call(self, inputs):
        return self.dense(inputs)

Custom training loop
Using `GradientTape`, you can write your own training loop, helpful for debugging or special cases.

with tf.GradientTape() as tape:
    y_pred = model(x)
    loss = custom_loss(y_true, y_pred)
grads = tape.gradient(loss, model.trainable_weights)
optimizer.apply_gradients(zip(grads, model.trainable_weights))

Debugging custom models
Use `print`, `tf.print`, or `tf.debugging` utilities to track shapes, values, and gradients while developing custom models.

tf.print("Shape:", tf.shape(x))

Examples
Real-world examples include attention layers, capsule networks, and self-defined embeddings, showing the power of custom modeling.

# Create a custom attention mechanism by subclassing Layer

What are autoencoders?
Autoencoders are neural networks trained to reproduce input data at the output layer. They learn compressed, informative representations by minimizing reconstruction loss. They are unsupervised models useful in feature extraction, dimensionality reduction, and anomaly detection.

encoded = encoder(input_data)
decoded = decoder(encoded)
autoencoder = Model(inputs=input_data, outputs=decoded)

Architecture overview
Autoencoders consist of an encoder that compresses data and a decoder that reconstructs it. The bottleneck layer captures the most salient features, and the entire model is trained end-to-end.

input_img = Input(shape=(784,))
encoded = Dense(32, activation='relu')(input_img)
decoded = Dense(784, activation='sigmoid')(encoded)
autoencoder = Model(input_img, decoded)

Encoder & decoder models
Encoders map input data to a lower dimension, and decoders reconstruct it. Each can be used independently after training, especially for tasks like dimensionality reduction or generation.

encoder = Model(input_img, encoded)
encoded_input = Input(shape=(32,))
decoder_layer = autoencoder.layers[-1]
decoder = Model(encoded_input, decoder_layer(encoded_input))

Denoising autoencoder
Denoising autoencoders are trained to remove noise from data. The input is a corrupted version of the original, but the output is the clean version. This improves robustness.

noisy_input = input_img + noise
autoencoder.fit(noisy_input, clean_img, epochs=50)

Sparse autoencoder
Sparse autoencoders use sparsity constraints on hidden units, encouraging the model to activate only essential neurons. This leads to feature selection and better generalization.

encoded = Dense(64, activation='relu', activity_regularizer=regularizers.l1(1e-5))(input_img)

Variational autoencoder
VAEs are generative models that learn latent distributions instead of fixed encodings. They are trained using reconstruction loss and KL divergence, allowing sampling of new data.

z_mean = Dense(latent_dim)(h)
z_log_var = Dense(latent_dim)(h)
z = z_mean + tf.exp(0.5 * z_log_var) * epsilon

Applications
Autoencoders are applied in noise reduction, dimensionality reduction, anomaly detection, image compression, and generative tasks. They’re also used in pre-training deep networks.

# Apply encoder to extract compressed representation for clustering
encoded_imgs = encoder.predict(x_test)

Image compression
Autoencoders compress images by reducing them to a latent vector and reconstructing them. This is useful in reducing storage and bandwidth for image transmission.

autoencoder.fit(x_train, x_train, epochs=50, batch_size=256)

Anomaly detection
Autoencoders can learn normal patterns in data. High reconstruction errors during inference indicate anomalies, making them suitable for fraud and intrusion detection.

if reconstruction_error > threshold:
    print("Anomaly detected")

Visualization
Autoencoders allow 2D/3D visualization of data in latent space, useful in clustering and understanding feature distribution.

encoded_imgs = encoder.predict(x_test)
plt.scatter(encoded_imgs[:,0], encoded_imgs[:,1])

Introduction to time series
Time series data is a sequence of data points collected over time. Forecasting involves predicting future values based on past patterns. It is widely used in finance, weather, and supply chain analytics.

# Example time series data
series = [112, 118, 132, 129, 121, 135, ...]

Sliding window approach
This approach creates input-output pairs by using a fixed-length history window. It allows LSTM models to learn dependencies from recent time steps.

X = [series[i:i+window] for i in range(len(series)-window)]
y = [series[i+window] for i in range(len(series)-window)]

Data reshaping
Neural networks, especially LSTMs, require 3D inputs. Data must be reshaped to [samples, timesteps, features].

X = np.reshape(X, (X.shape[0], X.shape[1], 1))

LSTM for forecasting
LSTM networks capture temporal dependencies in time series. They maintain internal memory and are ideal for sequence prediction tasks.

model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(window, 1)))
model.add(Dense(1))

Multi-step predictions
Multi-step forecasting predicts multiple future points. It can be done iteratively or with a model trained for multiple outputs.

model.add(Dense(3))  # Predict next 3 time steps

Normalization
Normalizing input data improves training speed and model performance. MinMaxScaler is commonly used for time series scaling.

scaler = MinMaxScaler()
scaled_series = scaler.fit_transform(series.reshape(-1,1))

Evaluation metrics
Metrics like MAE, RMSE, and MAPE evaluate forecasting accuracy. Choice depends on error tolerance and application domain.

mae = mean_absolute_error(y_true, y_pred)
rmse = np.sqrt(mean_squared_error(y_true, y_pred))

Visualizing predictions
Plotting actual vs predicted values helps assess forecast accuracy and detect model bias or drift.

plt.plot(y_true, label='Actual')
plt.plot(y_pred, label='Predicted')

Combining with CNN
CNNs can extract local patterns in time series before passing to LSTM. This hybrid model improves performance in many cases.

model = Sequential()
model.add(Conv1D(64, 3, activation='relu', input_shape=(window, 1)))
model.add(LSTM(50))

Real-world project
Time series forecasting projects include stock price prediction, energy usage, and sales forecasting. Data preprocessing, validation, and post-analysis are key to accuracy.

# Predict future stock prices using LSTM with sliding window

What is object detection?
Object detection involves identifying objects in images and localizing them with bounding boxes. It combines classification and localization and is used in surveillance, autonomous driving, and more.

# Output: [class, x_min, y_min, x_max, y_max]

CNN backbone
Object detection models often use pretrained CNNs like VGG or ResNet as backbones to extract features from images.

base_model = tf.keras.applications.ResNet50(include_top=False, input_shape=(224,224,3))

YOLO with Keras
YOLO (You Only Look Once) divides images into grids and predicts bounding boxes and class probabilities in a single forward pass.

# YOLO uses custom loss for bounding boxes + classification

SSD overview
SSD (Single Shot MultiBox Detector) detects objects in multiple scales using feature maps from different CNN layers.

# SSD uses anchor boxes at multiple feature map levels

Bounding boxes
Bounding boxes enclose objects. They’re defined by coordinates of the top-left and bottom-right corners.

box = [x_min, y_min, x_max, y_max]

Anchor boxes
Anchor boxes are predefined shapes that help models predict different object sizes and aspect ratios more effectively.

# Anchors = reference boxes for detection layers

Label encoding
Ground truth labels are converted into format suitable for training, including class, coordinates, and anchor matches.

# Label = [class, x_center, y_center, width, height]

Transfer learning for detection
Detection models are fine-tuned from pretrained CNN backbones to adapt to custom datasets efficiently.

model = tf.keras.Model(inputs=base_model.input, outputs=detection_head)

Evaluating mAP
Mean Average Precision (mAP) evaluates object detection performance by averaging precision across IoU thresholds and classes.

# mAP is calculated using precision-recall curve for each class

Real-world example
Use object detection to track people, detect vehicles, or count items in inventory. OpenCV, TensorFlow, and PyTorch provide tools to implement these.

# Detect faces in webcam feed with YOLOv5 or Haar cascades

What is segmentation?
Semantic segmentation involves classifying each pixel in an image into a category. It's widely used in medical imaging, autonomous driving, and object detection to understand image content at a granular level.

# Each pixel labeled with a class (e.g., 0=background, 1=object)
segmentation_mask = model.predict(image)

U-Net architecture
U-Net is a popular encoder-decoder architecture for segmentation. The encoder captures context while the decoder reconstructs the segmentation map.

# Load U-Net from segmentation_models library
import segmentation_models as sm
model = sm.Unet('resnet34', input_shape=(128,128,3), classes=1, activation='sigmoid')

Data preparation
Input images and corresponding segmentation masks must be paired and preprocessed to the same size. Masks are usually one-hot encoded.

# Normalize images, resize masks
image = image / 255.0
mask = tf.image.resize(mask, (128, 128))

Mask generation
Masks are binary (1 for object, 0 for background) or multi-class. They're often loaded from PNGs where each pixel color maps to a class.

# Convert RGB mask to one-hot encoded class mask
mask = tf.cast(mask == class_id, tf.float32)

Dice loss
Dice loss is used to measure overlap between predicted and ground truth masks, especially useful for unbalanced classes.

def dice_loss(y_true, y_pred):
    intersection = tf.reduce_sum(y_true * y_pred)
    return 1 - (2. * intersection + 1) / (tf.reduce_sum(y_true) + tf.reduce_sum(y_pred) + 1)

IoU metric
Intersection over Union (IoU) evaluates the overlap between predicted and true masks. Higher IoU means better segmentation.

iou = tf.keras.metrics.MeanIoU(num_classes=2)
iou.update_state(y_true, y_pred)

Augmentation techniques
Data augmentation improves generalization by flipping, rotating, and scaling both input images and their masks.

# Albumentations is popular for segmentation
import albumentations as A
A.HorizontalFlip(p=0.5)

Post-processing
Techniques like morphological operations, thresholding, and contour detection help refine segmentation outputs.

# Convert logits to binary mask
mask = (model.predict(image) > 0.5).astype("uint8")

Applications
Semantic segmentation is used in self-driving cars (lane detection), agriculture (plant counting), and healthcare (tumor detection).

# Example: identify tumors in medical images

Keras example
You can train a segmentation model in Keras using standard fit method with image-mask pairs.

model.compile(optimizer='adam', loss=dice_loss, metrics=['accuracy'])
model.fit(train_dataset, validation_data=val_dataset, epochs=10)

Transformers library
Hugging Face’s `transformers` library offers state-of-the-art NLP models like BERT, GPT, and T5 with pre-trained weights and tokenizer support.

from transformers import pipeline
classifier = pipeline("sentiment-analysis")
classifier("Hugging Face is awesome!")

Tokenizers
Tokenizers convert raw text into input IDs suitable for model input. BERT uses WordPiece, GPT uses Byte-Pair Encoding.

from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
tokens = tokenizer("Hello!", return_tensors="pt")

Importing models
Models like BERT, GPT-2, and RoBERTa can be imported easily using `from_pretrained()`.

from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased")

Text classification
Fine-tune BERT for sentiment or topic classification tasks using labeled datasets and transformers Trainer API.

from transformers import Trainer
trainer = Trainer(model=model, train_dataset=train_ds, eval_dataset=val_ds)
trainer.train()

Question answering
Pretrained models like BERT can answer questions given a context. Hugging Face provides a ready-made pipeline.

qa = pipeline("question-answering")
qa(question="What is Hugging Face?", context="Hugging Face is an AI company.")

Text generation
Use models like GPT-2 or GPT-Neo for generating coherent, context-aware text.

gen = pipeline("text-generation", model="gpt2")
gen("Once upon a time", max_length=50)

Fine-tuning BERT
BERT can be fine-tuned on custom datasets for downstream tasks. Requires dataset, tokenizer, and Trainer setup.

trainer.train()  # After loading model, tokenizer, and datasets

Datasets module
The `datasets` library simplifies loading and preprocessing common NLP datasets for training and evaluation.

from datasets import load_dataset
dataset = load_dataset("imdb")

Saving & exporting models
Trained models can be saved locally or pushed to Hugging Face Hub for reuse and sharing.

model.save_pretrained("my_bert_model")
tokenizer.save_pretrained("my_bert_model")

Use case demo
Hugging Face pipelines allow rapid prototyping for sentiment analysis, QA, summarization, and more with just a few lines of code.

summarizer = pipeline("summarization")
summarizer("Your long document text...")

Learning rate schedulers
These callbacks adjust the learning rate during training to improve convergence and avoid overshooting minima.

lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
  initial_learning_rate=1e-2, decay_steps=10000, decay_rate=0.9)
optimizer = tf.keras.optimizers.Adam(learning_rate=lr_schedule)

Weight initialization
Proper initialization helps stabilize training. Keras supports Xavier (Glorot), He, and random initializers.

Dense(64, kernel_initializer='he_uniform')

Gradient clipping
Clipping gradients prevents exploding gradients by capping the max gradient value during backpropagation.

optimizer = tf.keras.optimizers.Adam(clipvalue=1.0)

Mixed precision training
Enables faster training and lower memory usage by using float16 where possible while maintaining float32 precision.

from tensorflow.keras.mixed_precision import set_global_policy
set_global_policy('mixed_float16')

XLA compilation
XLA (Accelerated Linear Algebra) compiles TensorFlow graphs for faster execution using ahead-of-time optimization.

@tf.function(jit_compile=True)
def train_step(inputs): ...

Multi-GPU training
Distributed training strategies like `MirroredStrategy` enable parallel model training across GPUs.

strategy = tf.distribute.MirroredStrategy()
with strategy.scope():
    model = create_model()

TPU support
TPUs offer fast parallel processing. TensorFlow and PyTorch support TPU training with minimal code changes.

resolver = tf.distribute.cluster_resolver.TPUClusterResolver()
tf.config.experimental_connect_to_cluster(resolver)

Quantization
Quantization reduces model size by converting weights from float32 to int8 without significant loss in accuracy.

converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]

Pruning
Pruning removes less important weights in a model, making it smaller and faster with little accuracy impact.

import tensorflow_model_optimization as tfmot
pruned_model = tfmot.sparsity.keras.prune_low_magnitude(model)

Model distillation
Distillation trains a smaller student model to mimic a larger teacher model for faster inference.

# Train student on soft labels from teacher
student.predict(x_train) ≈ teacher.predict(x_train)

What is explainability?
Explainability refers to the ability to interpret and understand the decisions made by machine learning models. It helps stakeholders trust models by clarifying how inputs influence outputs.

# Conceptual example: Explain model output importance
# Actual implementation uses libraries like SHAP or LIME

SHAP
SHAP (SHapley Additive exPlanations) explains individual predictions by computing contribution scores for each feature using cooperative game theory.

import shap
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)
shap.summary_plot(shap_values, X_test)

LIME
LIME (Local Interpretable Model-agnostic Explanations) explains predictions by approximating models locally with interpretable models.

from lime import lime_tabular
explainer = lime_tabular.LimeTabularExplainer(X_train)
exp = explainer.explain_instance(X_test[0], model.predict)
exp.show_in_notebook()

Grad-CAM
Grad-CAM visualizes CNN model’s attention by producing heatmaps showing important image regions contributing to decisions.

# Keras Grad-CAM example available in tensorflow tutorials

Feature importance
Feature importance measures how much each input feature influences the model prediction, often visualized by bar charts or plots.

import matplotlib.pyplot as plt
plt.bar(feature_names, model.feature_importances_)
plt.show()

Saliency maps
Saliency maps highlight input areas (e.g., pixels) that strongly affect the model's prediction, useful especially for images.

# Use guided backpropagation or integrated gradients for saliency maps

Keras callbacks for visualization
Keras callbacks can be used to visualize metrics, embeddings, or layer activations during training.

from tensorflow.keras.callbacks import TensorBoard
tensorboard = TensorBoard(log_dir='./logs')
model.fit(X_train, y_train, callbacks=[tensorboard])

Explainable AI (XAI) tools
Popular XAI tools include SHAP, LIME, Eli5, InterpretML, and Alibi that help explain complex models regardless of their architecture.

# pip install shap lime eli5 interpret

Interpretability for stakeholders
Model explanations are essential for compliance and to help non-technical stakeholders understand model decisions for transparency.

# Use dashboards and visual reports to communicate insights

Tools comparison
SHAP provides global and local explanations; LIME is model-agnostic but local; Grad-CAM works well for CNNs; choice depends on data and use case.

# Choose tool based on model type: tree, deep learning, tabular

Export formats
Models are exported in formats like TensorFlow SavedModel, HDF5, ONNX, or PMML for deployment on different platforms.

model.save("model_saved_model")
# or
model.save("model.h5")

REST API with Flask
Flask can serve models as REST APIs to enable real-time predictions via HTTP requests.

from flask import Flask, request, jsonify
app = Flask(__name__)

@app.route('/predict', methods=['POST'])
def predict():
    data = request.json
    prediction = model.predict(data['input'])
    return jsonify({'prediction': prediction.tolist()})

TensorFlow Serving
TensorFlow Serving is a flexible, high-performance serving system for ML models designed for production environments.

# Run TF Serving docker container
docker run -p 8501:8501 --mount type=bind,\
source=/models/model_name/,target=/models/model_name/ \
-e MODEL_NAME=model_name tensorflow/serving

TensorFlow Lite
TensorFlow Lite enables deploying lightweight models on mobile and embedded devices for efficient inference.

import tensorflow as tf
converter = tf.lite.TFLiteConverter.from_saved_model("model_saved_model")
tflite_model = converter.convert()

TensorFlow.js
TensorFlow.js allows running models in browsers and Node.js for client-side or server-side inference.

# Convert model for TF.js
tensorflowjs_converter --input_format=tf_saved_model model_saved_model/ web_model/

AWS deployment
AWS services like SageMaker and Lambda offer scalable infrastructure to deploy, monitor, and manage ML models.

# Example: Deploy using SageMaker SDK (Python)
import sagemaker
# Configure and deploy model

Dockerizing Keras model
Containerize ML apps with Docker to ensure consistent environments and simplify deployment.

# Sample Dockerfile snippet
FROM python:3.8
COPY model.h5 /app/

Monitoring deployments
Monitoring tools track latency, throughput, errors, and data drift to maintain model performance post-deployment.

# Use Prometheus, Grafana for monitoring endpoints

Scaling inference
Scale ML inference horizontally or vertically based on traffic using Kubernetes or cloud autoscaling.

# Kubernetes autoscaling example
kubectl autoscale deployment model-server --min=2 --max=10 --cpu-percent=80

CI/CD for ML
Continuous integration and delivery pipelines automate testing, packaging, and deploying ML models to production.

# Use GitHub Actions, Jenkins for ML workflow automation

NLP pipeline
Chatbots process user inputs through NLP pipelines: tokenization, parsing, intent detection, and entity recognition to understand requests.

# Example: Tokenization with NLTK
import nltk
tokens = nltk.word_tokenize("Hello, how can I help?")

Intent classification
Intent classification identifies user purpose, crucial for determining chatbot responses.

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
# Train classifier to predict intents

Entity extraction
Extracting entities like dates, names, or locations enables more detailed, context-aware responses.

import spacy
nlp = spacy.load("en_core_web_sm")
doc = nlp("Book a flight to New York tomorrow")
entities = [(ent.text, ent.label_) for ent in doc.ents]

Sequence-to-sequence
Seq2seq models generate responses by learning input-output mappings, effective for conversational AI.

# Simple seq2seq architecture in Keras

Attention-based responses
Attention mechanisms improve seq2seq models by focusing on relevant parts of input when generating replies.

# Attention layers in Transformer models

Context management
Context management tracks conversation state to provide coherent multi-turn dialogues.

# Store user session data or dialogue states

Using transformers
Transformer models (like BERT or GPT) excel in understanding and generating natural language for chatbots.

from transformers import pipeline
chatbot = pipeline("conversational")

Chatbot deployment
Deploy chatbots on web, mobile, or messaging platforms with APIs, SDKs, or cloud services.

# Deploy using Flask or serverless functions

Rasa integration
Rasa is an open-source conversational AI framework supporting intent recognition, dialogue management, and integrations.

# Rasa example: training and running a chatbot
rasa train
rasa run

Evaluation
Evaluate chatbots using metrics like accuracy, F1-score, user satisfaction, and conversational flow quality.

# Human-in-the-loop and automated testing methods

GAN basics
Generative Adversarial Networks consist of two neural networks, a generator and a discriminator, competing to create realistic data. The generator tries to fool the discriminator, while the discriminator tries to distinguish real from fake.

# Simple GAN architecture sketch
# Generator creates fake samples, discriminator classifies real vs fake

Generator architecture
The generator network starts from random noise and outputs synthetic data, often using convolutional or dense layers depending on data type.

from tensorflow.keras.layers import Dense, Reshape
generator = Sequential([
  Dense(128, activation='relu', input_shape=(100,)),
  Dense(784, activation='sigmoid'),
  Reshape((28,28,1))
])

Discriminator model
The discriminator is a binary classifier that learns to detect if inputs are real or fake.

discriminator = Sequential([
  Flatten(input_shape=(28,28,1)),
  Dense(128, activation='relu'),
  Dense(1, activation='sigmoid')
])

Training loop
GAN training alternates between training discriminator on real and fake data, then training generator to fool the discriminator.

# Pseudocode for GAN training
for epoch in epochs:
    train discriminator on real + fake
    train generator via discriminator feedback

Stability tips
Use techniques like label smoothing, batch normalization, and learning rate tuning to stabilize GAN training.

# Example: use batch normalization in generator layers
from tensorflow.keras.layers import BatchNormalization
generator.add(BatchNormalization())

Conditional GANs
Conditional GANs generate data conditioned on labels or input vectors, enabling control over output type.

# Add label inputs concatenated with noise vector

DCGAN
Deep Convolutional GANs use convolutional layers to improve image generation quality.

# Use Conv2DTranspose in generator, Conv2D in discriminator

CycleGAN
CycleGANs enable image-to-image translation without paired data by learning mappings in both directions.

# Used for style transfer like horses ↔ zebras

Pix2Pix
Pix2Pix uses paired data for supervised image-to-image translation.

# Example: input edges → output photo

GAN applications
GANs are widely used in image synthesis, art generation, super-resolution, and data augmentation.

# Generate realistic images, augment data, create artworks

Healthcare diagnosis
AI helps detect diseases from images and data, improving accuracy and speed in diagnostics.

# Example: CNN for X-ray image classification

Fraud detection
Machine learning models identify unusual patterns in transactions to flag fraud.

# Use anomaly detection algorithms on transaction data

E-commerce recommendations
Personalized recommendations increase sales by suggesting relevant products.

# Collaborative filtering or content-based recommendation systems

Stock prediction
Models analyze historical prices and news to forecast stock trends.

# Time series LSTM model for price prediction

Social media sentiment
NLP models classify sentiment from social media posts to gauge public opinion.

# Sentiment analysis with LSTM or transformers

Autonomous driving
AI enables self-driving cars to perceive environment and make decisions.

# Object detection with YOLO or SSD networks

Language translation
Neural machine translation uses sequence-to-sequence models for accurate translations.

# Transformer-based translation systems

Voice synthesis
Text-to-speech systems create natural-sounding voice outputs.

# Tacotron or WaveNet models

Facial recognition
AI identifies people by matching facial features.

# FaceNet or similar deep learning architectures

AR/VR
AI enhances augmented and virtual reality experiences by tracking and interaction.

# Real-time gesture recognition using CNNs

Introduction to MLOps
MLOps combines machine learning and DevOps to streamline model development, deployment, and maintenance.

# Use pipelines to automate workflows

Experiment tracking
Track model parameters, metrics, and artifacts to compare different runs.

import mlflow
mlflow.start_run()
mlflow.log_param(\"lr\", 0.01)
mlflow.log_metric(\"accuracy\", 0.95)
mlflow.end_run()

Model versioning
Version control ensures reproducibility and rollback ability for ML models.

# Save models with version IDs
model.save(\"model_v1.h5\")

CI/CD pipelines
Automate testing and deployment of models into production environments.

# Example: GitHub Actions or Jenkins pipeline scripts

MLflow usage
MLflow is an open-source platform for managing ML lifecycle including tracking, projects, and deployment.

mlflow.start_run()
mlflow.log_param(\"batch_size\", 64)
mlflow.sklearn.log_model(model, \"model\")
mlflow.end_run()

Monitoring models
Monitor model performance and data drift to ensure models remain effective.

# Use Prometheus or custom dashboards

Drift detection
Detect when input data distribution changes, potentially degrading model accuracy.

# Statistical tests or dedicated libraries like Alibi Detect

Feature stores
Centralized repositories for storing and sharing ML features across teams.

# Feast or similar feature store tools

Real-time inference
Deploy models to serve predictions with low latency.

# TensorFlow Serving or FastAPI endpoints

Best practices
Ensure reproducibility, scalability, and security in MLOps workflows.

# Automate testing, versioning, and monitoring

Bias in training data
Bias in training data occurs when datasets reflect unfair or unrepresentative samples, causing models to produce skewed or discriminatory outcomes. Detecting and mitigating bias is essential for fairness.

# Example: Checking class balance in Python
from collections import Counter
print(Counter(y_train))

Fairness metrics
Fairness metrics evaluate how equitable model predictions are across different groups, such as demographic parity or equal opportunity, guiding ethical model deployment.

# Use AIF360 library to compute fairness metrics
from aif360.metrics import ClassificationMetric

Ethical design
Ethical AI design involves transparency, accountability, and inclusive development practices to ensure technology benefits society and respects human rights.

# Document model decisions and data sources for auditability

Interpretable models
Models designed for interpretability provide explanations for predictions, helping users trust and understand AI decisions.

import shap
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X)

Privacy-preserving ML
Techniques like federated learning and differential privacy protect user data by minimizing exposure during training.

# TensorFlow Federated example to train without centralized data

Differential privacy
Differential privacy adds noise to data or queries to ensure individual records cannot be re-identified, enhancing confidentiality.

from tensorflow_privacy.privacy.optimizers.dp_optimizer import DPGradientDescentOptimizer

Model auditing
Regular audits verify that models comply with ethical and legal standards, checking for bias, accuracy, and security.

# Automate model fairness and performance checks in CI/CD pipelines

Governance
AI governance defines policies, roles, and processes to oversee responsible AI development and deployment.

# Maintain governance docs and approval workflows

Human-in-the-loop
Involving humans in decision-making allows oversight and correction of AI outputs, enhancing reliability.

# Use human review for flagged predictions before final action

Legal compliance
Ensuring AI systems comply with laws like GDPR and CCPA protects user rights and avoids penalties.

# Include data processing agreements and consent management

Diagnosing training failures
Training failures can arise from data issues, model bugs, or resource limits. Logs and checkpoints help identify failure points.

# Monitor training logs for errors and warnings

Gradient issues
Problems like vanishing or exploding gradients hinder learning, requiring normalization or gradient clipping.

# Apply gradient clipping in TensorFlow
optimizer = tf.keras.optimizers.Adam()
@tf.function
def train_step():
    gradients = tape.gradient(loss, model.trainable_variables)
    clipped = [tf.clip_by_norm(g, 1.0) for g in gradients]
    optimizer.apply_gradients(zip(clipped, model.trainable_variables))

Vanishing/exploding gradients
These occur in deep nets when gradients become too small or large, causing unstable training. Using ReLU activations or LSTM cells helps.

# Use ReLU activations to mitigate vanishing gradients
model.add(tf.keras.layers.Dense(64, activation='relu'))

Model overfitting
Overfitting happens when a model memorizes training data, reducing generalization. Use dropout, regularization, or more data.

model.add(tf.keras.layers.Dropout(0.5))

Misaligned labels
Label errors degrade performance; verify data-label consistency and use data validation tools.

# Visualize samples and labels to check alignment

Data leakage
Leakage occurs when training data contains info from test sets, inflating metrics. Strict data separation avoids this.

# Use train_test_split with stratification and shuffle

Reproducibility
Fixing random seeds and environment versions ensures consistent results across runs.

import numpy as np
import tensorflow as tf
np.random.seed(42)
tf.random.set_seed(42)

Logging experiments
Logging hyperparameters, metrics, and artifacts with tools like MLflow supports experiment tracking and comparison.

import mlflow
mlflow.log_param("lr", 0.001)

Using breakpoints
Debug ML code interactively with breakpoints to inspect variables and flow.

# Use pdb in Python scripts
import pdb; pdb.set_trace()

Profiling
Profiling identifies bottlenecks in code or hardware usage, guiding optimization.

# TensorBoard profiling example
tensorboard --logdir=logs/profile

AI/ML job roles
Roles range from data scientists and ML engineers to AI researchers and product managers, each requiring specific skills and focus areas.

// Explore roles on job boards like LinkedIn, Glassdoor

Interview prep
Prepare with coding challenges, system design, and domain knowledge; practice explaining ML concepts clearly.

// Use LeetCode and interview prep platforms

Building a portfolio
Showcase projects on GitHub or personal websites to demonstrate skills and attract employers.

// Create repositories with clear READMEs and demos

Publishing to GitHub
Use version control and detailed commits to maintain clean, shareable codebases.

git init
git add .
git commit -m "Initial commit"
git push origin main

Online certifications
Obtain recognized certifications from platforms like Coursera, edX, or Google to validate skills.

// Examples: Google AI, IBM Data Science Professional Certificate

Resume building
Highlight relevant experience, skills, projects, and quantifiable achievements to stand out.

// Tailor resumes for each job application

Mock interviews
Simulate interviews with peers or platforms like Pramp to build confidence.

// Practice behavioral and technical questions regularly

Freelancing as AI dev
Freelance work builds experience, client communication skills, and flexibility.

// Join platforms like Upwork or Freelancer

Teaching AI
Sharing knowledge via blogs, tutorials, or workshops reinforces skills and builds reputation.

// Start a blog or YouTube channel on AI topics

Staying up to date
Follow research papers, news, and conferences to keep current in the fast-evolving AI field.

// Use arXiv, Twitter, and conferences like NeurIPS

The site is under development.

Keras Tutorial

The site is under development.

Keras Tutorial

Chapter 1: Introduction to Keras

Chapter 2: Understanding Neural Networks

Chapter 3: Building a Neural Network

Chapter 4: Data Preprocessing with Keras

Chapter 5: Activation Functions in Depth

Chapter 6: Loss Functions

Chapter 7: Optimizers Explained

Chapter 8: Model Evaluation Metrics

Chapter 9: Model Saving and Deployment

Chapter 10: Callbacks in Keras

Chapter 11: Convolutional Neural Networks (CNNs)

Chapter 12: Image Classification with Keras

Chapter 13: Transfer Learning

Chapter 14: Recurrent Neural Networks (RNNs)

Chapter 15: LSTM and GRU

Chapter 16: Natural Language Processing (NLP)

Chapter 17: Word Embeddings

Chapter 18: Attention and Transformers

Chapter 19: Generative Models

Chapter 20: Functional API in Keras

Chapter 21: Custom Layers & Models

Chapter 22: Autoencoders

Chapter 23: Time Series Forecasting

Chapter 24: Object Detection

Chapter 25: Semantic Segmentation

Chapter 26: Working with Hugging Face

Chapter 27: Advanced Optimizations

Chapter 28: Model Explainability

Chapter 29: Model Deployment

Chapter 30: Building Chatbots

Chapter 31: GANs in Practice

Chapter 32: Industry Projects

Chapter 33: Integrating Keras with MLOps

Chapter 34: Responsible AI

Chapter 35: Troubleshooting & Debugging

Chapter 36: Career and Certification Path