ReactJS Tutorial

Machine Learning is a subset of Artificial Intelligence that allows systems to learn from data and improve performance without being explicitly programmed.

<!-- Example -->
<p>Machine Learning helps email systems to filter spam messages based on data patterns.</p>

Supervised Learning, Unsupervised Learning, Semi-supervised Learning, Reinforcement Learning.

<!-- Example -->
<ul>
  <li>Supervised Learning</li>
  <li>Unsupervised Learning</li>
  <li>Semi-supervised Learning</li>
  <li>Reinforcement Learning</li>
</ul>

Supervised Learning is a type of ML where the model is trained on labeled data.

<!-- Example -->
<p>Training a model to predict house prices using past housing data with known prices.</p>

Unsupervised Learning is a type of ML where the model is trained on unlabeled data to find hidden patterns.

<!-- Example -->
<p>Clustering customers based on purchasing behavior without knowing their buying intent.</p>

Reinforcement Learning is a type of ML where an agent learns to make decisions by receiving rewards or penalties.

<!-- Example -->
<p>Teaching a robot to walk by rewarding successful steps and penalizing falls.</p>

AI is a broader concept where machines simulate human intelligence. ML is a subset of AI focused on learning from data.

<!-- Example -->
<p>AI includes ML, natural language processing, robotics, etc. ML specifically deals with data-driven learning.</p>

Overfitting is when a model learns the training data too well and performs poorly on new data.

<!-- Example -->
<p>A model predicting stock prices perfectly on past data but failing on new unseen data.</p>

Underfitting is when a model is too simple to learn the underlying pattern in the data.

<!-- Example -->
<p>Using a linear model for a complex dataset with curves and interactions.</p>

A training set is the portion of the dataset used to train a model.

<!-- Example -->
<p>70% of the data used for training a spam detection algorithm.</p>

A test set is used to evaluate the performance of a trained model on unseen data.

<!-- Example -->
<p>Using 30% of the total dataset for testing after training the model.</p>

Cross-validation is a technique to assess model performance by splitting the data into multiple subsets and training/testing on different combinations.

<!-- Example -->
<p>K-fold cross-validation divides the data into K parts and rotates training/testing on each fold.</p>

A feature is an individual measurable property or characteristic of a data point.

<!-- Example -->
<p>In a dataset about houses, features include square footage, number of rooms, and location.</p>

A label is the output or target value associated with each training example.

<!-- Example -->
<p>For a spam classifier, labels are "spam" or "not spam" tags for each email.</p>

A model is a function or algorithm that makes predictions based on input data.

<!-- Example -->
<p>A decision tree model classifies whether a loan should be approved based on applicant details.</p>

Linear regression is a supervised learning algorithm that models the relationship between input features and a continuous output.

<!-- Example -->
<p>Predicting house price based on size and number of rooms using a best-fit line.</p>

Logistic regression is a classification algorithm used to predict the probability of a binary outcome.

Regression predicts continuous values, while classification predicts discrete labels or categories.

A decision tree is a flowchart-like tree structure where internal nodes represent tests on features, and leaf nodes represent output labels.

A random forest is an ensemble method that combines multiple decision trees to improve performance and reduce overfitting.

Overfitting occurs when a model learns the noise in training data. Prevent it using techniques like cross-validation, regularization, and pruning.

Underfitting happens when a model is too simple to capture the data pattern. Fix it by increasing model complexity or training longer.

A confusion matrix is a table used to evaluate classification models by comparing predicted and actual values.

Precision is the ratio of true positives to total predicted positives; recall is the ratio of true positives to actual positives.

F1-score is the harmonic mean of precision and recall. It balances both metrics especially for imbalanced datasets.

Bias refers to the error due to overly simplistic assumptions in the model. High bias can cause underfitting.

Variance refers to the model’s sensitivity to small fluctuations in training data. High variance can lead to overfitting.

The bias-variance tradeoff is the balance between underfitting (high bias) and overfitting (high variance).

Gradient descent is an optimization algorithm that updates model weights to minimize loss function by moving in the direction of the steepest descent.

Learning rate is a hyperparameter that controls the step size of gradient descent. Too high may overshoot; too low may be slow.

Feature engineering is the process of selecting, modifying, or creating new features to improve model performance.

One-hot encoding converts categorical variables into binary vectors representing presence or absence of each category.

Cross-validation is a technique to assess model generalization by splitting data into training and validation folds.

PCA is a dimensionality reduction technique that transforms correlated features into uncorrelated principal components.

Regularization adds a penalty term to the loss function to reduce model complexity and prevent overfitting.

L1 adds absolute value penalties causing sparsity; L2 adds squared penalties shrinking weights evenly.

Early stopping halts training when validation performance stops improving to prevent overfitting.

A learning curve plots model performance against training size or epochs to diagnose bias or variance issues.

Normalization rescales features to a common scale, usually between 0 and 1, to improve training stability.

Standardization scales data to have zero mean and unit variance, useful for algorithms assuming normal distribution.

SVM is a supervised algorithm that finds the optimal hyperplane to separate classes with maximum margin.

Kernel trick transforms data into higher dimensions to make it linearly separable without computing coordinates explicitly.

Neural networks are interconnected layers of nodes designed to recognize patterns and model complex data.

Activation functions introduce non-linearity to neural networks, enabling them to learn complex functions.

Examples include sigmoid, ReLU, tanh, and softmax.

Backpropagation is an algorithm to update neural network weights by propagating the error backward using gradients.

Dropout randomly disables neurons during training to prevent overfitting and improve generalization.

Batch normalization normalizes layer inputs to stabilize and accelerate training.

CNNs use convolutional layers to extract spatial features, commonly used in image processing tasks.

RNNs have loops allowing information to persist, ideal for sequential data like text or time series.

LSTM is a type of RNN designed to capture long-term dependencies with gating mechanisms.

Overfitting occurs when a machine learning model learns the training data too well, including noise and outliers, leading to poor generalization on unseen data. Prevention techniques include:
- Using more training data
- Applying regularization (L1, L2)
- Pruning models (in decision trees)
- Early stopping during training
- Cross-validation to monitor model performance
- Simplifying the model architecture
Understanding and controlling overfitting is essential for building robust ML models.

k-Nearest Neighbors (k-NN) is a simple supervised ML algorithm that classifies based on majority vote of k nearest neighbors. Below is a simple Python example using scikit-learn:

from sklearn.neighbors import KNeighborsClassifier<br>
from sklearn.datasets import load_iris<br>
from sklearn.model_selection import train_test_split<br>

# Load dataset<br>
iris = load_iris()<br>
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.3, random_state=42)<br>

# Create k-NN classifier with k=3<br>
knn = KNeighborsClassifier(n_neighbors=3)<br>
knn.fit(X_train, y_train)<br>

# Predict on test data<br>
predictions = knn.predict(X_test)<br>

print(predictions)<br>

This code trains a k-NN model on Iris dataset and prints predictions for the test set.

The bias-variance tradeoff balances two sources of error that affect ML model performance:
- Bias: Error from overly simplistic assumptions, causing underfitting.
- Variance: Error from excessive sensitivity to training data noise, causing overfitting.
A good model minimizes total error by balancing bias and variance. Techniques like cross-validation help find this balance.
Understanding this tradeoff helps in selecting the right model complexity for your data.

Data normalization rescales features to a fixed range (usually 0 to 1). MinMaxScaler achieves this by subtracting min and dividing by data range. Example using scikit-learn:

from sklearn.preprocessing import MinMaxScaler<br>
import numpy as np<br>

data = np.array([[10, 200], [15, 300], [20, 400]], dtype=float)<br>

scaler = MinMaxScaler()<br>
normalized_data = scaler.fit_transform(data)<br>

print(normalized_data)<br>

This code normalizes the 2D array column-wise between 0 and 1.

Classification predicts discrete labels or categories (e.g., spam or not spam).
Regression predicts continuous values (e.g., house price, temperature).
Both are supervised learning tasks but differ in output types and algorithms used.
For example, logistic regression is used for classification, while linear regression is used for regression problems.
Choosing the correct task depends on your prediction goal.

Model deployment is the process of integrating a trained machine learning model into a production environment where it can make predictions on real data.

# Example: Deploying a model using Flask

from flask import Flask, request, jsonify

import joblib


app = Flask(__name__)

model = joblib.load('model.pkl')


@app.route('/predict', methods=['POST'])

def predict():

    data = request.get_json()

    prediction = model.predict([data['features']])

    return jsonify({'prediction': prediction.tolist()})


app.run(debug=True)

Overfitting can be prevented using techniques like cross-validation, regularization, pruning (for decision trees), and using more data or simpler models.

# Example: Using L2 Regularization in Logistic Regression

from sklearn.linear_model import LogisticRegression

model = LogisticRegression(penalty='l2', C=0.1)

model.fit(X_train, y_train)

Hyperparameter tuning is the process of choosing the optimal parameters for a learning algorithm to improve performance.

# Example: Grid Search for tuning SVM parameters

from sklearn.model_selection import GridSearchCV

from sklearn.svm import SVC

parameters = {'kernel':('linear', 'rbf'), 'C':[1, 10]}

grid = GridSearchCV(SVC(), parameters)

grid.fit(X_train, y_train)

print(grid.best_params_)

Feature scaling ensures that each feature contributes equally to the model by putting them on a similar scale.

# Example: Using StandardScaler

from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

X_scaled = scaler.fit_transform(X)

Imbalanced datasets can be handled using techniques like oversampling the minority class, undersampling the majority class, or using specialized algorithms.

# Example: Using SMOTE for oversampling

from imblearn.over_sampling import SMOTE

sm = SMOTE(random_state=42)

X_res, y_res = sm.fit_resample(X, y)

Dimensionality reduction reduces the number of input variables in a dataset. It helps improve model performance and visualization.

# Example: PCA in sklearn

from sklearn.decomposition import PCA

pca = PCA(n_components=2)

X_reduced = pca.fit_transform(X)

The curse of dimensionality refers to the exponential increase in data requirements as the number of features grows. It can lead to overfitting and computational inefficiency.

# Example: Comparing model performance before and after dimensionality reduction

from sklearn.ensemble import RandomForestClassifier

from sklearn.decomposition import PCA

from sklearn.model_selection import train_test_split

from sklearn.metrics import accuracy_score



X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

model = RandomForestClassifier()

model.fit(X_train, y_train)

original_score = accuracy_score(y_test, model.predict(X_test))



pca = PCA(n_components=5)

X_pca = pca.fit_transform(X)

X_train_pca, X_test_pca, y_train, y_test = train_test_split(X_pca, y, test_size=0.3)

model.fit(X_train_pca, y_train)

pca_score = accuracy_score(y_test, model.predict(X_test_pca))

print("Original Score:", original_score)

print("PCA Score:", pca_score)

Ensemble methods combine multiple models to improve prediction accuracy. Popular methods include Bagging, Boosting, and Stacking.

# Example: Using Random Forest (Bagging technique)

from sklearn.ensemble import RandomForestClassifier

model = RandomForestClassifier(n_estimators=100)

model.fit(X_train, y_train)

predictions = model.predict(X_test)

Stacking is an ensemble learning technique that combines predictions from multiple base models using a meta-model.

# Example: Using StackingClassifier

from sklearn.ensemble import StackingClassifier

from sklearn.linear_model import LogisticRegression

from sklearn.tree import DecisionTreeClassifier

from sklearn.svm import SVC



base_learners = [
    ('dt', DecisionTreeClassifier()),
    ('svc', SVC(probability=True))
]

stack_model = StackingClassifier(estimators=base_learners, final_estimator=LogisticRegression())

stack_model.fit(X_train, y_train)

stack_preds = stack_model.predict(X_test)

AdaBoost is a boosting algorithm that adjusts the weights of incorrectly classified samples to focus more on hard examples in subsequent rounds.

# Example: AdaBoost with decision trees

from sklearn.ensemble import AdaBoostClassifier

from sklearn.tree import DecisionTreeClassifier

model = AdaBoostClassifier(base_estimator=DecisionTreeClassifier(max_depth=1), n_estimators=50)

model.fit(X_train, y_train)

predictions = model.predict(X_test)

Stacking is an ensemble learning method where multiple models are trained, and a meta-model learns to combine their predictions.

# Example: Stacking with two base models and a meta-model

from sklearn.ensemble import StackingClassifier

from sklearn.linear_model import LogisticRegression

from sklearn.svm import SVC

from sklearn.tree import DecisionTreeClassifier


estimators = [('svm', SVC()), ('tree', DecisionTreeClassifier())]

model = StackingClassifier(estimators=estimators, final_estimator=LogisticRegression())

model.fit(X_train, y_train)

predictions = model.predict(X_test)

Batch size is the number of training samples processed before the model is updated. It impacts model accuracy and training time.

# Example: Training with different batch sizes

model.fit(X_train, y_train, epochs=10, batch_size=32)

model.fit(X_train, y_train, epochs=10, batch_size=64)

Overfitting happens when the model learns the training data too well and fails to generalize to new data.

# Example: Overfitting visible in training vs. validation accuracy

model.fit(X_train, y_train, validation_data=(X_val, y_val), epochs=100)

Techniques include dropout, regularization, early stopping, and using more training data.

# Example: Applying dropout

from tensorflow.keras.layers import Dropout

model.add(Dropout(0.5))

Dropout randomly disables neurons during training, preventing co-dependence and improving generalization.

# Example: Adding dropout to a layer

model.add(Dropout(0.25))

Early stopping halts training when performance on validation data starts degrading, preventing overfitting.

# Example: Using early stopping in Keras

from tensorflow.keras.callbacks import EarlyStopping

callback = EarlyStopping(monitor='val_loss', patience=3)

model.fit(X_train, y_train, validation_data=(X_val, y_val), callbacks=[callback])

L1 regularization adds the absolute value of weights to the loss, encouraging sparsity in the model.

# Example: Applying L1 regularization

from tensorflow.keras import regularizers

Dense(64, kernel_regularizer=regularizers.l1(0.01))

L2 regularization adds the square of the weights to the loss, preventing large weights and reducing overfitting.

# Example: Applying L2 regularization

Dense(64, kernel_regularizer=regularizers.l2(0.01))

CNN is a deep learning model primarily used for image processing, which uses convolutional layers to extract features.

# Example: Simple CNN with Keras

from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

model.add(Conv2D(32, (3,3), activation='relu'))

model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())

model.add(Dense(64, activation='relu'))

A pooling layer reduces the spatial dimensions of feature maps and helps control overfitting.

# Example: MaxPooling layer

model.add(MaxPooling2D(pool_size=(2, 2)))

KNN classifies a data point based on how its neighbors are classified. It finds the 'k' closest points and assigns the label most common among them.

# Example: Using KNN in scikit-learn

from sklearn.neighbors import KNeighborsClassifier

model = KNeighborsClassifier(n_neighbors=3)

model.fit(X_train, y_train)

predictions = model.predict(X_test)

Regularization techniques like L1 (Lasso) and L2 (Ridge) help prevent overfitting by adding penalties to the loss function during training.

# Example: Ridge regression

from sklearn.linear_model import Ridge

model = Ridge(alpha=1.0)

model.fit(X_train, y_train)

predictions = model.predict(X_test)

ROC curve plots true positive rate against false positive rate at various thresholds. AUC measures the area under this curve and shows model performance.

# Example: Plotting ROC curve

from sklearn.metrics import roc_curve, auc

import matplotlib.pyplot as plt


fpr, tpr, thresholds = roc_curve(y_test, model.predict_proba(X_test)[:,1])

roc_auc = auc(fpr, tpr)

plt.plot(fpr, tpr, label='AUC = %0.2f' % roc_auc)

plt.xlabel('False Positive Rate')

plt.ylabel('True Positive Rate')

plt.title('ROC Curve')

plt.legend()

plt.show()

Multicollinearity occurs when independent variables are highly correlated, which can distort the model. Solutions include removing variables or using PCA.

# Example: Using VIF to detect multicollinearity

from statsmodels.stats.outliers_influence import variance_inflation_factor

import pandas as pd


vif_data = pd.DataFrame()

vif_data["feature"] = X.columns

vif_data["VIF"] = [variance_inflation_factor(X.values, i) for i in range(len(X.columns))]

print(vif_data)

A confusion matrix summarizes the performance of a classification model using true positives, false positives, true negatives, and false negatives.

# Example: Confusion Matrix with scikit-learn

from sklearn.metrics import confusion_matrix

cm = confusion_matrix(y_test, predictions)

print(cm)

Handling missing data is crucial. Techniques include removing rows, imputing with mean/median/mode, or using algorithms that support missing values.

# Example: Impute missing values with mean

from sklearn.impute import SimpleImputer

import numpy as np

imp = SimpleImputer(strategy='mean')

X_imputed = imp.fit_transform(X)

Regularization reduces overfitting by penalizing large coefficients in models. L1 (Lasso) and L2 (Ridge) are common forms.

# Example: Ridge regression

from sklearn.linear_model import Ridge

model = Ridge(alpha=1.0)

model.fit(X_train, y_train)

Logistic regression models the probability of a binary outcome using a logistic function (sigmoid).

# Example: Logistic regression model

from sklearn.linear_model import LogisticRegression

model = LogisticRegression()

model.fit(X_train, y_train)

predictions = model.predict(X_test)

Cross-validation splits the data into k folds and trains the model on k-1 parts while testing on the remaining. It helps evaluate model performance reliably.

# Example: 5-fold cross-validation

from sklearn.model_selection import cross_val_score

scores = cross_val_score(model, X, y, cv=5)

print("Average score:", scores.mean())

Principal Component Analysis (PCA) is a dimensionality reduction method that transforms features into a set of uncorrelated variables called principal components.

# Example: Applying PCA

from sklearn.decomposition import PCA

pca = PCA(n_components=2)

X_reduced = pca.fit_transform(X)

The ROC (Receiver Operating Characteristic) curve plots TPR vs. FPR at various threshold settings, evaluating classification performance.

# Example: Plot ROC curve

from sklearn.metrics import roc_curve, auc

y_score = model.predict_proba(X_test)[:, 1]

fpr, tpr, _ = roc_curve(y_test, y_score)

roc_auc = auc(fpr, tpr)

Batch learning trains on the full dataset at once, while online learning updates the model incrementally with each data point or batch.

# Example: Online learning with SGDClassifier

from sklearn.linear_model import SGDClassifier

model = SGDClassifier()

for X_batch, y_batch in data_stream:

    model.partial_fit(X_batch, y_batch, classes=np.unique(y))

Anomaly detection identifies data points that deviate significantly from the norm. It's useful in fraud detection, health monitoring, etc.

# Example: Isolation Forest for anomaly detection

from sklearn.ensemble import IsolationForest

model = IsolationForest()

model.fit(X)

predictions = model.predict(X)

Stratified sampling splits data while preserving class proportions. It's vital in classification tasks with imbalanced datasets.

# Example: Stratified train-test split

from sklearn.model_selection import StratifiedShuffleSplit

split = StratifiedShuffleSplit(n_splits=1, test_size=0.2)

for train_index, test_index in split.split(X, y):

    X_train, X_test = X[train_index], X[test_index]

    y_train, y_test = y[train_index], y[test_index]

A confusion matrix summarizes the prediction results of a classification model, showing true positives, false positives, true negatives, and false negatives.

# Example: Confusion matrix with sklearn

from sklearn.metrics import confusion_matrix

predictions = model.predict(X_test)

cm = confusion_matrix(y_test, predictions)

print(cm)

Bagging (Bootstrap Aggregating) trains multiple models on different subsets of the training data and combines their results to improve performance and reduce variance.

# Example: Bagging classifier

from sklearn.ensemble import BaggingClassifier

from sklearn.tree import DecisionTreeClassifier

model = BaggingClassifier(base_estimator=DecisionTreeClassifier(), n_estimators=10)

model.fit(X_train, y_train)

Pre-pruning stops tree growth early based on criteria like max depth, while pruning removes branches after full tree growth to avoid overfitting.

# Example: Set max depth for pre-pruning

from sklearn.tree import DecisionTreeClassifier

model = DecisionTreeClassifier(max_depth=3)

model.fit(X_train, y_train)

A hyperparameter is a configuration set before training a model (e.g., learning rate, number of trees). It's tuned using techniques like GridSearchCV or RandomSearchCV.

# Example: Grid search for best parameters

from sklearn.model_selection import GridSearchCV

params = {'n_estimators': [50, 100]}

grid = GridSearchCV(estimator=model, param_grid=params, cv=3)

grid.fit(X_train, y_train)

An SVM finds the optimal hyperplane that best separates classes. It works well for high-dimensional data and uses kernel tricks for non-linearity.

# Example: Linear SVM
from sklearn.svm import SVC
model = SVC(kernel='linear')
model.fit(X_train, y_train)

Feature engineering is the process of selecting, transforming, or creating new input features to improve model performance.

# Example: Creating interaction feature

X['new_feature'] = X['feature1'] * X['feature2']

Overfitting occurs when a model performs well on training data but poorly on unseen data. Prevent it by using regularization, pruning, cross-validation, or simplifying the model.

# Example: Use regularization in logistic regression

from sklearn.linear_model import LogisticRegression

model = LogisticRegression(C=0.1)

model.fit(X_train, y_train)

K-fold cross-validation splits data into K subsets. The model is trained on K-1 parts and validated on the remaining one. This process is repeated K times to ensure robust evaluation.

# Example: 5-fold cross-validation

from sklearn.model_selection import cross_val_score

scores = cross_val_score(model, X, y, cv=5)

print(scores.mean())

Gradient descent comes in three types: Batch (uses whole dataset), Stochastic (uses one sample), and Mini-batch (uses small batches). Each has tradeoffs in speed and stability.

# Example: SGD in Scikit-learn

from sklearn.linear_model import SGDClassifier

model = SGDClassifier()
model.fit(X_train, y_train)

The ROC curve shows the trade-off between true positive and false positive rates. AUC (Area Under Curve) summarizes its performance. Closer to 1 means better classification.

# Example: Plot ROC curve

from sklearn.metrics import roc_curve, auc

y_scores = model.predict_proba(X_test)[:, 1]

fpr, tpr, _ = roc_curve(y_test, y_scores)

roc_auc = auc(fpr, tpr)

print(roc_auc)

Bagging (Bootstrap Aggregating) reduces variance by training multiple models on different subsets of data and averaging their predictions. It's commonly used in ensemble methods like Random Forest.

# Example: Bagging with Decision Tree

from sklearn.ensemble import BaggingClassifier

from sklearn.tree import DecisionTreeClassifier

model = BaggingClassifier(base_estimator=DecisionTreeClassifier(), n_estimators=10)

model.fit(X_train, y_train)

Boosting is an ensemble technique that builds models sequentially. Each new model corrects errors from previous ones. It improves accuracy and reduces bias.

# Example: Gradient Boosting

from sklearn.ensemble import GradientBoostingClassifier

model = GradientBoostingClassifier(n_estimators=100)

model.fit(X_train, y_train)

Feature selection improves model performance by removing irrelevant or redundant data. It also reduces overfitting and training time.

# Example: SelectKBest feature selection

from sklearn.feature_selection import SelectKBest, f_classif

selector = SelectKBest(score_func=f_classif, k=5)

X_new = selector.fit_transform(X, y)

Dimensionality reduction reduces the number of input variables. PCA (Principal Component Analysis) is a common technique to project data to lower dimensions while preserving variance.

# Example: PCA
from sklearn.decomposition import PCA

pca = PCA(n_components=2)

X_reduced = pca.fit_transform(X)

Model interpretability means understanding how a model makes decisions. It's essential in sensitive domains like healthcare and finance to ensure transparency and trust.

# Example: Using SHAP for interpretability

import shap

explainer = shap.Explainer(model, X_train)

shap_values = explainer(X_test)

shap.summary_plot(shap_values, X_test)

Regularization adds a penalty to the loss function, discouraging large coefficients. This helps the model generalize better and reduces overfitting.

# Example: L2 Regularization with Ridge Regression

from sklearn.linear_model import Ridge

model = Ridge(alpha=1.0)

model.fit(X_train, y_train)

Cross-validation splits the data into multiple training and validation sets. This gives a better estimate of model performance and reduces the risk of overfitting.

# Example: 5-Fold Cross Validation

from sklearn.model_selection import cross_val_score

scores = cross_val_score(model, X, y, cv=5)

print(scores.mean())

Machine Learning Interview Questions

Beginners To Experts

The site is under development.

Machine Learning Interview Questions

Machine Learning Interview Questions

Beginners To Experts

The site is under development.

Machine Learning Interview Questions

What is Machine Learning?

What are the types of Machine Learning?

What is Supervised Learning?

What is Unsupervised Learning?

What is Reinforcement Learning?

What is the difference between AI and ML?

What is overfitting?

What is underfitting?

What is a training set?

What is a test set?

What is cross-validation?

What is a feature in Machine Learning?

What is a label in supervised learning?

What is a model in Machine Learning?

What is linear regression?

16. What is logistic regression?

17. What is the difference between regression and classification?

18. What is a decision tree?

19. What is a random forest?

20. What is overfitting and how to prevent it?

21. What is underfitting and how to fix it?

22. What is a confusion matrix?

23. What are precision and recall?

24. What is F1-score?

25. What is bias in Machine Learning?

26. What is variance in Machine Learning?

27. What is the bias-variance tradeoff?

28. What is gradient descent?

29. What is learning rate?

30. What is feature engineering?

31. What is one-hot encoding?

32. What is cross-validation?

33. What is PCA (Principal Component Analysis)?

34. What is regularization?

35. Difference between L1 and L2 regularization?

36. What is early stopping?

37. What is a learning curve?

38. What is data normalization?

39. What is data standardization?

40. What is a support vector machine (SVM)?

41. What is kernel trick in SVM?

42. What is a neural network?

43. What is an activation function?

44. Examples of activation functions?

45. What is backpropagation?

46. What is dropout in neural networks?

47. What is batch normalization?

48. What is a convolutional neural network (CNN)?

49. What is a recurrent neural network (RNN)?

50. What is an LSTM network?

51. What is overfitting in machine learning and how to prevent it?

52. How to implement k-Nearest Neighbors (k-NN) classifier in Python?

53. Explain the bias-variance tradeoff in machine learning.

54. How to perform data normalization using MinMaxScaler in scikit-learn?

55. What is the difference between classification and regression in ML?

What is Model Deployment in Machine Learning?

How do you prevent overfitting in a model?

What is Hyperparameter Tuning?

What is the purpose of feature scaling?

How do you handle imbalanced datasets?

What is Dimensionality Reduction?

What is the Curse of Dimensionality?

What are Ensemble Methods?

What is Stacking in Ensemble Learning?

What is AdaBoost?

What is Stacking?

What is Batch Size?

What is Overfitting in Neural Networks?

How to Prevent Overfitting in Neural Networks?

What is Dropout?

What is Early Stopping?

What is L1 Regularization?

What is L2 Regularization?

What is a Convolutional Neural Network (CNN)?

What is a Pooling Layer?

How does K-Nearest Neighbors (KNN) work?