About the Course
SYLLABUS
Module 1: Introduction to Data Science
Overview of Data Science: Definition, scope, and applications. Comparing Key Fields: Data Science vs. Machine Learning vs. Deep Learning vs. Artificial Intelligence. Types of Data Science Problems: Supervised vs. Unsupervised learning. Data Representation and Understanding Data Types.
Famous Distance Metrics in Data Science: Euclidean Distance, Manhattan Distance, Chebyshev Distance, Cosine Similarity, Hamming Distance, Mahalanobis Distance.
Introduction to Feature Engineering: Feature Selection Techniques: Importance and methods to identify relevant features for model performance.
Module 2: Fundamentals of Learning from Data
Exploratory Data Analysis (EDA): Descriptive Statistics, Data Cleaning, Correlation Analysis, Data Normalisation, and Outlier Detection Techniques. Introduction to Learning Algorithms, The K-Nearest Neighbors (KNN) Algorithm: Concepts and Implementation. Evaluating Model Performance: Confusion Matrix, Accuracy, Precision, Recall, and F1-Score. The Bias-Variance Trade-off.
Module 3: Advanced Supervised Learning Techniques
Linear Regression: Theory and assumptions, and evaluation. Logistic Regression: Understanding classification through probabilistic models. Polynomial regression.
Regularization Techniques: L1 Regularization (Lasso), L2 Regularization (Ridge).
Decision Trees: Structure and working, Information Gain, Gini Index, Overfitting and pruning strategies. Ensemble Methods: Bagging and Random Forests: Diversity through bootstrap aggregation, Boosting: AdaBoost, and Gradient Boosting.
Module 4: Unsupervised Learning and Dimensionality Reduction
Dimensionality Reduction: Principal Component Analysis (PCA): Concepts, eigenvalues, and variance explanation.
K-Means Clustering: Objective, Algorithm, Hierarchical Clustering: Linkage methods and dendrograms, DBSCAN clustering algorithm.
Module 5: Introduction to Data Visualization in Python
Basics of Python – Introduction to Matplotlib – Handling Data in Python – Basic Plotting – Line Plots – Area Plots – Histograms – Bar Charts – Pie Charts – Box Plots – Scatter Plots – Bubble Plots – Waffle Charts – Word Clouds – Seaborn and Regression Plots, Introduction to folium library – Introduction to Dashboards with Plotly and Dash