Skip to main content

Data Science for Engineers (102905/AD531M)

RajagiriTech

About the Course

SYLLABUS

Module 1: Introduction to Data Science
Overview of Data Science: Definition, scope, and applications. Comparing Key Fields: Data Science vs. Machine Learning vs. Deep Learning vs. Artificial Intelligence. Types of Data Science Problems: Supervised vs. Unsupervised learning. Data Representation and Understanding Data Types.
Famous Distance Metrics in Data Science: Euclidean Distance, Manhattan Distance, Chebyshev Distance, Cosine Similarity, Hamming Distance, Mahalanobis Distance.
Introduction to Feature Engineering: Feature Selection Techniques: Importance and methods to identify relevant features for model performance.

Module 2: Fundamentals of Learning from Data
Exploratory Data Analysis (EDA): Descriptive Statistics, Data Cleaning, Correlation Analysis, Data Normalisation, and Outlier Detection Techniques. Introduction to Learning Algorithms, The K-Nearest Neighbors (KNN) Algorithm: Concepts and Implementation. Evaluating Model Performance: Confusion Matrix, Accuracy, Precision, Recall, and F1-Score. The Bias-Variance Trade-off.

Module 3: Advanced Supervised Learning Techniques
Linear Regression: Theory and assumptions, and evaluation. Logistic Regression: Understanding classification through probabilistic models. Polynomial regression.
Regularization Techniques: L1 Regularization (Lasso), L2 Regularization (Ridge).
Decision Trees: Structure and working, Information Gain, Gini Index, Overfitting and pruning strategies. Ensemble Methods: Bagging and Random Forests: Diversity through bootstrap aggregation, Boosting: AdaBoost, and Gradient Boosting.

Module 4: Unsupervised Learning and Dimensionality Reduction
Dimensionality Reduction: Principal Component Analysis (PCA): Concepts, eigenvalues, and variance explanation.
K-Means Clustering: Objective, Algorithm, Hierarchical Clustering: Linkage methods and dendrograms, DBSCAN clustering algorithm.

Module 5: Introduction to Data Visualization in Python
Basics of Python – Introduction to Matplotlib – Handling Data in Python – Basic Plotting – Line Plots – Area Plots – Histograms – Bar Charts – Pie Charts – Box Plots – Scatter Plots – Bubble Plots – Waffle Charts – Word Clouds – Seaborn and Regression Plots, Introduction to folium library – Introduction to Dashboards with Plotly and Dash

Requirements

Basic knowledge of Probability Theory, Linear Algebra, Statistics, and Python Programming.

Course Staff

Dr. Jithin Mathews

Assistant Professor, Department of Artificial Intelligence and Data Science, Rajagiri School of Engineering & Technology.

Qualification: B. Tech, M. Tech, and PhD in CSE

Experience: One year in Industry and 4 years in academia.

Areas of Interest: Machine Learning, Social Network Analysis

Course Staff Image #2

Ms. Aiswarya Mohan

Assistant Professor, Department of Artificial Intelligence and Data Science, Rajagiri School of Engineering & Technology.

Qualification: B. Tech in CSE, M. Tech in CSE

Experience: One year in Industry and 4.5 years in academia.

Areas of Interest: Deep Learning, Artificial Intelligence, Security in Image Processing

Binu A

Enroll