ONLINE WORKSHOP SERIES ON ESSENTIALS OF DATA SCIENCE
First step to become industry ready for top Data Science job roles.
Modules
Module | Workshop dates | Application Deadline |
---|---|---|
1. Introduction to Exploratory Data Analysis Using R | 16th, 21st, 28th, 30th May 2024 | 14th May 2024 |
2. Data Management | 4th, 6th ,11th, 13th, June 2024 | 3rd May 2024 |
3. Introduction to Statistical Models | 18th, 20th, 25th, 27th June 2024 | 17th June 2024 |
4. Introduction to Machine Learning | 2nd, 4th, 9th, 11th, 16th July 2024 | 1st July 2024 |
5. Data Visualization and Big Data | 18th, 23rd ,25th, 30th July 2024 | 17th July 2024 |
6. Python for Data Science | 1st, 6th, 8th, 13th August 2024 | 31st July 2024 |
Lecture Panel
- Dr. SD Viswakula
- Dr. RV Jayatillake
- Dr. GP Lakraj
- Dr. AA Sunethra
- Dr.D. Wickramarachchi
- Mr. Oshada Seneweera
- Dr. DSP Tissera
- Dr. KAD Deshani
Course Fees
LKR 10,000/= per module
LKR 54,000/= for all six modules (10% off)
Payment Details
>> Direct bank deposit
Bank: Commercial Bank of Ceylon PLC
Branch: Reid Avenue
Account name: Colombo Science and Technology Cell
Account No: 1116016487
(Payment can be made at any Commercial Bank branch).
Please write “EDS_2024" under the description section of the bank payment slip/online.
Please upload a scanned copy of the bank slip or the e-receipt in the registration form as proof of payment.
HOW TO APPLY?
Registration form
Essentials of Data Science – Registration form -2024 – Google Forms
Course Details
Module 1: Introduction to Exploratory Data Analysis using R:
Objectives
At the successful completion of the course student should be able to
- Write basic codes in R for simple data analysis.
- Define, calculate and interpret summary statistics and basic graphs to explore data.
- Perform hypothesis tests; mean, proportion, two population means and proportions, chi-square test of association, correlation.
Content
- Introduction to programming in R: R-studio interface, syntax, entering data and accessing spreadsheet data, writing functions, installing and using libraries.
- Data types.
- Exploring data with summary statistics using R: Mean, median, variance, concept of bias, outliers, and missing values.
- Exploring data with basic statistical graphs using R: Scatter plot, histogram, pie chart, bar charts, multiple bar charts, box-plots etc.
- Introduction to distributions: Binomial, Normal
- Population vs Sample and sampling techniques
- Hypothesis testing
Evaluation
- The Course 1 would be evaluated based on a mini quiz.
Module 2: Data management
Objectives
At the successful completion of the course student should be able to
- Prepare a dataset that can be used for analysis using R
- Define and apply basic concepts in database management
- Use SQL and NoSQL to perform simple queries to extract data.
Content
- Data munging with R:
- Introduction to database concepts
- Introduction to SQL
- Introduction to NoSQL
Evaluation
The Course 2 would be evaluated based on a mini quiz.
Module 3: Introduction to Statistical models
Objectives
At the successful completion of the course, student should be able to explain components of basic statistical models, identify and apply suitable models to analyze data.
Content
- Introduction to statistical modeling
- Linear regression model
- Logistic regression model
- Clustering
- Dimension reduction (PCA, FA)
- Time series
Evaluation
The Course 3 would be evaluated based on a mini quiz.
Module 4: Introduction to Machine Learning
Objectives
At the successful completion of the course, student should be able to apply basic machine learning algorithms to analyze data.
Content
- Comparison of statistical models and machine learning algorithms
- Comparison of supervised learning vs unsupervised learning
- Cross-validation methods
- Unsupervised learning – Clustering
- Supervised learning – Classification and value prediction: Random forest
- Supervised learning – Neural networks
- Rule based analysis: Apriori, market basket analysis
Evaluation
The Course 4 would be evaluated based on a mini quiz.
Module 5: Special Topics: Data Visualization and Big Data
Objectives
At the successful completion of the course, student should be able to apply data visualization concepts to describe data; define big data and discuss basic concepts in big data analytics.
Content
- Data visualization process, do’s and don’ts, Annotations, plots: bubble plots, tree maps etc. using Power BI
- Big data and its applications/case studies
- Cloud computing and internet of things
- Distributed file systems and computing : HADOOP technologies
Evaluation
The Course 5 would be evaluated based on a mini quiz.
Module 6: Python for Data Science
Objectives
At the successful completion of the course, student should be able to use Python programming language for basic data science applications.
Content
- Introduction to python and Jupyter notebook
- Python Basics – Types, Expressions and Variables, String Operations
- Python programming Fundamentals – Conditions and Branching, Loops, Functions, Objects and Classes
- Python Data Structures – Lists, Tuples, Sets, and Dictionaries
- Data analysis, manipulation, and visualization in Python – Numpy, Pandas, Matplotlib, Seaborn
- Machine learning and deep learning libraries – Sklearn, Tensorflow, Keras
Evaluation
The Course 6 would be evaluated based on a mini quiz.
Participation Certificate Awarding Criteria
A certificate will be issued upon successful completion of the course with ≥75% of attendance and a ≥50% score obtained on evaluation.
Capstone Project
The participants who successfully complete all six modules are eligible for a capstone project where a dataset will be given to analyze and meet some specific objectives. The participants who meet the minimum requirements (detailed instructions and requirements will be given) will be awarded a specialization certificate in essentials of data science.