ONLINE WORKSHOP SERIES ON ESSENTIALS OF DATA SCIENCE
First step to become industry ready for top Data Science job roles.
Modules
Module | Workshop dates | Application Deadline |
---|---|---|

Course Fees
LKR 10,000/= per module
LKR 54,000/= for all six modules (10% off)
Payment Details
>> Direct bank deposit
Bank: People’s Bank
Branch: Thimbirigasyaya Branch
Account name: University of Colombo
Account No: 314052800001
(Payment can be made at any People’s Bank branch).
>>Online Payment
Click “Pay Online" tab on University of Colombo website. Detailed instructions for online payments are found here
>>Email your payment proofs
HOW TO APPLY?
Course Details
Module 1: Introduction to Exploratory Data Analysis using R:
Objectives
At the successful completion of the course student should be able to
- Write basic codes in R for simple data analysis.
- Define, calculate and interpret summary statistics and basic graphs to explore data.
- Perform hypothesis tests; mean, proportion, two population means and proportions, chi-square test of association, correlation.
Content
- Introduction to programming in R: R-studio interface, syntax, entering data and accessing spreadsheet data, writing functions, installing and using libraries.
- Data types.
- Exploring data with summary statistics using R: Mean, median, variance, concept of bias, outliers, and missing values.
- Exploring data with basic statistical graphs using R: Scatter plot, histogram, pie chart, bar charts, multiple bar charts, box-plots etc.
- Introduction to distributions: Binomial, Normal
- Population vs Sample and sampling techniques
- Hypothesis testing
Evaluation
- The Course 1 would be evaluated based on a mini quiz.
Module 2: Data management
Objectives
At the successful completion of the course student should be able to
- Prepare a dataset that can be used for analysis using R
- Define and apply basic concepts in database management
- Use SQL and NoSQL to perform simple queries to extract data.
Content
- Data munging with R:
- Introduction to database concepts
- Introduction to SQL
- Introduction to NoSQL
Evaluation
The Course 2 would be evaluated based on a mini quiz.
Module 3: Introduction to Statistical models
Objectives
At the successful completion of the course, student should be able to explain components of basic statistical models, identify and apply suitable models to analyze data.
Content
- Introduction to statistical modeling
- Linear regression model
- Logistic regression model
- Clustering
- Dimension reduction (PCA, FA)
- Time series
Evaluation
The Course 3 would be evaluated based on a mini quiz.
Module 4: Introduction to Machine Learning
Objectives
At the successful completion of the course, student should be able to apply basic machine learning algorithms to analyze data.
Content
- Comparison of statistical models and machine learning algorithms
- Comparison of supervised learning vs unsupervised learning
- Cross-validation methods
- Unsupervised learning – Clustering
- Supervised learning – Classification and value prediction: Random forest
- Supervised learning – Neural networks
- Rule based analysis: Apriori, market basket analysis
Evaluation
The Course 4 would be evaluated based on a mini quiz.
Module 5: Special Topics: Data Visualization and Big Data
Objectives
At the successful completion of the course, student should be able to apply data visualization concepts to describe data; define big data and discuss basic concepts in big data analytics.
Content
- Data visualization process, do’s and don’ts, Annotations, plots: bubble plots, tree maps etc. using Power BI
- Big data and its applications/case studies
- Cloud computing and internet of things
- Distributed file systems and computing : HADOOP technologies
Evaluation
The Course 5 would be evaluated based on a mini quiz.
Module 6: Python for Data Science
Objectives
At the successful completion of the course, student should be able to use Python programming language for basic data science applications.
Content
- Introduction to python and Jupyter notebook
- Python Basics – Types, Expressions and Variables, String Operations
- Python programming Fundamentals – Conditions and Branching, Loops, Functions, Objects and Classes
- Python Data Structures – Lists, Tuples, Sets, and Dictionaries
- Data analysis, manipulation, and visualization in Python – Numpy, Pandas, Matplotlib, Seaborn
- Machine learning and deep learning libraries – Sklearn, Tensorflow, Keras
Evaluation
The Course 6 would be evaluated based on a mini quiz.
Participation Certificate Awarding Criteria
A certificate will be issued upon successful completion of the course with ≥75% of attendance and a ≥50% score obtained on evaluation.
Capstone Project
The participants who successfully complete all six modules are eligible for a capstone project where a dataset will be given to analyze and meet some specific objectives. The participants who meet the minimum requirements (detailed instructions and requirements will be given) will be awarded a specialization certificate in essentials of data science.