Menu Close

ONLINE WORKSHOP SERIES ON ESSENTIALS OF DATA SCIENCE

First step to become industry ready for top Data Science job roles.

Modules

Module Workshop dates Application Deadline
1. Introduction to Exploratory Data Analysis Using R 10th, 12th, 17th and 24th Jan 2023 3rd Jan 2023
2. Data Management 31st Jan, 2nd , 7th and 9th Feb 2023 30th Jan 2023
3. Introduction to Statistical Models 14th ,16th, 21st and 23rd Feb 2023 13th Feb 2023>
4. Introduction to Machine Learning 28th Feb, 2nd, 28th and 30th Mar 2023 27th Feb 2023
5. Data Visualization and Big Data 20th, 25th, 27th April and 2nd May 2023 19th April 2023
6. Python for Data Science 4th ,9th ,11th and 16th May 2023 3rd May 2023
Capstone Project 18th May 2023 17th May 2023

Course Fees

LKR 10,000/= per module

LKR 54,000/= for all six modules (10% off)

Payment Details

>> Direct bank deposit

Bank: People’s Bank
Branch: Thimbirigasyaya Branch
Account name: University of Colombo
Account No: 314052800001

(Payment can be made at any People’s Bank branch).

>>Online Payment

Click “Pay Online"  tab on University of Colombo website. Detailed instructions for online payments are found here

>>Email your payment proofs

Please email your payment proofs to cds@stat.cmb.ac.lk mentioning the module you are registering for.

HOW TO APPLY?

Registration form   here

For more information, please contact 

Course Details

Module 1: Introduction to Exploratory Data Analysis using R:

Objectives

At the successful completion of the course student should be able to

  • Write basic codes in R for simple data analysis.
  • Define, calculate and interpret summary statistics and basic graphs to explore data.
  • Perform hypothesis tests; mean, proportion, two population means and proportions, chi-square test of association, correlation.

Content

  • Introduction to programming in R: R-studio interface, syntax, entering data and accessing spreadsheet data, writing functions, installing and using libraries.
  • Data types.
  • Exploring data with summary statistics using R: Mean, median, variance, concept of bias, outliers, and missing values.
  • Exploring data with basic statistical graphs using R: Scatter plot, histogram, pie chart, bar charts, multiple bar charts, box-plots etc.
  • Introduction to distributions: Binomial, Normal
  • Population vs Sample and sampling techniques
  • Hypothesis testing 

Evaluation

  • The Course 1 would be evaluated based on a mini quiz.

Module 2: Data management

Objectives

At the successful completion of the course student should be able to

  • Prepare a dataset that can be used for analysis using R
  • Define and apply basic concepts in database management
  • Use SQL and NoSQL to perform simple queries to extract data.

Content

  • Data munging with R:
  • Introduction to database concepts
  • Introduction to SQL
  • Introduction to NoSQL

Evaluation

The Course 2 would be evaluated based on a mini quiz.

Module 3: Introduction to Statistical models

Objectives

At the successful completion of the course, student should be able to explain components of basic statistical models, identify and apply suitable models to analyze data.

Content

  • Introduction to statistical modeling
  • Linear regression model
  • Logistic regression model
  • Clustering
  • Dimension reduction (PCA, FA)
  • Time series

Evaluation

The Course 3 would be evaluated based on a mini quiz.

Module 4: Introduction to Machine Learning

Objectives

At the successful completion of the course, student should be able to apply basic machine learning algorithms to analyze data.

Content

  • Comparison of statistical models and machine learning algorithms
  • Comparison of supervised learning vs unsupervised learning
  • Cross-validation methods
  • Unsupervised learning – Clustering
  • Supervised learning – Classification and value prediction: Random forest
  • Supervised learning – Neural networks
  • Rule based analysis: Apriori, market basket analysis

Evaluation

The Course 4 would be evaluated based on a mini quiz.

Module 5: Special Topics: Data Visualization and Big Data

Objectives

At the successful completion of the course, student should be able to apply data visualization concepts to describe data; define big data and discuss basic concepts in big data analytics.

Content

  • Data visualization process, do’s and don’ts, Annotations, plots: bubble plots, tree maps etc. using Power BI
  • Big data and its applications/case studies
  • Cloud computing and internet of things
  • Distributed file systems and computing : HADOOP technologies

Evaluation

The Course 5 would be evaluated based on a mini quiz.

Module 6: Python for Data Science

Objectives

At the successful completion of the course, student should be able to use Python programming language for basic data science applications.

Content

  • Introduction to python and Jupyter notebook
  • Python Basics – Types, Expressions and Variables, String Operations
  • Python programming Fundamentals – Conditions and Branching, Loops, Functions, Objects and Classes
  • Python Data Structures – Lists, Tuples, Sets, and Dictionaries
  • Data analysis, manipulation, and visualization in Python – Numpy, Pandas, Matplotlib, Seaborn 
  • Machine learning and deep learning libraries – Sklearn, Tensorflow, Keras

Evaluation

The Course 6 would be evaluated based on a mini quiz.

Participation Certificate Awarding Criteria

A certificate will be issued upon successful completion of the course with ≥75% of attendance and a ≥50% score obtained on evaluation.

Capstone Project

The participants who successfully complete all six modules are eligible for a capstone project where a dataset will be given to analyze and meet some specific objectives. The participants who meet the minimum requirements (detailed instructions and requirements will be given) will be awarded a specialization certificate in essentials of data science.