Mountains

Information Management and Statistical Data Analysis using Python

What Will I Learn?

This comprehensive course will be your guide to learn how to use the power of Python to analyze big data, create beautiful visualizations, and use powerful machine learning algorithms. This course is designed for both beginners with basic programming experience or experienced developers looking to make the jump to Data Science and big data Analysis.

Fee In Different Currencies
RWF 70,000 Or USD 0 Or EURO 0
Enroll Now

Indicative Content

    • Basic statistical terms and concepts

      • Introduction to statistical concepts

      • Descriptive Statistics

      • Inferential statistics

    • Research Design

      • The role and purpose of research design

      • Types of research designs

      • The research process

      • Which method to choose?

      • Exercise: Identify a project of choice and developing a research design

    • Survey Planning, Implementation and Completion

      • Types of surveys

      • The survey process

      • Survey design

      • Methods of survey sampling

      • Determining the Sample size

      • Planning a survey

      • Conducting the survey

      • After the survey

      • Exercise: Planning for a survey based on the research design selected

    • Introduction to Phython

      • Course Intro

      • Setup

      • Installation Setup and Overview

      • IDEs and Course Resources

      • iPython/Jupyter Notebook Overview

    • Learning Numpy

      • Intro to numpy

      • Creating arrays

      • Using arrays and scalars

      • Indexing Arrays

      • Array Transposition

      • Universal Array Function

      • Array Processing

      • Array Input and Output

    • Intro to Pandas

      • Data Frames

      • Index objects

      • Reindex

      • Drop Entry

      • Selecting Entries

      • Data Alignment

      • Rank and Sort

      • Summary Statistics

      • Missing Data

      • Index Hierarchy

    • Working with Data

      • Reading and Writing Text Files

      • JSON with Python

      • HTML with Python

      • Microsoft Excel files with Python

      • Merge and Merge on Index

      • Concatenate and Combining Data Frames

      • Reshaping, Pivoting and Duplicates in Data Frames

      • Mapping, Replace, Rename Index, Binning, Outliers and Permutation

      • Group by on Data Frames

      • Group by on Dict and Series

      • Splitting Applying and Combining

      • Cross Tabulation

    • Big Data and Spark with Python

      • Welcome to the Big Data Section!

      • Big Data Overview

      • Spark Overview

      • Local Spark Set-Up

      • AWS Account Set-Up

      • Quick Note on AWS Security

      • EC2 Instance Set-Up

      • SSH with Mac or Linux

      • PySpark Setup

      • Lambda Expressions Review

      • Introduction to Spark and Python

      • RDD Transformations and Actions

    • Data Visualization

      • Installing Seaborn

      • Histograms

      • Kernel Density Estimate Plots

      • Combining Plot Styles

      • Box and Violin Plots

      • Regression Plots

      • Heat maps and Clustered Matrices

    • Data Analysis

      • Linear Regression

      • Support Vector

      • Decision Trees and Random Forests

      • Natural Language Processing

      • Discrete Uniform Distribution

      • Continuous Uniform Distribution

      • Binomial Distribution

      • Poisson Distribution

      • Normal Distribution

      • Sampling Techniques

      • T-Distribution

      • Hypothesis Testing and Confidence Intervals

      • Chi Square Test and Distribution

    • Report writing for surveys, data dissemination, demand and use

      • Writing a report from survey data

      • Communication and dissemination strategy

      • Context of Decision Making

      • Improving data use in decision making

      • Culture Change and Change Management

      • Preparing a report for the survey, a communication and dissemination plan and a demand and use strategy.

      • Presentations and joint action planning

RWF 70,000
Enroll Now

Objectives

  • Research Design

  • Python for Data Science and Machine

  • Spark for Big Data Analysis

  • Implement Machine Learning Algorithms

  • Numbly for Numerical Data

  • Pandas for Data Analysis

  • Matplotlib for Python Plotting

  • Seaborn for statistical plots

  • Interactive dynamic visualizations

  • SciKit-Learn for Machine Learning Tasks

  • K-Means Clustering, Logistic Regression and Linear Regression

  • Random Forest and Decision Trees

  • Natural Language Processing and Spam Filters

  • Neural Networks

  • Support Vector Machines

  • Research report writing

RWF 70,000
Enroll Now

Course Features

  • Lectures 0
  • Duration 90 Days
  • Certificate Yes
  • Enroll Now

Ready to Begin?

Find subjects you're passionate about by browsing our online course categories. Start
learning with top courses Built With Industry Experts.

Start Learning Apply for Job Opportunity