This comprehensive course will be your guide to learn how to use the power of Python to analyze big data, create beautiful visualizations, and use powerful machine learning algorithms. This course is designed for both beginners with basic programming experience or experienced developers looking to make the jump to Data Science and big data Analysis.
Introduction to statistical concepts
Descriptive Statistics
Inferential statistics
The role and purpose of research design
Types of research designs
The research process
Which method to choose?
Exercise: Identify a project of choice and developing a research design
Types of surveys
The survey process
Survey design
Methods of survey sampling
Determining the Sample size
Planning a survey
Conducting the survey
After the survey
Exercise: Planning for a survey based on the research design selected
Course Intro
Setup
Installation Setup and Overview
IDEs and Course Resources
iPython/Jupyter Notebook Overview
Intro to numpy
Creating arrays
Using arrays and scalars
Indexing Arrays
Array Transposition
Universal Array Function
Array Processing
Array Input and Output
Data Frames
Index objects
Reindex
Drop Entry
Selecting Entries
Data Alignment
Rank and Sort
Summary Statistics
Missing Data
Index Hierarchy
Reading and Writing Text Files
JSON with Python
HTML with Python
Microsoft Excel files with Python
Merge and Merge on Index
Concatenate and Combining Data Frames
Reshaping, Pivoting and Duplicates in Data Frames
Mapping, Replace, Rename Index, Binning, Outliers and Permutation
Group by on Data Frames
Group by on Dict and Series
Splitting Applying and Combining
Cross Tabulation
Welcome to the Big Data Section!
Big Data Overview
Spark Overview
Local Spark Set-Up
AWS Account Set-Up
Quick Note on AWS Security
EC2 Instance Set-Up
SSH with Mac or Linux
PySpark Setup
Lambda Expressions Review
Introduction to Spark and Python
RDD Transformations and Actions
Installing Seaborn
Histograms
Kernel Density Estimate Plots
Combining Plot Styles
Box and Violin Plots
Regression Plots
Heat maps and Clustered Matrices
Linear Regression
Support Vector
Decision Trees and Random Forests
Natural Language Processing
Discrete Uniform Distribution
Continuous Uniform Distribution
Binomial Distribution
Poisson Distribution
Normal Distribution
Sampling Techniques
T-Distribution
Hypothesis Testing and Confidence Intervals
Chi Square Test and Distribution
Writing a report from survey data
Communication and dissemination strategy
Context of Decision Making
Improving data use in decision making
Culture Change and Change Management
Preparing a report for the survey, a communication and dissemination plan and a demand and use strategy.
Presentations and joint action planning
Research Design
Python for Data Science and Machine
Spark for Big Data Analysis
Implement Machine Learning Algorithms
Numbly for Numerical Data
Pandas for Data Analysis
Matplotlib for Python Plotting
Seaborn for statistical plots
Interactive dynamic visualizations
SciKit-Learn for Machine Learning Tasks
K-Means Clustering, Logistic Regression and Linear Regression
Random Forest and Decision Trees
Natural Language Processing and Spam Filters
Neural Networks
Support Vector Machines
Research report writing
Find subjects you're passionate about by browsing our online course categories. Start
learning with top courses Built With Industry Experts.