Data Processing and Manipulation

Master Python data processing techniques including missing value handling, outlier detection, scaling, and data warehouse concepts.

This course cannot be purchased separately - to access the complete learning experience, graded assignments, and earn certificates, you'll need to enroll in the full Data Wrangling with Python Specialization program. You can audit this specific course for free to explore the content, which includes access to course materials and lectures. This allows you to learn at your own pace without any financial commitment.

Instructors:

Di Wu

English

This course includes

26 Hours

Of Self-paced video lessons

Intermediate Level

Completion Certificate

awarded on course completion

Free course

Add to compare

What you'll learn

Master techniques for handling missing values and outlier detection

Learn data reduction methods through sampling and dimension reduction

Apply scaling and discretization techniques for data preprocessing

Understand data warehouse concepts and multidimensional analysis

Create and manipulate pivot tables and data cubes

Skills you'll gain

Data Preprocessing

Missing Value Analysis

Outlier Detection

Data Scaling

Data Warehousing

Pivot Tables

Data Cube Operations

Pandas

Dimension Reduction

Data Discretization

This course includes:

1.5 Hours PreRecorded video

5 quizzes, 1 assignment

Access on Mobile, Tablet, Desktop

FullTime access

Shareable certificate

Get a Completion Certificate

Share your certificate with prospective employers and your professional network on LinkedIn.

Created by

University of Colorado Boulder

Provided by

Coursera

Top companies offer this course to their employees

Top companies provide this course to enhance their employees' skills, ensuring they excel in handling complex projects and drive organizational success.

There are 4 modules in this course

This comprehensive course focuses on essential data processing and manipulation techniques using Python. Students learn to handle missing values, detect outliers, perform data reduction through sampling and dimensionality reduction, apply scaling and discretization methods, and work with data warehouse concepts. The curriculum covers practical applications using Pandas for data transformation, multidimensional analysis using data cubes, and creating pivot tables for complex data exploration.

Missing Values and Outliers

Module 1 · 7 Hours to complete

Data Reduction

Module 2 · 6 Hours to complete

Scaling and Discretization

Module 3 · 6 Hours to complete

Data Warehouse

Module 4 · 7 Hours to complete

Fee Structure

Instructor

Di Wu

4.4 rating

93 Reviews

41,403 Students

18 Courses

Teaching Assistant Professor

Dr. Di Wu is a Teaching Assistant Professor at the University of Colorado Boulder, specializing in data science and computer science. His primary research interests include temporal databases, the semantic web, knowledge representation, and data science, with a focus on extending the Resource Description Framework (RDF) for temporal dimensions. Before joining CU Boulder, he taught various courses such as algorithms and data structures, programming languages, and database management. Dr. Wu aims to develop an inclusive and engaging pedagogy in data science education over the next five years, emphasizing experiential learning in both in-person and online formats. He is involved in teaching courses related to data science and programming, including specializations in Python programming for data scientists.

This course includes