Master machine learning techniques for big data analysis with hands-on experience in KNIME and Spark.
Master machine learning techniques for big data analysis with hands-on experience in KNIME and Spark.
This comprehensive course provides an overview of machine learning techniques to explore, analyze, and leverage data. You'll learn tools and algorithms to create machine learning models that learn from data and scale them to big data problems. Throughout the course, you'll master the complete machine learning process - from data exploration and preparation to building and evaluating models. The hands-on approach allows you to apply practical techniques using open-source tools like KNIME and Spark. You'll explore various machine learning algorithms including classification methods such as k-Nearest Neighbors, Decision Trees, and Naïve Bayes, along with regression, clustering, and association analysis. By the end of the course, you'll be able to design data-leveraging approaches, prepare data for modeling, identify appropriate machine learning techniques for different problems, construct models using open-source tools, and analyze big data problems using scalable algorithms on Spark.
4.6
(2,485 ratings)
75,120 already enrolled
Instructors:
English
پښتو, বাংলা, اردو, 4 more
What you'll learn
Design approaches to leverage data using the machine learning process
Apply techniques to explore and prepare data for modeling
Identify appropriate machine learning techniques for different problems
Construct models that learn from data using open source tools
Analyze big data problems using scalable algorithms on Spark
Evaluate machine learning models using appropriate metrics
Skills you'll gain
This course includes:
4.1 Hours PreRecorded video
11 assignments
Access on Mobile, Tablet, Desktop
Batch access
Shareable certificate
Closed caption
Get a Completion Certificate
Share your certificate with prospective employers and your professional network on LinkedIn.
Created by
Provided by

Top companies offer this course to their employees
Top companies provide this course to enhance their employees' skills, ensuring they excel in handling complex projects and drive organizational success.





There are 7 modules in this course
This course provides a comprehensive introduction to machine learning with big data, focusing on both theoretical concepts and practical applications. Students learn the complete machine learning process from data exploration and preparation to model building and evaluation. The curriculum covers various machine learning techniques including classification algorithms (k-Nearest Neighbors, Decision Trees, Naïve Bayes), regression, cluster analysis, and association analysis. Hands-on components are emphasized throughout the course, with practical implementations using KNIME for visual analytics and Apache Spark for scalable machine learning. Students gain experience working with real-world datasets, addressing common data quality issues, and evaluating model performance through appropriate metrics.
Welcome
Module 1 · 34 Minutes to complete
Introduction to Machine Learning with Big Data
Module 2 · 3 Hours to complete
Data Exploration
Module 3 · 2 Hours to complete
Data Preparation
Module 4 · 2 Hours to complete
Classification
Module 5 · 3 Hours to complete
Evaluation of Machine Learning Models
Module 6 · 3 Hours to complete
Regression, Cluster Analysis, and Association Analysis
Module 7 · 3 Hours to complete
Fee Structure
Instructors
Distinguished Data Science Leader and Scientific Workflow Pioneer
Dr. Ilkay Altintas serves as Chief Data Science Officer at the San Diego Supercomputer Center (SDSC) at UC San Diego, where she has established herself as a leading innovator in scientific workflows and data science since 2001. After earning her Ph.D. from the University of Amsterdam focusing on workflow-driven collaborative science, she founded the Workflows for Data Science Center of Excellence and has led numerous cross-disciplinary projects funded by NSF, DOE, NIH, and the Moore Foundation. Her contributions include co-initiating the open-source Kepler Scientific Workflow System and developing the Biomedical Big Data Training Collaborative platform. Her research impact spans scientific workflows, provenance, distributed computing, and software modeling, earning her the SDSC Pi Person of the Year award in 2014 and the IEEE TCSC Award for Excellence in Scalable Computing for Early Career Researchers in 2015. As Division Director for Cyberinfrastructure Research, Education, and Development, she oversees numerous computational data science initiatives while serving as a founding faculty fellow at the Halıcıoğlu Data Science Institute and maintaining active research collaborations across multiple scientific domains
Pioneer in Applied Machine Learning and Data Analytics
Mai H. Nguyen serves as the Lead for Data Analytics at the San Diego Supercomputer Center (SDSC) and Associate Director for AI of the WIFIRE Lab at UC San Diego. Her expertise spans multiple domains, combining advanced machine learning techniques with distributed computing to tackle large-scale data challenges. After earning both her M.S. and Ph.D. in Computer Science from UCSD with a focus on machine learning, she has built an impressive career bridging academic research with practical applications. Her research portfolio includes diverse applications such as remote sensing, medical image analysis, biomedical text analytics, wildfire mitigation, spacecraft autonomy, and speech recognition
Testimonials
Testimonials and success stories are a testament to the quality of this program and its impact on your career and learning journey. Be the first to help others make an informed decision by sharing your review of the course.
Frequently asked questions
Below are some of the most commonly asked questions about this course. We aim to provide clear and concise answers to help you better understand the course content, structure, and any other relevant information. If you have any additional questions or if your question is not listed here, please don't hesitate to reach out to our support team for further assistance.