Apply your big data knowledge through a hands-on project covering data integration, mining, Spark programming, and analysis to showcase your expertise.
Apply your big data knowledge through a hands-on project covering data integration, mining, Spark programming, and analysis to showcase your expertise.
This capstone course offers students the opportunity to apply their comprehensive big data knowledge in a practical setting. Students can choose from various projects focusing on data integration, data mining, Spark programming, or data analysis. The course requires submitting a detailed report and code for review, allowing participants to create a portfolio-worthy project that demonstrates their job readiness in the big data field. This hands-on experience bridges theoretical knowledge with practical application.
Instructors:
English
English
What you'll learn
Apply big data technologies to solve real-world problems
Develop a comprehensive project showcasing data analysis skills
Create professional-grade technical documentation
Implement data integration and mining techniques
Demonstrate proficiency in Spark programming
Build a portfolio-worthy project for career advancement
Skills you'll gain
This course includes:
PreRecorded video
Graded assignments, Exams
Access on Mobile, Tablet, Desktop
Limited Access access
Shareable certificate
Closed caption
Get a Completion Certificate
Share your certificate with prospective employers and your professional network on LinkedIn.
Provided by

Top companies offer this course to their employees
Top companies provide this course to enhance their employees' skills, ensuring they excel in handling complex projects and drive organizational success.





Module Description
The capstone project course provides students with a practical opportunity to apply their big data technology knowledge. Students select and complete a medium-scale project from various options in data integration, mining, Spark programming, or analysis. The course emphasizes independent work and requires students to submit both code and a comprehensive report for evaluation. This project-based approach allows students to demonstrate their practical skills and readiness for professional big data roles.
Fee Structure
Instructors

7 Courses
A Distinguished Scholar in Mathematical Imaging and Data Sciences
Jianfeng Cai serves as Professor in the Department of Mathematics at The Hong Kong University of Science and Technology, where he has established himself as a leading expert in mathematical foundations of imaging and data sciences. After completing his Bachelor's degree from Fudan University and PhD from the Chinese University of Hong Kong in 2007, he worked at several prestigious institutions including the National University of Singapore, UCLA, and University of Iowa before joining HKUST in 2015. His research focuses on theoretical and algorithmic foundations of problems related to information, data, and signals, with particular emphasis on efficient representation, sensing, and analysis of high-dimensional data. His groundbreaking work has garnered over 6,500 citations for a single paper on matrix completion algorithms, while his research has found applications in medical imaging, compressed sensing, signal processing, and machine learning. Named a highly cited researcher in mathematics by Clarivate Analytics in 2017 and 2018, his contributions include pioneering work in image restoration, matrix completion, and blind motion deblurring. Beyond his research, he actively supervises numerous PhD students while maintaining collaborations across multiple disciplines and institutions. His current office is located in Room 3438 at HKUST, where he continues to advance the field of mathematical imaging and data sciences through innovative research and mentorship.
A Distinguished Scholar in Database Systems and Big Data Computing
Ke Yi serves as Professor in the Department of Computer Science and Engineering at The Hong Kong University of Science and Technology, where he also directs the MSc Program in Big Data Technology. After completing his BS from Tsinghua University in 2001 and PhD from Duke University in 2006, he has established himself as a leading expert in database theory, parallel computing, and data stream algorithms. His research excellence is evidenced by multiple prestigious awards, including two SIGMOD Best Paper Awards (2022, 2016), a PODS Test-of-time Award (2022), a SIGMOD Best Demonstration Award (2015), and a Google Faculty Research Award (2010). As an ACM Distinguished Member, he has made significant contributions to database systems and algorithms, particularly in areas of data security, privacy, and distributed computing. His teaching excellence has been recognized with multiple Best Teaching Awards for his course on Big Data Computing. Beyond his academic work, Yi maintains active research collaborations with industry partners including Alibaba, Huawei, Microsoft, and Google, while serving as associate editor for prestigious journals and regularly chairing major conferences in the field. His research spans theoretical computer science and practical database systems, with particular emphasis on designing algorithms that offer both theoretical guarantees and practical effectiveness.
Testimonials
Testimonials and success stories are a testament to the quality of this program and its impact on your career and learning journey. Be the first to help others make an informed decision by sharing your review of the course.
Frequently asked questions
Below are some of the most commonly asked questions about this course. We aim to provide clear and concise answers to help you better understand the course content, structure, and any other relevant information. If you have any additional questions or if your question is not listed here, please don't hesitate to reach out to our support team for further assistance.