This course is part of Large-Scale Database Systems.
This comprehensive course explores advanced concepts in distributed database systems, focusing on three interconnected areas: reliability mechanisms, cloud computing frameworks, and machine learning applications. Students will develop a deep understanding of transaction management principles, including ACID properties, concurrency control methods, and deadlock detection and prevention techniques. The reliability component covers recovery protocols like ARIES and commit protocols that ensure data consistency during system failures. The course then transitions to cloud computing with an in-depth exploration of the Hadoop ecosystem, MapReduce programming model, and Accumulo database architecture, providing practical skills for large-scale data processing. Finally, students learn to implement machine learning techniques for data analysis, including collaborative filtering, clustering, and classification algorithms using Mahout and Accumulo. This integrated approach enables learners to design and maintain robust distributed database systems that leverage cloud infrastructure and machine learning capabilities.
Instructors:
English
What you'll learn
Implement transaction management with ACID properties in distributed systems
Apply concurrency control methods including two-phase locking and timestamp ordering
Implement deadlock detection and prevention techniques
Utilize recovery protocols like ARIES to maintain data consistency during failures
Develop efficient commit protocols for distributed transactions
Understand data warehousing principles and Accumulo architecture
Skills you'll gain
This course includes:
2.2 Hours PreRecorded video
8 assignments
Access on Mobile, Tablet, Desktop
FullTime access
Shareable certificate
Get a Completion Certificate
Share your certificate with prospective employers and your professional network on LinkedIn.
Created by
Provided by

Top companies offer this course to their employees
Top companies provide this course to enhance their employees' skills, ensuring they excel in handling complex projects and drive organizational success.





There are 4 modules in this course
This advanced course integrates three critical aspects of modern distributed database systems: transaction management and reliability, cloud computing platforms, and machine learning applications. Through four comprehensive modules, students first explore the fundamentals of transaction management in distributed environments, focusing on ACID properties, concurrency control algorithms like two-phase locking and timestamp ordering, and deadlock management techniques. The second module covers essential reliability protocols including ARIES recovery algorithm and commit protocols that ensure data consistency during system failures. This module also introduces data warehousing concepts and Accumulo architecture with its cell-level security mechanisms. The third module explores cloud computing with the Hadoop ecosystem, emphasizing MapReduce programming for large-scale data processing. The final module integrates machine learning applications, teaching students to implement collaborative filtering, clustering, and classification algorithms using tools like Mahout within distributed environments. Throughout the course, theoretical concepts are reinforced through practical assignments and self-reflective readings.
Course Introduction
Module 1 · 10 Minutes to complete
Transaction Management & Concurrency Control
Module 2 · 6 Hours to complete
Reliability Protocols, Data Warehousing, and Accumulo Architecture
Module 3 · 7 Hours to complete
Cloud Computing, Hadoop Ecosystem, and Machine Learning Applications
Module 4 · 6 Hours to complete
Fee Structure
Individual course purchase is not available - to enroll in this course with a certificate, you need to purchase the complete Professional Certificate Course. For enrollment and detailed fee structure, visit the following: Large-Scale Database Systems
Instructor
Expert in Distributed Database Systems and Large-Scale Computing
David Silberberg is a distinguished instructor at Johns Hopkins University, specializing in large-scale database systems and distributed computing. He holds a Ph.D. in Computer Science from the University of Maryland and both Master's and Bachelor's degrees in Computer Science from the Massachusetts Institute of Technology. Dr. Silberberg serves as a Principal Professional Staff member at the Johns Hopkins University Applied Physics Laboratory (APL) and is the Research Director of the Johns Hopkins Institute for Assured Autonomy. Dr. Silberberg's expertise spans various areas of computer science, including AI and machine learning algorithms, graph analytics, distributed and large-scale architectures, intelligent access to distributed and heterogeneous database systems, and semantic graph query languages. He is the instructor for the "Large-Scale Database Systems Specialization" on Coursera, which covers advanced topics in distributed database systems, cloud computing, data reliability, and machine learning for large-scale data solutions.
Testimonials
Testimonials and success stories are a testament to the quality of this program and its impact on your career and learning journey. Be the first to help others make an informed decision by sharing your review of the course.
Frequently asked questions
Below are some of the most commonly asked questions about this course. We aim to provide clear and concise answers to help you better understand the course content, structure, and any other relevant information. If you have any additional questions or if your question is not listed here, please don't hesitate to reach out to our support team for further assistance.