Course: Building Data Intensive Systems with Big Data Algorithms

Data Intensive Systems

Data-intensive systems are designed to store, process, and analyze large volumes of data that cannot be handled by a single computer, typically using distributed architectures.

Big Data Algorithms

Big data algorithms are designed to process massive datasets that cannot fit into the memory or storage of a single computer. They rely on distributed computing models such as MapReduce and its successors, which decompose tasks such as sorting, aggregation, joins, and matrix multiplication into parallel subtasks executed across multiple machines.

Code-First Teaching

The course takes a code-first approach, pairing each algorithmic concept with hands-on implementations using modern big data libraries and platforms, including PySpark, Spark SQL, and stream-processing frameworks. It equips students with a practical toolbox for designing and implementing scalable big data algorithms that solve problems such as large-scale sorting, aggregation, and matrix multiplication on datasets that cannot fit within the memory of a single machine.

Guided Student Projects

The course syllabus is designed to enable students to begin their projects while learning the material. As the course continues, they will enrich their projects with the concepts they acquire. Each team will give several in-class presentations for discussion and feedback.

Innovation Through Tools Mastery

As everyday tasks are increasingly automated by AI and mature libraries, professional developers are expected to innovate and integrate solutions quickly. Reflecting this shift, course projects emphasize exploring new use cases by thoughtfully combining multiple learned components in fresh and original ways

Modular Course Syllabus

The list below presents the complete set of subjects; individual course instances may vary depending on the course format, students’ backgrounds, and class dynamics.

Browse course offerings

Other Courses in the Hands-on Science(HoS) Series

Building Conversational Systems with LLMs and Agents

Building Multimedia Systems with Deep Generative Models

Building Autonomous Systems with Embodied Vision Models

Building Data-Intensive Systems with Big Data Algorithms

Building Distributed Intelligence Systems with Federated Learning and Decentralized Decisions

Building Temporal Intelligence Systems: Forecasting, Reinforcement Learning And World Models

Browse for all upcoming Hands-On AI Science course offerings, and past student projects

Page updated

Google Sites

Report abuse