Course: Building Autonomous Systems with Embodied Vision Models

Image Processing and Computer Vision

Image processing and computer vision address tasks from low-level enhancement, filtering, and feature extraction to high-level object recognition and scene understanding. Modern systems combine traditional techniques with deep learning trained on large image datasets for robust performance.

Embodied Artificial Intelligence

Embodied AI refers to artificial intelligence systems that both perceive and act within an environment, combining visual understanding with physical or simulated embodiment. Such models integrate perception, reasoning, and motor control to enable agents to learn from and interact meaningfully with their surroundings.

Code-First Teaching

The course takes a code-first approach, pairing every concept with extensive, hands-on examples that use modern libraries, including PyTorch, torchvision, OpenCV, and timm. The course equips students with a practical toolbox of ideas and tools for rapidly building vision-based models and applications.

Guided Student Projects

The course syllabus is designed to enable students to begin their projects while learning the material. As the course continues, they will enrich their projects with the concepts they acquire. Each team will give several in-class presentations for discussion and feedback.

Innovation Through Tools Mastery

As standard tasks are increasingly handled by AI and mature libraries, professional developers' expectations shift toward innovation and rapid integration. Accordingly, a key requirement for student projects is to tackle new use cases by generating unique data and training or fine-tuning task-specific vision models.

Modular Course Syllabus

The list below presents the complete set of subjects; individual course instances may vary depending on the course format, students’ backgrounds, and class dynamics.

Browse course offerings

Syllabus: Building with Vision`

Other Courses in the Hands-on Science(HoS) Series

Building Conversational Systems with LLMs and Agents

Building Multimedia Systems with Deep Generative Models

Building Autonomous Systems with Embodied Vision Models

Building Data-Intensive Systems with Big Data Algorithms

Browse for all upcoming Hands-On AI Science course offerings, and past student projects

Page updated

Google Sites

Report abuse