Coordinator: Prof. Mauro Barni
Home |  DIISM |   | Login Privacy e Cookie policy

Info

Structure




Bridging Human And Machine Vision

 

Prof.
Stefano Melacci
University of Siena - Dipartimento di Ingegneria dell'Informazione e Scienze Matematiche
Dario Zanca
Department AIBE FAU Erlangen-Nuremberg, Germany
Course Type
Type B
Calendar
Aula 103

May 19-23 h 9-13
Room
Program
Brief abstract
The course on Bridging Human and Machine Vision explores the interdisciplinary convergence of human visual perception and machine vision technologies. This field integrates insights from technical disciplines (computer vision and artificial intelligence), and empirical disciplines (cognitive science, neuroscience and psychology). The course is designed to provide students with a comprehensive understanding of how human visual processing systems can inform and enhance the development of machine vision algorithms, and vice versa.

Syllabus

Introduction to Human Visual System
◦ Overview of the human visual system (HVS): The eye, retina, and visual pathways.
◦ Basic concepts in visual perception
◦ Introduction to Human Visual Attention
  • Visual Attention and its Role: Selective attention and its neural underpinnings.
  • Types of Visual Attention: Bottom-up vs. top-down attention mechanisms.
  • Human attention modeling: Feature Integration Theory.

Computational Models of Human Attention
◦ Saliency prediction
  • Classical Models of Attention: Itti’s model, GBVS.
  • Supervised approaches: Learning human attention from gaze data.
◦ Scanpath prediction

Introduction to Artificial Vision Systems
◦ Fundamentals of Computer Vision: Image processing, feature extraction, and object recognition.
◦ Convolutional Neural Networks (CNNs): Architecture, training, and applications.
◦ Vision Transformers (ViT): Architecture, training, and applications.

Human-Inspired Vision Models
◦ CNNs vs. visual processing in the human brain
◦ Biologically-Inspired Architectures
  • Human-attention-enhanced models
  • Foveated models
  • V1-like models

Robustness in Vision Models: real-world and worst-case (i.e., adversarial) distribution shifts.
◦ Evaluating Model Robustness: Metrics and benchmarks for assessing robustness in vision systems.
◦ Metamers: The concept of deep learning metamers.

Are Deep Learning Models Good Models of Human Vision?
◦ Behavioral alignment: analysing deep learning systems as decision makers.
◦ Representations alignment: correlating deep learning activations and brain data.
◦ Future Directions: Opportunities for improving models and bridging gaps between human and artificial vision.





 

Courses

PhD Students/Alumni


Dip. Ingegneria dell'Informazione e Scienze Matematiche - Via Roma, 56 53100 SIENA - Italy