Sapienza - University of  Rome

DIET -  Department of Information Engineering, Electronics and Telecommunications  


Course Info: 

Tecniche Audiovisive

Prof. Aurelio Uncini   (info: aurelio _._ uncini _AT_ uniroma1 _._ it)

Laurea Magistrale: Ing. Comunicazioni, Ing. Informatica, Ing. Elettronica


Possible prosecution for Thesis, (also in companies, or in collaboration with other research centers at foreign universities).
For information contact the Teacher or its collaborators
 Dr. Simone Scardapane,  Dr. Michele Scarpiniti, Dr. Danilo Comminiello.


Procedures for the final examination:
he examination consists in the development of a project (home-work project) that refers to a specific topic related to the program.
The project comes with a final presentation valid as an oral test. .




Course beginning:

Sept. 29, 2020
Duration 13 weeks.


Lesson timetable


17:00 - 19:00



12:00 - 14:00




Objectives of the Course

The student acquires basic and specific knowledge to the discipline. In particular, it is able to define and implement, in relation to aspects of processing, complex systems for capturing, manipulating and generating audio signals in different operating environments.

Main topics

Part 1:  Modern Audio-Video Scenarios

  1. Immersive audio-video communication
  2. Virtual, augmented and mixed reality
  3. Modern media


Part 2: Machine Learning for Audio, Image and Video Analysis

  1. Machine Learning
  2. Bayesian Theory of Decision
  3. Clustering Methods
  4. Supervised Neural Networks
  5. RNN; CNN, GAN
  6. ……


Part 3: Multimedia Content Analysis

  1. Content-Based Video Description
  2. Content-Based Audio Description
  3. Gesture Recognition
  4. Action Recognition


Part 4: Sentiment and Emotion Analysis

  1. From Video
  2. From Audio
  3. Text (e.g. twetter, whatsapp, ..)





  • Camastra, Vinciarelli, Machine Learning for Audio, Image and Video Analysis, Springer 2015.
  • Signal Processing Magazine “Special Issue on Semantic Retreival Of Multimedia,” IEEE SP Magazine MARCH 2006
  •  Aurelio Uncini, “Audio Digitale”, McGraw-Hill, ISBN: 88 386 6317-3, 2006. • Z. Wu, Ti. Yao, Y. Fu, Y-G Jiang Deep Learning for Video Classification and Captioning, arXiv:1609.06782v2 [cs.CV] 22 Feb 2018
  • Aurelio Uncini, “Introduction to Neural Networks and Deep Learning”, Ed. 2020 (free pdf available  for the students).