Marco Tagliasacchi's research
Home
Teaching
Publications
Research
Curriculum

 

My research activities are currently focused on the processing of multimedia data (e.g. audio, image and video signals), backed by the principles of information theory and machine learning. On one side, information theory enables a compact, non-redundant, representation of data; on the other side machine learning algorithms are adopted to extract the content-based semantics from multimedia data. Recently, I have also started a research line in bioinformatics, targeting data mining and knowledge discovery in databases containing functional annotations of genes and gene products, and search computing, addressing cost-driven aggregation of rankings from heterogeneous search services.

Image and video processing

The activities in the area of image and video processing are carried out in cooperation with the Image and Sound Processing Group (ISPG) at the Dipartimento di Elettronica e Informazione – Politecnico di Milano, led by Prof. Tubaro.

Authentication and tampering detection [A.9., A.7., C.59., C.54., C.47., C.42.]

With the overwhelming diffusion of multimedia contents, protecting the authenticity and the integrity from undesired manipulations has become an increasingly important research theme. The research activities are organized in two areas. On one side, we have been investigating the problem of tampering detection and identification for audio-visual data based on hashing techniques. On the other side we have been investigating the problem of reconstructing the past history of visual contents, exploiting the “footprints” left by acquisition and coding systems.

Video quality assessment [A.10, A.8., C.57., C.55., C.50., C.48., C.46., C.45.]

Video data represents the large part of the traffic on the Internet. Therefore, there is a strong demand for automatic mechanisms able to evaluate the playout quality of video sequences at the clients. This is especially important when video streams are transmitted over best-effort networks and the quality of service cannot be guaranteed. We have investigated the problem of video quality assessment both in a no-reference (i.e. the original video sequence is not available) and in a reduced-reference (i.e. a small size auxiliary stream accompanies the main video stream) scenario. We have developed objective quality metrics that are well correlated with the perceptual quality of impaired video. We have made available to the research community the data collected during an extensive subjective evaluation campaign.

Video analysis [A.11., C.52.]

In a conventional imaging setting, smart cameras operate by transmitting the video content to a base station, where the video sequence is first decoded and then processed to extract meaningful information (e.g. motion detection, object tracking, etc.). Supported by the recent findings in the area of compressive sensing, i.e. that signals can be reconstructed from a limited number of random measurements, we are investigating a new paradigm to perform video analysis without the need for reconstructing the video sequence beforehand. Although this seems to be counterintuitive, it is relevant when the imaging system acquires directly random measurements of the scene, a solution that is being shown to be effective to reduce costs of cameras operating at non-optical wavelengths (e.g. infrared, gamma rays, etc.). We have also extended this framework to target privacy-enabled coding of video data.

Audio processing

The research activities on audio processing are carried out at the premises of the Sound and Music Computing lab (Como Campus – Politecnico di Milano), where I coordinate, together with Prof. Sarti, a research group of five people, including Ph.D. students and research assistants.

Self-calibration of acoustic cameras [C.56., C.53.]

Working with multiple microphone arrays requires knowing the relative positioning of each array in the 3D space. By exploiting concepts from the computer vision literature, we have defined the notion of acoustic camera, and addressed the problem of self-calibrating multiple acoustic cameras while minimizing the amount of data exchange between each camera. Both far-field and near-field conditions have been addressed.

Search Computing

Search computing is a new multi-disciplinary science which will provide the abstractions, foundations, methods, and tools required to answer multi-domain queries. The activities in the area of search computing are carried out in cooperation with the database group at the Dipartimento di Elettronica e Informazione – Politecnico di Milano, led by Prof. Ceri.

Rank aggregation for search computing [B.4., C.51, C.58.]

When the result of a search query stems from the aggregation of results produced by multiple search services, it is important to find the optimal access plan to fetch data from the individual services. As part of the research on search computing, we have been investigating the problem of joining heterogeneous services to answer complex queries taking into account service access costs.

Bioinformatics

The activities in the area of bioinformatics are carried out in cooperation with the database group at the Dipartimento di Elettronica e Informazione – Politecnico di Milano, led by Prof. Ceri.

Data analysis algorithms in gene annotation databases [B.3., C.49., C.60.]

Gene annotation databases are widely used as public repositories of biological knowledge. Gene and gene products are annotated with terms taken from unstructured controlled vocabularies or semantically structured ontologies (e.g. the Gene Ontology). We are developing a system which is meant to integrate available data sources providing functional annotations of genes and gene products. In this context, we have developed novel algorithms for automatically predicting newly inferred annotations based on the functional similarity between Gene Ontology terms.

 

Past Research Activities

Video Processing

Distributed video coding [A.5., A.3., A.1., C.44., C.34., C.33., C.28., C.26., C.24., C.23., C.22., C.21., C.17., C.16., C.15., C.14., C.13., C.12., C.10., C.9., C.8., C.7., C.6.]

Distributed video coding is a recent coding paradigm that enables a flexible distribution of the computational complexity between encoder and decoder, by moving part of the motion estimation task at the decoder. The research has focused on several aspects related to distributed video coding: improving the coding efficiency of state-of-the-art coding architectures, removing some issues that prevented such coding architectures from being applied in practical scenarios; studying the rate-distortion performance of distributed video coding and comparing it with conventional motion-compensated predictive codecs. The research activities have also addressed how to exploit distributed video coding to enhance the robustness with respect to packet losses.

Non-normative tools for video coding [A.6., A.4., B.2., B.1., C.38., C.36., C.35., C.32., C.18., C.11., C.2.]

In order to ensure interoperability, video coding standards define only the syntax of the bitstream and how to perform decoding. Several components are not specified by the standards, including motion estimation, rate allocation, rate control, error concealment, etc. The research has focused on non-normative tools for the state-of-the-art H.264/AVC video coding standard, with particular emphasis on error resilience and rate control.

Scalable video coding [A.2., C.5., C.4., C.3.]

When video contents are distributed over heterogeneous networks and devices it is desirable to adapt the bitstream to the characteristics of the receiving device. Scalable video coding enables bitstream adaptation without the need of transcoding, i.e. partial decoding followed by re-encoding. The bitstream corresponding to the desired frame-rate, spatial resolution and quality can be readily extracted from the original bitstream. The research has focused on wavelet-based scalable video coding techniques, somewhat extending the ideas of JPEG2000 to video signals, and it has led to several contribution to the MPEG, involved in the standardization of a scalable video codec.

Audio Processing

Acoustic source localization and tracking [C.41., C.40., C.39., C.30., C.29., C.27., C.25., C.20., C.19.]

The information about the type of acoustic event can be augmented by the location of the source by space-time processing of signals collected with microphone arrays. We have been working on the problem of acoustic source localization and tracking, especially when more than one source is active at the same time.

Audio classification [C.31., C.27.]

The goal of this research line is to detect the onset of anomalous events (e.g. gunshots, screams, etc.) in audio streams collected by environmental microphones. The research is proceeding towards modelling the temporal evolution of acoustic features extracted from the audio streams, in order to detect aggressions in public spaces for security applications.