View on GitHub

Signal-denoising-in-the-wild

MSc Data Science thesis repository

Audio denoising in the wild

a Master Degree Thesis project

Overview  | Data  | Report  | Presentation  | About me

☍   Overview

Technological innovation and the large-scale application of highly innovative tools have been offering great opportunities for study and development in the field of Artificial Intelligence for years. Among the numerous fields of application of these technologies are voice assistants with all the tasks associated with them. The very nature of these technological solutions means that their use often takes place in hostile, or highly noisy, environments such as urban contexts. From this problem arises the opportunity to investigate the potential of an end-to-end approach that includes a Deep Learning model to perform the denoising task in direct communication with a second model whose goal is to operate the speaker classification. This thesis work aims to verify this potential through a structured path, organized in numerous phases whose purpose is to obtain timely and comparable measurements with previous and subsequent works.

☍   Data

The data folder contains all the scp (script) files with the paths leading to the audio files in my machines.

☍   Code

The code is divided between the pyscripts, SpeakerRecognition and plots folders. These respectively contain the code for the first training of the WaveNet models, the part of the code dedicated to the fine-tuning phase and the one dedicated to the creation of the visualizations. The file kalditorch.yml contains all the packages needed to reproduce the system. The log folder contains alla the checkpoints created during the models training.

☍   Report

The text of the thesis in pdf format is available here.

☍   Presentation

The presentation slides of the thesis work are available here.

☍   About me

⊜   Fabrizio D’Intinosante