Deep Learning for Speech and Language
2nd Winter School at Universitat Politècnica de Catalunya (2018)
Language and speech technologies are rapidly evolving thanks to the current advances in artificial intelligence. The convergence of large-scale datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Applications such as machine translation or speech recognition can be tackled from a neural perspective with novel architectures that combine convolutional and/or recurrent models with attention. This winter school overview the state of the art on deep learning for speech and language ad introduces the programming skills and techniques required to train these systems.
Instructors
Instructors
Teaching Assistants
Guest Speakers
Lectures
Lectures & Slides (40%) - Room: D5-010
Course will be divided in modules of half an hour covering the following topics:
- 24/01 14:00 D1L1 (XG) Welcome
- 24/01 14:30 D1L2 (XG) Deep Learning - PDF
- 24/01 15:00 D1L3 (XG) CNN vs RNN vs Attention - PDF
- 24/01 15:30 D1L4 (AB) Word Embeddings - PDF
- 25/01 14:00 D2L1 (MRC) Language Model - PDF
- 25/01 15:00 D2L2 (MRC) Neural Machine Translation (NMT) - PDF Video
- 25/01 16:00 D2L3 (MRC) Seq2seq Natural Language Processing (NLP) - PDF Video
- 25/01 16:30 D2L4 (XG) Language and Vision - PDF Video
- 26/01 14:00 D3L1 (JA) Automatic Speech Recognition (ASR) - PDF
- 26/01 15:30 D3L2 (JH) Speaker Identification - PDF
- 29/01 14:00 D4L1 (AB) Text to Speech - PDF
- 29/01 15:00 D4L2 (SP) Speech to Speech - PDF
- 29/01 16:30 D4L3 (XG) Audio and Vision - PDF Video
- 30/01 14:15 D5L1 (Guest) Carlos Segura (Telefónica Research) - event
- 30/01 15:00 D5L2 (Guest) Jordi Pons (UPF) - event - PDF
- 30/01 16:00 D5P Project presentations from students
Project (60%) - Rooms: D5-010
Students worked in teams to develop a machine learning research project that was be presented in an oral presentation during the final day of the course. You can find the project report and slides in the following links:
- Carlos Arenas, Itziar Sagastiberri and Mireia Gartzia, “Toxic Comment Classification Challenge” [Slides]
- Pau Batlle, Miguel Cidrás, Esteve Tarragó, “Neural Approaches to Text Normalization” [Slides]
- Johan Bender and Miquel LLobert, “Speech Commands Challenge” [Slides]
- Juan Carlos Morales, Guillem París, Somayeh Jafari Dinani, “Deep Learning for Speech Commands” [Slides]
Practical
Practical details
- Study Programs: Master MET at ETSETB TelecomBCN, from the Universitat Politecnica de Catalunya.
- Course code and official guide: 230362 - DLSL
- ECTS credits: 2.5 ECTS
- Teaching language: English
- Semester: Autumn 2017
- Class Schedule: 24, 25, 26, 29 & 30 January 2018 (2pm-5pm Lectures, 5pm-6pm Lab)
- Capacity: 20 MSc students
- Location: Campus Nord UPC, Module D5, Room 010
Registration
Registration
This Winter School requires a previous knowledge on basic deep learning techniques, which will not be covered. Please follow these indications depending on your profile:
If you have no previous experience on deep learning:
Sign up for the BSc Winter School on Introduction to Deep Learning, which is taught almost simultaneously in a morning schedule. Once you have signed up for it, you can follow the instructions below that apply to your profile.
If you have taken a previous edition of DLAI, DLCV, DLSL or have previous experience on deep learning:
- 
    Master students at ETSETB: Registration is available from the ETSETB academic office. There is an extraordinary registration period between 11 and 14 January 2018. 
- 
    Master students at FIB: Contact the FIB academic office before 14 January 2018. They will collect all applications and submit them to ETSTEB for approval. 
- 
    Bachelor students at ETSETB or FIB: You might audit the course, with no official certification. If interested, fill in this form before 14 June 2018. 
- 
    Industry members and any other profile: You must apply for being accepted in the course and cover the 100% cost of the ECTS credits, without the support of the public funds. This corresponds to 143,08 € per ECTS credit (Summer 2016). If you are interested in this option, please contact the ETSETB Telecom BCN academic office. 
Previous
Previous editions
- Deep Learning for Computer Vision UPC TelecomBCN: 2016 2017
- Deep Learning for Speech and Language UPC TelecomBCN. 2017 2018
- Deep Learning for Artificial Intelligence. UPC TelecomBCN 2017.
- Deep Learning for Multimedia. Insight Dublin City University 2017.
- Amaia Salvador and Santiago Pascual. “Hands on Keras and TensorFlow”. Persontyle 2017.














