University of Potsdam, MSc CogSys, Summer 2019

PM 1: Computational Semantics with Pictures

Instructor Prof. Dr. David Schlangen
Office Hours Tuesday, 5-6pm
Class Hours Thursday, 10-12
Class Room  
Class Website

Course Description

In this class we will look at the area of “language and vision” research, taking a (computational-)linguistic, and in particular, semantic perspective. Treating images as models that do (or do not) make sentences true (or generally provide referents for expressions), we will look at what we can learn from language/vision datasets and models about natural language semantics.

In a first introductory block, we will look at some of the available datasets and do some linguistic analyses on them. We will look at some standard (and some not-so-standard) L&V tasks (such as captioning, referring expression generation / interpretation, visual question answering, visual dialogue, agreement games) and models for them.

In the second, more practically oriented part, students will decide on projects to do (in teams of up to three). Class hour will then turn more into a clinic, where ideas for projects and problems with the practical implementation are discussed.

In a final block, the teams will present their (preliminary) results, which they will then proceed to write up as the final part of their “Portfolioleistung”.

This class assumes that you are able to read and understand and implement current NLP papers. Hence it is advisable to have completed BM1, and ideally also an introduction to machine learning / deep learning.

Course Objectives

After completing this course, you will

  • have a good idea of what the current state of the art in “Language & Vision” research is
  • will have encountered some current NLP models in this area in depth
  • have learned how to define a (reasonable sized) project
  • have learned how to work in a team towards joint goals
  • have learned how to present the results of your work

Background Readings

The papers that will be discussed are / will be listed on the course website. As reference when working through the papers, here is some potentially useful material:

  • Yoav Goldberg, Neural Network Methods for Natural Language Processing, Morgan & Claypoole, 2017 (In the library; will try to get them to buy e-book version.)
  • The Stanford “cs224n: Natural Language Processing with Deep Learning” class materials:
  • The Stanford “CS231n: Convolutional Neural Networks for Visual Recognition class materias:

The sempix framework for working with language & vision data can be found in a separate repository.

Course Policy

This seminar belongs to module PM-1, which is worth a whopping 12 credit points. The contact hours (15 meetings * 2 hours) will be the smallest part of it; you have 330 hours of “Selbstlernzeit” (self-study time). Depending on how you want to spread that out over the semester, that can be up to 20 hours per week. Plan your schedules accordingly!

Grading Policy

The exam is a “Portfolioprüfung”, combining the grade for the class presentations (20 mins) and the grade for the project report (of around 20 pages, per group member) with equal weights. (The report should contain a statement on how work was distributed between group members, signed by all.)

The class presentations are split into one presentation of an existing model in the first part of the class and the presentation of the group topic and work in progress at the end.

Attendance Policy

This course relies on the active participation by students taking it.

E-mail Policy

I will try to be responsive with answering emails.. But first try piazza. (Possibly anonymously, if you are worried that your question might sound silly.)

If it is something administrative, email.