esslli logo esslli header
ESSLLI 2008
Freie und Hansestadt Hamburg
August 4-15, 2008

 

Abbreviations

LaCoLanguage & Computation
LaLoLanguage & Logic
LoCoLogic & Computation
Ffoundational
Iintroductory
Aadvanced
Wworkshop

For more information about the lecture halls and seminar rooms, see our lecture room page. The names listed under "Technical Assistance" are student volunteers who will act as a contact person for technical questions of the lecturers and workshop speakers during the course or workshop.

Statistical machine translation

This course will provide a thorough introduction to statistical machine translation. The focus of the course will be on using software tools and corpora to create translation models. There will be short lectures introducing the topic, and tutorials which walk students through the process of creating a translation system for a language pair. OUTLINE * Overview of data-driven machine translation - Introduction to parallel corpora as a resource for MT (with pointers publicly available corpora) - Rough intro to example-based MT - General idea of statistical MT, how p(f|e) is calculated from word- alignments * Producing word-alignments from parallel corpora - Use of Giza++ software package for creating alignments - How to evaluate alignment quality using AER - Looking into increasing performance from larger data sets * Extracting phrasal translations from word-alignments, and decoding - Introduction to software packages for enumerating phrases - Introduction for decoding (which is the search for the most probable translation) * Overview of evaluation metrics for machine translation quality - Guidelines for human evaluation - Automatic evaluation metrics including Bleu - If time: Learning curves, Reranking n-best lists * Intro to syntax-based translation models - synchronous grammars - hierarchical phrase models - decoding for tree-based models

Contact e-mail: esslli2008@science.uva.nl