19 January: DCLRS -- Jan 21 -- Begona Villada

Dublin Computational Linguistics Research Seminar: Index of January 2005 | Dublin Computational Linguistics Research Seminar - Index of year: 2005 | Full index


+---------------------------------------------------------+
/| Dublin Computational Linguistics Research Seminar |
/| DCLRS 2004/2005 |
/| DCU TCD UCD |
////////////////////////////////////////////////////////////


Speaker: Begona Villada, Groningen

venue: Davis Lecture Theatre (Arts Building Room 2043)
Trinity College, University of Dublin
time: 4:00-6:00, Friday, January 21, 2005.

title:

Computational issues of multi-word expressions: automatic
identification and modifiability

abstract:

Multi-word expressions exhibit idiosyncrasies at various linguistic
levels. Common idiosyncrasies include lexical restrictions, defective
morphology, rigid syntax and non-compositional semantics. To handle
their idiosyncratic behavior, multi-word expressions are often treated
as phrasal lexical items fully specified in the lexicon. However, some
of these expressions allow certain productive morphological processes
and they show limited flexibility in syntax. For this reason, the
degree of lexicalization varies across expressions. Prior to the
formalization of multi-word expressions in a lexicalist grammar, two
problems deserve attention: (1) the identification of those
expressions that qualify as multi-word expressions and (2) the
establishment of their potential for variation and modification.

After giving a characterization of multi-word expressions, in this
talk I describe data-driven methods to extract multi-word expressions
from linguistically annotated corpora. Hybrid models are presented and
evaluated on the task of acquiring Dutch support verb
constructions. Secondly, I introduce a semi-automatic corpus-based
method to establish the variation and modification potential of
required arguments within multi-word expressions. On the basis of the
extracted evidence, a grammar writer decides the lexical
representation of a multi-word expression, especially the necessary
lexical and morpho-syntactic constraints. The talk ends with some
conclusions and directions for future research.

Dublin Computational Linguistics Research Seminar - Index of January 2005 | Index of year: 2005 | Full index