18 April: 2 talks tomorrow: M. Rooth on speech datasets, D. Abusch on

Dublin Computational Linguistics Research Seminar: Index of April 2013 | Dublin Computational Linguistics Research Seminar - Index of year: 2013 | Full index


speak tomorrow (Fri, Apr 19) at the LCR (O'Reilly): Mats at noon, and
Dorit at 4pm.


Mats Rooth (noon)
Title: Harvesting Speech Datasets for Linguistic Research on the Web

Spoken language is growing explosively on the web. Though generic spoken
language search is hardly available, some sites index spoken language
using speech recognition, and it is possible to run off-the-shelf speech
recognition in a laboratory setting. This talk will present methodology
for creating single-phrase web datasets, which consist of multiple
utterances of a single short word sequence. A workflow created at the
Cornell Computational Linguistics Lab and the McGill Prosody Lab consists
of web harvest of underlying data, identification of true tokens and
transcription in a web database interface, phoneme alignment using an HMM
aligner, and subsequent acoustic and linguistic analysis. The procedure
results in large, diverse, and naturalistic datasets that make it possible
to re-examine issues such as the acoustic form and contextual conditioning
of prosody.

Tea, coffee and sandwiches will be available after the seminar (13:00) in the
O'Reilly foyer.

School Seminar webpage: https://www.scss.tcd.ie/SchoolSeminar/


Dorit Abusch (4pm)
Title: Anaphoric relations in sequential and conflated pictures

Pictorial representation, and in particular sequential art (i.e. comics)
can function as a counterpoint for natural language semantics and
pragmatics, where information is conveyed in a very different way, but
where nevertheless phenomena recur that are familiar from
natural language. A pure case is provided by sequential art without
speech bubbles, thought bubbles, or captions, what I call
"silent comics". This presentation will focus on issues of co-reference.

A basic operation in understanding a comic is parsing the pictures into
parts denoting individuals, and identifying these individuals across
frames.
Starting with a denotational conception of the information content of an
individual picture as being a set of possible situations, I suggest adding
discourse referents and identity predications among them to produce a
DRT-like representation. Discourse referents are modeled as areas of
pictures. The representation is comparable both to indexing and
discourse representation in linguistic theory, and perceptual indexing in
vision. It comes out in the semantic model that disourse referents for
individuals depicted in different pictures are existentially quantified,
and co-indexing is purely pragmatic. This is compared to the situation in
languages such as Chinese without definiteness marking. It is usually
assumed in theoretical accounts of such languages that a definiteness
feature is present syntactically, or is added in the syntax-semantics
interface. Finally I look at continuous and conflated narrative, where
one area of a picture depicts an object or individual in two or more
temporally separate but spatially overlapping scenes. I argue that this
provides represented coindexing, and look at examples from Indian art.


www.scss.tcd.ie/disciplines/intelligent_systems/clg/clg_web/DCLRS






_______________________________________________
cogsci mailing list
cogsci@scss.tcd.ie
https://lists.scss.tcd.ie/mailman/listinfo/cogsci

Dublin Computational Linguistics Research Seminar - Index of April 2013 | Index of year: 2013 | Full index