17 February: DCLRS -- Dietmar Janetzko, "Quantitative Data Provided by Search Engines?" (16:00, Friday, February 20)
Dublin Computational Linguistics Research Seminar: Index of February 2009 | Dublin Computational Linguistics Research Seminar - Index of year: 2009 | Full index
Seminar Announcement:
Dublin Computational Linguistics Research Seminar
DCLRS 2008/2009
DCU DIT TCD UCD
Venue: Jonathan Swift Lecture Theatre (Arts Building 2041a)
Trinity College Dublin
Time: 16:00, Friday, February 20, 2009
Title:
Quantitative Data Provided by Search Engines?
Studies on the the Data Quality of Search Engine Count Estimates
Speaker:
Dr. Dietmar Janetzko
National College of Ireland
Dublin
Abstract:
Count estimates ("hits") provided by Web search engines have received
much attention as a yardstick to measure a variety of phenomena of
interest as diverse as, e.g., language statistics, popularity of
authors, or similarity between words. Common to these activities is
the intention to use Web search engines not only for search but for ad
hoc measurement. Using search engine count estimates (SECEs) in this
way means that a phenomenon of interest, e.g., the popularity of an
author, is conceived of as a measurand, and SECEs are taken to be its
quantitative measures. As yet, however, the data quality of SECEs has
not yet been studied systematically, and concerns have been raised
against the use of this kind of data. In this talk, studies on data
quality of are presented that analysed SECEs in terms of classical
goodness criteria (objectivity, reliability, and validity). The
results obtained indicate that with the exception of some types of
Boolean queries (disjunction, negation) objectivity as well as
test-retest reliability and parallel-test reliability of SECEs is good
for most types of browsers and search engines. The findings are
discussed in the light of previous objections. Perspectives for new
measurement approaches that use the WWW as a resource for data
(Internet Resonance Diagnostics) are delineated.
Winter Schedule:
January 16 John Tait (Sunderland)
January 23 Gerhard Jaeger (Bielefeld)
January 30 Pat Healey (Queen Mary)
February 6 Sebastian Moeller
February 13 Alfredo Maldonado Guerra (Microsoft Dublin)
February 20 Dietmar Janetzko (NCI)
February 27 Steve Pulman (Oxford)
March 6 Tim Fernando (TCD)
March 13 Andreas Vlachos (Cambridge)
Spring Schedule:
April 3 Tomaz Erjavec (Institute Josef Stefan)
April 10 Public Holiday
April 17 Josef Van Genabith (DCU)
April 24 Frank Keller (Edinburgh)
May 1 held
May 8 Brian Murphy (Trento)
May 15 Kees van Deemter (Aberdeen)
May 22 Elisabeth Andre (Augsburg)
Dublin Computational Linguistics Research Seminar - Index of February 2009 | Index of year: 2009 | Full index