8 February: fyi -- Arabic news analysis, Italy

Index of February 2008 | Index of year: 2008 | Full index



The European Commission's Joint Research Centre (JRC) in Ispra, Northern
Italy, is looking for two native Arabic speakers with basic IT skills to
adapt its public news aggregation and analysis web portals to Arabic.
Applicants should be available for a minimum of three months, better more.



One position is an internship position, the other person would work for an
external IT service provider. Both persons would work out of the offices of
the JRC.



Location: Ispra, at the Lago Maggiore in Italy, 60 km West of Milan;

Host: European Commission - Joint Research Centre (JRC)

Starting date: April 2008 or later;

Duration: 3 to 12 months;

Position 1: traineeship / internship / stage / Praktikum / tirocino;

Remuneration 1: 963 Euro per month + travel allowance;

Position 2: contractor;

Remuneration 2: ca. 100 Euro per working day, after taxes;

Working language: English;

Activity: Web Technology, Language Technology; many other subject
areas

URL: http://langtech.jrc.it/,
http://emm.jrc.it/overview.html,
http://www.jrc.it/;

Deadline: To be filled as soon as possible.

Contact: Erik.Van-der-Goot@jrc.it



The JRC has developed and is running several public news aggregation and
analysis web portals (see http://emm.jrc.it/overview.html) and provides a
number of services to a wide range of international customers. Arabic is one
of the 35 languages currently covered, but no user interfaces are currently
provided for this language and tools should be further tuned to this
language. Tasks include:



- Translate interface menus;

- Translate, optimise and test Boolean search expressions for text
classification;

- Identify more Arabic language news sources;

- Help write the XSLT conversion programs that extract the news
texts from the raw web pages;

- Provide linguistic resources for information extraction programs
(persons, organisations, locations, quotations, relations, events)



Applicants must have the following qualifications:



- Required: Arabic native speaker competence;

- Required: good knowledge of read, written and spoken English;

- Required: Sensitivity for language, knowledge of regional
differences;

- Required: Basic IT skills, XML;

- Beneficial: further IT skills, web technology, HTML, XSLT, Java,
Perl, Oracle, etc.;

- Beneficial: knowledge of further natural languages;



The JRC's news aggregation and analysis applications contribute added value
to the world of the written media:



- Unbiased reporting by aggregating news from multiple sources in
many countries;

- Transparency: users see the viewpoints of the others, even across
languages;

- Live information: updated every ten minutes;

- Multilingual: between 19 and 35 languages are covered;

- Cross-lingual information access;

- Aggregation of information from multiple documents and from many
languages.



For more information on traineeships, cost of living, location, etc., see
http://langtech.jrc.it/WorkatJRC.html.







Ralf Steinberger ( Ralf.Steinberger@jrc.it)

European Commission - Joint Research Centre (JRC)
IPSC - SeS - Language Technology
URL: Applications:
http://emm.jrc.it/overview.html
URL: The science behind them:
http://langtech.jrc.it.

JRC-Acquis Multilingual Parallel Corpus (Version 3)

* Freely available for research purposes.

* 22 languages: Bulgarian, Czech, Danish, German, Greek, English,
Spanish, Estonian, Finnish, French, Hungarian, Italian, Lithuanian, Latvian,
Maltese, Dutch, Polish, Portuguese, Romanian, Slovak, Slovene and Swedish.

* Altogether over 1 Billion words.

* Sentence alignment for 231 language pairs.

* For more information and download, see

http://langtech.jrc.it/JRC-Acquis.html.




DGT-Translation Memory

* Freely available for research purposes.

* Aligned translation units for 231 language pairs.

* Alignment manually verified.

* For more information and download, see
http://langtech.jrc.it/DGT-TM.html.




The JRC's Language Technology group specialises in the development of highly
multilingual text analysis tools and in cross-lingual applications. Many
applications are accessible online, e.g.:

* NewsExplorer: multilingual news
aggregation and analysis (19 languages); allows to navigate the news over
time and across languages; trend analysis; collects information about people
from the news; social network detection.

* NewsBrief: breaking news detection and
display of the very latest thematic news from around the world; email
alerting (22+ languages).

* MedISys Medical Information System: latest
health-related news from around the world according to themes and diseases
(22+ languages).

* EMM-Labs : Latest developments;
social networks; live people-in-the-news; country and theme fact sheets;
maps showing violent events world-wide.

Index of February 2008 | Index of year: 2008 | Full index