27 June: fyi -- digital archives, NL
Index of June 2008 | Index of year: 2008 | Full index
The Institute for Dutch Lexicology has a vacancy for an experienced
Language Technologist. Your primary task will be to help develop
language technology to improve the accessibility of historical documents
for IMPACT. Work within the scope of other INL projects will also be
among your responsibilities.//
/IMPACT/ is a new European research project in which the INL is
participating and which started on 1 January 2008. It is an 'Integrated
Project' in which several national libraries and research institutes as
well as two commercial partners are working together. The main purpose
of IMPACT is to significantly improve the accessibility of historical
documents.
In order to achieve this, IMPACT has set itself the following tasks:
1. Current OCR software is not suitable for mass digitisation of
historical documents. Within the project, OCR software will be
developed that will significantly improve the accuracy of
state-of-the-art systems, allowing for the first time reliable
full text mass digitisation of historical documents.
2. Information in historical documents is not easily accessed by
modern users because of the historical language barrier. Within
the project, historical lexica and linguistic processing tools
will be developed that will enable enriched indexing to provide
access to historical material with contemporary query techniques.
Tasks
Your IMPACT-related tasks will concern the development of a toolbox for
the building and deployment of historical lexica. Both tools and lexica
will be used for the enhancement of OCR results and for better retrieval
in historical text material; recognition of Named Entities plays an
important role in this. You will be working on both the implementation
and the design of algorithms. Other tasks will be related to data
processing and tools for data processing.
Profile
- relevant academic background in computational linguistics,
computer science or applied mathematics
- demonstrable knowledge of and experience with the development
and implementation of machine learning, statistic and other
computerlinguistic algorithms
- demonstrable experience with software development. Sound
knowledge of C and C++ is required.
- ability to work under pressure, as part of a team that must
achieve good results within a short period of time
- Preferably you have:
- experience with Named Entity Processing and knowledge of OCR
techniques
- a PhD or other research experience
- knowlegde of and experience with historical text material.
Offer
An INL contract for two years. The salary scale indicated for this job
is -- dependent upon various factors -- either 10 or 11, with a maximum
of EUR 4.270, - gross per month on the basis of a 38-hour week. In
addition you will be entitled to 42 days holiday per annum plus a
holiday allowance, according to the Cao--onderzoekinstellingen.
Questions and applications
If you have any further questions, please contact Katrien Depuydt
(Taalbank), INL, Postbus 9515, 2300 RA, Leiden. Ph: +31 (0)71 527 2479,
email: depuydt@inl.nl. See also www.inl.nl
and www.impact-project.eu
. Applications may be sent to dr.
Jeannine Beeken (managing director), INL, Postbus 9515, 2300 RA, Leiden.
Email: secretariaat@inl.nl .
*Closing date:* 30-06-2008
--
Katrien Depuydt
Instituut voor Nederlandse Lexicologie
(Institute for Dutch Lexicology)
Taalbank
(Language Database Dept.)
Postbus 9515
NL-2300 RA Leiden
tel.: +31 71 5272479
mail: depuydt@inl.nl
Index of June 2008 | Index of year: 2008 | Full index