10 May: fyi -- corpus linguistics, California
Index of May 2004 | Index of year: 2004 | Full index
Position Available: Programmer / Digital Curator
The Rosetta Project: ALL Language Archive is looking for a
Jack-or-Jill of all digital trades to be the digital backbone of our
effort to create an online database of all documented human languages.
http://www.rosettaproject.org
We are looking for someone with a combination of programming and
linguistic skills- someone who has a broad range of tools in their
quiver and is able to adapt to a wide range of circumstances- both
planned and unplanned.
Much of the work will involve Perl scripting and command line work to
harvest and parse legacy databases and repopulate them into Rosetta
MySQL tables. There will also be a general and unending array of mass
file processing tasks, hardware and software battles, sys admin
issues, and the usual character encoding and transcription problems
encountered when working with the vast array of materials from the
vastly different sources that we engage daily.
Our database is currently over 30,000 text pages deep, providing basic
descriptive material for around 1,500 languages over the categories
of: demographic description, maps, orthography, phonology, grammar,
sociolinguistics, core vocabulary lists, numbering system, main
parallel text, glossed vernacular text, audio files, and various
miscellaneous other items. These materials are available online in
conjunction with a variety of tools to support collaborative
contribution, review and correction of all texts in the database.
Rosetta provides both content as well as tools for language
researchers, educators and learners. The person for this job also
needs to be interested in and qualified for the development of such
tools and environments.
Our goal in the next 4 years is to expand the current Rosetta database
to provide the above descriptive categories for all human languages
with some meaningful form of documentation- somewhere between 3,000
and 4,000 of the 7,000 on the planet (SIL count). We recently
received a $1,000,000 NSF grant from the National Science Digital
Library program to move us towards this goal
We obviously have a very large digital curation task on our hands.
And we need someone with unusual ambition, cleverness and flexibility
to engage and solve the vast array of challenges we have and will
encounter during this process.
Absolutely necessary tools and knowledge for the job:
- perl
- unix
- FreeBSD
- MySQL
- python
- zope
- windows sys admin
- clarity on Unicode
- ability to read IPA
If you are interested, please send an email letter explaining your
interest and abilities, along with a resume. No applications via
snail mail.
The position will start as a three month contract, and potentially
grow into a full time salaried position if the fit is correct.
Thank you,
Jim Mason
Director, Rosetta Project
Address for Applications:
Attn: director jim mason
jimmason@longnow.org
san francisco, CA 94710
United States of America
Applications are due by 01-May-2004
Contact Information:
Jim Mason
Email: jimmason@longnow.org
Website: http://www.rosettaproject.org
Index of May 2004 | Index of year: 2004 | Full index