The 12th LRB Meeting

picture

The 12th meeting of the Language Resource Board, i.e. the group of National Anchor Points in the scope of the ELRC project, took place on November 9th at the Panorama Hotel in Prague, during the EU Council presidency turn of the Czech Republic. The meeting was held in a hybrid mode, with 55 participants that joined on-site and 22 online participants.

After welcome addresses by Philippe Gelin, Head of Sector Multilingualism at the Data Directorate of DG Connect and Simon Ostermann, ELRC coordinator and Senior Researcher at the German Research Center for Artificial Intelligence (DFKI), Thierry Declerck, a Senior Consultant at DFKI, presented news on the ELRC project. Highlights in recent years include an almost doubled number of language resources collected by ELRC since 2020, new data crawling activities, two successful initiatives to collect Ukrainian language data and increased activities and reach of the ELRC on social media. 

Subsequently, Stefania Racioppa (Researcher at DFKI) presented the new ELRC White Paper. The fourth edition of the White Paper looked at language data use in public administrations (PA) and in small and medium-sized enterprises (SMEs) across Europe. It also includes updated LT country profiles for all CEF-affiliated countries, with findings based both on the ELRC country workshops and on surveys conducted specifically during the investigations for the White Paper. Highlights included findings on machine translation (MT) tool usage by PA and SMEs, indicating that especially the SMEs more frequently use MT tools compared to PA, and underlined how the importance of language data increased in the last three years in Europe.

In a second block, Philippe Gelin, the Programme Officers at Unit G.3 at the European Commission Dhafer Labib and Monica Pretti,  presented news on the new planned Language Data Space (LDS). The project will be a continuation of some of the effort and tasks conducted at ELRC and the ELG project (European Language Grid). After presenting an overview and the vision of the new LDS, presentations focused on the LDS structure, with details on the single tasks to be conducted and the establishment of the new Center of Excellence for Language Technologies (CELT). The third presentation focused on findings from the LDS workshop series, where associations representing different business sectors gave input on how they use language data and what they would expect from the LDS. Lastly, Philippe Gelin moderated a discussion round on the new LDS, where questions by the NAPs and other participants on budget and organisation of the data space were answered.

In the third block, Philippe Gelin talked about the governance of the new LDS and the role that it is expected to play within the European Member States, followed by another discussion round. The last block consisted of three presentations by consortium member on the ELRC language technology specifications: 

  1. Tom Vanallemeersch, Language AI Advisor at CrossLang, presented the ELRC Multilingual Fake News Processing Tool. 
  2. The ELRC WordPress Website Translation Tool developed by Tilde, was presented by Andrejs Vasiļjevs, CEO of Tilde. 
  3. Arne Defauw, Machine Learning Engineer at CrossLang gave a talk on the estimation of document complexity, quality and readability. 

All tools are or will be publicly available. Links for downloading are given in the respective presentations, linked on the ELRC event site.

The meeting was closed by Simon Ostermann in the late afternoon and followed by an informal get-together between representatives of the European Commission, the ELRC consortium and the national anchor points.