Frequently Asked Questions – and their Answers

What is going to happen to the data we provide?

Provision of data: How? Why? What are our data used for?

The data will go to the EC (DG Translate) to support the improvement of the machine translation system MT@EC.

Why should we (public institutions) actually provide data?

Provision of data: How? Why? What are our data used for?

Supporting your own language is supporting Europe and vice versa. Only with your help and with the provision of your language resources, CEF AT can be made fit to your needs. Within the CEF programme, CEF AT is available for free to public administrations in all EU member states and CEF affiliated countries (Iceland and Norway). So for your data, you receive a better service.

We (public institutions) don’t have any data for you! We work only paper-based. We outsource our translations.

Provision of data: How? Why? What are our data used for?

If translations are outsourced, you should ask for the translated data to be delivered with the translation memories. Make sure to negotiate the translation memories with the language service provider ahead.

We cannot just share our data with you – they are confidential!

Provision of data: How? Why? What are our data used for?

Most data held by the public sector is public data. Administrations provide various types of information online to the citizens (e.g. news, legal texts, official communications, interviews, brochures, background information, etc.). This information can also be available in a foreign language. In Germany, for instance, on the website of the national government, all information is provided in German, English, and French.

How can I upload my data to the repository?

Provision of data: How? Why? What are our data used for?

You can upload data to the ELRC repository in three simple steps:

1.      Register (new user) or login (returning user)

2.      Provide a basic description for the language resource (title, short description, language(s)) 

3.      Upload the .zip file

For further instructions, please read the Walkthrough for Contributors and/or contact the helpdesk  

What is MT@EC? What is CEF AT?

CEF AT, MT@EC and translation needs in the public administration

MT@EC is the current EC Machine Translation system used since June 26th, 2013. It is an online service with a web user interface in 24 languages for human use. It can be used as a web service in a machine-to-machine scenario. Using a highly secured protocol (sTESTA) coupled with the European identification ECAS MT@EC guarantees confidentiality of data. Any Member State administration can use it free of charge. More information: CEF AT (Automated Translation platform of the Connecting Europe Facility CEF), as part of CEF Digital, provides automatic translation services with the goal of making any digital service accessible to any EU citizen in his/her own language. European public online services such as Europeana, the Open Data Portal, the Online Dispute Resolution Platform, etc. should benefit from CEF.AT. More on CEF.

How can we access MT@EC?

CEF AT, MT@EC and translation needs in the public administration

MT@EC can be used by any Member State administration.

It can be accessed as follows:

  • Staff working for EU institutions or agencies can use MT@EC with their ECAS account credentials.
  • Staff working for a public administration in an EU country and Norway/Iceland should follow these steps:

    • Sign up for your personal ECAS account and password(using only your professional email address).
    • Send an email to DGT-MT@ec.europa.eu requesting an MT@EC account.
    • Specify your position and the public administrative body you work for.
    • Don't forget to include your full signature. - DGT creates your MT@EC account and notifies you.

Further information available here.

Why would we need MT@EC? We have human translators!

CEF AT, MT@EC and translation needs in the public administration

MT@EC can substantially help make the translation process more productive and more efficient. EC translators are responsible for translating content into all official EU languages. In total, more than 7,000 translators working for DG Translation and EU institutions have translated more than 2.3M pages in 2014.

MT@EC is used daily for French, Spanish, Portuguese and Italian to produce initial translations that are post edited in a very efficient way. For other languages (e.g. German) the quality level of the output is still too low.

In the last year, however, significant progress has been achieved through domain-specific engines. For domain-specific reports and texts, the quality of the translated output by MT@EC is acceptable. In other cases, the tool can rapidly scan long texts in a foreign language and point out passages to be translated by humans.

Overall, the translation quality is directly related to the availability of good quality data in the language: if the data for MT is good, then the MT system will be good.

Why should we support MT@EC / CEF AT – we can have our own national solution?

CEF AT, MT@EC and translation needs in the public administration

Typically, national solutions are targeted on particular range of topics. Hence, the scope of MT@EC is broader and more comprehensive. By supporting MT@EC, participants can expect to have access to a broader service.

Machine translation is directly opposed to our national policy that young people should learn foreign languages.

CEF AT, MT@EC and translation needs in the public administration

Not necessarily. Machine translation can actually provide a good basis for learning languages.

Initially, it can be used to bridge the gap for people who cannot speak a particular language until they acquire initial language skills.

For instance, at university level, machine translation can be used to provide automatic and simultaneous translations of lectures for foreign students who do not master the language.

Machine translation will never work for our languages (e.g. Estonian, Finnish, Hungarian and other morphologically rich languages).

CEF AT, MT@EC and translation needs in the public administration

Processing certain languages with the current MT technologies is more difficult because of e.g. their free morphology or their free constituent order. MT experts are working on new MT solutions based on neural networks more adapted to these languages. Moreover, the European Commission funds several actions (see e.g.) to investigate MT solutions for languages which currently receive only sub-optimal MT support.
However, regardless of the methodology, huge amounts of parallel resources are needed for the implementation of the systems, since these systems rely on machine learning. The need for data is specifically addressed in the workshops.

Why should I care about translations and get hold of/keep corresponding language data?

Managing and harvesting language data - why and how?

Whether you translate your material internally or outsource it, your process can benefit from the re-use of language data from previous translations in a cost-effective way while improving the quality of the output.

How should I manage my data and why? We don’t have any infrastructures or resources (especially small translation services)!

Managing and harvesting language data - why and how?

In the public sector there is a great diversity in translation management: from paper-based to digitized workflows with term lists and translation memories storage.
From an organizational point of view, much benefit can arise even from small changes in dealing with language data. Suggested actions can be taken without major effort, including:

  • Analysis of all phases of data development
  • Based on this, creation of a “data management plan” (DMP), even a very basic one:

    • Which data is important?
    • Where is it stored?
    • Can it be further processed?
  • Document all relevant data
  • If possible, use the web as additional publication channel and reap the benefits of linked data (see http://www.w3.org/DesignIssues/LinkedData.html)
  •  (Check presentation “Best practice for the future: Capitalize on your valuable data”)