Logo

Elhuyar – Bilingual speech recognition (ASR) and machine translation (MT) for creating translated subtitles and minutes of municipal plenary sessions

Elhuyar 

Sector: Services

Business Case

During the last few years, the use of video minutes on the web has become the only official record of the municipal plenary sessions, due to the saving of not having to make transcriptions or minutes manually and translating them into the two official languages of the Basque Autonomous Community. However, this poses several problems: video records are not accessible to deaf people; without translation, as many plenary sessions are bilingual, citizens’ language rights are not respected, in addition to the fact that monolingual people cannot understand the content spoken in the other language; the survival of digital formats in the future is not assured and the records will certainly not be accessible in the medium or long term, let alone in periods of hundreds of years, unlike paper records.

Objectives

Use bilingual Automatic Speech Recognition (ASR) technology for the (semi-)automatic creation of subtitles and minutes, and automatic translation technology to translate those subtitles and minutes.

Use case

Bilingual transcription/subtitling templates have been created to be used for municipal plenary sessions. Speaker detection technology has been developed: once the contributions of each councillor/official in a plenary session have been identified, the speaker is automatically detected in successive sessions. Statistics on the use of each language by councillors, parties, etc. can be obtained. This technology has been integrated into the content manager/publisher of Abao, a company engaged, among other things, in plenary session recording and video recording services, which now also allows it to generate subtitles and minutes in both the original and translated languages.

Infrastructure

Cloud

Technology

Automatic or deep learning Text Mining Voice recognition

Data

Several hundred hours of transcribed audio recordings. Public corpus of audio recordings with speaker identification.

Resources

Researchers who specialise in NLP and particularly in speech recognition and machine translation. Server infrastructure to host the developed recognition systems (if On Premise installation is not required). Developers of APIs for remote calling of recognition and automatic translation systems. Front-end developers for integration of recognition, machine translation and manual correction systems.

Difficulties and learning

Difficulty in transcribing bilingual content. Difficulty in transcribing local terminology and toponymy. Training with locality datasets.

KPIs (business impact and metrics of the model)

Implementation in Abao. Increased accessibility of content in terms of formats and official languages. Increased durability of the minutes of plenary sessions.

Funding

Applied Artificial Intelligence

Collaborators, Partners

Abao

Scroll to Top