Automatic Simultaneous Translation

Script error: No such module "Draft topics". Script error: No such module "AfC topic".

Automatic Simultaneous Translation is a subfield of Computer Science and Artificial Intelligence (AI) that pursues the goal of performing Simultaneous Interpretation of speech by machine.^[1]. For humans as well as for machines, translation of spoken language, i.e. Interpretation, can be performed consecutively or simultaneously^[2]^[3]. Consecutive translation is done one sentence at a time, with the speaker of the source language and the translator (in the target language) taking turns speaking. In simultaneous interpretation, the interpretation from a source into a target language has to be performed in parallel while the original speaker speaks continuously in the source language. Simultaneous translation involves a high cognitive load^[4]

Description[edit]

Simultaneous speech translation systems are either built as a pipeline of separate components or as end-to-end systems.^[5]

Pipelined Systems[edit]

Pipelines Simultaneous Translation systems by machine combine three areas of artificial intelligence:^[6]^[7]

Automatic Speech Recognition (ASR) which transcribes the speech of the source language into text in the source language.^[8]^[8]
Machine Translation (MT) which translates the recognised text in the source language to text in the target language.^[9]
Output (speech or text) generation, where the output in the target language has to be presented in an appropriate fashion for a listener to understand and follow along. This includes the insertion of punctuation.^[10] and the removal of disfluencies^[11]^[12]

End-to-End Systems[edit]

End-to-End Simultaneous Translation systems use one large Artificial neural network to directly transform the incoming audio signal into the sequence of output words into which to translate.^[13]^[14]^[15]

History[edit]

First Speech Translation systems were published in the early '90s, with consecutive translation dialog systems^[7]. These early systems generally also assumed a restricted domain under which a system would be used.

The first demonstration of a real-time, open-domain, online, simultaneous translation system was first given in a joint press conference at Carnegie Mellon University, Pittsburgh, PA, USA and Karlsruhe Institute of Technology, Germany in 2005^[16]^[17]^[18]^[19]^[20]^[21]. A system was shown to the public that translated a lecture simultaneously with little delay, using statistical components. Output was shown as subtitles on a screen under or next to the presentation slides of the lecturer. Alternate output modes via targeted audio speakers, heads-up display goggles were also shown in the demonstration.

Following this feasibility demonstration, an actual production service was introduced at Karlsruhe University in Germany^[22]. The service as the result from EU funded project EU-BRIDGE, where simultaneous translation technologies were turned into cloud based services^[23]. EU-BRIDGE also tested Karlsruhe Lecture translator at the European Parliament^[24]. At the European Parliament's "Innovation Days 2018" the Karlsruhe Lecture Translator provided automatic conference interpretation at the European Parliament. Similar demonstrations of simultaneous speech translations were also later performed by Microsoft at a conference in China (2016)^[25] and by Baidu in 2018^[26], indicating a growing need and interest in bringing translingual interpreting capabilities to modern software products.

References[edit]

↑ Mima, Hideki; Iida, Hitoshi; Furuse, Osamu (1998). ""Simultaneous interpretation utilizing example-based incremental transfer"" (PDF). 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 2. pp. 855–861.
↑ "Consecutive and Simultaneous Interpretering". www.conference-interpreters.ca. Archived from the original on 2016-10-22. Retrieved 2017-09-29.
↑ Gaiba, Francesca (1998). The Origins of Simultaneous Interpretation: The Nuremberg Trial. Ottawa: University of Ottawa Press. ISBN 978-0776604572. Search this book on
↑ Mizuno, Akira (2017). "Simultaneous interpreting and cognitive constraints". Bull. Coll. Lit. 58: 1–28.
↑ Anastasopoulos, Antonios; Bojar, Ondrej; Bremerman, Jacob; Cattoni, Roldano; Elbayad, Maha; Federico, Marcello; Ma, Xutai; Nakamura, Satoshi; Negri, Matteo; Niehues, Jan; Pino, Juan; Salesky, Elizabeth; Stüker, Sebastian; Sudoh, Katsuhito; Turchi, Marco; Waibel, Alexander; Wang, Changhan; Wiesner, Matthew (2021). ""Findings of the IWSLT 2021 Evaluation Campaign"" (PDF). Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021). pp. 1–29. doi:10.18653/v1/2021.iwslt-1.1.
↑ Fügen, Christian; Waibel, Alex; Kolss, Muntsin (2007). ""Simultaneous translation of lectures and speeches."" (PDF). Machine Translation, Vol. 21, No. 4, MTSN. pp. 209–252. doi:10.1007/s10590-008-9047-0.
↑ ^7.0 ^7.1 Seligman, Marc; Waibel, Alex; Joscelyne, Andrew, eds. (16 March 2017). TAUS Speech-to-Speech Translation Technology Report. Search this book on
↑ ^8.0 ^8.1 Nguyen, Thai-Son; Stüker, Sebastian; Waibel, Alex (2021). ""Super-Human Performance in Online Low-Latency Recognition of Conversational Speech"". Interspeech 2021.
↑ Niehues, Jan; Pham, Ngoc-Quan; Ha, Thanh-Le; Sperber, Matthias; Waibel, Alex (2018). ""Low-Latency Neural Speech Translation"" (PDF). 2018 Annual Conference of the International Speech Communication Association (INTERSPEECH). pp. 1293–1297.
↑ Cho, Eunah; Niehues, Jan; Waibel, Alex (2012). ""Segmentation and Punctuation Prediction in Speech Language Translation Using a Monolingual Translation System"" (PDF). 2012 International Workshop for Spoken Language Translation (IWSLT).
↑ Honal, Matthias; Schultz, Tanja. ""Correction of Disfluencies in Spontaneous Speech using a Noisy-Channel"" (PDF). 2003 European Conference on Speech Communication and Technology (EUROSPEECH). pp. 2781–2784.
↑ Cho, Eunah; Niehues, Jan; Waibel, Alex (2014). ""Machine Translation of Multi-party Meetings: Segmentation and Disfluency Removal Strategies"" (PDF). 2014 International Workshop on Spoken Language Translation (IWSLT). pp. 176–183.
↑ Ren, Yi; Liu, Jinglin; Tan, Xu; Zhang, Chen; Qin, Tao; Zhao, Zhou; Liu, Tie-Yan (2020). ""SimulSpeech: End-to-End Simultaneous Speech to Text Translation"" (PDF). Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. pp. 3787–3796. doi:10.18653/v1/2020.acl-main.350.
↑ Ma, Xutai; Pino, Juan; Koehn, Philipp (2020). "Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021)". arXiv:2011.02048v1 [cs.CL].
↑ Nguyen, H.; Estève, Y.; Besacier, L. (2021). ""An Empirical Study of End-To-End Simultaneous Speech Translation Decoding Strategies"". Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). pp. 7528–7532. doi:10.1109/ICASSP39728.2021.9414276.
↑ "Demo of breakthroughs in cross lingual communication and speech-to-speech translation" (video) (Press release). Pittsburgh, PA, USA: Carnegie Mellon University. 27 October 2005.
↑ "Breaking the Language Barrier with InterAct". Pittsburgh, PA, USA: KDKA News. 27 October 2005.
↑ "Welcome to the Future: Effective Communication". CNN. 2005.
↑ "Computer speech translator closer" (PDF). Shanghai Daily Company. 28 October 2005.
↑ Bails, Jennifer (28 October 2005). "No longer lost in translation" (PDF). Pittsburgh, PA, USA.
↑ Spice, Byron (28 October 2005). "Let's talk! The computer can translate". Pittsburgh, PA, USA.
↑ "German university to stream subtitled lectures". Deutsche Welle. 25 June 2012.
↑ Sebastian, Stüker; Ney, Hermann; Federico, Marcello; Tescari, Alessandro; Simpson, Matt; Rödder, Margit; Koehn, Philipp; Steinbiss, Volker (31 January 2015). EU-BRIDGE Final Report (PDF) (Report).
↑ EP-conference - New Technologies and Education for Multilingualism going Global (video). Brussels, Belgium. 18–19 October 2021.CS1 maint: Date format (link)
↑ "Microsoft Translator brings end-to-end speech translation to everyone with the world's first Speech Translation API". Microsoft Translator Blog. Microsoft. 30 March 2016. Retrieved 28 August 2021.
↑ "Baidu Research Announces Breakthrough in Simultaneous Translation". Baidu Research. Baidu Research. 24 October 2018. Retrieved 28 August 2021.

This article "Automatic Simultaneous Translation" is from Wikipedia. The list of its authors can be seen in its historical and/or the page Edithistory:Automatic Simultaneous Translation. Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.

[Osamu1998-1] Mima, Hideki; Iida, Hitoshi; Furuse, Osamu (1998). ""Simultaneous interpretation utilizing example-based incremental transfer"" (PDF). 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 2. pp. 855–861.

[consSimul-2] "Consecutive and Simultaneous Interpretering". www.conference-interpreters.ca. Archived from the original on 2016-10-22. Retrieved 2017-09-29.

[Gaiba1998-3] Gaiba, Francesca (1998). The Origins of Simultaneous Interpretation: The Nuremberg Trial. Ottawa: University of Ottawa Press. ISBN 978-0776604572. Search this book on

[Mizuno2017-4] Mizuno, Akira (2017). "Simultaneous interpreting and cognitive constraints". Bull. Coll. Lit. 58: 1–28.

[IWSLT2021findings-5] Anastasopoulos, Antonios; Bojar, Ondrej; Bremerman, Jacob; Cattoni, Roldano; Elbayad, Maha; Federico, Marcello; Ma, Xutai; Nakamura, Satoshi; Negri, Matteo; Niehues, Jan; Pino, Juan; Salesky, Elizabeth; Stüker, Sebastian; Sudoh, Katsuhito; Turchi, Marco; Waibel, Alexander; Wang, Changhan; Wiesner, Matthew (2021). ""Findings of the IWSLT 2021 Evaluation Campaign"" (PDF). Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021). pp. 1–29. doi:10.18653/v1/2021.iwslt-1.1.

[Fügen_2007-6] Fügen, Christian; Waibel, Alex; Kolss, Muntsin (2007). ""Simultaneous translation of lectures and speeches."" (PDF). Machine Translation, Vol. 21, No. 4, MTSN. pp. 209–252. doi:10.1007/s10590-008-9047-0.

[Taus_Report_2017-7] 7.0 ^7.1 Seligman, Marc; Waibel, Alex; Joscelyne, Andrew, eds. (16 March 2017). TAUS Speech-to-Speech Translation Technology Report. Search this book on

[Nguyen_2021-8] 8.0 ^8.1 Nguyen, Thai-Son; Stüker, Sebastian; Waibel, Alex (2021). ""Super-Human Performance in Online Low-Latency Recognition of Conversational Speech"". Interspeech 2021.

[Niehues_2018-9] Niehues, Jan; Pham, Ngoc-Quan; Ha, Thanh-Le; Sperber, Matthias; Waibel, Alex (2018). ""Low-Latency Neural Speech Translation"" (PDF). 2018 Annual Conference of the International Speech Communication Association (INTERSPEECH). pp. 1293–1297.

[Cho_2012-10] Cho, Eunah; Niehues, Jan; Waibel, Alex (2012). ""Segmentation and Punctuation Prediction in Speech Language Translation Using a Monolingual Translation System"" (PDF). 2012 International Workshop for Spoken Language Translation (IWSLT).

[Honal_2003-11] Honal, Matthias; Schultz, Tanja. ""Correction of Disfluencies in Spontaneous Speech using a Noisy-Channel"" (PDF). 2003 European Conference on Speech Communication and Technology (EUROSPEECH). pp. 2781–2784.

[Cho_2014-12] Cho, Eunah; Niehues, Jan; Waibel, Alex (2014). ""Machine Translation of Multi-party Meetings: Segmentation and Disfluency Removal Strategies"" (PDF). 2014 International Workshop on Spoken Language Translation (IWSLT). pp. 176–183.

[ren-etal-2020-simulspeech-13] Ren, Yi; Liu, Jinglin; Tan, Xu; Zhang, Chen; Qin, Tao; Zhao, Zhou; Liu, Tie-Yan (2020). ""SimulSpeech: End-to-End Simultaneous Speech to Text Translation"" (PDF). Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. pp. 3787–3796. doi:10.18653/v1/2020.acl-main.350.

[ma2020simulmt-14] Ma, Xutai; Pino, Juan; Koehn, Philipp (2020). "Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021)". arXiv:2011.02048v1 [cs.CL].

[besacier2021-15] Nguyen, H.; Estève, Y.; Besacier, L. (2021). ""An Empirical Study of End-To-End Simultaneous Speech Translation Decoding Strategies"". Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). pp. 7528–7532. doi:10.1109/ICASSP39728.2021.9414276.

[CMU_2005-16] "Demo of breakthroughs in cross lingual communication and speech-to-speech translation" (video) (Press release). Pittsburgh, PA, USA: Carnegie Mellon University. 27 October 2005.

[KDKA_2005-17] "Breaking the Language Barrier with InterAct". Pittsburgh, PA, USA: KDKA News. 27 October 2005.

[CNN_2005-18] "Welcome to the Future: Effective Communication". CNN. 2005.

[Shanghai_2005-19] "Computer speech translator closer" (PDF). Shanghai Daily Company. 28 October 2005.

[Tribune_Review_2005-20] Bails, Jennifer (28 October 2005). "No longer lost in translation" (PDF). Pittsburgh, PA, USA.

[Post_Gazette_2005-21] Spice, Byron (28 October 2005). "Let's talk! The computer can translate". Pittsburgh, PA, USA.

[DW2012-22] "German university to stream subtitled lectures". Deutsche Welle. 25 June 2012.

[EUBRIDGE_2015-23] Sebastian, Stüker; Ney, Hermann; Federico, Marcello; Tescari, Alessandro; Simpson, Matt; Rödder, Margit; Koehn, Philipp; Steinbiss, Volker (31 January 2015). EU-BRIDGE Final Report (PDF) (Report).

[EP_2012-24] EP-conference - New Technologies and Education for Multilingualism going Global (video). Brussels, Belgium. 18–19 October 2021.CS1 maint: Date format (link)

[MS_Blog_2016-25] "Microsoft Translator brings end-to-end speech translation to everyone with the world's first Speech Translation API". Microsoft Translator Blog. Microsoft. 30 March 2016. Retrieved 28 August 2021.

[Baidu_2018-26] "Baidu Research Announces Breakthrough in Simultaneous Translation". Baidu Research. Baidu Research. 24 October 2018. Retrieved 28 August 2021.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]