Audiovisual Translation (AVT) is a field where Machine Translation (MT) has long found limited success mainly due to the multimodal nature of the source and the formal requirements of the target text. Subtitling is the predominant AVT type, quickly and easily providing access to the vast amounts of audiovisual content becoming available daily. Automation in subtitling has so far focused on MT systems which translate source language subtitles, already transcribed and timed by humans. With recent developments in speech translation (ST), the time is ripe for extended automation in subtitling, with end-to-end solutions for obtaining target language subtitles directly from the source speech. In this thesis, we address the key steps for accomplishing the new paradigm of automatic subtitling: data, models and evaluation. First, we address the lack of representative data by compiling MuST-Cinema, a speech-to-subtitles corpus. Segmenter models trained on MuST-Cinema accurately split sentences into subtitles, and enable automatic data augmentation techniques. Having representative data at hand, we move to developing direct ST models for three scenarios: offline subtitling, dual subtitling, live subtitling. Lastly, we propose methods for evaluating subtitle-specific aspects, such as metrics for subtitle segmentation, a product- and process-based exploration of the effect of spotting changes in the subtitle post-editing process, and finally, a comprehensive survey on subtitlers' user experience and views on automatic subtitling. Our findings show the potential of speech technologies for extending automation in subtitling to provide multilingual access to information and communication.
Automatic subtitling: A new paradigm / Karakanta, Alina. - (2022 Nov 11), pp. 1-308. [10.15168/11572_356701]
Automatic subtitling: A new paradigm
Karakanta, Alina
2022-11-11
Abstract
Audiovisual Translation (AVT) is a field where Machine Translation (MT) has long found limited success mainly due to the multimodal nature of the source and the formal requirements of the target text. Subtitling is the predominant AVT type, quickly and easily providing access to the vast amounts of audiovisual content becoming available daily. Automation in subtitling has so far focused on MT systems which translate source language subtitles, already transcribed and timed by humans. With recent developments in speech translation (ST), the time is ripe for extended automation in subtitling, with end-to-end solutions for obtaining target language subtitles directly from the source speech. In this thesis, we address the key steps for accomplishing the new paradigm of automatic subtitling: data, models and evaluation. First, we address the lack of representative data by compiling MuST-Cinema, a speech-to-subtitles corpus. Segmenter models trained on MuST-Cinema accurately split sentences into subtitles, and enable automatic data augmentation techniques. Having representative data at hand, we move to developing direct ST models for three scenarios: offline subtitling, dual subtitling, live subtitling. Lastly, we propose methods for evaluating subtitle-specific aspects, such as metrics for subtitle segmentation, a product- and process-based exploration of the effect of spotting changes in the subtitle post-editing process, and finally, a comprehensive survey on subtitlers' user experience and views on automatic subtitling. Our findings show the potential of speech technologies for extending automation in subtitling to provide multilingual access to information and communication.File | Dimensione | Formato | |
---|---|---|---|
Karakanta_PhDThesis_bib.pdf
accesso aperto
Descrizione: PhD Thesis
Tipologia:
Tesi di dottorato (Doctoral Thesis)
Licenza:
Creative commons
Dimensione
6.68 MB
Formato
Adobe PDF
|
6.68 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione