Audio Translator and Captions Generator are all missing necessary Segmentation Options when exporting to captions (SRT or VTT). The minimal required options are : Max number of lines per segment Max number of words per line (in same segment) Max Duration Per Segment (Seconds) Max Characters Per Segment Sentence-Aware Segmentation (If enabled, the start of a new sentence will always begin a new segment) Free transcription tools on the net already have all this, so..... ¯\_(ツ)_/¯