[Audio Translator] Suggestion for Enhancement
planned
L
Lucho
Audio Translator and Captions Generator are all missing necessary Segmentation Options when exporting to captions (SRT or VTT). The minimal required options are :
- Max number of lines per segment
- Max number of words per line (in same segment)
- Max Duration Per Segment (Seconds)
- Max Characters Per Segment
- Sentence-Aware Segmentation (If enabled, the start of a new sentence will always begin a new segment)
Free transcription tools on the net already have all this, so..... ¯\_(ツ)_/¯
1min.AI
planned
L
Lucho
1min.AI Thanks for considering this request. Truly appreciated.
Also to point out that Speaker Diarization/Identification Using OpenAI Whisper is an excellent complementary option which allows many useful possibilities, to mention some :
- Designate a color tag to each speaker in captions, for instance VTT : <c.yellow>sentence of speaker 2</c>...<c.white>...<c.cyan>
- Write down participants in minutes of meeting and state who is defined in charge of each outlined task
- Write down who is host and guest during a podcast
- Export answers by speaker into a form or table
- Stats of transcription (time per speaker, clarity of speech, language used, etc).
Hopefully
Speaker Diarization
can be added soon to Audio Translator and Captions Generator tools1min.AI
Merged in a post:
Audio File translator update.
Rick Martinez
Can we add WHISPERX and WHISPER JAX both are superior to WHISPER and I find this extremely helpful for translations especially for both accuracy and longer files.
1min.AI
Merged in a post:
Target Language selection for Audio translator Tool
L
Lucho
Audio Translator misses the most essential option that is selecting the target language for translation.