[AI for Audio] Suggestion for Enhancement
planned
L
Lucho
Audio Translator and Captions Generator are all missing necessary Segmentation Options when exporting to captions (SRT or VTT). The minimal required options are :
- Max number of lines per segment
- Max number of words per line (in same segment)
- Max Duration Per Segment (Seconds)
- Max Characters Per Segment
- Sentence-Aware Segmentation (If enabled, the start of a new sentence will always begin a new segment)
Free transcription tools on the net already have all this, so..... ¯\_(ツ)_/¯
1min.AI
Merged in a post:
[Text to Speech] More Voices for Google AI
jereme
Google Text to Speech has new voices such as the English (US) Journey O. English UK and India Journey voice types etc. as shown here:
Maybe it would be great if it would automatically update everytime new voices are added.
1min.AI
planned
L
Lucho
1min.AI Thanks for considering this request. Truly appreciated.
Also to point out that Speaker Diarization/Identification is an excellent complement which allows many useful features, to mention some :
- Optional Include Speaker Names in Subtitles
00:00:01.285 --> 00:00:03.804
John: Hello, I'm the Speaker1
- Set VTT Subtitle Styling (bold, italic, color tags) for Voice Difference
<c.white><i>
Voice Over
</i></c>
<b>
Hi, I'm Speaker1
</b>
- Write down participants in meeting minutes and state who is in charge of each outlined task
- Write down who is host and guest in a podcast
- Export answers by speaker to table or form
- Transcription stats (time per speaker, clarity of speech, language used)
Hopefully
Speaker Diarization
can be added soon to Audio Translator and Captions Generator tools1min.AI
Merged in a post:
Audio File translator update.
Rick Martinez
Can we add WHISPERX and WHISPER JAX both are superior to WHISPER and I find this extremely helpful for translations especially for both accuracy and longer files.
1min.AI
Merged in a post:
Target Language selection for Audio translator Tool
L
Lucho
Audio Translator misses the most essential option that is selecting the target language for translation.