AI Speech-to-text is poor and generates off timestamp.

Emma-Eriksson wrote on 5/7/2025, 9:28 AM

How come when I use speech-to-text, the AI analyses it very poorly in swedish?

And how come the genereted subtitle ends up on wrong location (even tho I generated it from Source Media, not the timeline)

The end-result often results in the subtitle starting waaay too early and then maybe stops when we stop talking in the video?

Br,

Emma

Comments

RogerS wrote on 5/7/2025, 10:31 AM

Could be a limitation of the underlying Microsoft Azure service- the online Swedish language dataset maybe isn't so big compared to English or other languages it does a better job on.

joelsonforte.br wrote on 5/9/2025, 6:11 AM

Hi @Emma-Eriksson

Try the Auto Captions for Vegas. I did a small test on a video in Swedish and it worked well. See the result bellow.

Download the trial version, which is fully functional and allows you to use 50 sessions. If you like it, just get in touch with me.

https://www.vegascreativesoftware.info/us/forum/auto-captions-for-vegas--147836/

Emma-Eriksson wrote on 5/12/2025, 6:02 AM

The results is kinda the same as vegas pro. Not good enough. But thanks anyway.

RogerS wrote on 5/12/2025, 6:31 AM

Probably Swedish engineers need to do more to develop language models. You could try Subtitle Edit and see if any of the transcription engines handle it better. My guess is not.

joelsonforte.br wrote on 5/12/2025, 7:28 PM

The results is kinda the same as vegas pro. Not good enough. But thanks anyway.

@Emma-Eriksson

The quality of the transcription results depends on the model used. Larger models produce better results. If you have a good GPU, use the Large-V2 or Large-V3 models. In the video, I always use the Small model just for demonstration purposes.

If you want subtitles that are more in sync with better timestamps, I’ll give you a tip: use the Word By Word option in Auto Captions for Vegas. This option generates a timestamp for each word in the audio instead of entire phrases, which results in much higher accuracy. After that, just edit the .srt file using Subtitle Edit to set the number of characters and lines you want, and import it back into the Vegas timeline.

At least for me, this has always worked really well. Here's a short screen recording I made showing the step-by-step process. Good luck with your projects!

 

Emma-Eriksson wrote on 5/14/2025, 7:22 AM

@joelsonforte.br Okay I see the potential, whats the pricing?
Can you make a onetime payment or is it monthly? :)

joelsonforte.br wrote on 5/14/2025, 8:28 AM

@Emma-Eriksson

All the information you need is in this link: https://www.vegascreativesoftware.info/us/forum/auto-captions-for-vegas--147836/

Just send an email to joelsonforte.br@gmail.com confirming that you want to purchase.

The price is only 40 dollars. One-time payment via PayPal. Once the payment is confirmed, I will send the serial number. The contact email is also there.