AI Speech-to-text is poor and generates off timestamp.

Emma-Eriksson wrote on 5/7/2025, 9:28 AM

How come when I use speech-to-text, the AI analyses it very poorly in swedish?

And how come the genereted subtitle ends up on wrong location (even tho I generated it from Source Media, not the timeline)

The end-result often results in the subtitle starting waaay too early and then maybe stops when we stop talking in the video?

Br,

Emma

Comments

RogerS wrote on 5/7/2025, 10:31 AM

Could be a limitation of the underlying Microsoft Azure service- the online Swedish language dataset maybe isn't so big compared to English or other languages it does a better job on.

Emma-Eriksson wrote on 5/12/2025, 6:02 AM

The results is kinda the same as vegas pro. Not good enough. But thanks anyway.

RogerS wrote on 5/12/2025, 6:31 AM

Probably Swedish engineers need to do more to develop language models. You could try Subtitle Edit and see if any of the transcription engines handle it better. My guess is not.

Emma-Eriksson wrote on 5/14/2025, 7:22 AM

@Former user Okay I see the potential, whats the pricing?
Can you make a onetime payment or is it monthly? :)