Text to Speech

Gazza131UK wrote on 1/28/2026, 3:54 AM

I have to say, after looking at the text to speech model on VP23, I am very disappointed on how robotic and unreal some of them sound. I mean, its not even like its new technology anymore. I think you guys should either remove it or up your game and do some rework on it.

Sorry to be negative, but I just wanted to be honest.

Comments

Dexcon wrote on 1/28/2026, 4:38 AM

I think you guys should either remove it or up your game and do some rework on it.

Text-to-Speech (and v-v) is, I believe, sourced from Microsoft (possibly MS Azure) and, yes, some of the voices are quite robotic but some - depending on the language/accent - provide a choice of emotions (e.g. happy, sad) for the dialogue read.

Earlier today, I watched a YT video on the history of 20th century movie lots in California. It clearly used a US accented AI text-to-speech read and was so bad that I stopped watching halfway through. And it wasn't helped by the creator not allowing 'breath' room or a pause of any kind between sentences, paragraphs or chapters.

I understand that there are 3rd party AI text-to-speech providers that have been suggested for Premiere Pro and Davinci Resolve users as neither of those NLEs natively feature text-to-speech (and v-v) - unless that has recently changed with Premiere Pro. But then you'd need to be sure of the source of voices for those 3rd party plugins to make sure that it's not also MS.

Cameras: Sony FDR-AX100E; GoPro Hero 11 Black Creator Edition; Samsung S23 Ultra smart phone

Installed: Vegas Pro 13, 15, 16, 17, 18, 19, 20, 21, 22 & 23, HitFilm Pro 2021.3, DaVinci Resolve Studio 20.3, BCC 2026, Mocha Pro 2026, NBFX TotalFX 7, Neat NR 6, DVD Architect 6.0, MAGIX Travel Maps, Sound Forge Pro 16, SpectraLayers Pro 12, iZotope RX11 Advanced and many other iZ plugins, Vegasaur 4.0

Windows 11 25H2

Dell Alienware Aurora 11:

10th Gen Intel i9 10900KF - 10 cores (20 threads) - 3.7 to 5.3 GHz

NVIDIA GeForce RTX 2080 SUPER 8GB GDDR6 - liquid cooled

64GB RAM - Dual Channel HyperX FURY DDR4 XMP at 3200MHz

C drive: 2TB Samsung 990 PCIe 4.0 NVMe M.2 PCIe SSD

D: drive: 4TB Samsung 870 SATA SSD (used for media for editing current projects)

E: drive: 2TB Samsung 870 SATA SSD

F: drive: 6TB WD 7200 rpm Black HDD 3.5"

Dell Ultrasharp 32" 4K Color Calibrated Monitor

 

LAPTOP:

Dell Inspiron 5310 EVO 13.3"

i5-11320H CPU

C Drive: 1TB Corsair Gen4 NVMe M.2 2230 SSD (upgraded from the original 500 GB SSD)

Monitor is 2560 x 1600 @ 60 Hz