Some good news, I have been diving into an alternative to "Vegas Pro Speech to Text". Although it is a very fine feature and easy to use, it has some drawbacks: first of all, it is a Vegas 365 - only feature (I hope this may change). Lots of people will not be able to use it - those without a subscription. Like all AI based stuff it is dependent on the model and results may vary. It also lacks a way to tune for quality and language. I suspect it also being more favorable to English.
So here is an alternative: Whisper openAI.
I have created a simple Vegas script to call whisper and convert speech to text. Just place the cursor over an event on the timeline and the script will create result files with text. In a future version I can extend this to create subtitles from these result files on the timeline, feel free to add this or add more of the whisper capabilities like quality, language and translation options. Refer to the document on Whisper at the bottom of this post.
Latest Update 18/12/2023:
Whisper Speech To Text v7":
- add backward compatibility for (old "Sony" Vegas versions UI plugin naming) scripting in Vegas 14,15,16 (Note: only tested in Vegas 21 - not tested in 14, 15 or 16)
Here is the link to the latest Vegas script called "Whisper Speech To Text v7":
older scripts:
Whisper Speech To Text v6":
- sets timeline timecode format to "Time" to prevent subtitle discrepancies on the timeline and reset back to the original user's preference after the subtitle insert
Here is the link to the latest Vegas script called "Whisper Speech To Text v6":
"Whisper STT RAW v2" (variant):
- same script as v6 with the exception that it keeps the original sentence layout "as is" of the .srt file without automatically adding a newline (like a word wrap) after 9 words.
Here is the link to the latest Vegas script called "Whisper STT RAW v2":
====================================================================================
The only caveat is that it requires quite a bit of effort to get whisper installed, it depends on Python, GIT, FFmpeg, etc. and setting of environment variables. So, you need to install a bunch of supporting stuff before you can use whisper. But it is doable. For this purpose, I have put together a document on how to use and install whisper (and its dependent programs), it has all the links to get you up and running.
Here is the link to the document on Whisper openAI:
https://www.dropbox.com/s/dh62ripb58xth86/AI%20whisper.docx?dl=0
==================================================================================
update history
Update 18/12/2023: "Whisper Speech To Text v7"
- add backward compatibility for (old "Sony" Vegas versions UI plugin naming) scripting in Vegas 14,15,16 (Note: only tested in Vegas 21 - not tested in 14, 15 or 16)
Update 14/12/2023: "Whisper Speech To Text v6" & "Whisper STT RAW v2":
- sets timeline timecode format to "Time" to prevent subtitle discrepancies on the timeline and reset back to the original user's preference after the subtitle insert
Update 13/12/2023: "Whisper Speech To Text v5":
- Compatibility update: solves subtitle insert fail issue some users have because of different results from whisper:
- for those that whisper saves filename + media type extension + .srt
- for those that whisper saves filename + .srt (without media type extension .wav .mp4 etc...)
On request here is a variant of the "Whisper Speech To Text v5" script called "Whisper STT RAW v1":
- same script as v5 with the exception that it keeps the original sentence layout "as is" of the .srt file without automatically adding a newline (like a word wrap) after 9 words.
Update 11/12/2023: "Whisper Speech To Text v4":
- Small update (but also a big improvement and bug fix for some users to support speech to text when the drive location of the audio media is not located on the same drive as the Vegas project.
Update 10/12/2023: "Whisper Speech To Text v3":
- Very small update to support audio filenames with spaces in their names
Update 29/11/2022: Whisper Speech To Text v2":
- Major update: I made a new improved script with UI to select different transcode model options, a translate option, and a UI option to import the subtitles from the generated files to a new track
11 November 2022: Original script "Whisper Speech To Text":
Have fun!