Speech to Text via Whisper openAI

Comments

Former user wrote on 3/4/2023, 9:03 PM

Subtitle Edit is ready to go for these implementations so whenever we get a fixed or functional new one it can be added

It's great software, I hope if they make the change to GPU processing for whisper, it doesn't get the same problems as the GPU whisper versions.

Maybe run it by RX10 first, separate out the person you need, and use whisper next.

Great idea! 👍

RogerS wrote on 3/5/2023, 1:22 AM

Fora a non-python Whisper that does CPU or GPU you can grab it here: https://github.com/Purfview/whisper-standalone-win

It's not working in SubtitleEdit at the moment but works from the command prompt (run cmd as admin). It doesn't seem to have repeated lines.

Save it somewhere, dump ffmpeg.exe to the folder with whisper.exe, change the command prompt folder to there "cd C:\Whisper\" for example. Try this as a template (you can changer the language, location and model type).

whisper.exe --device cuda --language en --model "base" "C:\Videos\video name.mp4"

Former user wrote on 3/14/2023, 1:53 AM

This is the whisper variant i'm using currently https://github.com/Dadangdut33/Speech-Translate/releases/tag/1.1.0

It seems pretty good, in this example created subtitles for a 3min video using large dictionary in 1 minute (rtx 3080) . It gets things almost perfect until about 2 minutes where timing begins to be affected. I thought others interested in translation could use this as a barometer of sorts, and even download this video and compare the app version they're using. Russian historically has been difficult for whisper to do a good job at, If the whisper version you're using does a better job let us know

This uses GPU, maybe only Nvidia. It has no integration with any NLE.

Former user wrote on 3/27/2023, 11:31 PM

I tried the new version of StoryToolKit (Nvidia GPU only) by downloading the video in my last message and re-translating it. Top subs are the new translation. https://github.com/octimot/StoryToolkitAI/releases/tag/v0.17.16

It doesn't have the same timing problems seen with Speech-Translate, but as a negative it's formatting not as good, and instead of using multiple shorter sentences it seem to like to form paragraphs instead. Neither perfect options, but StoryToolKit in standalone mode (for Vegas users) possibly better choice, just need to break up the sub paragraphs manually where required

If your whisper translator does a better job, please share

wwaag wrote on 9/5/2023, 1:43 PM

Just wrote a new Batch WhisperAI Speech to Text tool and created a new thread. Here's the link https://www.vegascreativesoftware.info/us/forum/happyotter-batchwhisperai-speech-to-text--142423/

AKA the HappyOtter at https://tools4vegas.com/. System 1: Intel i7-8700k with HD 630 graphics plus an Nvidia RTX4070 graphics card. System 2: Intel i7-3770k with HD 4000 graphics plus an AMD RX550 graphics card. System 3: Laptop. Dell Inspiron Plus 16. Intel i7-11800H, Intel Graphics. Current cameras include Panasonic FZ2500, GoPro Hero11 and Hero8 Black plus a myriad of smartPhone, pocket cameras, video cameras and film cameras going back to the original Nikon S.

bitman wrote on 12/10/2023, 9:09 AM

@Former user I have adapted the script to support spaces in the audio filenames, it is in version 3, you can download it from the start page in this post. It is just a one line change, you could also just adapt the v2 script (around line148):

sw.WriteLine("whisper " + myFile + modelOption); //temp remove for speed testing rest of APP

to add stuff like + "\"" in the argument, this will avoid the argument being escaped prematurely!

sw.WriteLine("whisper " + "\"" + myFile + "\"" + modelOption); //temp remove for speed testing rest of APP

 

APPS: VIDEO: VP 365 suite (VP 22 build 250) VP 21 build 315, VP 365 20, VP 19 post (latest build -651), (uninstalled VP 12,13,14,15,16 Suite,17, VP18 post), Vegasaur, a lot of NEWBLUE plugins, Mercalli 6.0, Respeedr, Vasco Da Gamma 17 HDpro XXL, Boris Continuum 2025, Davinci Resolve Studio 18, SOUND: RX 10 advanced Audio Editor, Sound Forge Pro 18, Spectral Layers Pro 10, Audacity, FOTO: Zoner studio X, DXO photolab (8), Luminar, Topaz...

  • OS: Windows 11 Pro 64, version 24H2 (since October 2024)
  • CPU: i9-13900K with Air Cooler: Noctua NH-D15 G2 HBC
  • RAM: DDR5 Corsair 64GB (5600-40 Vengeance)
  • Graphics card: Gigabyte GeForce RTX 5090 Aorus Xtreme WF AIO 32GB
  • Monitor: LG UltraGear 45GX950A 44.5" WUHD 5K2K OLED monitor (21:9), Resolution: 5120x2160, 165 Hz
  • C-drive: Corsair MP600 PRO XT NVMe SSD 4TB (PCIe Gen. 4)
  • Video drives: Samsung NVMe SSD 2TB (980 pro and 970 EVO plus) each 2TB
  • Mass Data storage & Backup: WD gold 6TB + WD Yellow 4TB
  • MOBO: Gigabyte Z690 AORUS MASTER
  • PSU: Corsair HX1500i, Case: Fractal Design Define 7 (PCGH edition)
  • Misc.: Logitech G915, Evoluent Vertical Mouse, shuttlePROv2

 

 

bitman wrote on 12/11/2023, 7:30 AM

@Former user 

Version 4 should fix your issues! See post start.

Latest Update 11/12/2023:

I made a small update (but also a big improvement and bug fix for some users @Joelson) to support speech to text when the drive location of the audio media is not located on the same drive as the Vegas project.

By the way, text to speech media and Vegas project on the same, but another drive than C: did work in the previous versions (I tested this, hence some confusion), but apparently not when the Vegas project itself was on different drive then the Vegas media...

Last changed by bitman on 12/11/2023, 7:31 AM, changed a total of 1 times.

APPS: VIDEO: VP 365 suite (VP 22 build 250) VP 21 build 315, VP 365 20, VP 19 post (latest build -651), (uninstalled VP 12,13,14,15,16 Suite,17, VP18 post), Vegasaur, a lot of NEWBLUE plugins, Mercalli 6.0, Respeedr, Vasco Da Gamma 17 HDpro XXL, Boris Continuum 2025, Davinci Resolve Studio 18, SOUND: RX 10 advanced Audio Editor, Sound Forge Pro 18, Spectral Layers Pro 10, Audacity, FOTO: Zoner studio X, DXO photolab (8), Luminar, Topaz...

  • OS: Windows 11 Pro 64, version 24H2 (since October 2024)
  • CPU: i9-13900K with Air Cooler: Noctua NH-D15 G2 HBC
  • RAM: DDR5 Corsair 64GB (5600-40 Vengeance)
  • Graphics card: Gigabyte GeForce RTX 5090 Aorus Xtreme WF AIO 32GB
  • Monitor: LG UltraGear 45GX950A 44.5" WUHD 5K2K OLED monitor (21:9), Resolution: 5120x2160, 165 Hz
  • C-drive: Corsair MP600 PRO XT NVMe SSD 4TB (PCIe Gen. 4)
  • Video drives: Samsung NVMe SSD 2TB (980 pro and 970 EVO plus) each 2TB
  • Mass Data storage & Backup: WD gold 6TB + WD Yellow 4TB
  • MOBO: Gigabyte Z690 AORUS MASTER
  • PSU: Corsair HX1500i, Case: Fractal Design Define 7 (PCGH edition)
  • Misc.: Logitech G915, Evoluent Vertical Mouse, shuttlePROv2

 

 

bitman wrote on 12/11/2023, 2:29 PM

@Former user Strange, but the filename + file type extension + SRT extension is the correct way of the script. On my PC, whisper generates the above, and script uses the above and it just works...

Last changed by bitman on 12/11/2023, 2:29 PM, changed a total of 1 times.

APPS: VIDEO: VP 365 suite (VP 22 build 250) VP 21 build 315, VP 365 20, VP 19 post (latest build -651), (uninstalled VP 12,13,14,15,16 Suite,17, VP18 post), Vegasaur, a lot of NEWBLUE plugins, Mercalli 6.0, Respeedr, Vasco Da Gamma 17 HDpro XXL, Boris Continuum 2025, Davinci Resolve Studio 18, SOUND: RX 10 advanced Audio Editor, Sound Forge Pro 18, Spectral Layers Pro 10, Audacity, FOTO: Zoner studio X, DXO photolab (8), Luminar, Topaz...

  • OS: Windows 11 Pro 64, version 24H2 (since October 2024)
  • CPU: i9-13900K with Air Cooler: Noctua NH-D15 G2 HBC
  • RAM: DDR5 Corsair 64GB (5600-40 Vengeance)
  • Graphics card: Gigabyte GeForce RTX 5090 Aorus Xtreme WF AIO 32GB
  • Monitor: LG UltraGear 45GX950A 44.5" WUHD 5K2K OLED monitor (21:9), Resolution: 5120x2160, 165 Hz
  • C-drive: Corsair MP600 PRO XT NVMe SSD 4TB (PCIe Gen. 4)
  • Video drives: Samsung NVMe SSD 2TB (980 pro and 970 EVO plus) each 2TB
  • Mass Data storage & Backup: WD gold 6TB + WD Yellow 4TB
  • MOBO: Gigabyte Z690 AORUS MASTER
  • PSU: Corsair HX1500i, Case: Fractal Design Define 7 (PCGH edition)
  • Misc.: Logitech G915, Evoluent Vertical Mouse, shuttlePROv2

 

 

bitman wrote on 12/11/2023, 3:19 PM

I will have a look tomorrow for a specific solution for you if possible, it is getting late in Belgium's timezone!

APPS: VIDEO: VP 365 suite (VP 22 build 250) VP 21 build 315, VP 365 20, VP 19 post (latest build -651), (uninstalled VP 12,13,14,15,16 Suite,17, VP18 post), Vegasaur, a lot of NEWBLUE plugins, Mercalli 6.0, Respeedr, Vasco Da Gamma 17 HDpro XXL, Boris Continuum 2025, Davinci Resolve Studio 18, SOUND: RX 10 advanced Audio Editor, Sound Forge Pro 18, Spectral Layers Pro 10, Audacity, FOTO: Zoner studio X, DXO photolab (8), Luminar, Topaz...

  • OS: Windows 11 Pro 64, version 24H2 (since October 2024)
  • CPU: i9-13900K with Air Cooler: Noctua NH-D15 G2 HBC
  • RAM: DDR5 Corsair 64GB (5600-40 Vengeance)
  • Graphics card: Gigabyte GeForce RTX 5090 Aorus Xtreme WF AIO 32GB
  • Monitor: LG UltraGear 45GX950A 44.5" WUHD 5K2K OLED monitor (21:9), Resolution: 5120x2160, 165 Hz
  • C-drive: Corsair MP600 PRO XT NVMe SSD 4TB (PCIe Gen. 4)
  • Video drives: Samsung NVMe SSD 2TB (980 pro and 970 EVO plus) each 2TB
  • Mass Data storage & Backup: WD gold 6TB + WD Yellow 4TB
  • MOBO: Gigabyte Z690 AORUS MASTER
  • PSU: Corsair HX1500i, Case: Fractal Design Define 7 (PCGH edition)
  • Misc.: Logitech G915, Evoluent Vertical Mouse, shuttlePROv2

 

 

bitman wrote on 12/12/2023, 7:47 AM

@Former user I have a version v5 of the script ready. It should work for both our machines,

  • for those that whisper saves filename + media type extension + .srt
  • for those that whisper saves filename + .srt (without media type extension .wav .mp4 etc...)

Not sure why whisper works differently, maybe you have an other version or a different install of all the stuff that is needed to make whisper work.

Anyway, solution was to copy the .srt text file without file extension into an .srt file with media type extension so the rest of the script would work (but only in case the file did not exist via a stripping .srt and reconstruct the full path + mediatype + .srt)

Last changed by bitman on 12/12/2023, 7:57 AM, changed a total of 1 times.

APPS: VIDEO: VP 365 suite (VP 22 build 250) VP 21 build 315, VP 365 20, VP 19 post (latest build -651), (uninstalled VP 12,13,14,15,16 Suite,17, VP18 post), Vegasaur, a lot of NEWBLUE plugins, Mercalli 6.0, Respeedr, Vasco Da Gamma 17 HDpro XXL, Boris Continuum 2025, Davinci Resolve Studio 18, SOUND: RX 10 advanced Audio Editor, Sound Forge Pro 18, Spectral Layers Pro 10, Audacity, FOTO: Zoner studio X, DXO photolab (8), Luminar, Topaz...

  • OS: Windows 11 Pro 64, version 24H2 (since October 2024)
  • CPU: i9-13900K with Air Cooler: Noctua NH-D15 G2 HBC
  • RAM: DDR5 Corsair 64GB (5600-40 Vengeance)
  • Graphics card: Gigabyte GeForce RTX 5090 Aorus Xtreme WF AIO 32GB
  • Monitor: LG UltraGear 45GX950A 44.5" WUHD 5K2K OLED monitor (21:9), Resolution: 5120x2160, 165 Hz
  • C-drive: Corsair MP600 PRO XT NVMe SSD 4TB (PCIe Gen. 4)
  • Video drives: Samsung NVMe SSD 2TB (980 pro and 970 EVO plus) each 2TB
  • Mass Data storage & Backup: WD gold 6TB + WD Yellow 4TB
  • MOBO: Gigabyte Z690 AORUS MASTER
  • PSU: Corsair HX1500i, Case: Fractal Design Define 7 (PCGH edition)
  • Misc.: Logitech G915, Evoluent Vertical Mouse, shuttlePROv2

 

 

bitman wrote on 12/12/2023, 9:08 AM

@Former user You owe me a beer!

around line 649 in the v5 script if you open it with the free notepad++, you see the following:

            if (spaces == 9)  //seems optimal for ENGLISH

You can increase 9 with a higher number; this will allow more spaces in the line of text (crude method used to detect sentence length) before a newline is issued.

Last changed by bitman on 12/12/2023, 9:09 AM, changed a total of 1 times.

APPS: VIDEO: VP 365 suite (VP 22 build 250) VP 21 build 315, VP 365 20, VP 19 post (latest build -651), (uninstalled VP 12,13,14,15,16 Suite,17, VP18 post), Vegasaur, a lot of NEWBLUE plugins, Mercalli 6.0, Respeedr, Vasco Da Gamma 17 HDpro XXL, Boris Continuum 2025, Davinci Resolve Studio 18, SOUND: RX 10 advanced Audio Editor, Sound Forge Pro 18, Spectral Layers Pro 10, Audacity, FOTO: Zoner studio X, DXO photolab (8), Luminar, Topaz...

  • OS: Windows 11 Pro 64, version 24H2 (since October 2024)
  • CPU: i9-13900K with Air Cooler: Noctua NH-D15 G2 HBC
  • RAM: DDR5 Corsair 64GB (5600-40 Vengeance)
  • Graphics card: Gigabyte GeForce RTX 5090 Aorus Xtreme WF AIO 32GB
  • Monitor: LG UltraGear 45GX950A 44.5" WUHD 5K2K OLED monitor (21:9), Resolution: 5120x2160, 165 Hz
  • C-drive: Corsair MP600 PRO XT NVMe SSD 4TB (PCIe Gen. 4)
  • Video drives: Samsung NVMe SSD 2TB (980 pro and 970 EVO plus) each 2TB
  • Mass Data storage & Backup: WD gold 6TB + WD Yellow 4TB
  • MOBO: Gigabyte Z690 AORUS MASTER
  • PSU: Corsair HX1500i, Case: Fractal Design Define 7 (PCGH edition)
  • Misc.: Logitech G915, Evoluent Vertical Mouse, shuttlePROv2

 

 

jetdv wrote on 12/12/2023, 9:26 PM

@Former user, One thing you probably need to do is make sure your timeline timecode format matches the SRT file (i.e. Time). See if that makes a difference. If it does, the script can change to that format and then change back to the current format at the end as we did with the other scripts you were working with.

public RulerFormat OrgRulerFormat;
            OrgRulerFormat = myVegas.Project.Ruler.Format;
            myVegas.Project.Ruler.Format = RulerFormat.Time;
            myVegas.UpdateUI();
                myVegas.Project.Ruler.Format = OrgRulerFormat;

 

bitman wrote on 12/13/2023, 8:21 AM

@Former user I have added a new script "Whisper STT RAW v1 (see beginning of post), this is basically the same v5 script, but omits the word wrap optimization's after 9 words, and as such keeps the original srt layout.

APPS: VIDEO: VP 365 suite (VP 22 build 250) VP 21 build 315, VP 365 20, VP 19 post (latest build -651), (uninstalled VP 12,13,14,15,16 Suite,17, VP18 post), Vegasaur, a lot of NEWBLUE plugins, Mercalli 6.0, Respeedr, Vasco Da Gamma 17 HDpro XXL, Boris Continuum 2025, Davinci Resolve Studio 18, SOUND: RX 10 advanced Audio Editor, Sound Forge Pro 18, Spectral Layers Pro 10, Audacity, FOTO: Zoner studio X, DXO photolab (8), Luminar, Topaz...

  • OS: Windows 11 Pro 64, version 24H2 (since October 2024)
  • CPU: i9-13900K with Air Cooler: Noctua NH-D15 G2 HBC
  • RAM: DDR5 Corsair 64GB (5600-40 Vengeance)
  • Graphics card: Gigabyte GeForce RTX 5090 Aorus Xtreme WF AIO 32GB
  • Monitor: LG UltraGear 45GX950A 44.5" WUHD 5K2K OLED monitor (21:9), Resolution: 5120x2160, 165 Hz
  • C-drive: Corsair MP600 PRO XT NVMe SSD 4TB (PCIe Gen. 4)
  • Video drives: Samsung NVMe SSD 2TB (980 pro and 970 EVO plus) each 2TB
  • Mass Data storage & Backup: WD gold 6TB + WD Yellow 4TB
  • MOBO: Gigabyte Z690 AORUS MASTER
  • PSU: Corsair HX1500i, Case: Fractal Design Define 7 (PCGH edition)
  • Misc.: Logitech G915, Evoluent Vertical Mouse, shuttlePROv2

 

 

jetdv wrote on 12/13/2023, 9:17 PM

Here's the changes needed to switch it to "Time" and then back to whatever it was:

https://www.vegascreativesoftware.info/us/forum/speech-to-text-via-whisper-openai--137928/?page=3#ca900398

 

bitman wrote on 12/14/2023, 4:39 AM

@Former user @jetdv Good find, I never realized that one of your issues issue was related to a specific user setting of the timeline timecode. Mine was set to yet another variant: "Time & Frames", this one however pretty much behaved like "Time" for the scripts. To avoid issues and to improve the script, I have added @jetdv code to set the timeline timecode to "Time" when adding subtitles, and restoring the user preference after the insert.

See start of this post for script version v6 (and the RAW variant v2), I have removed the links of the older scripts to clean up a bit.

APPS: VIDEO: VP 365 suite (VP 22 build 250) VP 21 build 315, VP 365 20, VP 19 post (latest build -651), (uninstalled VP 12,13,14,15,16 Suite,17, VP18 post), Vegasaur, a lot of NEWBLUE plugins, Mercalli 6.0, Respeedr, Vasco Da Gamma 17 HDpro XXL, Boris Continuum 2025, Davinci Resolve Studio 18, SOUND: RX 10 advanced Audio Editor, Sound Forge Pro 18, Spectral Layers Pro 10, Audacity, FOTO: Zoner studio X, DXO photolab (8), Luminar, Topaz...

  • OS: Windows 11 Pro 64, version 24H2 (since October 2024)
  • CPU: i9-13900K with Air Cooler: Noctua NH-D15 G2 HBC
  • RAM: DDR5 Corsair 64GB (5600-40 Vengeance)
  • Graphics card: Gigabyte GeForce RTX 5090 Aorus Xtreme WF AIO 32GB
  • Monitor: LG UltraGear 45GX950A 44.5" WUHD 5K2K OLED monitor (21:9), Resolution: 5120x2160, 165 Hz
  • C-drive: Corsair MP600 PRO XT NVMe SSD 4TB (PCIe Gen. 4)
  • Video drives: Samsung NVMe SSD 2TB (980 pro and 970 EVO plus) each 2TB
  • Mass Data storage & Backup: WD gold 6TB + WD Yellow 4TB
  • MOBO: Gigabyte Z690 AORUS MASTER
  • PSU: Corsair HX1500i, Case: Fractal Design Define 7 (PCGH edition)
  • Misc.: Logitech G915, Evoluent Vertical Mouse, shuttlePROv2

 

 

zzzzzz9125 wrote on 12/14/2023, 10:28 AM

hey, I've found two problems.

1. In Vegas Pro 16 and before, the script can't generate Titles & Text events properly for its GUID has been changed after 17. Just change {Svfx:com.vegascreativesoftware:titlesandtext} in your script to {Svfx:com.sonycreativesoftware:titlesandtext}, so that it can be used in 16 and before, without affecting the functionality in newer versions.

2. When I click the Balanced button, it can generate the .srt file (with other files) normally, but when I click Draft(fast), nothing is generated in the folder. I don't know what's going on.

Last changed by zzzzzz9125 on 12/14/2023, 10:28 AM, changed a total of 1 times.

Using VEGAS Pro 22 build 250 & VEGAS Pro 21 build 208.

Information about my PC:
Brand Name: HP VICTUS Laptop
System: Windows 11.0 (64-bit) 10.00.22631
CPU: 12th Gen Intel(R) Core(TM) i7-12700H
GPU: NVIDIA GeForce RTX 3050 Laptop GPU
GPU Driver: NVIDIA Studio Driver 560.70

bitman wrote on 12/14/2023, 11:56 AM

@zzzzzz9125 Thanks, useful information for those who are at Vegas pro 16 (or even older versions), feel free to change the script for yourselves, but I am not going to do it, for the simple reason that I cannot test it anymore on version 16; I upgrade every year and keep maximum 1 or 2 prior versions after the update; oldest version -1 usually get the axe when I upgrade!

APPS: VIDEO: VP 365 suite (VP 22 build 250) VP 21 build 315, VP 365 20, VP 19 post (latest build -651), (uninstalled VP 12,13,14,15,16 Suite,17, VP18 post), Vegasaur, a lot of NEWBLUE plugins, Mercalli 6.0, Respeedr, Vasco Da Gamma 17 HDpro XXL, Boris Continuum 2025, Davinci Resolve Studio 18, SOUND: RX 10 advanced Audio Editor, Sound Forge Pro 18, Spectral Layers Pro 10, Audacity, FOTO: Zoner studio X, DXO photolab (8), Luminar, Topaz...

  • OS: Windows 11 Pro 64, version 24H2 (since October 2024)
  • CPU: i9-13900K with Air Cooler: Noctua NH-D15 G2 HBC
  • RAM: DDR5 Corsair 64GB (5600-40 Vengeance)
  • Graphics card: Gigabyte GeForce RTX 5090 Aorus Xtreme WF AIO 32GB
  • Monitor: LG UltraGear 45GX950A 44.5" WUHD 5K2K OLED monitor (21:9), Resolution: 5120x2160, 165 Hz
  • C-drive: Corsair MP600 PRO XT NVMe SSD 4TB (PCIe Gen. 4)
  • Video drives: Samsung NVMe SSD 2TB (980 pro and 970 EVO plus) each 2TB
  • Mass Data storage & Backup: WD gold 6TB + WD Yellow 4TB
  • MOBO: Gigabyte Z690 AORUS MASTER
  • PSU: Corsair HX1500i, Case: Fractal Design Define 7 (PCGH edition)
  • Misc.: Logitech G915, Evoluent Vertical Mouse, shuttlePROv2

 

 

jetdv wrote on 12/15/2023, 9:14 AM

@bitman, for my scripts that need to work in "14 and newer", I did this:

            if (myVegas.Version.Contains("14") || myVegas.Version.Contains("15") || myVegas.Version.Contains("16"))
            {
                genUID = "{Svfx:com.sonycreativesoftware:titlesandtext}"; //Sony Titles & Text
            }
            else
            {
                genUID = "{Svfx:com.vegascreativesoftware:titlesandtext}"; //Magix Titles & Text
            }

 

bitman wrote on 12/18/2023, 2:51 AM

@jetdv Thanks, I added the code in v7

Update 18/12/2023: "Whisper Speech To Text v7"

  • add backward compatibility for (old "Sony" Vegas versions UI plugin naming) scripting in Vegas 14,15,16 (Note: only tested in Vegas 21 - not tested in 14, 15 or 16)

@zzzzzz9125 version v7 for you to try, I could not test it on these older non Magix Vegas versions!

APPS: VIDEO: VP 365 suite (VP 22 build 250) VP 21 build 315, VP 365 20, VP 19 post (latest build -651), (uninstalled VP 12,13,14,15,16 Suite,17, VP18 post), Vegasaur, a lot of NEWBLUE plugins, Mercalli 6.0, Respeedr, Vasco Da Gamma 17 HDpro XXL, Boris Continuum 2025, Davinci Resolve Studio 18, SOUND: RX 10 advanced Audio Editor, Sound Forge Pro 18, Spectral Layers Pro 10, Audacity, FOTO: Zoner studio X, DXO photolab (8), Luminar, Topaz...

  • OS: Windows 11 Pro 64, version 24H2 (since October 2024)
  • CPU: i9-13900K with Air Cooler: Noctua NH-D15 G2 HBC
  • RAM: DDR5 Corsair 64GB (5600-40 Vengeance)
  • Graphics card: Gigabyte GeForce RTX 5090 Aorus Xtreme WF AIO 32GB
  • Monitor: LG UltraGear 45GX950A 44.5" WUHD 5K2K OLED monitor (21:9), Resolution: 5120x2160, 165 Hz
  • C-drive: Corsair MP600 PRO XT NVMe SSD 4TB (PCIe Gen. 4)
  • Video drives: Samsung NVMe SSD 2TB (980 pro and 970 EVO plus) each 2TB
  • Mass Data storage & Backup: WD gold 6TB + WD Yellow 4TB
  • MOBO: Gigabyte Z690 AORUS MASTER
  • PSU: Corsair HX1500i, Case: Fractal Design Define 7 (PCGH edition)
  • Misc.: Logitech G915, Evoluent Vertical Mouse, shuttlePROv2

 

 

bvideo wrote on 12/19/2023, 3:18 PM

Fantastic!

pierre-k wrote on 12/20/2023, 4:36 AM

I know that if I want to export subtitles from Vegas to srt or sub, I have to use Vegasaur.

Has anything changed for the better over the years? How do you export these subtitles?

bitman wrote on 1/15/2024, 9:22 AM

@Former user I do not really use speech to text (I do use a lot of text to speech), the whisper thing was just a fun project I wanted to try out, so I do not recall I tried any longer audio than a few minutes. So It may have always been an issue, I am not sure. Anyway I think it is aways safer to cut stuff (video or audio) in smaller chunks which are more manageable and easier on your system to process.

On the other hand, there may be some internal script process "safety" timing I do not know off in the Vegas scripting engine which may time out if the script engine is idling too long whilst whisper is busy processing. If that is the case, the script may need extra code to keep itself busy!

Something @jetdv can maybe shed a light (if scripts can time out when idling).

Last changed by bitman on 1/15/2024, 9:23 AM, changed a total of 1 times.

APPS: VIDEO: VP 365 suite (VP 22 build 250) VP 21 build 315, VP 365 20, VP 19 post (latest build -651), (uninstalled VP 12,13,14,15,16 Suite,17, VP18 post), Vegasaur, a lot of NEWBLUE plugins, Mercalli 6.0, Respeedr, Vasco Da Gamma 17 HDpro XXL, Boris Continuum 2025, Davinci Resolve Studio 18, SOUND: RX 10 advanced Audio Editor, Sound Forge Pro 18, Spectral Layers Pro 10, Audacity, FOTO: Zoner studio X, DXO photolab (8), Luminar, Topaz...

  • OS: Windows 11 Pro 64, version 24H2 (since October 2024)
  • CPU: i9-13900K with Air Cooler: Noctua NH-D15 G2 HBC
  • RAM: DDR5 Corsair 64GB (5600-40 Vengeance)
  • Graphics card: Gigabyte GeForce RTX 5090 Aorus Xtreme WF AIO 32GB
  • Monitor: LG UltraGear 45GX950A 44.5" WUHD 5K2K OLED monitor (21:9), Resolution: 5120x2160, 165 Hz
  • C-drive: Corsair MP600 PRO XT NVMe SSD 4TB (PCIe Gen. 4)
  • Video drives: Samsung NVMe SSD 2TB (980 pro and 970 EVO plus) each 2TB
  • Mass Data storage & Backup: WD gold 6TB + WD Yellow 4TB
  • MOBO: Gigabyte Z690 AORUS MASTER
  • PSU: Corsair HX1500i, Case: Fractal Design Define 7 (PCGH edition)
  • Misc.: Logitech G915, Evoluent Vertical Mouse, shuttlePROv2

 

 

jetdv wrote on 1/15/2024, 10:41 AM

The script, itself, should have no issue. However, newer versions of Vegas have started including something that checks if there's no responses so Vegas can properly shut down if it gets stuck so that might be coming into play depending on the version used.

Kilo wrote on 5/17/2024, 9:48 PM

Hello! I'm hoping you can do me a huge favor. I followed your tutorial on how to do auto subtitles in Vegas pro without the 365 service and after many hours of trial and error (my fault) I got it to work. Now I am seeing another problem which I never expected unfortunately! I make vertical videos and the subtitles I create are only 1-3 words long not entire sentences like whisper AI outputs. I've seen some people online suggesting changing the source code but I'm not sure I can do that since I downloaded it from the console like your tutorial showed. Do you have any suggestions on how i should fix this problem. Thanks a lot for any help you can give me i know your times valuable so i truly appreciate it