AutoPod-like scripts

Gastor wrote on 11/30/2023, 7:20 PM

As someone who edits Podcast, the tools created by AutoPod, mainly their Multi-cam Editor, save me a ton of time. But they require me to work in Adobe Premiere to edit the podcast I'm working on.

I was wondering if there are any scripts that do what their extensions do, but for Vegas?

1) Their multicam editor cuts between multiple speakers and changes which camera based on who is speaking

2) their jumpcut editor cuts out silent parts of a conversation

3) their social media editor can reframe clips to different sizes and use keyframes to keep things centered, as well as add in logo/music for your brand towards the end of a clip.


Is anything like this out there for Vegas pro? If not, would any of this be possible to script for a newbie script-creator?


jetdv wrote on 12/1/2023, 8:44 AM

@Gastor, I know you sent me an e-mail related to this too so I'll just answer here. Personally, I have not really thought about this. There are definitely some hurdles that would have to be crossed.

How do you know which person is speaking? (I'm not sure how you're determining that in Premiere unless their speech to text can separate that). If each person's speaking was on a separate track, then it would be fairly easy to switch cameras between them. Basically, use track 1 if there's an event on the first person's audio track and use track 2 for events on the second person's audio track - or something of that nature. More information would be needed here, though.

Cutting out "silent" parts is doable as I have an entire series of tutorials that show how that can be done. It uses the render "loudness" log so a method that searches for speech and determines no one is talking to find these sections might be better.

As for reframing clips, you could just use Pan/Crop, Track Motion, the "Crop" effect and the "Picture in Picture" effect to do that so not sure why you'd need a separate "tool". As for adding a logo, that would also be no issue but there would simply need to be a definition of what is to be added where.

Gastor wrote on 12/2/2023, 10:58 PM

thanks for getting back to me @jetdv In terms of switching to who is speaking - you're right - it does so by having each camera angle paired to a vocal track. You enter who is on which camera and what mic is picking up their volume. It works with up to 10 speakers and 10 cameras. So in my setup, I use 2 audio tracks and 3 cameras, 1 camera and audio track for each speaker on the pod, and a third camera as a wide capturing both. It then does a good job of understanding who to cut to based on the audio. The settings can be adjusted to cut to the wide shot more or less, depending on taste. I've seen it cut between an hour long podcast, featuring 3 speakers and 4 cameras, within 4 minutes, and require minimal adjustments to its cuts from my part. This gives me a great rough edit that frees up a ton of my time, and lets me focus on other aspects of the video (layout, graphics popups etc.) that really enhance the production value of the pod. If you're into audio, it basically functions as a Sidechain would in music, that ducks a bassline when the kickdrum hits. In this case, it "ducks" one video feed for another, based on which audio track features dialogue.


Considering Vegas has script capabilities and audio plugins, i feel like this should be something that can be created for it. I also think Vegas is focusing on smaller creatives, and i think tools that would help podcast editors falls into that, so I'm hopeful someone sees the value of what AutoPod is doing.

jetdv wrote on 12/3/2023, 7:36 AM

@Gastor I don't know of any tools in VEGAS that splits the audio per person. If you can get the split audio on the timeline in VEGAS, a script can certainly adjust the video to match (even with settings for a third camera). I don't know if the "whisper" audio translation tools used by some on this forum is capable of distinguishing between speakers and labeling/separating them.

Take a look at this tutorial:

I believe the person requesting this was actually using Premiere to split the audio and pick the different speakers. Then this script would take, for example, all of "Speaker 1" and move those blocks to a new track. Once "Speaker 1" and "Speaker 2" are on different tracks, then it's fairly trivial to select which camera to use. Instead of "ducking" cameras, what I typically do is add a new track and add the camera snippets to that new track. In fact, you can even add the other camera angles as "takes" making it easy to go to any section and Press "T" to switch between the angles.

Gastor wrote on 12/4/2023, 8:53 AM

You're correct. (Sorry if i wasn't explaining it clearly earlier.) They do require you use different audio tracks with each speakers/camera audio being captured separately. That way, matching the camera to the speaker is easy as the audio from the track that each camera angle is linked to, only has the dialogue from one speaker.

This promo video showing the edit, and the speed of the cuts are a real-time example of how fast it goes through the podcast. It's incredible. I'd love to see a version of this for Vegas.

The tutorial you shared is impressive. While it's not what's needed to create what AutoPod is doing, it does give me hope that it's possible. Thanks for sharing it. It opens up a few other ideas.


jetdv wrote on 12/4/2023, 9:06 AM

@Gastor, would you give me a sample on the VEGAS timeline so I can see what the original source would look like in order to generate the final output?

I'd need the original video/audio files as well for testing.

Gastor wrote on 12/4/2023, 9:40 AM

Yes I can @jetdv I'll have some stuff ready for you today

Gastor wrote on 12/5/2023, 9:58 AM

let me know if you got my DM with the link @jetdv I'm excited to hear your thoughts on what's possible

jetdv wrote on 12/5/2023, 1:16 PM

@Gastor, I did receive it, I have gotten the files downloaded (haven't had a chance to look at them yet) but it might be next week before I can really start to look at this. It's a busy season... Thanks for the files as they will definitely help as I start looking into this.

Gastor wrote on 12/31/2023, 2:20 PM

btw @jetdv it looks like someone is working on something similar for Resolve and Final Cut.

This one technically is a stand alone app that would render a final project file that could then be uploaded into Resolve or Final Cut. So not exactly the same as Autopod, which is a plugin for premiere

would love to see what you come up with for Vegas

jetdv wrote on 12/31/2023, 2:43 PM

@Gastor Cool. I wonder if it's the same person? Sounds like they're mainly Mac editors.

Gastor wrote on 2/13/2024, 1:22 PM

I spoke with Mo from podmate this past week. He's made some great progress. He's building something separately from the autopod team. I got to demo the current beta version of his podmate, which does the primary thing that autopod does - uses the vocal tracks from each video track to switch between active speakers, and also he's figured out how to also occasionally do a split screen with both speakers as an establishing shot. This is awesome for podcast editors. Unlike autopod, his won't be a plugin or a script, instead, its a separate app that spits out a new XML file that can be loaded in resolve or vegas. Would love to see what you're able to come up with @jetdv

jetdv wrote on 2/13/2024, 1:28 PM

@Gastor, I seriously haven't been able to even begin looking at this. If you need help importing the XML, I'm sure I could help with that as it wouldn't take nearly as long. But to do everything required for a full script, that would certainly take a lot of time that I just don't have right now. If your person wanted to talk to me about possibly automating the access to his product from a script in VEGAS, that might also be a possibility.