Needs Auto lip-sync Animation Script

chihirobelmo wrote on 12/22/2015, 10:06 PM
Hi,

I have an idea of auto lip syncing script, but wondering if Vegas pro API allows me to do that.
My idea is as follows:

1. Draw a humans face without mouth, then save it as PNG.
2. Draw 3 or more pictures of a mouth(ex:closed, half-open, and full-open), then save them as
Transparent PNG.
3. Open VegasPro and start new project. Add face.png on "trackA", add voice.wav on "trackB", then create "trackC" above "trackA".

4. Run the script and it will listen to "trackB" from the start to an end, determines current gain at that moment every 1/8 or 1/12 seconds.
5. While 4 is running, the script also adds one of the mouth pictures on "TrackC" which fits to current gain.

(ex.Adds closed mouth picture as an event at first.
Extend the event until the current gain goes above the threshold.
After reaching the threshold it will adds half-open mouth pic, and extend the event until the current gain go below the threshold which allows script to add another closed mouth pic again, or reaches another threshold which allows script to add full-open mouth pic.)

I have looked for VegasScriptAPI.htm but I only found a property which gets or sets the volume gain value for the audio track, seems won't determine the gain every seconds.

I really don't have any experience of touching C# (only have an experience of touching MATLAB), but if there is a way I'd like to try and any help would greatly be appreciated.

Comments

JohnnyRoy wrote on 12/23/2015, 8:28 AM
You would need access to the audio stream and the Vegas Pro Script API doesn't give you this. The best you can do is open the audio file yourself and read it and determine what to do from there. You would definitely need to learn C# and be skilled in processing audio files.

Personally, I would just buy a copy of CrazyTalk and use that.

~jr
chihirobelmo wrote on 12/27/2015, 6:23 AM
Thank you JohnnyRoy

It seems using Crazytalk you mentioned is much easier way to do a lip-sync.

I also found BorisFX has an Audio-driven visualizer so there might be a way to access to the audio stream, but making videoFX must be much harder way and will take so many hours.