[I]"But how do you sandwich the text between two peopole or things as seen here in the fast food bit."[/I]
You just create a mask that "cuts out" the text as it crosses something you want it to appear to go behind. Vegas has all the tools you need to do this although it is easier in dedicated compositing applications.
Once you have your resolved your motion track for your text and you have layered it in Vegas
you can use the lumens part = the shirt = whiteish, use that as a mask you don’t physically have to mask each frame the colour is the mask or the dark bits same thing