There is no catch all or even "close to" recipe for this.
You will have to address each event independently and then recognize each negative characteristic involved within and devise a plan of attack based on that. Plan on expending more effort than you did on the video if you expect professional (maybe even tolerable) results.
At the very least, you might try turning on the 'normalize' switch for each event (you should be able to multi-select before doing so). This will match the peak levels of all of the events, and a slight crossfade between events (if possible) will disguise any jumps. If your sound events are REALLY all over the place, you probably won't be able to avoid doing some hands-on work on each.
Probably best to try a couple of different things and see what works best. I'm usually satisfied with silence in gaps, just fade in and out gently to reduce the distraction factor.
In the end, all of this is pretty subjective stuff.