Can't answer "why", but can offer a view: the color grading system acts as though you are creating a custom LUT, made from all those parameters you can adjust, and including other LUTs, all arithmetically composed into a single LUT, which by its nature cannot vary over time. It may have some performance benefits over traditional keyframed effects, but if you want keyframes you need to fall back to the traditional effects.
There are a couple of ways to automate luts. In the lut fx you can animate the strength parameter. Only way I know to do it with cgp right now is to dup the clip and put a different grade on each, then inversely crossfade or composite envelope them. I've tried that and it adds significant processing overhead. Exporting the luts manually and throwing them into Lut fx with inverse strength animation seemed to play more smoothly and render quicker but is tedious to update. If they implemented strength animation on each cgp/lut keyframe, that could be the best of all worlds.
Sure you can do LUT strength easily but you'd presumably want to keyframe the components of the LUT (say just white balance or exposure as it changes in the shot) not the result- literally everything including log to Rec 709 conversions if any.
Keyframing CGP could end up with the equivalent of hundreds or thousands of different LUTs over the course of a video and switching between them from frame to frame.
The only way I can imagine it working is if each element of the CGP created its own LUT which were keyframed independently (so you don't need a new LUT every frame), but that processing overhead might be high.
I always do it like howard mentioned... I grade an event, then split it, and adjust the grade on the right side of the split to where I want it to end up... then cross fade the split, and it will fade from one set of color grading panel settings to the next. You can adjust the length of the fade to suit your needs from there. That's easier than keyframing, IMO, and you can adjust the fade's curve to change how it eases in and out. Each split you add to the event essentially becomes a keyrframe that stores various CGP settings.
I suppose a big problem with having the usual Vegas keyframes for parameters in the CG panel is that each parameter in a keyframe can have a different curve associated with it. So it would not work to just statically create a LUT at each keyframe and dynamically crossfade the strength from each LUT to the next while rendering or playing.That's because some parameters would need to change at different rates from others.
But the techniques described by @fr0sty and @Howard-Vigorita don't depend on different curves for each parameter, so maybe that is a useful compromise for some. I mean when each frame is rendered, the LUT that is applied could be dynamically created as an interpolation of strength between the previous and the next LUT (which is to say those LUTs statically computed at those keyframes). (I think that would be more efficient than the brute strength of dynamically creating a LUT at each frame from the interpolation of every CG parameter.) I don't know if sequential interpolation from one LUT to the next would be efficient or not, but it sounds like it would be more efficient than a crossfade between two events with their own LUTs applied.