I called them Sync-Points because I dont know a better word for them, and Mp3 calls its frame-boundary that way. Im NOT speaking about I-frames, even though in the worst case they will have the same starting point. A sync-point would simply add a Header with enough information to start CABAC-decoding from there, such headers are only a few bytes and I would expect encoders to be capable of adding them regulary, either each x frames or each x KB of (unCABACed) Data.By "sync-points" I guess you mean "I-frames". These might be every 0.5~1 seconds or perhaps even further apart. If you want to try that with HD data then you'd better have a lot of RAM.
For the RAM-requirement: assume worstcase is one second and you have a 40Mps stream, thats 5MB worth of compressed data. Its going to expand after unCABACing, but its worth noting that the result is still heavily compressed, and a few orders of magnitude smaller than the uncompressed frames. CABAC is still just entropy-encoding like Huffman, and that aint very efficient in most cases and even in the best case its less than 1:8. So the requirement in RAM is not insignificant, but doable IMHO.