They just tacked it on a regular SDR controller
Are you sure? That doesn't seem possible, considering DRDRAM doesn't deal with RAS/CAS addressing etc; that's done locally on-die of each RAM chip from what I understand.
adding extra latency serializing-deserializing data and commands
No way to avoid that with a serial-like bus...
on top of that they weren't able to take advantage of some the more advanced things i RDRAM (like many more open pages).
From what I read at the time, they DID do that, but had to limit the number of open pages due to DRDRAM power draw. Since each chip could dissipate as much as 4W apiece would have meant a full RIMM would have destroyed itself without active cooling. And who can count on that being generally available in PCs?
at the same time Alpha EV7 had <100ns (but with the memory controller on die) with the same RDRAM memory chips.
While I don't doubt EV7 would have been faster at accessing memory, I can imagine it was due to somewhat other reasons than just plain intel incompetence...
Just for starters, EV7 was a newer and more evolved product than the i850 chipset, aimed at a higher-end piece of the market, likely with more I/O buffers etc and other advanced features simply because R&D budget was bigger, and general price point of the end result was (a lot!) higher. I850 was a consumer chipset, very cost sensitive!
EV7 was also HIGHLY multichanneled (8, as I recall - which allows some pretty advanced interleaving), and likely only allowed one RIMM per channel while i850 had only 2 channels with 2 RIMMs apiece, meaning twice as long a signal path, twice the max number of devices. And both the device count and the bus length affects latency from what I understand.