You can check "The Image Processing Handbook, 5th Edition" by Russ, or any one of numerous neurological studies - but it is a measured scientific fact that human eyes tend to fall in the 1 arc minute of viewing angle. There are physical constraints caused by the way our eyes are designed that make this true. Note that this is not my opinion. You can go look up the studies yourself if you want - this is a pretty well researched fact.
At 2 feet - which is far closer than most people sit to their monitors - that is a pixel density of 184. An 8k monitor would have a pixel density of over 400. So most (if not all) human beings with 20/20 vision would see no improvement from an 8k monitor. I won't claim nobody would - but out of ~8 billion people in the world there might be 1 or 2 given the tolerances on how these were measured.
That brings us to a question of economics. While stranger things have happened, I find it highly unlikely that monitor manufacturers will ever build a monitor so far out of spec with what the human eye can actually see. It costs them a lot more for what ends up being a marketing point on a slide. People won't see the difference between two monitors sitting side by side (because they physically can't), so they will by the cheaper monitor.
The only way I can see 8k monitors being built is the "Monster Cable" effect. Something where people buy it not because it is actually better, but because it is expensive and has fancy marketing attached. I just plain do not see the margins in that for monitor manufacturers. Basically, I wouldn't hold your breath.
""Check your algorithms in motion" (Who said that? I forgot, it was in some slides about AA algorithms)
One arc minute is, per your references, adequate for a photo frame showing your grand kids and doing an alpha transition between (already AA'ed by the camera) photos every minute. It might not be so adequate with the typical street light in a racing game coming closer, going from 1 pixel width to 2 pixels width then back to 1 pixel width in the distance, then alternating between 2-3 pixels width, and so on, as it drifts to one side and closer. You will notice the change in widths, in motion, because you are, as you just quoted, sensitive to 1 arc minute. The relative differences are huge at the 1-2 pixel width range, and not so noticeable at the 40-41 pixel range.
AA helps with that, of course. So you can have both GPU AA, but also "monitor AA". I will speculate and say that with 4xMSAA, as an approximation, you will see a 1-1.25 pixel "width strobe" at what was before the 1-2 pixel range. Or at worst, a 0.25-0.50 pixel, gentle, alpha blended vibration. "Monitor AA" will apply a reduction factor to those values, and if your GPU can handle the multiplied resolution, why not use both?
IMHO 8k sounds a bit excessive today, but I'd love to have 3x 4k monitors with added 4x MSAA.