There are a ton of latency-correcting protocols built into bluetooth (and audio systems in operating systems) so for example, if you switched between wired and bluetooth headphones whilst watching video in something like VLC, VLC will adjust the A/V automatically - assuming the devices/OS are all providing this information.
Yeah, this is the right distinction. For casually consuming content (eg watching a prerecorded video) the latency doesn't really matter, so long as it can stay very consistent. Even if the RTT latency is a whopping 1000msec, so long as the total stack is aware of the situation, it will simply delay the video output by the equivalent 1000ms and -- to your eyes and ears -- it all lines up just fine.
For having a purely audio conversation, the rule of thumb in the enterprise VoIP space is to keep round-trip latency below 50msec for a "natural" conversation. Anything more and people start to perceive the prolonged pauses where you start talking on top of eachother. The difference here is, so long as absolute latency is below 50msec, the jitter/deviation can vary pretty heavily without anyone truly noticing. Well, outside of compression artifacts where sometimes a stretching window might try to be caught up by simply concatenating the sound itself which might result in a perception of occasional choppiness (eg hey I think you broke up a tiny bit...)
For an interactive video / audio experience, you basically have to dip into the neurology of the human visual and auditory nervous and related brain systems to account for how big the lag can be -- and the individual matters highly here. Shifty gave a good example of a shooter where the muzzle flash vs the sound of the gunfire would be a good giveaway. However, every human is wired in their own way, and some people will notice it more than others.
And of course we must consider input lag in all of these equations. Imagine a scenario where your perceived inputs are asynchronous to visual and auditory cues -- even worse, if all three of these cues are perceivably out of sync from eachother. It would make the game feel disconnected at best, perhaps even antagonistic to the human player at worst.