("Bit masking" ...)Any hardware that supports paletted textures
If you have hardware support for 4 bit palettes, and have two textures where you need just four indices each, you can pack them into a single 4-bit texture and use two palettes. In both palettes, you duplicate the entries in a way that makes the bits you don't need for the current texture irrelevant. This doesn't require dependent reads at all.
Let's put the two index bits for texture 0 into the two low bits of the texture and lets call the palette entries C00, C01, C02, C03.
Likewise put texture 1's two-bit indices into the high bits and let's name the colors C10, C11, C12, C13.
When you want to sample texture 0, you make this 4-bit palette current:
C00, C01, C02, C03
C00, C01, C02, C03
C00, C01, C02, C03
C00, C01, C02, C03
I.e. all rows are equal. The two higher bits don't affect the color that's pulled from the pallete.
When you want to sample texture 1, you switch to this palette:
C10, C10, C10, C10
C11, C11, C11, C11
C12, C12, C12, C12
C13, C13, C13, C13
All columns are equal now. With this palette, the two
higher bits are irrelevant.