AMD announces SSE5 instructions

I don't know how I feel about this. On the one hand, the new instructions look legitimately useful, as opposed to most of SSE4, and if it's single-cycle MAD on a CPU, that's obviously kind of huge. On the other, we're going to have processors with completely disparate feature sets again. I don't think that's a good thing, since you're just going to get code compiled for the lowest common denominator except when Intel or AMD pays somebody some money.
 
I don't know how I feel about this. On the one hand, the new instructions look legitimately useful, as opposed to most of SSE4, and if it's single-cycle MAD on a CPU, that's obviously kind of huge. On the other, we're going to have processors with completely disparate feature sets again. I don't think that's a good thing, since you're just going to get code compiled for the lowest common denominator except when Intel or AMD pays somebody some money.

It is true this will add some sort of burden for the developers, but can't a single executable have support for different instruction sets, activating different code paths on the fly? (This would obviously make the executables bigger, no?)
 
It also means that developers would need to spend time doing this.

It is true this will add some sort of burden for the developers, but can't a single executable have support for different instruction sets, activating different code paths on the fly? (This would obviously make the executables bigger, no?)
 
I dunno. If these are similar enough to the instruction sets used in GPUs, then wouldn't the shader compilation and optimization work carry over to the CPU world?
 
So do we know if SSE5 is a superset of SSE4?

I had actually assumed Intel owned SSE and AMD just had to follow, I didn't realise they could create new versions when they want. Whats stopping Intel creating a different version of SSE5?

Scary stuff. But then as long as each company when they create a new version makes it a superset of the previous version we shouldn't have any trouble. It may even lead to a "feature war" which could bring some nice stuff to the table.
 
In other news:
Intel strikes back with "SSE 5.1 -- drives your X-Fi surround setup obsolete!" :LOL:

Anyway, the presence of FMAC* capability would open a wide door for el-cheapo and fast software solutions to the mainstream.
 
It is odd that AMD threw those instructions from SSE4 into SSE5, the opcodes and the rest seem to be the same.

One small difference I found in AMD's description of SSE5 ROUNDPS and Intel's SSE4 ROUNDPS is AMD's support of unaligned access.

The packed rounding instructions for SSE4 seem to require 16-byte alignment, while AMD has already added unaligned access support with a flag bit for Barcelona onwards.

I don't know why this would be significant enough to promote the instruction to SSE5, as most other alignment exceptions for the rest of SSEx are dealt with without making them special additions in another SSE extension.

It is a little different though, since Intel's documentation has alignment checking for the scalar rounding instructions, but makes no mention of it for packed (doc omission?).
AMD's variant does.

Perhaps putting those ops into SSE5 saves on having a bit flag for the enabling of alignment checking/unaligned access for those instructions.

The going assumption was that AMD would implement the remainder of SSE4 in later cores, though that doesn't mean anything. I can hardly imagine that they'd skip compatibility entirely, though the cross-licensing agreement between them could be up for renegotiation in SSE5's time frame.

edit:
Also, another opcode byte.
I guess we know why AMD's been pushing for larger fetch lengths. Their future instructions are going to bloat further.
It does get them out of the 2 source limitation, though.
 
Do AMD even get to decide whether they can call it SSE5? I thought Intel controlled SSE naming.
 
I don't know.

If there is control, it's not by copyright or trademark.
Intel's whitepapers on SSE are pretty rigorous in putting trademark and copyright symbols on every reference to Intel or Core, but nothing on SSE or Streaming SIMD Extensions.

Maybe it was an informal agreement between them?
 
Reading this article makes me think fusion. If you're planning on doing fast shading on the CPU core or giving access to an embedded GPU directly in x86 code, then you need some extra instructions, and these look like they would be the beginning of those.
 
If this is the case, we could sit down and determine which set of instructions present in the x86 ISA with SSE5 would map to the operations that can be done by the ALU units of R600.

It would give us an idea of how integrated the two will be in the Bulldozer and maybe Bulldozer+1 timeframe.

Viewing the prospect of going through the semantics and gotchas of two different specifications to find matching instruction behavior makes me kind of lazy, though.
 
SSE5 "MAD" requires that either operand 1 or 3 is the destination. I think it can only use a register as an operand, i.e. no literals allowed.

So it seems to me that you'd need an abstraction if you wanted a system that decided at run time which processor the code would run on, etc.

Phil Hester talked about having a "driver" for CPUs in the future...

http://forum.beyond3d.com/showthread.php?t=43841

Jawed
 
Thinking about it, SSE4 launches in a few weeks and after that Intel will be looking towards Nehalem. Is Nehalem likely to add further instructions to SSE?

If so, whats stopping Intel calling that SSE6 and getting out of the gate with 6 before AMD launch 5?

This all sounds pretty silly to me. AMD really should have supported SSE4 in full.

I note SSE4 incorporates a dot product instruction aswell. Can anyone tell me what advantages this can bring to games?
 
Thinking about it, SSE4 launches in a few weeks and after that Intel will be looking towards Nehalem. Is Nehalem likely to add further instructions to SSE?

If so, whats stopping Intel calling that SSE6 and getting out of the gate with 6 before AMD launch 5?

This all sounds pretty silly to me. AMD really should have supported SSE4 in full.

I note SSE4 incorporates a dot product instruction aswell. Can anyone tell me what advantages this can bring to games?

AFAIK the next core scheduled to add significantly to ISA extensions is Sandy Bridge, formerly known as Gesher, hence the previous internal codename "Gesher New Instructions". However, Nehalem is supposed to add 7 new instructions in the form of SSE 4.2.
 
AFAIK the next core scheduled to add significantly to ISA extensions is Sandy Bridge, formerly known as Gesher, hence the previous internal codename "Gesher New Instructions". However, Nehalem is supposed to add 7 new instructions in the form of SSE 4.2.

Cool, thanks for the clarification. So according to that Penryn doesn't actually utilise the full SSE4 instruction set as its missing 7 instructions. These will be added in Nehalem which will be the first processor to feature all 54 nstructions of SSE4.

Sandy Bridge would have probably incorporated SSE5 but will now probably call it SSE6. Given that Sandy Bridge isn't due until about 2010 it seems AMD will get a little time out of SSE5 but not much. The question is will SSE6 be a superset of SSE5...
 
I think Larabee and AMD's "streaming" ambitions using GPU tech mean that the fork is inevitable.

AMD has talked about CPUs having drivers, in the same way GPUs do. Once you've got a driver-based model for a CPU, does the forking actually matter?...

Jawed
 
I think Larabee and AMD's "streaming" ambitions using GPU tech mean that the fork is inevitable.

AMD has talked about CPUs having drivers, in the same way GPUs do. Once you've got a driver-based model for a CPU, does the forking actually matter?...

Jawed

I suppose we can ask Transmeta how well that worked.

The driver strategy doesn't make SSE5 any more likely to be adopted.
What does a driver do if there are vendor-specific extensions?

I'm thinking it either faults, wraps the command to an equivalent sequence (this gets dangerous), or requires a fallback path in lieu of a crash.

That is no different from how things are done right now through microcode.
If the driver model works for SSE5, microcode should have gotten 3dnow adopted.

The bigger company ignored the extension, and the vast majority of the market ignored it as well.

If Fusion somehow makes SSE5 palatable in the distant future, Intel will just migrate Larrabee's extensions to SSE6 or 7.

What will AMD do?
Assuming the cross-licensing agreement persists that far ahead in the future, it will probably wind up supporting Larrabee's extensions as well.
That leaves SSE5 hanging out with 3dnow.

edit:

By the way, I will beat Phil Hester with an NV30 if they try to pass off the kind of crap GPU drivers get away with as acceptable for a CPU interface.
 
Back
Top