N64 RDP/RSP

It was a shared memory system so I don't really know how you'd measure it in any useful way. It's just one of those things where you ave what you have, all you can do is minimise the number of times you hit external memory. :/

I'm sure the RSP/RDP did better out of it that the CPU just because they used DMA to move chunks of memory
 
How different is the rendering tech of N64 compared to where current video cards are? Was N64 pretty forward looking? It sounds like the RSP is almost a vertex/pixel shader mechanism with its programmability. Actually, the way you can do almost everything on that RSP makes it sound like where we are going with PCIe's bidirectional speedy-ness and the unified shaders on video cards.

The emulator developers have done pretty well with getting games running, but there are visual bugs in basically every game. Makes me think that the tech is different enough to be a problem. The RSP doing audio seems to be a MAJOR problem for emulation too....
 
swaaye said:
How different is the rendering tech of N64 compared to where current video cards are? Was N64 pretty forward looking? It sounds like the RSP is almost a vertex/pixel shader mechanism with its programmability. Actually, the way you can do almost everything on that RSP makes it sound like where we are going with PCIe's bidirectional speedy-ness and the unified shaders on video cards.

The emulator developers have done pretty well with getting games running, but there are visual bugs in basically every game. Makes me think that the tech is different enough to be a problem. The RSP doing audio seems to be a MAJOR problem for emulation too....

I'm not sure that the N64 was particularly "forwards looking", more sideways looking.

In the same way tha say a TNT or Voodoo 1 was derived from SGI tech, so was the N64's RDP, but prioritising somewhat differently. Yes the RSP was programmable, but I don't think that decision was made to supply "vertex shader" like functionality. Just that it was the easiest way at the time to provide transform and lighting hardware.

The Hardware does support several features that are hard to emulate, but they are just different rather than better. The destination blender is more complex than most DX hardware, and you can do some nasty things with the "trilinear" hardware that would be difficult to simulate on a PC.

Since it was a shared memory architecture (and not a well designed one) interoperability between the graphics and CPU pipeline is very simple. But that's more a function of it's low cost design than something that was a design requirement when it was built.
 
swaaye said:
How different is the rendering tech of N64 compared to where current video cards are? Was N64 pretty forward looking? It sounds like the RSP is almost a vertex/pixel shader mechanism with its programmability. Actually, the way you can do almost everything on that RSP makes it sound like where we are going with PCIe's bidirectional speedy-ness and the unified shaders on video cards.

The emulator developers have done pretty well with getting games running, but there are visual bugs in basically every game. Makes me think that the tech is different enough to be a problem. The RSP doing audio seems to be a MAJOR problem for emulation too....

The problem with N64 emulators right now is that they don't really emulate the RDP. The RDP emulation is done at high level, and does not emulate the opcodes directly. Basically each video plugin has to simulate what each and every different ucode is trying to do. As a result of the simulation, most games have problems ranging from minor geometry/sound errors, to totally not working like Rogue Squadron, Indiana Jones, and WDC(who would've thought that ;)).

Most devs say that lack of documentation for the RDP is why it isn't emulated well. The worst part is that there doesn't seem to be any real interest in N64 emulation anymore since the big name games work for the most part. It will probably stay a hackish area of emulation until MAME covers N64 hardware for the arcade games that used it(Mickey's Magical Tetris, and the Seta Aleck64 games). No one else seems to really care about accurate emulation of it.
 
Urian said:
Really? Why my old GF2 GTS had 32MB of DDR-SDRAM at 200Mhz if the graphic card was released in the year 2000?
Aye, you're right. :) GF2 GTS was released in summer 2000, and did come with DDR; in fact the second release of the GF1 had DDR in Q1 2000. Graphics cards have always been first to consume exotic new memory types.
 
Sorry to bring this old thread back to life, but I have to settle this burning question about the RCP's capabilities. My understanding of the RCP is cobbled together from a number of sources, but only recently have I discovered info that conflicts with my long held understanding that the core of the RSP was a 2-way FP32 vector unit. That understanding, while I never confirmed explicitly, was based on official press material stating the RCP (which I read as the RSP portion) being capable of 100+ MFLOPS.

The "N64 Introductory Manual" however, states that the vector unit of the RSP is based on "eight pieces of 16-bit product-sum operation mechanisms" and further states "The RSP uses the 32-bit fixed-point vertex calculations to perform these transformations." Then ERP mentions that it's based on 8-bit integer vector ops, so now I'm really confused.

I suppose I was naive in assuming the RSP, even if it were fp-based, would be capable of single-cycle FMADD throughput, given the huge disparity between its clock and triangle rate. So based on all this conflicting information, my questions are:

a) What is the internal organization of the RSP's vector unit? 8xINT8, 8xINT16, 2xFP32? Is it fixed width?
b) How are vertices presented to the RSP? 16- or 32-bit fixed-point (per component)? 32-bit floating-point? Or does this depend on the microcode?
c) Typically how many cycles does it take to transform a single vertex (with standard microcode, let's say)?
d) In a typical N64 game, are vertices ever represented with FP32 precision for use by the regular FPU?
e) If the RSP really is a fixed-point architecture, where does the official 100 MFLOPS figure come from?

f) Slightly OT, but is it correct to say that RSP uCode is more like normal MIPS code that the RSP runs in its own address space, and not uCode in the traditional sense as a sequence of opcodes that map directly to the inputs of the RSP's functional units? This would explain how vertices could more easily be represented at a different precision than the RSP's native vector width.

I don't expect answers to all of these questions, but hopefully I can get at least some insight.
 
http://www.moogle-tech.com/n64_jul07.html
RCP - Reality Central Processor. MIPS chip model R4300. Clocked at approximately 94MHz, with 16kbytes of instruction cache and 8kbytes of data cache (though the listed cache sizes are questionable). Integrates standard COP0 (MMU) and COP1 (FPU) modules.
[...]
RSP - Reality Signal Processor. Depending on how you look at it, it's either a standard MIPS chip on testosterone or a standard MIPS chip on an antiandrogen. It was designed specifically to handle display lists and audio lists. Its status is partially accessible by the RCP as COP2. The RSP has no MMU or FPU; COP0 access maps to the RDP (4)'s status and command buffer, COP1 does not exist, and COP2 maps to a custom vector instruction set designed specifically for the Nintendo 64, though it is rumored that the instruction set is the predecessor to the VICE multimedia extensions found in the Silicon Graphics O2 workstation. The RSP's vector extensions give the RSP an additional 32 vector registers; each register is 128 bits wide, made up of eight 16-bit fixed-point values that are accessible in various combinations based on bitfields present in the vector-specific instructions. The vector registers can also be accessed manually via MFC2 and MTC2 instructions. In addition, there are eight 64-bit-wide accumulator registers that are accessible only through individual vector-related instructions. To call the RSP's memory architecture bizarre would be an understatement; its instruction cache (8kbytes) and data cache (8kbytes) have been repurposed into the only memory accessible to the RSP. Furthermore, the RSP uses Harvard architecture, limiting data fetches to DMEM and instruction fetches to IMEM. Both IMEM and DMEM are mapped into the RCP's memory space; both the RCP and RSP may initiate DMA transfers to and from RDRAM, and the RCP can manually access the RSP's memory area, though manual access is subject to caching. In both this respect and the additional vector instructions, the RSP can be likened to one of the Playstation 2's VU1 or one of the Playstation 3's SPUs. I could provide more information, but this entry is already quite long, so if any further information is desired, drop me a line at my above-listed locations.
That's from some guy working on a low level N64 emulator for MESS. At the moment, his code doesn't seem to be in MESS, so you can't just download the source and look at it.

So, it looks like the RSP is fixed point, but the main CPU has an FPU.
 
Thanks for the info Tapam. I hadn't come across that site in my searching, but I will look into it some more. That's another source to mention an 8xINT16 configuration, so I feel comfortable abandoning my notion that it was an FP architecture. That still leaves ERP's comment about it being based on 8-bit ops (though this source does say the registers are accessible in various combinations) and the 100 MFLOPS figure, though I'm beginning to question just how "official" that statement was.

Speaking of emulators, I had actually started looking at Project 64's source, specifically the RSP interpreter, but the general lack of commenting makes grinding through the code a real chore. It looks like I just might have to do that though to answer some of my other questions.
 
Could have been 16 bits, I just remember having to deal with carry bits to do almost any calculation, and I remember triangle setup being a tremendous pain in the ass.
 
Thanks ERP. If you had to deal with carries (in microcode, right?), then that lends a lot of credence to the idea that vertices were represented as 32-bit fixed-point and operated on by a series of operations with 16-bit MACs in the RSP. Then again, Playstation got away with 16-bit precision, but N64's polys were rock solid, so barring any further information, this will be my new understanding of N64's vertex pipeline :smile:.
 
Source vertex format is irrelevant, to the RSP, it's just a chunk of memory that's moved in via DMA, the uCode can deal with it anyway it wants. I think we probably used 16 bits for coordinates but it's been a long time.
The real cost was the triangle setup, rather than the transforms, it involved multiple divides none of which were cheap in any sense of the word.
The RDP triangle format was a mix of formats, I couldn't recount it off the top of my head, but I'd be very suprised if any of the coordinates were in excess of 16 bits. Some of the divisors might have been.

PS1 didn't have to deal with perspective for triangle setup, and had no subpixel accuracy.
Most of the stuff on N64 could be dealt with as 16 bit, but not everything.

The biggest difference between the original some what optimistically named Fast3d and the later slightly less optimistically named Fast3D2 microcodes was the relative accuracy of the calculations. There were other uCodes Nintendo provided, though I don't know of any games that used them, the RacerX/Stunt Racer uCodes were very loosely based on a Nintendo supplied ZSort uCode, largely because it allowed us to build the game and keep it running as the uCode was written. If I were building it from scratch with hindsight I'd have done it somewhat differently.
 
Source vertex format is irrelevant, to the RSP, it's just a chunk of memory that's moved in via DMA, the uCode can deal with it anyway it wants.

Well yes, if you can create your own uCode, but I was referring to the vertex format expected by the standard uCode. Either way, if you recall only using a 16-bit format, that's probably what most, if not all, games used. But then, if you had to deal with carry bits, maybe you were right about the RSP being based on 8-bit ops. Was the RCP a 64-bit or 128-bit design? I've seen both thrown around on the web. Oh well, sometimes I don't even know why I'm so interested in obsolete tech :p.
 
The RDP triangle format was a mix of formats, I couldn't recount it off the top of my head.

I believe it's explained in Appendix A of patent no. US6331856 which you can easily google up. Actually this patent lists every RDP hardware command (the same goes for US6239810). Make sure to grab pdf versions, if interested.
 
ERP,

I hear some popping and clicking in the Top Gear Rally soundtrack. Some sort of latency/buffer challenges?
http://www.youtube.com/watch?v=LgLpRL_doP4&fmt=18
This was captured off of a N64.

TGR does all the audio processing on the CPU, and supports a relatively limited number of channels (6 if I remember correctly), it could have been a number of things. The most likely culprit is samples getting cut off with significant amplitude.

Thinking about though it also uses a trick where by it uses the MMU to remap the same memory page to two different virtual addresses, in order to effectively loop the buffer but reduce the number of times you have to check for the loop condition. So it's possible there is some weird interaction where sometimes the flush to physical memory does the wrong thing.

Could also have been me mishandling the audio DMA, or a race condition between the audio DMA and the CPU buffer fill. Or just plain old clipping on the audio data.
 
Wow, great find angrylion! That patent answers just about every question I had about N64 architecture. I can at last verify that the RSP is based on 8 16-bit ALUs. It also mentions
In this example, the transformed x, y, z, and w, values corresponding to the vertex are stored in double precision format, with the integer parts first followed by the fractional parts...
Prior to that it describes the double precision format as 16.16. Again, that's not to say that much precision was always required, but it sounds likely that that vertex format was widely used.
 
Hello ERP,

I feel desperated

I sent you emails during nearly one year now without success.

Could you please check out and get back to me?

Thanks a lot

Olivieryuyu
 
Nintendo64 should have had 8-10 MB of lower latancy RAM and 16k texture cache. especially by 1996, but unfortunately, the spec was kept at what was to be used in 1995.
The Nintendo 64's 4kb texture cache was more than enough for its textures as PlayStation had a smaller 2kb texture cache. The reason why the Nintendo 64 had "blurry" graphics at times, was actually because of the limited storage of its cartridges. The PlayStation on the other hand, was disc-based as it allowed over 650 MB of data which made of for its smaller 2kb texture cache. If the Nintendo 64 had been disc-based, it would of had way more room for data which would make its textures look impressively better than the PlayStation. But overall, the Nintendo 64 mainly in all its games had better looking textures, because of its anti-aliasing feature. SGI and MIPS did and incredible job with the technology they had back in 1994, in the creation and design of the hardware on the Nintendo 64 with it being cartridge-based.
 
Back
Top