Adding full random memory of a CPU

Discussion in 'General 3D Technology' started by Reverend, Dec 20, 2004.

  1. Dio

    Dio
    Veteran

    Joined:
    Jul 1, 2002
    Messages:
    1,758
    Likes Received:
    8
    Location:
    UK
    Actually, I would disagree a little with that. VPU's are no worse for random accesses than CPU's are, but the definition of 'random' is different for the two.

    There are fundamental limits here. If accesses are truly utterly random, it doesn't matter how much cache or effort you spend in 'intelligent' banking, you'll end up on average with one page break per Nth memory access where N is your number of banks (that is, assuming that the 'cache' is much smaller than the memory size). Your effective burst lengths will be limited to your 'struct size' (if you see what I mean) and/or you may also incur an overfetch cost, if you do not make use of all the data in a cache line. These are impossible to get around. (Usually cache lines and burst lengths are tuned to be reasonably similar, so overfetch and bank switches are the key costs).

    In order to improve things, you choose to optimise looking for coherence in your input data and optimise for certain classes of pseudorandom access pattern.

    At the moment, VPU's and CPU's assume different kinds of coherence. CPU's use LRU caches, which one could view as assuming temporal coherence. VPU's make less use of these and more of latency compensation 'caches', prefetching, FIFOs and arranging data in memory to avoid bank switches - basically, we assume spatial coherence. Of course, there is some overlap: we have some LRU caches, and CPU's are now doing things like automatic prefetch in order to get spatial coherence on large arrays.

    Neither has any optimisation for overfetch other than assuming that a particular bit of data will all get used before it's discarded (and, given the cache line / burst length relation, it seems unlikely that there is much in this area, at least while we use SDRAM-type technology).

    Coherence optimisations are fundamental cost/performance tradeoffs. It would not be hard to add large LRU caches in a VPU, but it will cost area. That cost would be expended if there was a convincing business model behind the decision, but there already is a highly expensive, high performance, general purpose CPU in the device... I mean, what else is it going to do if the VPU ends up doing everything?
     
  2. andypski

    Regular

    Joined:
    May 20, 2002
    Messages:
    584
    Likes Received:
    28
    Location:
    Santa Clara
    Interesting - my question would be, do these large LRU caches support the maximum throughput of the VPU device per-clock, in terms of reads and writes, and if so then does it have an effect on the cost of the silicon against, say, the equivalent sized CPU cache? To support full-speed, fully random access on current hardware you could potentially need to read and write multiple pixels in a single clock to unrelated memory locations in the cache. Of course, if the cache is allowed to throttle the performance of the overall system this would seem to be less of a problem, but it introduces a potential bottleneck.

    This also seems a rather different case to the CPU cache situation, where typically you would not expect to generate a large number of writes and reads to unrelated locations in the space of a single clock, or even a small number of clocks. This is why my instinct tells me that it's not as simple as slamming a large cache on the back end of a VPU.

    My immediate concern would be - do you end up in one or more of these cases:

    - The VPU system goes relatively slowly with random access anyway, so it wasn't worth the extra area for the cache, and a CPU ends up just as quick for the algorithm in question.
    - The cache slows down the maximum throughput of the device when rendering simple pixels to consecutive memory locations, and thus impacts the basic performance of the device when it's working on its most standard function.
    - The cache can't support the full speed transactions of the VPU per-clock and thus is bypassed in some way for simple rendering, effectively becoming a big chunk of wasted area in this case (ie. when rendering typical non random-access applications).

    Maybe I'm wrong here and I'm inventing issues where none exist - I must admit to not having thought the problem through in too much detail.
     
  3. Reverend

    Banned

    Joined:
    Jan 31, 2002
    Messages:
    3,266
    Likes Received:
    24
    andypski, don't worry, I don't take these things too personal (this -- 3D, games -- is still a hobbby for me). Eric, Jeff and Chris got Christmas greetings from me (sorry, forgot about you! :oops: ) recently. I do however tend to speak what's on my mind without thinking I should be polite.

    What irks me about some of the postings by your fellow ATI'ers is that they feel the need to be defensive (the "I feel we design to a sweetspot" or "We believe we did the right engineering decision" or a sarcastic "Anyway, I’m glad you think R300 is a “Fineâ€￾ part." when I paid one of your hardware with a compliment deemed not high enough).

    I talk about 3D, using facts and available hardware. I provide my complaints (and, less frequently, compliments... complaints gets things going more, compliments make you complacent, look at NVIDIA :) ). Your fellow ATI'ers should address my (usually valid) complaints and not defend your company. If my wishlist isn't what you think is feasible, let us know by way of facts and your company's design and business policies. Not by defending your company's existing products. I know it's a natural thing to do, but I don't like it.

    Not that you or your fellow ATI'ers would care what I like and don't like, of course.
     
  4. Humus

    Humus Crazy coder
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    3,217
    Likes Received:
    77
    Location:
    Stockholm, Sweden
    What ATI employee was being defensive here? Alright so it's pretty obvious you mean me, but I don't get WTF you're talking about.

    I added a simple opinion on the topic, without trying to defend or even relate to any current, former or future hardware or ATI business, competitors or market situation. It was a comment entirely on the technical merits of such functionality. In what way does IHV employees shutting up raise the standard of a technical discussion? Do you have a personal problem with me or what?
     
  5. Reverend

    Banned

    Joined:
    Jan 31, 2002
    Messages:
    3,266
    Likes Received:
    24
    Wasn't talking about you or your comment in this thread.
     
  6. [maven]

    Regular

    Joined:
    Apr 3, 2003
    Messages:
    645
    Likes Received:
    16
    Location:
    DE
    Rev, I think you're getting a bit out of line...

    But then don't expect anyone else to stay polite when you're telling them (indirectly) to STFU.

    I think you're mistaking simple, factual explanations for defensiveness.

    Disagree.
     
  7. Reverend

    Banned

    Joined:
    Jan 31, 2002
    Messages:
    3,266
    Likes Received:
    24
    Hey, that's fine with me!

    Oh, I think the ATI'ers provided good explanations for why they disagree with some of the things I wish for. For example, everyone knows my arguments for FP32. The ATI'ers have provided good explanations why this wasn't good for their current (and perhaps near-future) products for various valid reasons. But they spoil it (for me, and me only) with persistent "last comments" (that has nothing to do with the technical or business decisions) by defending their company. Now, they're allowed to do that here of course but I tend to dislike company-defending comments in a technical discussion. For example, I really enjoyed the debate I had with sireric in that DX9-vs-IEEE32-regarding-FP32 thread... until he threw in his marketing and defend-my-product/company comments (although I should blame myself for using the R300 as the basis for my comments). Again, they can continue to defend their company and/or products here but it's not something I care for. I treasure their knowledge (and would like to thank them for providing valuable insights into some of the things that happen at ATI), I just don't like marketing pieces. However, just as they're allowed to do so, I am allowed to voice my (very personal) dislike for such practice.

    That's fine and okay (although I believe every single argument I've made for FP32 -- which is usually the topic of arguments between me and the ATI'ers -- are factually correct). That's what makes discussions lively.

    And to address your first comment last :

    I believe I am although I do not feel I need to treat IHVs participating here any differently nor with greater respect than Tom, Dick and Harry, and that's why I'll STFU myself about this as it's just my opinion. Later on, I'll PM the ATI'ers to clarify my position.

    Now, lets' get back to being on-topic or this thread will be locked.
     
  8. Dio

    Dio
    Veteran

    Joined:
    Jul 1, 2002
    Messages:
    1,758
    Likes Received:
    8
    Location:
    UK
    I didn't think it through in that kind of detail either :D

    I'd have said ordering and coherence might be a bigger issue than the second of those. The first I already covered. The third is simply a cost/benefit issue, if there are scenarios where it helps significantly then the cost might be worthwhile even if it isn't worth it for 'rendering' cases.
     
  9. Dave Baumann

    Dave Baumann Gamerscore Wh...
    Moderator Legend

    Joined:
    Jan 29, 2002
    Messages:
    14,079
    Likes Received:
    648
    Location:
    O Canada!
    What marketing pieces?
     
  10. nelg

    Veteran

    Joined:
    Jan 26, 2003
    Messages:
    1,557
    Likes Received:
    42
    Location:
    Toronto
    You know how those ATI guys bedazzle us with technical mumbo jumbo and then slip in “ Act now and we will include free shipping. Quantities are limited, operators are standing by." Sleazy bastards. :lol:
     
  11. Reverend

    Banned

    Joined:
    Jan 31, 2002
    Messages:
    3,266
    Likes Received:
    24
    I agree they're not immediately obvious.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...