Haswell vs Kaveri

Discussion in 'Architecture and Products' started by AnarchX, Feb 8, 2012.

  1. Raqia

    Raqia Regular

    Well let's say you want to GREP the results of your sort (which I think makes much more sense to do on a CPU) to find patterns or present it to an Excel range for a client; now you have to pass that whole context back into the CPU's memory space right? I'd be thrilled if this were a fast and abstractable process on Kaveri.
     
  2. RecessionCone

    RecessionCone Regular Subscriber

    =)
    It's already fast and abstractable. I haven't had to deal with this problem since I started using Thrust years ago.


    //your CPU data
    T* data; int len;

    //allocate space and transfer data
    thrust::device_vector<T> gpu_data(data, data+len);

    //sort it
    thrust::sort(gpu_data.begin(), gpu_data.end);

    //look at an element (transfers data for you)
    T el = gpu_data[10];
    std::cout << el << std::endl;

    //transfer the whole vector back
    thrust::host_vector<T> sorted_data = gpu_data;

    Abstraction over memory spaces isn't too bad...
     
  3. Raqia

    Raqia Regular

    Thanks, that's down right clean, I'll have to try it! Still, if you want a speed, the whole paradigm of GPU + CPU interoperation for now still seems limited to running long chunks of math dense code like massive linear algebra or merge sorts entirely on the GPU until it finishes, and only then accessing data instead of freely interleaving of the two types of processors.

    Anyway, it sounds like the memory allocation issue is addressed with Kaveri so after some initial overhead, maybe we'll just be able to do all this w/o specialized memory abstracting libraries and just passing raw pointers around to GPU threads (or waves, warps whatever they call them) running our functions.

    It'd be nice to have new CPU instructions in the future that directly use the hardware GPU like the extra wide SIMD unit it is on a synchronous CPU thread and let the OS or a more sophisticated GPU scheduler handle managing resources. (Another possibility is reserving one or two couple privileged units on the GPU side for this purpose with full cache coherence logic for those select units etc.)
     
  4. Paran

    Paran Regular


    Suddenly Intel found a market. Broadwell-K gets GT3e according to this: http://www.cpu-world.com/news_2013/...socket_1150_CPUs_to_feature_GT3_graphics.html
     
  5. Chabi

    Chabi Newcomer

  6. entity279

    entity279 Veteran Subscriber

    2.0 ? This hints at Google chrome versioning system or what?
     
  7. Kaotik

    Kaotik Drunk Member Legend

    Source?
     
  8. no-X

    no-X Veteran

  9. Chabi

    Chabi Newcomer

  10. moozoo

    moozoo Newcomer

    Dedicated SSD PCIe = Sata express?
     
  11. pjbliverpool

    pjbliverpool B3D Scallywag Legend

    Cool, looks like this might actually be a decent CPU. Just a shame about the lack of GPU and memory oomph.
     
  12. Chabi

    Chabi Newcomer

    direct access PCIe lanes to SSD
    (i think)
     
  13. kalelovil

    kalelovil Regular

     
  14. DSC

    DSC Banned

  15. Kaotik

    Kaotik Drunk Member Legend

    Anyone know if the "2.0" refers to "Kaveri 2.0" or "APU 2.0" now that it has all the HSA stuff?
     
  16. Alexko

    Alexko Veteran Subscriber

    There have been mentions of Kaveri 2.0 before, e.g. on the LinkedIn profiles a few AMD employees. I think the original Kaveri was scrapped and replaced by what's about to be released, hence the delay and the introduction of Richland.
     
  17. Chabi

    Chabi Newcomer

  18. Alexko

    Alexko Veteran Subscriber

    The article says that AMD has yet to decide what the Turbo frequency will be, but that seems hard to believe so close to launch.
     
  19. fellix

    fellix Veteran

    Is the Mandelbrot FPU subtest x87 coded? Looks like the legacy stack remains untouched.
     
  20. Raqia

    Raqia Regular

    Looks like the CEO is slashing costs and ensuring execution w/ this next round of parts. The bulk process is probably cheaper and TSMC can probably deliver better volume too, but it seems like clock speed is down; hopefully their turbo boost is working more selectively now. We have DDR3 instead of GDDR5, we're keeping sockets yet again, and there's no enthusiast part. The enthusiast in me wants them to liberate the Steamroller B core, but maybe I should buy the company's stock for some consolation when it eventually turns around.
     
Loading...

Share This Page

Loading...