Recent content by sparsh

  1. S

    Survey paper on Deep Learning on GPUs

    The rise of deep-learning (DL) has been fuelled by the improvements in accelerators. GPU continues to remain the most widely used accelerator for DL applications. We present a survey of architecture and system-level techniques for optimizing DL applications on GPUs. We review 75+ techniques...
  2. S

    Survey on Deep Learning on NVIDIA's Jetson Platform

    Design of hardware accelerators for neural network (NN) applications involves walking a tight rope amidst the constraints of low-power, high accuracy and throughput. NVIDIA's Jetson is a promising platform for embedded machine learning which seeks to achieve a balance between the above...
  3. S

    Survey paper on mobile web browsing

    Mobile web browsing (MWB) can very well be termed as the confluence of two major revolutions: mobile (smartphone) and internet revolution. Mobile web traffic has now surpassed the desktop web traffic and has become the primary means for service providers to reach-out to the billions of...
  4. S

    Survey on FPGA-based Accelerators for CNNs

    CNNs (convolutional neural networks) have been recently successfully applied for a wide range of cognitive challenges. Given high computational demands of CNNs, custom hardware accelerators are vital for boosting their performance. The high energy-efficiency, computing capabilities and...
  5. S

    Survey on security techniques for GPU

    Graphics processing unit (GPU), although a powerful performance-booster, also has many security vulnerabilities. Due to these, the GPU can act as a safe-haven for stealthy malware and the weakest ‘link’ in the security ‘chain’. We present a survey of GPU vulnerabilities showed by researchers...
  6. S

    An Open-source tool for modeling 2D/3D, SLC/MLC memories

    We have just released version 2 of DESTINY, which can model: * (2D/3D) SRAM and eDRAM * (2D/3D, SLC/MLC) STT-RAM, ReRAM and PCM * (2D, SLC/MLC) SOT-RAM, Flash, DWM SLC/MLC = single/multi-level cell, DWM = domain wall memory (aka racetrack memory), SOT-RAM = spin orbit torque RAM, STTRAM = spin...
  7. S

    Survey of Cache Partitioning Techniques for Multicore Processors

    Multi/many-core processors are vital for HPC. With increasing core-count, management of cache has become extremely important. This survey reviews 90 papers on cache partitioning. It discusses different types of cache partitioning techniques in various contexts and also their integration with...
  8. S

    A Survey of Techniques for Architecting TLBs

    Translation lookaside buffer (TLB) caches virtual to physical address translation information and is used in systems ranging from embedded devices to high-end servers. Since TLB is accessed very frequently and a TLB miss is extremely costly, prudent management of TLB is important. We present a...
  9. S

    A Survey Paper on Techniques for Managing Register File in CPU

    Available here Accepted in Concurrency and Computation: Practice and Experience 2016 Abstract: Processor register file (RF) is an important microarchitectural component used for storing operands and results of instructions. The design and operation of RF has crucial impact on the performance...
  10. S

    A Survey On Cache Bypassing Techniques for CPUs, GPUs and CPU-GPU systems

    Available at https://www.academia.edu/24842555/A_Survey_of_Cache_Bypassing_Techniques accepted in JLPEA 2016, reviews ~90 papers. Part of the abstract: With increasing core-count, the cache demand of modern processors has also increased. However, due to strict area/power budgets and presence...
  11. S

    A Survey of Prefetching Techniques for Processor Caches

    A Survey of Recent Prefetching Techniques for Processor Caches Accepted in ACM Computing Surveys 2016 Part of the abstract: As the trends of process scaling make memory system even more crucial bottleneck, the importance of latency hiding techniques such as prefetching grows further. However...
  12. S

    A Survey of Techniques for Managing GPU Register File

    A Survey of Techniques for Architecting and Managing GPU Register File Accepted in IEEE TPDS 2016 Part of the abstract: To support their massively-multithreaded architecture, GPUs use very large register file (RF) which has a capacity higher than even L1 and L2 caches. In total contrast...
  13. S

    A Survey Of Techniques for Approximate Computing and Storage

    A Survey Of Techniques for Approximate Computing accepted in ACM Computing Surveys 2016, reviews ~85 papers. Covers: * Approximate computing in CPU, GPU and FPGA and various processor components (e.g. cache, main memory, secondary storage) * Approximate storage in SRAM, DRAM/eDRAM, non-volatile...
  14. S

    Survey papers on Power Management in Embedded systems

    1. A Survey Of Architectural Techniques for Near-Threshold Computing Accepted in ACM J. on Emerging Technologies in Computing Systems, 2015 Part of abstract: Low-voltage computing and specifically, near-threshold voltage computing (NTC), which involves operating the transistor very close to and...
  15. S

    A Survey on Techniques for Managing Process Variation

    A Survey Of Architectural Techniques for Managing Process Variation Accepted in ACM Computing Surveys 2016 Part of the abstract: Process variation --deviation in parameters from their nominal specifications-- threatens to slow down and even pause technological scaling and mitigation of it is...
Back
Top