ISCA 2014 会议论文整理

即将踏入 Computer Architecture 领域,准备从今年下半年开始把这个方向相关的国际顶级会议论文都下过来看看。

从 ISCA 2014 开始:

ISCA 是由 ACM 与 IEEE 共同主办的体系结构领域顶级会议,全名是 International Symposium on Computer Architecture。


话说,学校买了 IEEE 的刊文版权就是好哇~~~数字图书馆登上去直接就能下了,怀抱这么一个大金库,真是爽啊!!

Session 1: Machines and Prototypes

  • [238] Unifying on-chip and inter-node switching within the Anton 2 network
    使用Anton 2网络来统一片上和节点间切换

  • [195] A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services

  • [232] SCORPIO: A 36-Core Research Chip Demonstrating Snoopy Coherence

Session 2A: Resilience

  • [200] Avoiding Core’s DUE & SDC via Acoustic Wave Detectors and Tailored Error Containment and Recovery

  • [221] MemGuard: A Low Cost and Energy Efficient Design to Support and Enhance Memory System Reliability

  • [212] GangES: Gang Error Simulation for Hardware Resiliency Evaluation

  • [227] Real-World Design and Evaluation of Compiler-Managed GPU Redundant Multithreading

Session 2B: Design Space Exploration

  • [198] ArchRanker: A Ranking Approach to Design Space Exploration

  • [196] Aladdin: A Pre-RTL, Power-Performance Accelerator Simulator Enabling Large Design Space Exploration of Customized Architectures

  • [236] SynFull: Synthetic Traffic Models Capturing Cache Coherent Behaviour

  • [218] Harnessing ISA Diversity: Design of a Heterogeneous-ISA Chip Multiprocessor

Session 3A: Caches

  • [203] The Direct-to-Data (D2D) Cache: Navigating the Cache Hierarchy with a Single Lookup

  • [231] SC^2: A Statistical Compression Cache Scheme

  • [204] The Dirty-Block Index

  • [214] Going Vertical in Memory Management: Handling Multiplicity by Multi-policy

Session 3B: GPUs and Parallelism

  • [209] Fine-grain Task Aggregation and Coordination on GPUs

  • [208] Enabling Preemptive Multiprogramming on GPUs

  • [234] Single-Graph Multiple Flows: Energy Efficient Design Alternative for GPGPUs

  • [215] HELIX-RC: An Architecture-Compiler Co-Design for Automatic Parallelization of Irregular Programs

Session 4: Emerging Technologies

  • [206] Efficient Digital Neurons for Large Scale Cortical Architectures

  • [197] An Examination of the Architecture and System-level Tradeoffs of Employing Steep Slope Devices in 3D CMPs

  • [233] STAG: Spintronic-Tape Architecture for GPGPU Cache Hierarchies

Session 5A: NVRAM

  • [222] Memory Persistency

  • [228] Reducing Access Latency of MLC PCMs through Line Striping

  • [216] HIOS: A Host Interface 110 Scheduler for Solid State Disks

Session 5B: Datacenters and Cloud

  • [237] Towards Energy Proportionality for Large-Scale Latency-Critical Workloads

  • [235] SleepScale: Runtime Joint Speed Scaling and Sleep States Management for Power Efficient Data Centers

  • [224] Optimizing Virtual Machine Consolidation Performance on NUMA Server Architecture for Cloud Workloads

Session 6A: DRAM

  • [230] Row-Buffer Decoupling: A Case for Low-Latency DRAM Microarchitecture

  • [217] Half-DRAM: a High-bandwidth and Low-power DRAM Architecture from the Rethinking of Fine-grained Activation

  • [210] Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors

Session 6B: Circuits and Architecture

  • [199] Architecture Implications of Pads as a Scarce Resource

  • [220] Increasing Off-Chip Bandwidth in Multi-Core Processors with Switchable Pins

  • [194] A Low Power and Reliable Charge Pump Design for Phase Change Memories

Session 7A: Coherence and Replay

  • [211] Fractal++: Closing the Performance Gap between Fractal and Conventional Coherence

  • [223] OmniOrder: Directory-Based Conflict Serialization of Transactions

  • [225] Pacifier: Record and Replay for Relaxed-Consistency Multiprocessors with Distributed Directory ProtocoL

  • [229] Replay Debugging: Leveraging Record and Replay for Program Debugging

Session 7B: Security/OOO Processors

  • [201] The CHERI capability model: Revisiting RISC in an age of risk

  • [202] CODOMs: Protecting Software with Code-centric Memory Domains

  • [205] EOLE: Paving the Way for an Effective Implementation of Value Prediction

  • [219] Improving the Energy Efficiency of Big Cores

Session 8: Accelerators

  • [213] General-Purpose Code Acceleration with Limited-Precision Analog Computation

  • [226] Race Logic: A Hardware Acceleration for Dynamic Programming Algorithms

  • [207] Eliminating Redundant Fragment Shader Executions on a Mobile GPU via Hardware Memoization

  • [239] WebCore: Architectural Support for Mobile Web Browsing