Posts by Collection

portfolio

publications

Boosting Deep Neural Network Efficiency with Dual-Module Inference

ICML, 2020

  • Develop a light-weighted auxiliary “little” module with random projection and weight quantization for probing Neural Network (NN) layerwise output sparsity to facilitate NN inference acceleration;
  • The proposed scheme can be easily applied to various types of neural networks, such as CNN, and LSTM. (e.g., on ResNet-18, and it outperforms the state-of-the-art solutions with much higher FLOPs reduction, memory saving and model accuracy);
  • The proposed scheme can also be applied to various tasks, such as object detection (SSD: Single Shot MultiBox Detector).

Download here

GNNAdvisor: An Efficient Runtime System for GNN Acceleration on GPUs

OSDI, 2021

  • Propose and implement an efficient runtime system for accelerating GNN on GPU;
  • The proposed system can effectively leverage the input-level information of GNNs for guiding system-level optimizations on GPUs;
  • Rigorous experiments and comparisons with existing GNN frameworks, such as DGL, demonstrate the effectiveness of our system.

Download here

DSXplore: Optimizing Convolutional Neural Networks via Sliding-Channel Convolutions

IPDPS, 2021

  • Propose and implement the first optimized design for exploring deep separable convolution on CNNs;
  • At the algorithm level, we incorporate a novel sliding-channel convolution (SCC), featured with filter-channel overlapping to balance the accuracy performance and the reduction of computation and memory cost;
  • At the implementation level, we build an optimized GPU-implementation tailored for SCC by leveraging several key techniques, such as the input-centric backward propagation and the channel-cyclic optimization;
  • Integrate the SCC into the existing Pytorch framework as a new type of convolution operator.
  • Project is now open-source at Github.

Download here

talks

teaching