Note: Detailed descriptions of workshop and panel are available here.
Start Time |
Title |
|
|
Sunday February 25 |
|
|
Workshops 9am – 2:30pm |
|
9:00-10:25 |
FPGA-based Accelerated Cloud Computing with AWS EC2 F1 and SDAccel (Slides) |
High-Speed FPGA Packet Processing using the new P4 Programming Language (Slides - pptx) (Slides - pdf) Stephen Ibanez (2) |
10:30-11:50 |
Doing Research on FPGAs in the Data Center |
|
12:00 |
||
13:00-14:30 |
Training of Quantized Neural Networks (Slides) Thomas Preusser (Xilinx) - San Carlos 1-4 |
|
14:30- |
Optimizing Quantized Neural Networks on FPGAs (Slides) Robert Green (ASIC Design Services) - San Carlos 1-4 |
|
15:00 |
Coffee Break |
|
|
Special Session: Deep Learning |
|
15:15 |
CausaLearn: Automated Scalable Framework for Streaming-based Causal Bayesian Learning using FPGAs (Slides) |
|
Bita Darvish Rouhani; Mohammad Ghasemzadeh; Farinaz Koushanfar
|
||
15:40 |
C-LSTM: Enabling Efficient LSTM using Structured Compression Techniques on FPGAs (Slides) |
|
Shuo Wang (1); Zhe Li (2); Caiwen Ding (2); Bo Yuan (3); Qinru Qiu (2); Yanzhi Wang (2); Yun (Eric) Liang (1); Yun (Eric) Liang (1) |
||
16:05 |
Coffee Break |
|
16:20 |
DeltaRNN: A Power-efficient Recurrent Neural Network Accelerator (Slides) |
|
Chang Gao (1); Daniel Neil (2); Enea Ceolini (1); Shih-Chii Liu (1); and Tobi Delbruck (1) |
||
16:45 |
A Lightweight YOLOv2: A Binarized CNN with A Parallel Support Vector Regression for an FPGA (Slides) |
|
Hiroki Nakahara; Haruyoshi Yonekawa; Tomoya Fujii; and Shimpei Sato |
||
17:10 |
Adjourn |
|
19:00 |
Reception – Marriott Ferrantes Bay View (10th Floor) |
|
|
Monday February 26 |
|
8:00 |
Continental Breakfast – San Carlos Foyer |
|
8:45 |
Opening Remarks – San Carlos 2-4 |
|
Jason Anderson (U Toronto); Kia Bazargan (UMN) |
||
|
Session 1: Architecture |
|
9:00 |
Architecture and Circuit Design of An All-Spintronic FPGA Device (Slides) |
|
Stephen Williams ; Mingjie Lin |
||
9:25 |
Liquid Silicon: A Data-Centric Reconfigurable Architecture enabled by RRAM Technology (Slides) |
|
Yue Zha and Jing Li
|
||
9:50 |
Improving FPGA Performance with a S44 LUT Structure (Slides)(short paper) |
|
Wenyi Feng (1); Jonathan Greene (1); Alan Mishchenko (2) |
||
9:55 |
Poster Session 1 and Break – San Carlos 1 & San Carlos Foyer |
|
|
Session 2: CAD - San Carlos 2-4 Session Chair: Sabya Das, Xilinx |
|
11:00 |
ParaDRo: A Parallel Deterministic Router Based on Spatial Partitioning and Scheduling (Slides) |
|
Chin Hau Hoo (1); and Akash Kumar (2) |
||
11:25 |
Routing Magic: Performing Computations Using Routing Networks and Voting Logic on Unary Encoded Data (Slides) |
|
Soheil Mohajer; Zhiheng Wang; Kia Bazargan |
||
11:50 |
A Full-System VM-HDL Co-Simulation Framework for Servers with PCIe-Connected FPGAs (Slides) |
|
Shenghsun Cho; Mrunal Patel; Han Chen; Michael Ferdman; Peter Milder |
||
12:15 |
Lunch – See ticket for location |
|
|
Session 3: Deep Learning - San Carlos 2-4 Session Chair: Peter Cheung, Imperial College |
|
14:00 |
Towards a Uniform Template-based Architecture for Accelerating 2D and 3D CNNs on FPGA (Slides) |
|
Junzhong Shen; You Huang; Zelong Wang; Yuran Qiao; Mei Wen; Chunyuan Zhang |
||
14:25 |
A Customizable Matrix Multiplication Framework for the Intel HARPv2 Xeon+FPGA Platform - A Deep Learning Case Study (Slides) |
|
Duncan Moss (1); Srivatsan Krishnan (2); Eriko Nurvitadhi (2); Piotr Ratuszniak (2); Chris Johnson (2); Jaewoong Sim (2); Asit Mishra (2); Debbie Marr (2); Suchit Subhaschandra (2); Philip Leong (1) |
||
14:50 |
A Framework for Generating High Throughput CNN Implementations on FPGAs (Slides) (Best Paper Nominee) |
|
Hanqing Zeng; Ren Chen; Chi Zhang; Viktor Prasanna |
||
15:15 |
Poster Session 2 and Break – San Carlos 1 & San Carlos Foyer |
|
|
Session 4: High Level Synthesis 1 - San Carlos 2-4 Session Chair: Stephen Neuendorffer, Xilinx |
|
16:00 |
Dynamically Scheduled High-level Synthesis (Slides) (Best Paper Nominee) |
|
Lana Josipovic; Radhika Ghosal; Paolo Ienne |
||
16:25 |
A Scalable Approach to Exact Resource-Constrained Scheduling Based on a Joint SDC and SAT Formulation (Slides) (Best Paper Nominee) |
|
Steve Dai; Gai Liu; Zhiru Zhang |
||
16:50 |
P4-compatible High-level Synthesis of Low Latency 100 Gb/s Streaming Packet Parsers in FPGAs (Slides)(short paper) |
|
Jeferson Santiago da Silva; François-Raymond Boyer; J.M. Pierre Langlois |
||
18:30 |
Banquet - San Carlos 2-4 |
|
19:30 |
Panel: The Computational Battle for Deep Learning Slides: Debbie Marr (Intel), Jeff Johnson (Facebook), Kees Vissers (Xilinx), (Eric Chung (Microsoft), Song Han (Stanford/MIT) |
|
|
Tuesday February 27 |
|
|
Session 5: Applications 1 - San Carlos 2-4 Session Chair: John Lockwood, Algo-Logic Systems |
|
9:00 |
Combined Spatial and Temporal Blocking for High-Performance Stencil Computation on FPGAs Using OpenCL (Slides) |
|
Hamid Reza Zohouri; Artur Podobas; Satoshi Matsuoka |
||
9:25 |
A HOG-based real-time and multi-scale Pedestrian Detector Demonstration System on FPGA (Slides) |
|
Jan Dürre; Dario Paradzik; Holger Blume |
||
9:50 |
Scalable Window Generation for the Intel Broadwell+Arria 10 and High-Bandwidth FPGA Systems |
|
Greg Stitt; Abhay Gupta; Madison Emas; David Wilson; Austin Baylis |
||
10:15 |
High-performance QR Decomposition for FPGAs (Slides)(short paper) |
|
Martin Langhammer; Bogdan Pasca |
||
10:20 |
Poster Session 3 and Break – San Carlos 1 & San Carlos Foyer |
|
|
Session 6: High Level Synthesis 2 - San Carlos 2-4 Session Chair: George Constantinides, Imperial College |
|
11:00 |
ADAM: Automated Design Analysis and Merging for Speeding up FPGA Development (Slides) |
|
Ho-Cheung Ng; Shuanglong Liu; Wayne Luk |
||
11:25 |
Graph-Theoretically Optimal Memory Banking for Stencil-Based Computing Kernels (Slides) |
|
Juan Escobedo; Mingjie Lin |
||
11:50 |
Architecture Exploration for HLS-Oriented FPGA Debug Overlays (Slides) |
|
Al-Shahna Jamal (1); Jeffrey Goeders (2); Steve Wilton (1) |
||
12:15 |
Lunch – Marriott Ferrantes Bay View (10th Floor) |
|
|
Session 7: Circuits and Computation Engines - San Carlos 2-4 Session Chair: Nachiket Kapre, University of Waterloo |
|
13:45 |
Memory-Efficient Fast Fourier Transform on Streaming Data by Fusing Permutations (Slides) |
|
François Serre; Markus Püschel |
||
14:10 |
Degree-aware Hybrid Graph Traversal on FPGA-HMC Platform (Slides) |
|
Jialiang Zhang; Jing Li |
||
14:35 |
Accelerating Graph Analytics By Co-Optimizing Storage and Access on an FPGA-HMC Platform (Slides) |
|
Soroosh Khoram; Jialiang Zhang; Maxwell Strange; Jing Li |
||
15:00 |
Coffee Break - San Carlos Foyer |
|
|
Session 8: Applications 2 - San Carlos 2-4 Session Chair: Lesley Shannon, Simon Fraser University |
|
15:15 |
Configurable FPGA Packet Parser for Terabit Networks with Guaranteed Wire-Speed Throughput (Slides) |
|
Jakub Cabal (1); Pavel Benáček (1); Lukáš Kekely(1); Michal Kekely (2); Viktor Puš (2); Jan Kořenek (3) |
||
15:40 |
FASTCF: FPGA-based Accelerator for Stochastic-Gradient-Descent-based Collaborative Filtering (Slides) (Best Paper Award Recipient) |
|
Shijie Zhou (1); Rajgopal Kannan (2); Yu Min (1); Viktor Prasanna (1) |
||
16:05 |
Rosetta: A Realistic High-Level Synthesis Benchmark Suite for Software Programmable FPGAs (Slides) |
|
Yuan Zhou (1); Udit Gupta (2); Steve Dai (1); Ritchie Zhao (1); Nitish Srivastava (1); Hanchen Jin (1); Joseph Featherston (1); Yi-Hsiang Lai (1); Gai Liu (1); Gustavo Angarita Velasquez (1); Wenping Wang (1); Zhiru Zhang (1) |
||
16:30 |
FPGA Fastfood - A High Speed Systolic Implementation of a Large Scale Online Kernel Method (Slides)(short paper) |
|
Sean Fox; David Boland; Philip Leong |
||
16:55 |
Closing Remarks, Best Paper Award |
|
|
Poster Session 1 – San Carlos 1 |
|
|
Optimizations of Sequence Alignment on FPGA: A Case Study of Extended Sequence Alignment |
|
|
Zheming Jin; Kazutomo Yoshii |
|
|
Automatic Optimising CNN with Depthwise Separable Convolution on FPGA |
|
|
Ruizhe Zhao; Xinyu Niu; Wayne Luk |
|
|
Continuous Skyline Computation Accelerator with Parallelizing Dominance Relation Calculations |
|
|
Kenichi Koizumi; Kei Hiraki; Mary Inaba |
|
|
Fast-Track: Exploiting Fast FPGA wiring for implementing NoC shortcuts |
|
|
Nachiket Kapre (1); Tushar Krishna (2) |
|
|
An Optimal Microarchitecture for Stencil Computation with Data Reuse and Fine-Grained Parallelism |
|
|
Yuze Chi;Peipei Zhou; Jason Cong |
|
|
A FPGA friendly approximate computing framework with hybrid Neural networks |
|
|
Haiyue Song; Xiang Song; Tianjian Li; Naifeng Jing; Xiaoyao Liang; Li Jiang |
|
|
In-Package Domain-Specific ASICs for Intel® Stratix® 10 FPGAs: A Case Study of Accelerating Deep Learning Using TensorTile ASIC |
|
|
Eriko Nurvitadhi; Jeff Cook; Asit Mishra; Debbie Marr; Kevin Nealis; Philip Colangelo; Andrew Ling; Davor Capalija; Utku Aydonat; Sergey Shumarayev; Aravind Dasu |
|
|
Evaluation of OpenCL Performance-oriented Optimizations for Streaming Kernels on the FPGA |
|
|
Zheming Jin |
|
|
K-Flow: A Programming and Scheduling Framework to Optimize Dataflow Execution on CPU-FPGA Platforms |
|
|
Jason Cong (1); Zhenman Fang (1)(2); Yao Hu (3); Di Wu (3) |
|
|
FPGA-based LSTM Acceleration for Real-Time EEG Signal Processing |
|
|
Zhe Chen; Andrew Howe; Hugh T. Blair; Jason Cong |
|
|
Understanding Performance Differences of FPGAs and GPUs |
|
|
Jason Cong (1); Zhenman Fang (1,2); Michael Lo (1); Hanrui Wang (1,3); Jingxian Xu (1); Shaochong Zhang (1) |
|
|
Poster Session 2 – San Carlos 1 |
|
|
Software/Hardware co-design for multichannel scheduling in IEEE 802.11p MLME |
|
|
Nan Ding (1); Wei Zhang (2); Yanhua Ma (1); Zhenguo Gao (1) |
|
|
Solving Satisfiability Problem on Quantum Annealer: A Lesson from FPGA CAD Tools |
|
|
Juexiao Su; Lei He |
|
|
Domino: An Asynchronous and Energy-efficient Accelerator for Graph Processing |
|
|
Chongchong Xu; Chao Wang; Yiwei Zhang; Lei Gong; Xi Li; Xuehai Zhou |
|
|
Towards Serial-Equivalent Parallel Routing for FPGAs |
|
|
Minghua Shen (1); Wentai Zhang (2); Nong Xiao (1); Guojie Luo (2) |
|
|
Performance Comparison of Multiple Approaches of Status Register for Medium Density Memory Suitable for Implementation of a Lossless Compression Dictionary |
|
|
Matěj Bartík (2); Tomáš Beneš (1); Sven Ubik (2); Pavel Kubalík (1) |
|
|
BoxPlacer: Force Directed-Based Timing-Driven Placement for Large-Scale FPGAs |
|
|
Minghua Shen (1); Jiaxi Zhang (2); Nong Xiao (1); Guojie Luo (2) |
|
|
DATuner: An Extensible Distributed Autotuning Framework for FPGA Design and Design Automation |
|
|
Gai Liu (1); Ecenur Ustun (1); Shaojie Xiang (1); Chang Xu (2); Guojie Luo (3); Zhiru Zhang (1) |
|
|
Mapping Large-Scale DNNs on Asymmetric FPGAs |
|
|
Wentai Zhang (1); Jiaxi Zhang (1); Minghua Shen (2); Nong Xiao (2); Guojie Luo (1) |
|
|
Software-Defined FPGA-Based Accelerator for Deep Convolutional Neural Networks |
|
|
Yankang Du; Qinrang Liu; Shuai Wei; Chen Gao |
|
|
Design of an MTJ-Based Nonvolatile LUT Circuit with a Data-Update Minimized Shift Operation for an Ultra-Low-Power FPGA |
|
|
Daisuke Suzuki; Takahiro Hanyu |
|
|
High-Throughput Lossless Compression on Tightly Coupled CPU-FPGA Platforms |
|
|
Weikang Qiao (1); Jieqiong Du (1); Zhenman Fang (1,2); Libo Wang (1); Michael Lo (1); Jason Cong (1); Mau-Chung Frank Chang (1) |
|
|
Poster Session 3 – San Carlos 1 |
|
|
HexCell: a Hexagonal Cell for Evolvable Systolic Arrays on FPGAs |
|
|
Fady Hussein; Luka Daoud; Nader Rafla
|
|
|
Label based Feature Analysis and Target Detection with Imager-driven Processing Mode for Ultrafast-Imager |
|
|
Xiaoyu Yu (1); Dong Ye (2) (1 Tencent; 2 Harbin Inst of Tech) |
|
|
A Low-Power Deconvolutional Accelerator for Convolutional Neural Network Based Segmentation on FPGA |
|
|
Shuanglong Liu (1); Xinyu Niu (2); Wayne Luk (1) |
|
|
FPGAs in the Datacenters: the Case of Parallel Hybrid Super Scalar String Sample Sort (pHS^5) |
|
|
Mikhail Asiatici (1); Damian Maiorano (2); Paolo Ienne (1) |
|
|
SIFT keypoint Descriptor Matching Algorithm: A Fully Pipelined Accelerator on FPGA |
|
|
Luka Daoud; Muhammad Kamran Latif; Nader Rafla |
|
|
FGC: A Toolflow for Generating and Configuring Custom FPGAs |
|
|
Oluseyi Ayorinde (1); He Qi (2); Benton Calhoun (2) |
|
|
Exploration of Low Numeric Precision Deep Learning Inference Using Intel® FPGAs |
|
|
Philip Colangelo (1); Nasibeh Nasiri (1); Eriko Nurvitadhi (1); Asit Mishra (1); Martin Margala (2); Kevin Nealis (1) |
|
|
LEOSoC: An Open-Source Cross-Platform Embedded Linux Library for Managing Hardware Accelerators in Heterogeneous System-on-Chips |
|
|
Andrea Guerrieri (1); Sahand Kashani-Akhavan (1); Mikhail Asiatici (1); Pasquale Lombardi (2); Bilel Belhadj (2); Paolo Ienne (1) |
|
|
A Self-adaptation Method of Fitting Convolutional Neural Network into FPGA |
|
|
Ning Mao (1); Zhihong Huang (1); Xing Wei (1); He Zhao (1); Xinkai Di (1); Le Yu (2); Haigang Yang (1) |