Embedded Deep Learning and Advanced Computation

Abstract

In this project, several novel techniques were developed for accelerating deep learning computation on embedded devices with highly constrained computing resources. These techniques include: (1) using variable precision block floating point with stochastic rounding, (2) employing term quantization which quantizes floating point numbers into power-of-two terms, (3) extending pre-trained language models with domain-specific vocabulary, (4) minimizing memory access with schedules using constant bandwidth blocks, (5) applying full-stack optimization in the co-design of algorithms, models and architectures, (6) splitting neural networks for wearable computing, (7) designing algorithms for detecting input to DNNs which is out-of-distribution, (8) packing sparse DNNs for efficient systolic array implementations of DNNs, (9) designing memory-on-logic architectures and systolic building blocks for 3D-IC implementations of DNNs, and (10) leveraging bit-level sparsity in in-memory computing. These methods complement each other and are applicable to all resource-constrained deep learning accelerators.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Feb 01, 2023
Accession Number: AD1193214

Entities

People

H. T. Kung

Organizations

Harvard University

Embedded Deep Learning and Advanced Computation

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers

Technology Areas