Embedded Deep Learning and Advanced Computation

Abstract

In this project, several novel techniques were developed for accelerating deep learning computation on embedded devices with highly constrained computing resources. These techniques include: (1) using variable precision block floating point with stochastic rounding, (2) employing term quantization which quantizes floating point numbers into power-of-two terms, (3) extending pre-trained language models with domain-specific vocabulary, (4) minimizing memory access with schedules using constant bandwidth blocks, (5) applying full-stack optimization in the co-design of algorithms, models and architectures, (6) splitting neural networks for wearable computing, (7) designing algorithms for detecting input to DNNs which is out-of-distribution, (8) packing sparse DNNs for efficient systolic array implementations of DNNs, (9) designing memory-on-logic architectures and systolic building blocks for 3D-IC implementations of DNNs, and (10) leveraging bit-level sparsity in in-memory computing. These methods complement each other and are applicable to all resource-constrained deep learning accelerators.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Feb 01, 2023
Accession Number
AD1193214

Entities

People

  • H. T. Kung

Organizations

  • Harvard University

Tags

Communities of Interest

  • Advanced Electronics
  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Abstracts
  • Air Force
  • Air Force Research Laboratories
  • Algorithms
  • Application-Specific Integrated Circuits
  • Artificial Intelligence
  • Artificial Intelligence Software
  • Central Processing Units
  • Computer Architecture
  • Computer Languages
  • Computer Programming
  • Computer Vision
  • Computers
  • Computing System Architectures
  • Convolutional Neural Networks
  • Deep Learning
  • Field Programmable Gate Arrays
  • High Performance Computing
  • Machine Learning
  • Network Architecture
  • Neural Networks
  • Pattern Recognition
  • United States

Fields of Study

  • Computer science

Readers

  • Approximation Theory.
  • Distributed Systems and Data Platform Development
  • Parallel and Distributed Computing.

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Neural Networks