On the Regularization Properties of Structured Dropout

Abstract

Dropout and its extensions (e.g. DropBlock and DropConnect) are popular heuristics for training neural networks, which have been shown to improve generalization performance in practice. However, a theoretical understanding of their optimization and regularization properties remains elusive. Recent work shows that in the case of single hidden-layer linear networks, Dropout is a stochastic gradient descent method for minimizing a regularized loss, and that the regularizer induces solutions that are low-rank and balanced. In this work we show that for single hidden-layer linear networks, DropBlock induces spectral k-support norm regularization, and promotes solutions that are low-rank and have factors with equal norm. We also show that the global minimizer for DropBlock can be computed in closed form, and that DropConnect is equivalent to Dropout. We then show that some of these results can be extended to a general class of Dropout-strategies, and, with some assumptions, to deep non-linear networks when Dropout is applied to the last layer. We verify our theoretical claims and assumptions experimentally with commonly used network architectures.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jun 14, 2020
Accession Number
AD1152541

Entities

People

  • Ambar Pal
  • Benjamin D. Haeffele
  • Connor Lane
  • Rene Vidal

Organizations

  • Johns Hopkins University

Tags

Communities of Interest

  • Autonomy

DTIC Thesaurus Topics

  • Algorithms
  • Artificial Intelligence
  • Artificial Intelligence Software
  • Bernoulli Distribution
  • Computer Languages
  • Computing System Architectures
  • Data Mining
  • Data Science
  • Information Processing
  • Information Science
  • Information Systems
  • Iterations
  • Literature
  • Machine Learning
  • Network Architecture
  • Neural Networks
  • Optimization
  • Probability
  • Probability Distributions
  • Random Variables
  • Theorems

Fields of Study

  • Computer science

Readers

  • Linear Algebra
  • Neural Network Machine Learning.
  • Operations Research

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Neural Networks