A Theoretically Principled Framework for Learning by Pruning

Abstract

Stochastic gradient descent (SGD) is the algorithmic cornerstone of modern machinelearning, as it can effectively train to state-of- the-art accuracy deep networks of previouslyunthinkable parameter scales. However, there has been several concerns with regardsto it s robustness, parallelism, latency, and serious doubts on its biological plausibility.As it presently stands, all other alternatives have fallen short of even approaching theperformance of SGD. However, a recent line of experimental work offers a fascinatingglimps e to what could be a vastly different and combinatorial approach to learning: pruningrandom networks. These studies show that even a t initialization and in the completeabsence of weight training, one can find sub-networks of the initial random model thatachieve ne ar state-of-the-art prediction accuracy, across many machine learning tasks.The experimental evidence on the existence of highly acc urate subnetworks withinrandom models is fascinating. However, it remains unclear if this is a universal phenomenon.It is also entir ely unknown what are the learning and computational limitsof pruning-only algorithms, and their implications. Addressing these chall enges willoffer strong cues for re-imagining learning algorithms and the design of predictive models,specifically for the following reasons. The grand endeavor of this ONR projectis to develop a theoretically principled framework to analyze the learning-by-pruning phenomenon. Our goal is to uncover its learning and algorithmic implications, whileunderstanding its advantages and limitations in c omparison to training-based learning.Our project follows a concrete research plan focused around three thrusts. T1: AGeneral Framew ork for Achievability Results, where we identify settings under whichlearning by pruning is information-theoretically possible. T2 : Towards PrincipledAlgorithms for Learning by Pruning, where we develop algorithms from first principles,provide provable guarante es, and examine howthe architectures to be pruned can betailored to the task at hand. T3: Implications of Learning by Pruning, whe re westudy the generalization, geometric, algorithmic, and hardware implications and benefitsof pruning, and compare it with backpro p-based training, with the goal of offering aroadmap to guide algorithmic choices.If successful, our proposed program will be transf ormational in the way that wetrain models and is envisioned to create a new interdisciplinary field of deep learningthrough combinat orial pruning algorithms. The proposed research has the potential tosignificantly impact the theory and design of future machine lea rning systems, for bothcivilian and naval applications, as it will address several deployment challenges, relatedto scalability, har dware suitability, and provable operational guarantees.Approved for Public Release

Document Details

Document Type: DoD Grant Award
Publication Date: Aug 20, 2021
Source ID: N000142112806

Entities

People

Dimitrios Papailiopoulos

Organizations

Office of Naval Research
United States Navy
University of Wisconsin System

A Theoretically Principled Framework for Learning by Pruning

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers

Technology Areas