The Effect of Profile Choice and Profile Gathering Methods on Profile-Driven Optimization Systems

Abstract

Profile-driven optimization can produce substantial improvements in the quality of code produced by a compiler or link-time optimizer. In this work, we analyze several important aspects of profile-driven optimization. We examine the effectiveness of profile-driven optimization in two commercial-quality optimizers (Digital's GEM compiler and the link-time optimizer 'alto'). We perform analyses to determine how much variability in profile-driven optimization performance results from choosing different training profiles, and to determine how much optimization benefit results from choosing more 'accurate' profiles (that is, profiles that better predict the way that a program is actually run). We examine low-overhead profiling methods such as static estimation (estimating profiles using static heuristics) and statistical sampling (gathering profiles by sampling only a small number of basic block executions). We analyze some profile-driven optimization results in great detail, and show a methodology for accounting for the profile-driven optimization effects of profile data associated with individual functions. Our results show that profile-driven optimization is effective on average, but unreliable when considering any individual benchmark. Using more accurate profiles is only weakly connected to improved profile-driven optimization performance for most benchmarks. However, low-overhead profiling techniques result in substantial degradations in the reliability and average performance of profile-driven optimization, often to the point of rendering the entire profile-driven optimization process useless. Our analysis also shows that the effects of profile-driven optimization are highly concentrated in the profile data associated with a few functions. Whether profile data improves or worsens the performance of optimized code, it is often possible to attribute the vast majority of this effect to the profile data associated with just a few functions.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Oct 01, 2003
Accession Number
ADA461168

Entities

People

  • Geoff Langdale

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Combinatorial Analysis
  • Computational Science
  • Computers
  • Data Mining
  • Data Science
  • Estimators
  • Experimental Design
  • Factorial Design
  • Information Science
  • Instruction Set Architecture
  • Measurement
  • Network Science
  • Operating Systems
  • Probability
  • Statistical Algorithms
  • Statistical Analysis
  • Statistical Sampling

Fields of Study

  • Computer science
  • Physics

Readers

  • Computational Modeling and Simulation
  • Neurodegenerative Parkinson's Disease and Rickettsial Disease handbook, including the data level of dopamine, BC, neurons, and PD.
  • Operations Research