Knowledge and Reasoning for Drastic Program Improvement: Melding Formal and Statistical Approaches
Abstract
Increasingly more sophisticated software are developed for Naval systems. Developing such software increasingly relies on programming the application logic on top of existing components built as libraries. This leads to a dilemma, from two conflicting realities: On the one hand, a myriad of software libraries have come into existence and grown rapidly in the past 10~15 years~from the prominent NumPy, SciPy, Panda, etc. for scientific computation, data analysis, etc. in the drastically growing user community of the Python language, to all kinds of libraries for all kinds of other languages. They are available as open source for so many important applications and used by everyday programmers in all these application domains, especially for Python due to its ease of use.On the other hand, developing efficient programs correctly depends critically on knowledge and smart use of these libraries, while sufficient such knowledge is possessed only by the true experts who fully understand relevant functions in relevant libraries and their proper use in the context of the application logic. Even experts cannot keep up due to the growth of such libraries, which are also constantly upgraded for different underlying software versions and hardware architectures. Improper uses can easily lead to drastic slowdowns in the application software. Solving this dilemma requires learning deep knowledge about programming and programs, and using the knowledge to reason about correct and efficient use of libraries. However, the state of the art is extremely limited. While such knowledge could be programmed in a program optimizer,as is done currently, this approach is impractical and does not scale for maintaining the optimizer with the rapidly growing library knowledge to be captured. A knowledge base about programming and programs is necessary to capture this knowledge declaratively. Due to the deep and intractable nature of powerful, high-level, dynamic programming languages like Python, acquiring new knowledge of programs and programming automatically througheither formal reasoning or statistical learning alone is far from sufficient. Simply using one approach for certain tasks, and the other approach for other tasks, is also inadequate. Each challenging task requires inherently combined use of both. Use of a knowledge base about programs and programming will also be critically needed for much beyond program optimization. This is because people believe that great applications will be done by AI, the artificial intelligence, but AI itself is all in computer programs, which must be reasoned about, manipulated, and improved. This project proposes a general knowledge base framework for programming and programs, and the use of this framework in drastic program improvement. The framework acquires knowledge of programs and programming from two sources: formal descriptions about language semanticsand implementations, and statistical information about programs and program executions. The framework represents knowledge about programming and programs in two categories: equivalence relations and cost measures. Both categories include both formal and statistical knowledge. The framework then uses the knowledge base to improve the programs based on the learned equivalence relations and the desired cost measures and trade-offs. This includes integrated use of formal and statistical knowledge to provide guarantees on the improvement. We will build a Python implementationof this framework and its use for program improvement for Python, and evaluate iton real applications.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Dec 16, 2019
- Source ID
- N000142012751
Entities
People
- Scott Stoller
Organizations
- Office of Naval Research
- Research Foundation for the State University of New York
- United States Navy