Learning SAS's Perl Regular Expression Matching the Easy Way: By Doing

Abstract

Perl Regular Expression (PRX) functions were added in SAS9. The key to learning how to use them is learning PRX metacharacter behavior within a PRX match. The more you practice PRX matching, the more proficient you become. Ideally, practice would involve a method for repeating the matching process so as to see cause and effect between the input, in the form of modifications to the perl regular expression and/or the source string, and the output in the form of the match results. But just as important as the practice itself would be a method for logging your "practice trail" in a file for future reference and expansion as well as avoidance of wheel-reinventing. But how do you do this without this file becoming bloated, cluttered, and unmanageable? The answer is to let it become bloated and cluttered. Enter the regex_learning_tool consisting of a SAS Enterprise Guide project and an Excel file containing your practice trail in the form of match records each with a different perl regular expression and/or source string. The regex_learning_tool allows both beginner and expert to efficiently practice PRX matching by selecting and processing only the match records that the user is interested in based on a search of either PRX metacharacters contained in the perl regular expression or character string(s) contained in a match description field. The current Excel file contains over 400 match records demonstrating the use of most all PRX metacharacters. This paper will explain how to use the regex_learning_tool so that you can practice PRX matching, retain what you learn, and not have to worry about how bloated, cluttered, and unmanageable your practice trail becomes. The regex_learning_tool project was written using SAS Enterprise Guide 5.1 and SAS 9.3 on a Windows 7 Enterprise operating system. A copy of the SAS Enterprise Guide project, Excel file, as well as other tool-related material can be found at sascommunity.org under Papers and Presentations.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 12, 2015
Accession Number
ADA616591

Entities

People

  • Paul Genovesi

Organizations

  • United States Air Force School of Aerospace Medicine

Tags

Communities of Interest

  • Space

DTIC Thesaurus Topics

  • Aerospace Medicine
  • Department Of Defense
  • Electronic Mail
  • Information Operations
  • Instructions
  • Learning
  • Materials
  • Military Medicine
  • Operating Systems
  • Personality
  • Trademarks
  • Truncation

Readers

  • Applied Combinatorial Optimization and Logic Circuit Design.
  • Database Systems and Applications
  • Organizational Process Management (OPM).