CHAPTER 2:
Quality-Improvement Objectives The major goal of the Ada Analyzer is to support software quality-improvement efforts by locating areas of the code that could potentially be improved and providing the information necessary for a decision on whether to make a change. This chapter defines a set of specific quality-improvement objectives and how to achieve them with the Ada Analyzer. The following information is provided for each major objective:
- Detailed descriptions of possible subobjectives;
- A list of Ada Analyzer commands that can be used to address the analysis objective;
- The output that can be expected from each command; and
- How to interpret and use the output to make progress against the objective.
The first section of this chapter provides a general overview of all objectives. The next section describes several different analysis methods and tips on how the Ada Analyzer can be best configured for these methods. All remaining sections describe specific quality-improvement objectives. These sections are not necessarily intended to be read from start to finish. Instead, users with specific quality-improvement goals should go to the section describing that goal and follow the guidelines described there. A summary of all objectives and the commands supporting those objectives appears in "Quality-Improvement Objectives and Command to Use" on page 278.
OverviewSoftware quality can be measured in many dimensions. Perhaps the highest priority is that a program should function correctly. A program's overall quality is also measured by its efficiency, portability, maintainability, readability, consistency, and adherence to development standards. Even correctness is elusive since bugs always remain in software, especially in untested paths and input scenarios. Realistically, correctness is only a measure of the consistency between the requirement's specifications, the behavior of the software, and the user documentation, all of which are written at vastly different points in the development cycle.
Software quality can be evaluated to determine its current rating against a set of "goodness" or "badness" criteria. This rating is often reduced to a set of numbers that, when they fall within a particular range, indicate high, medium, or low quality. This may provide a sense of relative quality, but most ratings systems do not suggest how the quality of the software can be improved.
A second approach, that taken by the Ada Analyzer, is to analyze software by locating areas of the code that can affect the quality dimensions cited above. These areas of probable impact are then brought to the attention of the user for further inspection. The user determines whether the code segment is appropriate based on an evaluation of several competing criteria and the full context in which the construct appears.
The Ada Analyzer generates information that supports interactive analysis by project members and quality-assurance personnel. It does not attempt to determine whether a construct is good or bad but to highlight and illuminate code sections that should be evaluated further. The focus is on identifying potential changes that could improve software quality. The user selects the changes to be made after all available information and overall context are considered.
With these goals in mind, Ada Analyzer commands have been organized around a set of quality-improvement objectives. This chapter fully defines each objective and describes how the Ada Analyzer can be used to achieve that objective. These guidelines can be used as the basis for a detailed analysis plan that is specially tailored to project needs.
The following analysis objectives are currently supported by the Ada Analyzer. They are discussed in more depth in subsequent sections of this chapter:
- Program design and structure: The Ada Analyzer can illuminate how a software system is designed and structured. It locates key program structures and offers condensed information about their use throughout the program. Analysis includes packaging, data typing and object declaration, subprogram interfaces, and dependency relationships of all items. In addition to providing a better understanding of a program's structure, this analysis can often suggest restructuring opportunities, identify redundancy and obsolete items, and locate inconsistencies across the software.
- Software content: The Ada Analyzer locates and counts construct usage at various levels of detail from the very high level to highly precise information on low-level constructs. This can be used to profile software or to quickly locate all instances of a specific construct that has been identified as having a negative impact on the software. A library of metrics is also available to characterize software over a period of time. Metrics can be collected when project milestones are completed and then compared to analyze trends and estimate project-completion status.
- Readability: The Ada Analyzer locates programming constructs that affect local readability, are inconsistent, or, because of their complexity, make programs harder to understand and maintain.
- Portability and compatibility: The Ada Analyzer locates constructs that generally are not portable or are implementation-dependent. This analysis can be used to improve the general portability of software units or to avoid problems when the software is intended for another target compilation system.
- Reusability: The Ada Analyzer supports identification and analysis of Ada units, especially generics, that have reuse potential. It also identifies construct usage that reduces the likelihood that a unit can be reused.
- Conformance to programming standards: The Ada Analyzer locates items that violate programming standards. A few generally accepted criteria are included with this release. These criteria can easily be expanded to check project-specific standards.
- Code correctness: The Ada Analyzer locates areas in the code that have a higher likelihood of error, bringing the possibility for error to the attention of the developer or tester.
- Code efficiency: The Ada Analyzer locates constructs that generally have a high impact on code size and/or run-time speed. This coupled with a good understanding of the cross-compilation tools can rapidly identify opportunities for improved efficiency.
- Reduction of compilation time: The Ada Analyzer can be used to identify the sections of the software that increase compilation time. Reducing compilation time can have a dramatic impact on the number of testing cycles that can be completed before a release must be made.
- Support of testing: Ada Analyzer commands can be used to support the testing process through identification of branch points and the input and output of data in subprograms.
It is important to note that many of these objectives overlap. Some overlap in positive ways. Improving standards conformance, for example, can make a program more consistent, increase its readability, and even improve efficiency when certain prohibited constructs are removed. Some efficiency optimizations, on the other hand, can have negative effects on readability, portability, and reuse. In the following sections, each objective is described somewhat narrowly, with descriptive text outlining how to achieve the stated objective. Competing objectives are sometimes mentioned, but details are often left to the user to evaluate. The Ada Analyzer is intended to provide information to the user, leaving final judgment to the user's understanding of all relevant objectives.
Command descriptions in this chapter are intentionally brief. Although references are made to the output content, actual sample output is not included here. A complete description of all commands along with sample output is provided in Chapter 3, "Command Descriptions."
Analysis MethodsSeveral basic methods of Ada software analysis are supported by the Ada Analyzer. Online analysis is usually interactive, with users scanning hypertable output and traversing to the actual code when more information or context is required. A more formal quality-validation process is supported by commands that report coding-standard violations, compiler-compatibility problems, and/or potential programming errors. These commands typically are used by developers before a release is made with the objective of reducing or eliminating the number of reported violations. Hard-copy forms of the hypertables can be used to support a code-review process or as general reference information for developers and project managers. This section describes each of these methods and gives hints on how to use Ada Analyzer commands most effectively within these domains.
Online Analysis
Most Ada Analyzer commands generate hypertable reports. Hypertables are essentially read-only editors that can be used interactively to accelerate analysis of large amounts of code. Hypertables display condensed forms of information, with each row of the table containing a separate instance of an Ada construct that matches the analysis criteria. Table columns contain attributes of each construct and help to characterize details or provide the context in which each construct appears. Some columns are connected back to the construct or attribute and can be reached by clicking on the [Visit] button. In this way, tables can be scanned for a quick overview of the contents, with traversal used to gather more information when necessary. If the user decides that a change is necessary, it can be made immediately in the code under inspection since traversal arrives at the exact point in the code where the construct is located.
A specific sort order has been chosen as the default for all Ada Analyzer commands. This order is always left to right by column. (Note that the column order for each command can be seen in Chapter 3, "Command Descriptions.") It is possible to resort a hypertable after it has been generated. Given that only two sort options are available when generating hypertables and that many tables have more than two attribute columns, the user may find that another order would be more optimal for the analysis objective at hand. If the sort order in the existing table is not appropriate, the Sort option on the Format menu can be used to reorder the information and store it in a new preview object. A description of this capability appears on Page 16.
Code Reviews
Code reviews should be an essential part of any large-scale software-development process. In some studies, extensive use of peer review uncovered more programming errors than testing and debugging combined. Code reviews, when used at selective points in the development process, can provide a large return on the time invested. Early code reviews can ensure that a consistent design and structure are used throughout the program. This can reduce implementation costs and avoid the need to restructure in later development phases. Code reviews held after implementation can identify programming errors, ensure a consistent use of the Ada language and the application's meta-language interfaces, and improve the execution efficiency of the code.
Code reviews consist of four phases:
- Examination: Tool-assisted review of the code and collection of problem areas for discussion. Hard-copy listings are generally used to annotate issues found during review.
- Discussion: A meeting is held to discuss the issues located by all reviewers. Developers are also present to provide their implementation strategy and perspective on proposed changes. Action items (changes to be made) are written as agreements are reached on the modifications that should be made.
- Correction: Developers make the changes requested in the list of action items.
- Verification: A final review ensures that all changes are made and that those changes do not introduce new problems.
The Ada Analyzer can be used to support the code-review process in several ways. Command output can be used by reviewers before the review meeting to prepare their analysis and comments. Reviewers usually are not the author of the code that they are reviewing and thus are likely to be unfamiliar with its detail before the review. Structural analysis can be used to locate key constructs and increase the reviewers' overall understanding of the code and its design. Other commands can be used to reduce the time required to find potential problems or areas in the code that deserve more attention. Commands analyzing code correctness, portability, and readability can be used to identify areas for improvement. If general structural problems or inconsistencies are found through visual inspection, the Ada Analyzer can assist in finding all occurrences of this pattern within the code. Custom locator commands can be written and integrated into the standards-conformance checking for constructs not already located by the Ada Analyzer (see the section titled "Adding New Compatibility and Standards-Conformance Rules" on Page 252).
Although holding a code review online can be imagined, most reviews are held around a table with discussion centering on hard copies of the code that reviewers have annotated. The effectiveness of hard-copy output used during reviews can be improved in two ways:
- A program listing of all code under review can be generated with the Generate_Listing command. Output can be generated with line numbers, unit-name headers, page numbers, and a complete table of contents. Various other options are available for controlling the precise format of the output.
- Two analysis switch options, Include_Line_Numbers and Include_Unit_Names, should be set to True before executing an analysis command. This provides cross-referencing into the program listing for quick reference at the review meeting.
The Ada Analyzer can also help with the correction and verification process. If items to be corrected are located in hypertables, the developer can use the built-in traversal to quickly locate the areas of the code requiring correction. Part of the verification process can also be performed by re-executing the Ada Analyzer commands to rapidly verify that corrected problems are no longer present.
Validation of Coding Standards
Most software-development projects have a set of coding standards for developers to follow. Coding standards generally consist of a set of language constructs that are prohibited from use by developers. These standards may be motivated by any combination of style, consistency, or efficiency considerations. Use of host-based development of software for embedded targets may also involve compatibility issues between the two compilation systems used in that process.
The difficulty with such standards is that they often contain a long, complicated list of restrictions that are difficult to remember and follow during day-to-day development. Coding-standards analysis attempts to verify that software does not contain any standards violations or compatibility problems. The existence of entries in a hypertable report indicates that the code under analysis has violated one of these standards. Using the traversal built into hypertables, the user can move to each reported violation and make the necessary changes to correct the problem.
The interactive form of coding-standard validation can be used on a daily basis during software development to ensure that the standards are adhered to from the beginning. This is much more efficient than fixing problems after initial development is complete. Continuous checking by developers helps to ensure standards conformance among the entire development team.
The results of batch analysis can be included as part of the code-review process. Some projects require that standards be verified more often, however. Certain standards may be verified before each release of the software, for example. Compatibility standards should be verified before each transfer of the software to the target compilation system to avoid having to correct and retransmit units that will not compile.
Generation of Project Information
Some Ada Analyzer commands generate useful reference information. Commands in this group are most often of the form Display_*, although the counting commands and some Locate_* commands also belong in this group. Information about Ada units and their interdependencies is one example of information that can be useful to all project members. Location of all key programming constructs such as tasks, generics, exceptions, and subprograms is another. Cross-referenced lists of declarations and their dependencies are a third example. Finally, some of the counting information can be useful at the management level to estimate project-completion milestones.
Although traversal is also built into the hypertable reports containing reference information, the hard-copy form can provide a good cross-reference and/or data dictionary of the program. Setting the Analysis_Switches option Display_Subsystem_Names to True is recommended in this case to fully specify the location of any reported construct.
Analyzing Program Structure and ContentThis section discusses analysis of a program's overall structure and its content. These objectives might be pursued most often when the user is unfamiliar with the code and requires an overview of its content. Maintenance projects or any effort that requires some form of reverse engineering would likely employ these techniques as well. The specific objectives in this section are:
- Locating key constructs
- Unit partitioning and dependencies
- Subsystem partitioning and dependencies
- Dynamic analysis
- Counting
- Construct location
- Metrics collection
Locating Key Constructs
Detailed Objectives
This objective focuses on the basic structure of an Ada program and the location of key constructs within that program. Library units and their with dependencies form the organizational skeleton of any Ada program. All other program entities are embedded within this structure.
Key program structures include:
- Declarations: Declarative items define the entities that can be referenced in a program. Type declarations, specifically, form a semantic network from which data objects are declared. Subprograms form the set of operations that are applied to those objects. An overview of all declarations, especially when separated by kind, provides one key dimension in understanding any program.
- Exceptions: Exceptions often form the core of any error-handling strategy in an Ada program. Locating them is central to understanding this aspect of interprocedural communication.
- Tasks: Use of tasking in an Ada program often defines its approach for scheduling separate processes. Tasking can also be used to synchronize access to shared resources. Locating all tasks and understanding their interactions is another key component of Ada program analysis.
- Generics: Generic units often are used to implement key abstractions. They can also be used when dependency inversion requires that low-level functions "look upward" in the dependency hierarchy. A detailed analysis of generics can provide insight into the design of any program.
Applicable Commands and Output Interpretation
- Display_Expanded_Type_Structure: Displays (recursively) all components of record and array types. The type referenced by an access type can also be expanded on request. Each component is displayed with its type and size in bytes (if computable). This display can be used to visualize the size and structure of large, complicated composite types.
- Display_Unit_Declarations: Displays all unit declarations in a hierarchical format similar to the code itself. Comments, blank lines, and other extraneous details are not included. This provides a high-level representation of code that is easy to scan for overall structure.
- Display_With_Closure: Locates all library units and their families (including subunits and child units) and displays those units in either an indented hierarchy or a sorted list. In the hierarchical format, withed units are expanded inline and indented one level. When displayed as a flat list, units can be sorted in either alphabetical or dependency order. In addition, the with clauses for each library unit are listed with a reference to where the withed library unit and its family are likewise expanded. This command allows users to see package partitioning, subunit partitioning, specific dependencies of each unit, and the global dependency hierarchy.
- Locate_Class_Hierarchies: Locates all tagged types and the closure of the types that are derived from them. Both explicit and inherited methods for each class can be displayed. This report can be used to visualize the structure of the class library of a system.
- Locate_Exceptions: Displays for each exception declaration all:
- Raises of that exception
- Handlers for that exception
The subprogram or package in which the raise or handler appears is included in the output. Anonymous raises are also identified. With this output, an analysis of all interactions for each exception can be performed.
- Locate_Generics: Displays all generic declarations and their instantiations. The first three (limited only by space) formal parameter kinds are included for each generic declaration. Instantiation entries include the actual parameter used for each formal parameter in the instantiation.
- Locate_Named_Declarations: Displays all declarations sorted by name and separated by kind into multiple tables. Boolean parameters are available to select which declarations are of most interest.
- Locate_Similar_Record_Structures: Locates record types that have an identical or similar set of component types. Two record types, each with three integer components, would be reported as identical. Component names are ignored. Records with one extra component, or one component that has a different type, are reported as having one difference. This information can be used to recognize redundant types or opportunities for discriminant types.
- Locate_Tasking: Displays all tasks and protected objects, all entries and protected subprograms, and the accepts and calls for each entry. The output is sorted to focus on entry calls and accept statements for each entry to highlight rendezvous possibilities.
- Locate_Type_Declarations: Displays all type declarations separated by kind and provides relevant attributes for each type. Numeric types, for example, list the bounds and accuracy constraints. Array types list the number of indexes and the kind and name of the component type of the array. This information can be used to better understand the network of type declarations from which all program objects are declared.
Unit Partitioning and Dependencies
Detailed Objectives
This objective focuses on the partitioning of Ada units within the program and the dependencies among those units.
Applicable Commands and Output Interpretation
- Display_Unit_Relationships: Similar to the Display_With_Closure command, but computes only the first-level withing relationships of the units specified and not the closure. Cross-references are also not computed. This information can be used to understand the relationships of units in just one subsystem, for example.
- Locate_External_Dependencies: Locates dependencies that one set of Ada units has on another set. Dependencies are listed from a unit in the referencing set to a declaration in the referenced set.
- Display_With_Closure: Locates all library units and their families (including subunits and child units) and displays those units in either an indented hierarchy or a sorted list. In the hierarchical format, withed units are expanded inline and indented one level. When displayed as a flat list, units can be sorted in either alphabetical or dependency order. In addition, the with clauses for each library unit are listed with a reference to where the withed library unit and its family are likewise expanded. This command allows the user to see package partitioning, subunit partitioning, specific dependencies of each unit, and the global dependency hierarchy.
Note: Note that the List_Closure_To_Ascii_File option will create a list of dependent units and write their pathnames to a file.
- Locate_Subunit_Candidates: Locates all subprograms that are larger than some threshold size, thus making them candidates to become subunits. Existing subunits and their sizes are also displayed in a separate table.
Subsystem Partitioning and Dependencies
Detailed Objectives
This objective focuses on the organization of Ada units within Rational Subsystems and subdirectories. Subsystems are an integral part of program design and development on Rational Apex. Understanding the partitioning of units into subsystems, the import relationships between subsystems, and the unit-level dependencies across subsystems is another critical component in understanding an Ada system's organization and design.
Applicable Commands and Output Interpretation
- Display_Inter_Subsystem_References: Displays, for each subsystem in an activity, the units in other subsystems that it references. It also displays the reverse relationship —— that is, for each exported unit in a subsystem, a list of units that reference it from other subsystems is displayed. This provides a convenient cross-reference listing of all external dependencies.
- Display_Subsystem_Import_Closure: Displays the import relationships between sub-systems contained in a configuration. All subsystem entries are displayed in an indented hierarchy with higher-level subsystems appearing at the top of the display. This command provides a view-level of dependencies, whereas the Display_Inter_Subsystem_References command provides the unit-level view.
Dynamic Analysis
Detailed Objectives
This objective focuses on the dynamic structure of a program —— that is, the relationship of program entities during execution. Key components of this analysis are:
- Calling relationships: The order and depth of subprogram calling sequences offers a different perspective of a program's structure. Use of recursion and co-routine implementations may also be of interest.
- Elaboration: Ada defines an initial execution sequence called elaboration, where static variables are initialized, tasks are made ready to run, and certain other dynamic preparations are made. Locating the points in the program where these elaboration actions take place is an important part of execution analysis.
- Branch points: The points in a program where decisions are made and different logic is executed depending on the outcome of those decisions are typically called branch points. Locating these decision points and determining the effect of each branch (for example, reading input variables and writing output variables) can improve understanding of program execution. Although not in the traditional format, this output can be seen as similar to a flowchart.
- Set and use of objects: The reading and writing of data is one of the most important activities in a program. Knowing where each variable is updated and read, especially when it is globally visible, can be an important contribution to the user's understanding of a program.
- Task rendezvousing: The interaction of tasks through entry calls and accept statements is another key aspect of the dynamic behavior of a program.
- Propagation of exceptions: The propagation of exceptions from any subprogram is very important in understanding the control flow of a program.
Applicable Commands and Output Interpretation
- Display_Call_Tree: Displays the calling hierarchy of any subprogram in an indented hierarchy. The output identifies what kind of subprogram is called (procedure or function) and whether the call is made using a rename. It also traverses through calls to generic formal subprograms into the actual subprogram used in the appropriate instantiation. Full, inline expansion of all calls is possible, or references to previously expanded calls can be selected with the parameter options of this command.
- Locate_Calling_Relationships: Displays both the calls that a particular subprogram makes to other subprograms and the reverse relationship, the other subprograms that call a particular subprogram.
- Locate_Calling_Thread_Sets_And_Uses: Displays all static variables that are set and used along the calling thread of a specified subprogram or task.This list of referenced variables can be compared with lists from other calling threads to locate variables that may need synchronization in a multi-tasking context. It may also be used during debugging to locate where specific variables are being updated.
- Display_Subprogram_Branch_Points: Displays each possible path in a subprogram and the branch points that are encountered along that path. Data values required to select the correct branch and variable references along the path are included in the displayed attributes.
- Locate_Elaboration_Impacts: Displays all aspects of a program that are part of or affect elaboration. This includes:
- Package begin blocks: Attributes include the number of statements in the block and whether variables in external packages are updated.
- Task declarations and object declarations containing tasks: Any declaration in which a task elaborates.
- Initialization of static objects: Attributes include an indication of the kind of initial value (a function result, aggregate, and so on).
- Use of pragma Elaborate.
- Locate_Objects_Set_And_Used: Locates all variables and parameters and displays how often each is read (used) and updated (set). Each set and use is reachable through traversal.
- Locate_Recursive_Subprograms: Locates all recursive subprograms. In this context, recursion means that a subprogram directly calls itself or that it calls a subprogram that
eventually calls back to the original subprogram. The depth of the call chain to search is controlled by a parameter.- Locate_Rep_Specs: Locates two important execution-sizing parameters that are specified with size clauses:
- Task storage size
- Collection size for access types
This information can provide some indication of where memory resources are allocated in the program.
- Locate_Tasking: Displays all tasks and protected objects, all entries and protected subprograms, and the accepts and calls for each entry. The output is sorted to focus on entry calls and accept statements to highlight task interactions. Entries without accepts are noted explicitly in the table output. The lack of entry calls can be determined by scanning the table for entry declarations with no calls.
- Locate_Subprograms_Propagating_Exceptions: Locates all subprograms and the exceptions that are propagated from them.
Counting
Detailed Objectives
Counting Ada constructs or lines tells little about the semantics or structure of a program, but it can provide a profile of software as it grows over time. A measurement of source lines of code (SLOC) can be used to estimate several attendant development costs —— for example, testing, documentation, and maintenance. Comparison of SLOC at intermediate release points in the development process can also help to gauge progress toward completion.
Applicable Commands and Output Interpretation
- Compare_Unit_Construct_Counts: Counts each type of Ada construct within a set of Ada units and reports these counts in two forms:
- Separate totals for each unit
- Comparison of all unit totals with each other
Summary totals and unit averages are also included. The command Display_Unit_Construct_Counts lists construct counts for each individual unit.
- Count_Lines_Of_Code: Counts all lines (carriage returns) in a set of Ada units and characterizes each line as one of the following:
- A blank line
- A comment line
- A line that ends with a semicolon
- Other (none of the above)
- Actual Ada lines (total lines – comment lines – blank lines)
With these totals, the user can measure SLOC in whatever way best suits the project requirements.
Construct Location
Detailed Objectives
Construct location often becomes important after a problem or opportunity for improvement is found through other means. Then the objective becomes locating all constructs that have the same pattern as the problem construct. An example might be a construct that is inefficiently implemented by the compiler. If this were the concatenation operator, for example, the Locate_Operators command could be used to find all usages of that operator. When more precise filtering is required, a new rule can be written and easily integrated into the Locate_Coding_Violations command. (See the section titled "Adding New Compatibility and Standards-Conformance Rules" on Page 252 for details on performing this customization.) The following list contains the standard Ada Analyzer commands that provide general construct location.
Applicable Commands and Output Interpretation
- Display_Unit_Construct_Counts: Lists the number of constructs that appear in a unit. All possible Ada constructs are listed.
- Display_Unit_Declarations: Displays all unit declarations in a hierarchical format similar to the code itself. Comments, blank lines, and other extraneous details are not included. This provides a high-level representation of code that is easy to scan for overall structure.
- Locate_Annotations: Locates all annotations (structured comments) and collects the content of each annotation if specified.
- Locate_Attributes: Locates all attribute usages and displays each item attributed and its kind.
- Locate_Constants: Locates all constants separated by type and provides the constant's static value if it is computable.
- Locate_Elements_Containing_Text: Locates all elements that contain some specified text string and sorts them by kind.
- Locate_Expressions: Locates all expressions and collects appropriate attributes of each.
- Locate_Named_Declarations: Locates all declarations separated by kind and provides each declaration's name and appropriate attributes.
- Locate_Operators: Locates all operator usage and provides the type and expression of each operand.
- Locate_Pragmas: Locates all pragma usage and any parameters to those pragmas.
- Locate_Rep_Specs: Locates all representation specifications and collects appropriate attributes of each.
- Locate_Statements: Locates all statements separated by kind and collects relevant attributes of each.
- Locate_Type_Declarations: Locates all type declarations and collects appropriate attributes of each.
For more details on these commands, see their detailed descriptions in Chapter 3, "Command Descriptions."
Metrics Collection
Detailed Objectives
Metrics collection is concerned with quantifying the contents of software with a set of measures or metrics. The significance of a single number that rates a software unit in some dimension is often minimal. It is difficult to interpret the meaning of such isolated numbers or to determine what action to take based on such numbers. If the metric for a particular unit exceeds some threshold, then that unit may warrant further investigation to determine why it has too much or too little of a particular trait. In addition, when metrics collected earlier in the development cycle are compared with one another, the result can be used to recognize trends or to judge project-completion status.
Applicable Commands and Output Interpretation
- Collect_Metrics: Collects a set of enabled metrics on all compilation units or program units located in a specified set of Ada units. Metrics are enabled in the file metrics_collection-
_ status in the metrics_collection subdirectory of the current configuration policy. In this file, metrics are enabled, placed in named hypertables, and given weighting factors if a weighted-average computation is desired. (See "Metrics-Collection Files" on page 34 for more information.)- Compare_Metrics: This command opens up to five previously collected metric reports and generates a comparison of each metric contained in those reports. Differences are computed showing the history of a particular metric over time.
Analyzing ReadabilityThis section discusses objectives for improving the readability, understandability, and thus maintainability of Ada software. Specific analysis objectives include:
- Name selection
- Use of use clauses
- Comment Correctness
- Program complexity
- Unit/subprogram partitioning
Name Selection
Detailed Objectives
One of the key contributing factors to program readability is the selection of good names. Whether or not a name is good is very difficult to determine automatically, but user-assisted analysis can be employed effectively. Many projects have conventions, using names that imply what an entity is or what it does. Examples of this are the exclusive use of nouns for object declarations, verbs for subprograms, *_Generic for all generic units, and Is_* for all predicate functions. If such a system is used, it should be used consistently to avoid confusion.
Consistency is also critical when renaming is used. If a rename is selected to shorten a long package name, that same name should be used whenever a rename is required.
Names that are too short can be cryptic and offer too little information about the item they represent. Names that are too long or have too many segments can be cumbersome to use and affect formatting negatively. Misspelled words can be very annoying during the development phase when references are created and when trying to read and understand a program. Certain words may be prohibited altogether for reasons of portability or consistency.
This objective focuses on analysis of the name space as defined within an Ada program.
Applicable Commands and Output Interpretation
- Locate_Name_Anomalies: Detects and displays the following naming characteristics:
- Names that are shorter than some threshold number of characters
- Names that are longer than some threshold number of characters
- Names that contain more segments than some threshold value
- Names that contain misspelled segments
- Names that contain prohibited words
Two files, one containing abbreviated names that should not be considered as misspelled, and a second containing a list of prohibited words, are used to configure this command.
- Locate_Named_Declarations: Collects all declared names separated by kind of declaration. Names are sorted alphabetically and, where necessary, have an additional kind attribute. This allows the user to scan all names for any violations of the established conventions or for those that are otherwise inappropriate.
- Locate_Renames: Locates all renames and sorts them alphabetically by the name of the renamed entity. This display quickly shows the user when inconsistent renames have been used.
Use of Use Clauses
Detailed Objectives
The use of use clauses is a very controversial issue. On the positive side, they conveniently provide direct visibility to operators that would otherwise have to be renamed or fully qualified. When they are used to avoid qualification of other names, however, they can obscure where an entity is declared and thus reduce code readability.
Applicable Commands and Output Interpretation
- Locate_Use_Clauses: Locates all use clauses and determines the context in which they are placed. This context can be an entire unit or a limited scope such as a declare block or subprogram.
Comment Correctness
Detailed Objectives
When used correctly, comments can greatly enhance the understandability of the program. Although it is as yet impossible to truly "read" comments and check that their content is correct and consistent with the code they reference, comments can be checked for misspellings and the use of prohibited words.
Applicable Commands and Output Interpretation
- Locate_Annotations: Locates all specified annotations (structured comments) and their content. It will also locate required annotations that are missing and prohibited annotations that are present.
- Locate_Elements_Containing_Text: Locates all comments that contain some specified text. This command locates comments containing specified text that should be replaced or modified.
- Locate_Misspellings: Locates all misspelled and prohibited words that appear in comments. A list of acceptable abbreviations and any prohibited words are kept in files. These can be used to limit and/or configure the output generated by this command.
Program Complexity
Detailed Objectives
Complexity is always the enemy of readability and understandability. This analysis objective focuses on locating various forms of program complexity, including:
- Complex expressions: Expression complexity can be measured by its nesting depth, number of operators, number of function calls, and number of variables referenced.
- Complex generic constructions: Complexity in this context includes nested generics, generics instantiated with other generic items, and generics instantiated in a dynamic scope.
- Statement nesting: A measure of the depth to which statements are nested within a subprogram or task.
- Branching depth: A measure of the nesting of branching (if, case, select, and so on) statements within a subprogram or task.
- Number of separate execution paths in a subprogram: A measure of the number of different effects that a subprogram can have, depending on which sequence of branches is taken.
- Number of subprogram parameters and variable references: A measure of the number of inputs and outputs of a subprogram.
- Number of calls to external subprograms: A measure of the number of other operations that need to be "touched" to complete execution of a subprogram.
- Number of return statements: Functions must have at least one return statement in their body. Multiple return statements within functions, or any return statements in procedures, increase the complexity of those subprograms.
- Unused "with" clauses and declarations: Unused items can be very confusing. Complete reusable abstractions may have subprograms that are not used by a particular application and others may be intended for future development requirements. In general, unused constructs should be removed.
- Redundant type declarations: Parallel development often results in redundant type declarations that have the same meaning but separate implementations. This may be appropriate in early development phases, but these types should be collapsed and condensed to avoid confusion as the software matures.
Applicable Commands and Output Interpretation
- Display_Unit_Construct_Counts: Lists the number of constructs that appear in a unit. All possible Ada constructs are listed.
- Collect_Metrics: Collects a set of enabled metrics on all compilation units or program units located in a specified set of Ada units. Metrics tend to count the number of occurrences of a construct and thus measure complexity.
- Locate_Complex_Expressions: Locates all expressions that have greater depth, have more operators and/or function calls, and reference more variables than a set of threshold para-meters allow.
- Locate_Generic_Complexities: Locates complex generic constructions, including nested generics, generics instantiated with other generic items, and generics instantiated in a dynamic scope.
- Locate_Subprogram_Complexities: Computes statement nesting, branching depth, the number of execution paths in a subprogram, and size in lines. This command also computes the number of external calls to other subprograms, the number of return statements, the number of parameters, and the number of variable references within a subprogram or task.
- Locate_Type_Declarations: Locates all type declarations separated by kind. Default sorting will place redundant type specifications in sequential table entries.
- Locate_Unused_Declarations: Locates all unused declarations, including unused record components.
- Locate_Unused_With_Clauses: Locates all with clauses that are unused or are used only to compile a use or rename clause.
Unit/Subprogram Partitioning
Detailed Objectives
Large library units are generally more difficult to read and understand than smaller ones. Subunits are one method of breaking large units into more manageable pieces. Subprograms that are very large may also attempt to implement too much functionality and might better be broken into smaller subfunctions.
Another aspect of subprogram design is the trade-off between parameters and direct object reference. Many factors, including optimization, safety, and program readability, must be considered when designing object references within a subprogram. This analysis objective focuses on information that can help the user make these decisions.
Applicable Commands and Output Interpretation
- Locate_Subprogram_References: Locates all references to objects (including parameters) within a subprogram. The output displays what kind of object is referenced, in what context it is declared, and whether the subprogram sets, uses, or both sets and uses each variable.
- Locate_Subunit_Candidates: Locates all subprograms or package bodies that are bigger than some threshold size. The size of each body can be compared with the size of its parent unit to determine whether the subprogram should be a subunit. Existing subunits and their sizes are displayed in a separate table.
Analyzing Portability and ReusabilityThis section describes objectives to determine when and whether software is portable and/or reusable. It also focuses on locating places where reusability and portability can be improved. Specific analysis objectives include:
- Target-dependent constructs (compiler compatibility)
- Host-development dependencies
- Reusable units
Target-Dependent Constructs (Compiler Compatibility)
Detailed Objectives
This objective focuses on locating program entities that are target-dependent and thus
potentially nonportable. Ada defines many of its language constructs as implementation-dependent, which means that each compiler implementation can make its own precise definition. The definition (bounds and precision) of numeric types in the target compiler's package Standard is one example. Supported pragmas, unchecked programming, and available attributes are others.Incompatibilities generally exist between the Rational compilation system used for host development and the compiler used for target-code generation. Development on Rational Apex can lead to subtle dependencies that do not transfer well to the target. The Rational compiler does not completely support representation specifications or use of System.Address, for example. Because they are not fully supported, they also are not fully checked for correctness.
Some target compilers reserve certain words and prohibit them from use in application programs. No words are reserved by the Rational compiler. This can lead to conflict when an attempt is made to compile these units with the target compiler.
Applicable Commands and Output Interpretation
- Locate_Attributes: Locates all attribute usage, including those that may not be supported by the target compilation system.
- Locate_Compatibility_Problems: Locates the following potential compatibility problems:
- Use of predefined types: Use of nonportable numeric types in package Standard.
- Use of unknown pragmas: Use of pragmas not recognized by the target compiler. In many cases, the pragma simply may have been misspelled. The list of pragmas recognized by the target compiler is defined in a configuration file. All pragma usage is compared with this list.
- Use of reserved names: Use of names reserved by the target compiler. The list of reserved names is defined in a configuration file.
- Unchecked_Conversion size mismatch: Mismatch of type sizes provided to the Unchecked_Conversion function.
- Use of non-predefined attributes.
- Locate_Pragmas: Locates all pragmas, some of which may be implementation-dependent. The pragma Priority is a likely example.
- Locate_Rep_Specs: Locates all representation specifications and displays their relevant attributes.
- Locate_System_Address_Usage: Locates all usage of the System.Address type, including object declarations and use of the 'ADDRESS attribute.
Host-Development Dependencies
Detailed Objectives
The use of a more powerful host to develope software targeted for execution on another processor often provides a more productive environment for software development. Some targets are not capable of executing a compiler or other development tools. Use of such hosts, however, can lead to dependencies on the host environment that do not transfer well to the target compilation system. The Rational compiler allows prompts, for example. Code containing prompts will not parse correctly when submitted to any other target compiler. Dependencies on Rational Apex interfaces can also occur, especially when test code that may need these dependencies is mixed with application software.
Applicable Commands and Output Interpretation
- Locate_Compatibility_Problems: Locates the following potential compatibility problems:
- Use of external interfaces: Use of Ada interfaces outside project-development libraries.
- Use of prompts: Existence of Rational Apex prompts.
- Display_Inter_Subsystem_References: Locates all references to other subsystem units, including subsystems that are not in the target closure and units that appear as utility interfaces in Rational Apex.
Reusable Units
Detailed Objectives
This objective focuses on identifying units that are candidates for reusability and providing information for improving reusability. Although the determination of real reusability is beyond the scope of static analysis tools, indicators such as generic units and the use of private types can identify units with higher probability for reuse. In addition, the following considerations apply:
- Subprograms that operate only on their parameters and not on external variables may be good candidates for general utilities.
- Reusable units should have a limited dependency closure. If they depend on too many other units, the entire closure may be difficult to use in other contexts.
- Named numbers should be declared and referenced instead of literals sprinkled throughout the program.
- Units that have any host/target dependencies (see previous objective) generally will not be reusable.
- Packages containing static variables must either support synchronized access to these variables or state that they are not generally usable in multitasking applications.
Applicable Commands and Output Interpretation
- Display_With_Closure: Displays the compilation closure of any unit, presenting all dependent units in an indented hierarchy. Withed units are expanded inline and indented one level. Complete inline expansion is possible, or embedded references to previous expansions (when duplicates are found) can be selected.
- Locate_Expressions: Locates the use of literals within a unit. Literals generally should be replaced with references to named numbers or type attributes.
- Locate_Generics: Locates all generics and their instantiations. Instantiations can be investigated to determine whether the generic formals should be expanded or contracted. When all instantiations have the same actual parameter for a formal, perhaps that formal should be removed and the actual directly referenced or called.
- Locate_Generic_Formal_Dependencies: Determines which parts of a generic unit actually depend on the formals. Sections that have no dependencies can be moved outside the generic, reducing the amount of code expansion when multiple instantiations are present.
- Locate_Packages_With_State: Locates packages that contain static variables. Access to these variables may require synchronization in multitasking applications.
- Locate_Subprogram_References: Locates all references to objects (including parameters) within a subprogram. The output displays what kind of object is referenced and whether the subprogram sets, uses, or both sets and uses the variable. Subprograms with many external references generally will not be portable.
- Locate_Type_Declarations: Locates all type declarations, especially private and tagged type declarations. The information-hiding aspects of private types generally improve a unit's abstraction and thus its potential for reuse. Tagged types indicate the specification of classes that can be extended.
Checking for Programming ErrorsThis section describes methods for locating potential programming errors through both static and dynamic (interpretive) analysis techniques. Static analysis locates constructs that, simply due to their existence, pose a high risk of error. In dynamic analysis, a simple interpreter can walk through the execution paths of a program and recognize problems specific to one or more combinations of code segments and branch decisions. In both cases, the constructs located should not always be considered as errors but as high-risk constructs that should be investigated further. Specific areas of analysis include:
- Object sets and uses
- Subprogram execution problems
- Misspellings
- Use of error-prone constructs
- Representation specifications
- Static constraint violations
- Use of System.Address
- Inconsistencies
Object Sets and Uses
Detailed Objectives
The use of uninitialized variables is one of the most difficult programming errors to locate. The problem is compounded by the fact that this condition may occur only when a specific path of the program is executed. A traditional approach to finding these errors has been the development of coverage tests —— that is, tests that execute each branch combination of the program in the hope of "touching" all paths and thus each error. Construction of such tests is very expensive and does not always result in identification of the uninitialized variable. Interpretive walks through the program can find these errors without actually having to execute the program.
Problems in the same genre include out parameters that are not updated along a particular path or variables that are set but never used. The objective of the following commands is to locate such problems.
Applicable Commands and Output Interpretation
- Display_Set_Use_Problems: Traverses the execution paths in a subprogram or set of subprograms and identifies the following potential problems:
- Variables that are used before they are set along a particular path
- Out parameters that are not set along a particular path
- Locate_Data_Synchronization_Points: Locates variables that are set and used along multiple calling threads. If a variable is set or set and used in multiple, parallel calling threads, it may need synchronization to ensure protected access.
- Locate_Objects_Set_And_Used: Locates all object declarations and collects all sets and uses of that object. One immediate problem that can be seen in this output is a variable that has no sets and/or no uses. The lack of variable use may be more an efficiency problem than a programming error. However, it can also point to some flaw in the algorithm or its implementation.
Subprogram Execution Problems
Detailed Objectives
Additional execution problems that can be found with interpretive analysis techniques are:
- Functions that have no return statement or raise statement along a particular path.
- The raising of nonvisible exceptions that will propagate out of a subprogram along a particular path.
- Function exception handlers that contain neither a return nor a raise statement.
- The presence of potentially blocking operations (LRM 9.5.1 (8)) in protected operations or in subprograms called by protected operations.
Furthermore, recursion can be a very powerful mechanism for implementing certain algorithms. It can have a detrimental impact, however, on programs with limited stack space. Therefore it is important to know where recursion is used within a program.
Applicable Commands and Output Interpretation
- Display_Subprogram_Execution_Problems: Locates functions that do not have a return or raise statement on all possible execution paths. This command also locates exceptions that propagate out of subprograms that are not declared in a visible scope. If the exception is not in a visible scope, it is likely that some caller will not be able to handle the exception since it cannot reference it. Although when others => is an option, it is undesirable in that it cannot differentiate which exception has been raised.
- Locate_Potential_Programming_Errors: Locates several potential programming errors. In this context, this command locates exception handlers in functions that contain neither a raise nor a return statement. All potential paths within the handler are checked. Potentially blocking operations within protected operations can also be located along with the identification of infinite recursion.
- Locate_Recursive_Subprograms: Locates all recursive subprograms. In this context, recursion means that a subprogram calls itself directly or that it calls a subprogram that eventually calls the original subprogram. The depth of the call chain searched is controlled by a parameter.
Misspellings
Detailed Objectives
Although misspelled words are not truly programming errors, they must at least be considered documentation errors. Standard spelling checkers often report errors incorrectly because of the extra program syntax and name formatting —— for example, underscore ( _ ) characters. Editor-based spelling checkers can be sufficient for single units but cumbersome when checking all units in a release. The command in this section checks spelling of three program constructs within a set of Ada units:
- Comments
- String literals
- Declarative items (each segment of the declared name is checked)
The standard Rational and user-specific dictionaries are used to determine whether a word is misspelled. The user may add words to this dictionary to configure the checking. A file can also be updated to contain abbreviated words that, although not in the dictionary, should not be reported as errors. A second file is used to define any words that, although spelled correctly, are prohibited from use and should always be reported.
Applicable Commands and Output Interpretation
- Locate_Misspellings: Locates misspellings in comments, string literals, and declarations. This command can be configured to allow specified abbreviations and disallow any prohibited words.
Use of Error-Prone Constructs
Detailed Objectives
The objective of the following set of commands is to locate constructs that have a high probability for error. The following potential problems can be located:
- Subprogram calls using default values.
- Division where the divisor has a potential zero value.
- Use of unsafe relational operators with real values (the equal operator, for example, may always return False even though the values are close enough to be considered equal).
- Use of static, nonattribute values for loop ranges. It is better to use a subtype indication or 'FIRST, 'LAST, or 'RANGE attributes.
- Use of static, nonattribute values for slice ranges.
- Operator rename clauses that rename a different operator.
- Exception handlers that check for Numeric_Error but not Constraint_Error in the same handler block.
- Function exception handlers that contain neither a return nor a raise statement.
- Use of others clauses.
- Use of aliasing.
In many cases, no error may exist currently, but the potential for future error is high as modifications are made. Complex programming constructs have an inherently high probability for error. These areas should be scrutinized by the user and possibly broken into smaller, less complicated pieces.
Applicable Commands and Output Interpretation
- Locate_Aliasing: Locates all aliased variables and constants and the use of the `ACCESS attribute with those objects. Aliasing provides two separate ways to reference a variable's value. When used with care, it can be very powerful, but it can also lead to unexpected results when updates are made through aliased access values.
- Locate_Complex_Expressions: Locates all expressions that have greater depth, have more operators and/or function calls, and reference more variables than a set of threshold parameters allow.
- Locate_Others_Clauses: Locates all others clauses used in case statements, exception handlers, or aggregates.
- Locate_Potential_Programming_Errors: Locates several potential programming errors listed above.
- Locate_Subprogram_Complexities: Computes statement nesting, branching depth, the number of execution paths in a subprogram, and its size in lines. This command also computes the number of: external calls to other subprograms, return statements, parameters, and variable references within a subprogram or task.
Representation Specifications
Detailed Objectives
Representation specifications are not fully checked for consistency by the Rational host compiler. The objective of the following command is to locate these inconsistencies:
- Length clauses that are too small to contain values of the type represented.
- Values in representation specifications that must be static but are not.
- Values in the enumeration representation specification that are not in ascending order.
- Components of record representations that are specified more than once.
- Record components that are not specified in the representation clauses. Since this is allowed, it will be recognized as a warning.
- Components that have an even number of bytes in size (characters and integers, for example) that are not byte-aligned in the representation specification. This will lead to inefficiencies in the manipulation of this component. Note that byte size is defined in the configuration file Type_Sizing_File.
- Bit ranges in the representation specification for a component that overlaps with bit ranges from a previous component.
- The number of bits allocated for a specific record component that is too small for that component's size.
Note: All size computations are based on configuration parameters stored in the Type_Sizing_File described on Page 40.
Applicable Commands and Output Interpretation
- Locate_Rep_Spec_Inconsistencies: Locates inconsistencies in representation specifications as described above.
Static Constraint Violations
Detailed Objectives
The Rational host compilation system checks most static constraint violations but does not yet support checking of:
- Real type constraints
- Complex expressions (containing operators) that result in static values
When the appropriate switches are not enabled, it is also possible that no checking is performed. Since these switches have other effects as well, checking may be turned off for parts of the code, resulting in unchecked constraints.
Applicable Commands and Output Interpretation
- Locate_Static_Constraint_Violations: Locates all assignments of static values that violate the upper or lower bound of the object into which the assignment is attempted. The output tables display both the static value (or size for strings and arrays) and the bounds of the container.
Use of System.Address
Detailed Objectives
Use of direct memory addressing in high-level programs is an extremely error-prone activity that Ada discourages. Access types, for example, are not addresses but abstract pointers (handles) to dynamically created objects. The operations on access types are limited to assignment and dereferencing, prohibiting the pointer itself from being manipulated to point at some other part of memory.
Direct access to memory addresses was made available for applications that require it to implement low-level interfaces. Address implementations typically are integer-based, allowing their manipulation with numeric operators. This allows access to parts of memory that may be inappropriate (that is, containing instructions, read-only data, or data values that reside outside the original address bounds of the structure being accessed). Because the numeric manipulation of addresses is very error-prone, the use of values of type System.Address should be monitored carefully to ensure correct use.
One particular error that can be located with the command below is the use of the addresses of dynamic variables (that is, subprogram parameters and local variables). Once the address is received through the 'ADDRESS attribute, it must be used immediately and not stored in a variable for later use. The problem is that this address may no longer be valid once the program continues execution. There is no guarantee that the stack will remain in the same state as when the address was taken.
One other potential problem is use of 'ADDRESS on constant values. Constants may have been folded in by the compiler or optimizer and may not be resident in a valid memory address.
Applicable Commands and Output Interpretation
- Locate_System_Address_Usage: Locates all usage of System.Address, including object declarations and use of the 'ADDRESS attribute.
Inconsistencies
Detailed Objectives
When some construct in Ada is declared, it is expected that other parts of the program will use it. Subprograms should be called, generics should be instantiated at least once, and variables should be both set and used. Both exceptions and task entries have somewhat more complicated usage that should be present. User-defined exceptions should be both raised and handled as well. User-defined exceptions without a raise or a handler are likely errors. Task entries must be called but must also be "accepted." Both are required for rendezvous, and the lack of one or the other is an error.
Applicable Commands and Output Interpretation
- Locate_Exceptions: Displays for each program exception declaration all:
- Raises of that exception
- Handlers for that exception
Table entries are sorted by exception name so that missing handlers or raise statements can be seen clearly. It is also easy to locate raises of predefined exceptions. Although not an error, this is often considered bad practice.
- Locate_Tasking: Displays all tasks and protected objects, all entries and protected operations, and the accepts and calls for each entry. The output is sorted to focus on entry calls and accept statements to highlight task interactions. Entries without accepts are noted explicitly in the table output. The lack of entry calls can be determined by scanning the table for entry declarations with no calls.
Checking Standards ConformanceProgramming standards are almost always project-specific. The Ada Analyzer provides a framework for incorporation of project-specific checks into an existing command that performs all traversal, sorting, and hypertable management. Custom validation checks can be developed quickly for local needs and incorporated into this framework. Most naming standards can be analyzed with existing commands. This section discusses the following specific objectives:
- Formal coding standards
- Naming standards
Formal Coding Standards
Detailed Objectives
This objective focuses on checking units against coding standards that restrict the full use of Ada constructs within a project. Rules in this category may have several motivations:
- Preferred stylistic considerations
- Program safety/correctness
- Target-compiler limitations/optimizations
- Requirements-driven restrictions
Applicable Commands and Output Interpretation
- Locate_Coding_Violations: The base product release locates a large number of constructs whose use is generally considered bad programming practice. In addition, this command provides a framework into which new checks can be added that match local requirements. All checks are developed from a simple template and connected to the traversal harness by adding arms to a few case statements. This process is fully explained in the section titled "Adding New Compatibility and Standards-Conformance Rules" on 252. It is also possible to have this customization performed by Little Tree Consulting.
- Locate_Coding_Violations_Interactively: This command will interactively check for and highlight coding violations within a single Ada unit. Although all coding violations in the library can be enabled for interactive checking, some may impact performance due to the amount of processing required (see "Locate_Coding_Violations_Interactively" on page 130 for more information).
- Locate_Obsolescent_Ada83_Features: This command will locate the use of all features specified in Annex J, "Obsolescent Features", of the Ada 95 language definition. The Ada 95 LRM defines several largely redundant features to maintain compatibility with Ada 83 programs. These may be removed from the language definition in the future.
- Locate_Specific_Coding_Violations: This command is a variant of the Locate_Coding_Violations command. This command allows standards-conformance checks to be selected in the Boolean options section of the command dialog box.
- Locate_Ada95_Coding_Violations: Like the Locate_Specific_Coding_Violations command above, this command allows standards-conformance checks (Ada 95 checks only) to be selected in the Boolean options section of the command dialog box.
Rule libraries are available to check Ada 95 compatibility (provided free in the base release), the Software Productivity Consortium's guidelines on Ada quality and style, and Little Tree Consulting's software-quality guidelines. A list of all checks appears on 299.
Naming Standards
Detailed Objectives
Projects usually choose to define standards for name selection. Naming standards often specify that the names of all declarations of a particular kind have the same form. (For example, the exclusive use of nouns for object declarations, verbs for subprograms, *_Generic for all generic units, and Is_* for all predicate functions.) Other standards require names to have an appropriate meaning and that they be usable. Although no static analysis tool can make this determination, names that are too short, too long, too complicated, or misspelled may be a good starting point for this evaluation. This objective focuses on locating all name declarations in a program and presenting their characteristics to the user so that a determination of "appropriateness" can be made.
Applicable Commands and Output Interpretation
- Locate_Name_Anomalies: Detects and displays:
- Names that are shorter than some threshold number of characters
- Names that are longer than some threshold number of characters
- Names that contain more segments than some threshold number
- Names that contain misspelled segments
- Names that contain prohibited words
Two files, one containing abbreviated names that should not be considered as misspelled, and a second containing a list of prohibited words, are used to configure this analysis.
- Locate_Named_Declarations: Collects all declared names separated by kind of declaration. Names are sorted alphabetically and where necessary have an additional kind attribute. This allows the user to scan all names for any violations of the established conventions.
- Locate_Coding_Violations: Rule LTC 517 will check to see that declared names follow a set of naming conventions specified in the rule_enforcement/naming_conventions file. A description of possible naming conventions enforceable with this file appears on 37.
Reducing Compilation TimeThe amount of time required to recompile a program can greatly affect the time required to complete a project. Changes to an Ada program mean that at least some part of it must be recompiled. Changes to specifications can force many units to be recompiled. As a project moves into later phases of the development cycle, more emphasis is placed on testing and repairing errors found through testing. The amount of time that it takes to make a change and get the program ready to be retested is directly dependent on the recompilation time required. Since thousands, or even hundreds of thousands, of compilations may occur during the lifetime of a project, keeping compilation time to a minimum should be a priority. The following specific objectives are addressed:
- Unused constructs
- Dependency reduction
- Redundancy
- Use of use clauses
Unused Constructs
Detailed Objectives
When a construct such as a declaration is unused, the compiler must check it for correctness, place its name in the symbol table, and generate code. None of this work adds any value to the execution of the program. Reusable packages may contain subprograms that are not used by a particular application but are present for a complete abstraction. Other items may be intended for future development requirements but are unused at this point in the program's development. Thus it may not always be appropriate to remove unused code even though its presence increases compilation time, can affect performance negatively, and generally is confusing.
Applicable Commands and Output Interpretation
- Locate_Unused_Declarations: Locates all unused declarations, including unused record components. Usage is computed within the closure of a single program specified by a configuration.
Note: It is important that the specified configuration have entries for all subsystems in the program closure. If a subsystem is omitted, declarations can be reported as unused when they are not.
Dependency Reduction
Detailed Objectives
Unused with clauses can have a major impact on recompilation time. When a unit specification is recompiled, the units that transitively depend on that unit must be recompiled. If a unit withs another unit but does not actually use anything within that unit, it may be recompiled unnecessarily. If the withing unit is also a unit specification, its recompilation may trigger other unnecessary recompilation in the transitive dependency closure. In other cases, a with clause is necessary but can be moved from the specification to the body of the unit. This will reduce the recompilation requirement to the body and any subunits and avoid the transitive impact entirely.
Applicable Commands and Output Interpretation
- Locate_Unused_With_Clauses: Locates all with clauses that are unused or are used only to compile use or rename clauses. It also identifies with clauses that could be moved to the body and those that are redundant (that is, appear more than once).
Redundancy
Detailed Objectives
Like unused items, redundant items must be compiled, even though they do not have a positive impact on the program. The benefit of removing redundancy is perhaps greater for optimization and readability objectives (see the sections that discuss those objectives), but it is included here for completeness.
Applicable Commands and Output Interpretation
- Locate_Constants: Locates all constants, separated by kind and sorted by value. Redundant values can be seen as subsequent entries in the tables. Redundant names (that is, separate constants with the same name) are also listed in a separate table.
- Locate_Operators: Locates all operator uses and separates them by kind. By looking at the operand expressions for each operator, the user often can locate redundant computations.
- Locate_Similar_Record_Structures: Locates record types that have an identical or similar set of component types. Two record types, each with three integer components, would be reported as identical. Component names are ignored.
- Locate_Type_Declarations: Locates all type declarations and separates them by kind. Redundant declarations (that is, types with the same bounds, access components, and so on) can be seen as subsequent entries in the tables.
Use of Use Clauses
Detailed Objectives
The effect of a use clause is to make the entire name space of a withed package directly visible within the context of that use clause. This can simplify references, especially for operators, but it increases the amount of work that the compiler has to do to resolve any reference, since more names must be searched. Removing use clauses or reducing the context in which they apply can improve compilation speed.
Applicable Commands and Output Interpretation
- Locate_Use_Clauses: Locates all use clauses and indicates the context in which they appear.
Optimizing SoftwareSoftware optimization is not a high-priority task for all phases of the lifecycle, but every project arrives at the point where, although the software "works", it is too slow or too big. This section describes several techniques for improving the efficiency and/or performance of Ada programs:
- Inlining
- Object references
- Object size
- Generics
- Redundancy
- Expensive constructs
- Operator selection
- Loop nesting
- Object initialization
- Use of Text_Io
- Compiler dependencies
Inlining
Detailed Objectives
Object-oriented software defines a set of abstract objects and operations on those objects. These operations typically are subprograms that must be called to initiate some operation on the object. Objects are often composed of lower-level objects, and their operations are implemented, at least in part, by calling lower-level subprograms. This can lead to a large number of subprogram calls, each of which must add and then remove its context from the stack. The ability to inline subprograms thus becomes a critical optimization, allowing the layered structure of the software to exist without the high procedure-call overhead that is often associated with this approach.
One aspect of analyzing where to apply inlining is understanding the calling structure of the program. This is best accomplished with the Display_Call_Tree command, which provides an indented hierarchy of possible calling sequences within a program.
The decision to inline must balance the benefit of eliminating the calling overhead with the resulting code expansion. It also may not be possible to inline some subprograms because they contain exception handlers or tasks, or because they are recursive. The following subprogram attributes are collected to support the decision of whether to inline a subprogram:
- Number of statements in the subprogram body (impacts code expansion)
- Number of local declarations (impacts code expansion)
- Number of call sites (impacts code expansion)
- Whether the subprogram is recursive (inlining not possible)
- Whether the subprogram contains exception handlers (inlining potentially not possible)
- Whether the subprogram contains task elaboration (inlining not possible)
Applicable Commands and Output Interpretation
- Display_Call_Tree: Displays the calling hierarchy of any subprogram in an indented hierarchy. The output identifies what kind of subprogram is called (procedure or function) and whether a call is made using a rename. This command also traverses through calls to generic formal subprograms to the actual subprogram provided in the appropriate instantiation. Full expansion of all calls is possible, or references to previously expanded calls can be selected with the parameters to this command.
- Locate_Inline_Candidates: Locates all subprograms and computes the number of statements, local declarations, and call sites, whether the subprogram is recursive, and whether it contains task elaboration or exception handlers. Threshold parameters can be used to limit the number of reported subprograms to those that are high-probability candidates. The actual number of call sites does not include all calls. It includes only those that appear within the closure of a program specified by a configuration or import closure. Finally, all subprograms that are already inlined (have a pragma Inline) are collected in a separate table with all of the above attributes. This can be used to identify modified subprograms whose characteristics no longer make it beneficial to inline.
Object References
Detailed Objectives
This objective focuses on object references within a subprogram. Generally, parameter references are more efficient since some number of them can be kept in registers. Many targets also have two addressing modes, one for stack-relative addressing and one for direct-memory references. Thus, references to global variables may be more expensive than parameter or local variable references. Finally, object renaming can be used once to compute one or more address offsets into nested structures, allowing subsequent indexing or selection to be made relative to this base address.
Applicable Commands and Output Interpretation
- Locate_Renames: Locates all renames and sorts them alphabetically by the name of the renamed entity. Parameters are available to select only object renames.
- Locate_Subprogram_References: Locates all references to objects (including parameters) within a subprogram. The output displays what kind of object is referenced (that is, local variable, global variable, parameter, and so on) and whether the subprogram sets, uses, or both sets and uses the object.
Object Size
Detailed Objectives
The first objective focuses on objects that require a large amount of storage. Since the referencing of components within large objects may have an impact on the addressing required for access, it can be useful to find all objects that are greater than some threshold size. Some compilation systems even have a limit on the maximum size of an object.
A second objective focuses on analysis of object packing. Clearly there is a trade-off between space reduction with packed objects versus increased time to extract and process packed components. When the highest priority is space, the trade-off may be acceptable; the Ada Analyzer can help make this determination by estimating the difference between packed and nonpacked objects. Three options in the type_sizing_file can be used to support this analysis. They are the Always_Pack_Booleans_Option, the Always_Pack_Enumeration_Types_-Option, and the Always_Pack_Integer_Types_Option. If these options are True, size computation will use The_Minimum_Packed_Boolean_Size, The_Minimum_Packed_-Enumeration_Size, and The_Minimum_Packed_Integer_Size, respectively, when calculating the size of objects or components of these types. The Dont_Straddle_Word_Boundaries_-Option can be used to prevent part of a packed component from appearing in one word and another part in the subsequent word. If this option is set to True, filler size will be added to prevent this straddling.
Applicable Commands and Output Interpretation
- Locate_Objects_By_Size: Locates all object declarations and computes the number of bits that they would occupy on the target. Output can be reported in bits or rounded up to the nearest byte or word count. Representation specifications are respected as well as size clauses. In some cases, complicated record constructs with alignment and dope-vector considerations may result in only approximate results. Named types can be given a precise size value by updating a configuration file. (See the discussion of the type_sizing configuration file on Page 42 for more information.)
Generics
Detailed Objectives
This objective focuses on optimizing the use of generics. On almost all targets, generics are expanded inline when they are instantiated. This can lead to a large amount of memory usage when multiple instantiations are present. One optimization is to make sure that all items in the generic actually depend on the generic formals. If segments of the generic are not dependent, they can be moved to a nongeneric part, removing this code from code that is macro-expanded at each instantiation.
Some compilers may also be less efficient when implementing complicated generic constructions such as nested generics, generics that are instantiated with other generics, and generics that are instantiated in a dynamic scope. The location and analysis of these constructs can lead to optimizations as well.
Applicable Commands and Output Interpretation
- Locate_Generic_Complexities: Locates complex generic constructions, including nested generics, generics instantiated with other generic items, and generics instantiated in a dynamic scope.
- Locate_Generic_Formal_Dependencies: Determines which parts of a generic unit do not depend on the formals. Sections that have no dependencies can be moved outside the generic, reducing the amount of code expansion when multiple instantiations are present.
Redundancy
Detailed Objectives
Redundant constructs can cause additional code to be generated and included in the load image. In some cases, the compiler can recognize the redundancy and remove (or "fold") them from the code. Redundancy increases confusion and compilation time. These are valid reasons themselves to remove them from the code.
Redundant constants should be eliminated and defined once in a global scope. It is perhaps better to have one constant for pi declared globally than for several code sections to define their own. Two accuracy levels may be required for some algorithms, but their values, once set, are unlikely to change. A global placement therefore will not have potential recompilation impacts.
Redundant implementation of computational algorithms can occur rather naturally when requirements for different functional units are presented separately and then implemented by different developers. Locating these redundant implementations and placing them in one utilities package can save code size and reduce maintenance time.
Applicable Commands and Output Interpretation
- Locate_Constants: Locates all constants separated by kind. Redundant values can be seen as subsequent entries in the tables. Redundant names (that is, separate constants with the same name) are also listed in a separate table.
- Locate_Operators: Locates all operator usage. By looking at the operand expressions for each kind of operator, the user can often locate redundant computations.
- Locate_Similar_Record_Structures: Locates record types that have an identical or similar set of component types. Two record types, each with three integer components, would be reported as identical. Component names are ignored.
Expensive Constructs
Detailed Objectives
This objective focuses on Ada constructs that are generally considered expensive. Although each has a set of semantics that may be advantageous for simple implementation, there may be a trade-off with increased execution time. The commands in this section provide information for evaluating this trade-off. The following Ada constructs are of most interest:
- Type declarations that are deeply nested: Nested types may require multiple address calculations to extract single elements.
- Unconstrained types: Unconstrained arrays and discriminant records typically have more overhead than their constrained counterparts. Depending on requirements, space or time trade-offs can be made by switching from one form to the other.
- Type declarations that have representation specifications: Representation specifications may require extra processing (shifting and masking) to extract individual fields. Representation specifications are often necessary to interface with external hardware or I/O channels but should not be used for internal processing because of this access overhead. One-time con-version to or from internal data (unspecified types) may be an appropriate optimization.
- Type declarations that are packed with pragma Pack: Packing data often requires the extra extraction processing associated with representation specifications.
- Slices: Slices may require complicated address calculation and extra copying to manipulate.
- Aggregates, especially with "others" clauses: Addressing computations for each value inserted from an aggregate into a composite object may not be as fast as direct assignment of individual values or assignment of ranges of values inside a loop.
- Record types with default initialization: Default initialization can be a very useful way to ensure that all objects of a certain type have an initial value. It can be expensive, however. Compilers typically generate elaboration code to perform these initializations. Object declarations of these types could also be given an explicit initialization value. In some other cases, it may not be necessary to perform the initialization at all. Since this extra code and elaboration time is somewhat hidden, it may be worth investigating whether the default initialization is really needed.
Not all of these constructs may be suspect on all targets. Analysis can focus on those that affect performance given the specific characteristics of the target compiler.
Applicable Commands and Output Interpretation
- Display_Expanded_Type_Structure: Displays (recursively) all components of record and array types. The type referenced by an access type can also be expanded on request. Each component is displayed with its type and size in bytes (if computable). This display can be used to visualize the size and structure of large, complicated composite types.
- Locate_Default_Initialization: Locates record types with default initialization and collects all object declarations of these types.
- Locate_Expensive_Types: Locates all type declarations that are nested beyond a given threshold level, types with representation specifications, packed types, and unconstrained types.
- Locate_Expressions: Locates all expressions and separates them by kind. Location of slices and aggregates can be selected with Boolean parameters.
- Locate_Others_Clauses: Locates all others clauses used in case statements, exception handlers, or aggregates.
Operator Selection
Detailed Objectives
Some operators are more expensive than others. Multiplication is typically much less expensive than division, for example. When scaling a value, it is often better to multiply by the precomputed inverse of the value than to perform division. String concatenation is so expensive on some machines that it often must be avoided entirely.
Short-circuit Boolean operators can be more efficient while maintaining semantic equivalence of the conditional expression. These operations allow the evaluation of the second part of an expression to be skipped when the first part completely determines the outcome of the Boolean expression. This can save time when the second expression is complex, requiring many instructions to evaluate.
Applicable Commands and Output Interpretation
- Locate_Operators: Locates all operator usage and provides the type and expressions for each operand.
- Locate_Short_Circuit_Opportunities: Locates all occurrences of or, and, or else, and and then operators. Right and left operand expressions are included in the output.
Loop Nesting
Detailed Objectives
Loop statements execute the statements within them many times. As the number of iterations increases, the need to optimize the loop contents becomes more important. Nested loops have a multiplier effect with the contents of the inner loop executed M*N times. Identifying such "hot spots" can help the user focus on places where optimization efforts may have the most impact.
Applicable Commands and Output Interpretation
- Locate_Loop_Nesting: Locates all loops that have a static number of iterations greater than a given threshold value or loops that are nested deeper than a given threshold depth. This analysis is extended across procedure calls. That is, if a subprogram is called within a loop and that subprogram also contains a loop or calls (transitively) another subprogram that contains a loop, then that is counted as one additional level of loop nesting.
Object Initialization
Detailed Objectives
As discussed in the section on programming errors, an uninitialized variable can be a very difficult error to locate. One strategy for avoiding uninitialized variables is to ensure that all variable declarations have an explicit initial value. This may avoid uninitialized variables, but it can impact performance when the variable is also set by one of the following means before it is ever used:
- Implicitly initialized through default initialization of record types;
- Immediate assignment; or
- Used as an out parameter in a procedure call.
In these cases, the variable is set twice before it is used. Such cases can usually be eliminated by removing the explicit initial value.
Applicable Commands and Output Interpretation
- Display_Set_Twice_Before_Use: Traverses a subprogram or set of subprograms and identifies local variables that are set twice before they are ever used.
Use of Text_Io
Detailed Objectives
The use of package Text_Io in programs intended to execute on embedded targets can sometimes be a problem. This package is often very large and can add a significant amount of space to the load image of a program. The use of Text_Io in the early phases of a program's development on the host can greatly help some forms of testing. As the program moves to the target, however, it might be necessary to replace all Text_Io calls with calls to a lower-level set of I/O services.
Applicable Commands and Output Interpretation
- Locate_Compatibility_Problems: Locates all references to package Text_Io in addition to other potential compatibility problems. A file is used to enable or disable checking of each rule. Modifying this file allows checking only for the use of Text_Io, not for all compatibility checks.
Compiler Dependencies
Detailed Objectives
All compilers have constructs that are not implemented as efficiently as other alternative constructs. When these constructs are known (sometimes the compiler documentation provides such a list under "Programming Guidelines"), several Locate_* commands can be used to find all instances of those constructs. The commands Locate_Statements, Locate_Expressions, Locate_Attributes, and Locate_Named_Declarations are likely the most useful for this objective. (See the section titled "Construct Location" for more details on the use of these commands.)
Miscellaneous ObjectivesThis section describes a few miscellaneous objectives that do not necessarily fit into any of the major categories above. It includes:
- Unit-testing support
- Documentation support
- Elaboration problems
- Null statements
- Hard-copy listings
Unit-Testing Support
Detailed Objectives
Testing can be a very time-consuming and expensive process. Several forms of testing are, of course, possible during a project. Especially during the early phases of the development cycle, testing is often informal. Programs are executed by the user many times, perhaps under control of a debugger, and repairs are made immediately as errors are found. In later phases, formal test programs are often developed. Their advantage is that the same testing can be repeated many times to ensure that minor changes to other parts of the program do not affect the correct functioning of the tested software. This is typically called regression testing.
Formal unit tests typically set some initial conditions (input), execute the unit (a subprogram), and then compare the result (outputs) against expected values. Inputs include all variables and parameters that the program uses during execution. Outputs include all variables and parameters that the subprogram sets. The setting of inputs and outputs depends on the actual path that execution takes through the subprogram. This objective focuses on identifying each possible path through a subprogram and computing the objects that are set and used along each path. This information can be used to help create unit tests for a subprogram. Coverage tests (that is, a set of tests that execute all paths in a program) can also be developed from this information.
Applicable Commands and Output Interpretation
- Display_Subprogram_Branch_Points: Locates all subprogram paths and displays the value of the conditional expression at each branch point necessary to follow that path. The variables that are set and used along each path are also listed.
- Locate_Subprogram_References: Locates all references to objects (including parameters) within a subprogram. The output displays what kind of object is referenced and whether the subprogram sets, uses, or both sets and uses the variable. This can be used to determine what inputs are necessary and the effect of those inputs on other data objects.
Documentation Support
Detailed Objectives
In many cases, large segments of detailed documentation can be generated from the software itself. Ada was designed to support code that is self-documenting. Although that goal cannot be completely realized within Ada, the software does contain large amounts of useful information that can be extracted and delivered as documentation to the customer. Commands collecting information about software content and inter-relationships can all be included or delivered separately as system documentation. Two examples, Locate_Objects_Set_And_Used and Locate_Subprograms_Propagating_Exceptions, are listed below.
Comment annotations can increase the information content associated with specific constructs. This information can be extracted and cross-referenced to the Ada units in which they appear or to the specific construct to which they are attached.
Applicable Commands and Output Interpretation
- Display_Unit_Declarations: Displays all unit declarations in a hierarchical format similar to the code itself. Comments, blank lines, and other extraneous details are not included. This provides a high-level representation of code that is easy to scan for overall structure.
- Locate_Annotations: Locates all annotations (structured comments) and collects the content of each annotation if specified.
- Locate_Objects_Set_And_Used: Locates all object declarations and collects all sets and uses of that object.
- Locate_Subprograms_Propagating_Exceptions: Locates all subprograms and the exceptions that are propagated from them. This is not something that can be specified with Ada constructs, and annotations can easily become inconsistent with the code.
Elaboration Problems
Detailed Objectives
Ada defines an initial phase of program execution called elaboration, where static variables are initialized, tasks are made ready to run, and subprogram and generic bodies are checked to make sure they are also elaborated. This phase occurs before the first statement of the main procedure is executed. Local variables are also elaborated during execution upon entry to all subprograms.
Occasionally, a program will not execute because of some problem during elaboration. Some debuggers may help to illuminate the problem, but sometimes the program simply becomes stuck with no message reported. The Locate_Elaboration_Impacts command can be used to locate all places in the code that impact elaboration. One frequent problem is the use of a function as an initial value expression before the body of the function has been defined. These instances can be found by examining all static objects that are initialized with function calls.
Analysis is also available to determine if a unit is preelaborable or pure as specified by section 10.2.1 of the Ada 95 LRM.
Applicable Commands and Output Interpretation
- Locate_Elaboration_Impacts: Displays all aspects of a program that are part of or impact elaboration. This includes package begin blocks, task declarations and object declarations containing tasks, initialization of static objects, and use of pragma Elaborate.
Null Statements
Detailed Objectives
Null statements may not seem important since they are a simple concept and generate no code. The problem is that null statements are often used as placeholders when the developer wants to defer complete implementation for some reason. Ada requires at least one statement in many places and a null statement is easily supplied to arrive at a unit that will compile and execute. This process is often called stubbing. Prototyping and stepwise implementation strategies often use this technique. In later phases of development, it is expected that all null statements have been replaced with implementation code and that any remaining ones truly mean that the program should do nothing for the case at hand. The existence of null statements in code therefore can indicate unimplemented requirements.
In addition, null statements that do not appear alone in statement lists are superfluous and should be removed.
Applicable Commands and Output Interpretation
- Locate_Statements: Locates all null statements (selectable by parameter) and indicates whether the null statement appears alone or with other statements.
Hard-Copy Listings
Detailed Objectives
The Ada Analyzer is intended primarily for interactive use. The traversal built into all hyper-tables greatly accelerates the process of locating areas where improvement is possible. Hard copy of command output does not contain the traversal feature but has the advantage that it can be taken into review meetings or distributed as general reference material.
Applicable Commands and Output Interpretation
- Generate_Listing: Generates a combined listing of several Ada units in FrameMaker MIF format. Unit images can be generated with line numbers, unit-name headers, page numbers, and a complete table of contents. Various other options are available for controlling the precise format of the output. The resulting output can be opened by FrameMaker or FrameViewer and then printed.
Rational Software Corporation
http://www.rational.com support@rational.com techpubs@rational.com Copyright © 1993-2000, Rational Software Corporation. All rights reserved. |