bannerc

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Set Data
hiddentrue
namereftab
45
Tabsetup
01. Introduction
12. Design Analysis Guidance
23. Safety Analysis During Design Analysis
34. Analysis Report Content
45. Resources

Introduction

The Software Design Analysis product

This product section

focuses on analyzing the software design that has been developed from the requirements

.

(software, system, and/or interface). This topic describes some of the methods and techniques Software Assurance and Software Safety personnel may use to evaluate the quality of the architecture and design elements that was developed.

The software This product section focuses on analyzing the design that has been developed from the requirements. The design process begins with a good understanding of the requirements and initially develops a basic architecture to use in coding the software. The basic architecture the system architecture and system design. The architectural design begins with the development of a basic architecture and a high-level preliminary design. The architectural design is then expanded into a more detailed design that can actually be used to code the desired systemlow-level detailed design. By the time the detailed design is complete, software engineering should be able to implement it into the code of the desired software system or application.

Since the design will be what primarily guides the codingcode implementation, it is particularly important to make sure ensure that the architecture and design is are correct, completesafe, secure, complete, understandable, and captures what the intent of the requirements intend for the system to do. The detailed design components capture the captures the low-level component-based approach to implementing the software requirements, including the requirements associated with fault management, security, and safety. Analysis When the detailed design is complete, the analysis of the requirements traceability documents should show the relationship between the detailed software design components and the software requirements provides providing evidence that all of the requirements are in the detailed design.

Thus, an important part of ensuring the final system is correct, safe and secure is making sure the design accurately represents all the requirements. The sections for this product describe some of the methods and techniques Software Assurance and Software Safety personnel can use to evaluate and improve the quality of the design elements that are being developed.

accounted for. The information in this topic is divided into several tabs as follows:

  • Tab 1 – Introduction
  • Tab 2 – Software Design Analysis Guidance –
The second tab in this topic
  • provides general guidance for doing software design
analysis. The third tab in this topic
  • analysis 
  • Tab 3 – Safety Analysis During Design – provides additional guidance
for design analysis that may be done
  • when
there is
  • safety critical software
involved.
  • is involved with analysis emphasis on safety features
  • Tab 4 - Analysis Reporting Content – provides guidance on the analysis report product content
  • Tab 5 – Resources for this topic

The following is a list of the applicable SWE requirements that relate to the generation of the software design analysis product:

Div
idtabs-1

1.  Introduction

Excerpt

SWE #

NPR 7150.2 Requirement

NASA-STD-8739.8 Software Assurance and Software Safety Tasks

034

The project manager shall define and document the acceptance criteria for the software.

1. Confirm software acceptance criteria are defined and assess the criteria based on guidance in the NASA Software Engineering Handbook, NASA-HDBK-2203.

134

If a project has safety-critical software or mission-critical software, the project manager shall implement the following items in the software:
a. The software is initialized, at first start and restarts, to a known safe state.
b. The software safely transitions between all predefined known states.
c. Termination performed by software of functions is performed to a known safe state.
d. Operator overrides of software functions require at least two independent actions by an operator.
e. Software rejects commands received out of sequence when execution of those commands out of sequence can cause a hazard.
f. The software detects inadvertent memory modification and recovers to a known safe state.
g. The software performs integrity checks on inputs and outputs to/from the software system.
h. The software performs prerequisite checks prior to the execution of safety-critical software commands.
i. No single software event or action is allowed to initiate an identified hazard.
j. The software responds to an off-nominal condition within the time needed to prevent a hazardous event.
k. The software provides error handling.
l. The software can place the system into a safe state.

6. Analyze the software design to ensure:
a. Use of partitioning or isolation methods in the design and code,
b. That the design logically isolates the safety-critical design elements and data from those that are non-safety-critical.

057

The project manager shall transform the requirements for the software into a recorded software architecture.

1. Assess that the software architecture addresses or contains the software structure, qualities, interfaces, and external/internal components.

2. Analyze the software architecture to assess whether software safety and mission assurance requirements are met.

143

The project manager shall perform a software architecture review on the following categories of projects:
a. Category 1 Projects as defined in NPR 7120.5 

Swerefn
refnum082
.
b. Category 2 Projects as defined in NPR 7120.5 that have Class A or Class B payload risk classification per NPR 8705.4 
Swerefn
refnum048
.

1. Assess the results of or participate in software architecture review activities held by the project. 

058

The project manager shall develop, record, and maintain a software design based on the software architectural design that describes the lower-level units so that they can be coded, compiled, and tested.

1. Assess the software design against the hardware and software requirements, and identify any gaps.

2. Assess the software design to verify that the design is consistent with the software architectural design concepts and that the software design describes the lower-level units to be coded, compiled, and tested.

3. Assess that the design does not introduce undesirable behaviors or unnecessary capabilities. 

4. Confirm that the software design implements all of the required safety-critical functions and requirements.  

5. Perform a software assurance design analysis.

080

The project manager shall track and evaluate changes to software products.

1. Analyze proposed software and hardware changes to software products for impacts, particularly to safety and security.

081

The project manager shall identify the software configuration items (e.g., software records, code, data, tools, models, scripts) and their versions to be controlled for the project.

2. Assess that the software safety-critical items are configuration managed, including hazard reports and safety analysis.

203

The project manager shall implement mandatory assessments of reported non-conformances for all COTS, GOTS, MOTS, OSS, or reused software components.

2. Assess the impact of non-conformances on the safety, quality, and reliability of the project software.

Div
idtabs-2

2. Software Design Analysis Guidance

In software design, software requirements are transformed into the architectural design with a software architecture and a high-level preliminary design followed by the more specific detailed software design. The architecture establishes the interfaces, overall layout/structure, and data flow of the software. The high-level preliminary design identifies the specific individual components (e.g., files, functions, subroutines, classes, modules) for each software program/application along with a description of what that piece does. In addition, it should include items such as the inputs, outputs, units, and data types along with databases and interfaces (e.g., hardware, operator/user, software program/applications, system and subsystems). 

The detailed design takes the high-level components, files, functions, subroutines, classes, etc. and breaks them down to the point where they become pseudo-code with variable names and associated descriptions identified and the logic flow stubbed out. As project budgets tighten, more and more software organizations are embedding the detailed design in the source code and extracting it with tools like Javadoc and Doxygen. (Note: This is not an endorsement of these tools.) So, Software Assurance and Software Safety personnel should be aware they may receive the detailed design documentation in a less traditional manner. For small software systems, the architectural and detailed design may be combined.

The design addresses the software architectural design and software detailed design.  The objective of doing design analysis is to ensure that the design: 

  • is a correct, accurate, and complete transformation of the software requirements that will meet the operational needs under nominal and off-nominal conditions,
  • is safe,
  • is secure with known weaknesses and vulnerabilities mitigated,
  • introduces no unintended features, and
  • does not result in unacceptable operational risk.

The design should also be created considering portability, performance, and maintainability so future changes can be made quickly without the need for significant redesign.

There are several design techniques described below that help with the analysis of the design. Each of these may be used by Software Assurance and Software Safety personnel to help ensure a more robust design. Additionally, these personnel should be aware of the Topic section – Software Design Principles – that addresses specific aspects of the design.

Tab 3 (Safety Analysis During Design) contains a more extensive list of analysis techniques that may be used by the Software Safety personnel.

Software Assurance and Software Safety tasks in NASA-STD-8739.8

Swerefn
refnum278
that relate to design analysis are found in SWE-052, SWE-058, SWE-060, SWE-087, SWE-134, and SWE-157.

2.1 Use of Checklists and Known Best Practices

As part of the design analysis, Software Assurance and Software Safety personnel review the design to ensure that general design best practices have been implemented (see below). The use of the SADESIGN Checklist (see below) is important when evaluating the software design as it highlights many best practices. There are other aids in this Handbook that may be used for evaluating the design. They are the Programming Checklists Topic: 6.1 - Design for Safety Checklist and the Software Design Principles Topic: Software Design Principles. This information should be considered during the analysis for both safety critical software and non-safety critical software. Teams may decide to formulate some of this information into a checklist that is applicable to their project.

General Design Best Practices:

Some general design best practices to consider are:

  • Break the design into smaller chunks. Don’t try to design it all at once.
  • Keep the design simple.
  • Keep the design modular so it will be easier to test and maintain.
  • Keep boundaries, interfaces, and constraints in mind.
  • Strive for maximum cohesion and minimum coupling. (Cohesion groups together the things that make sense; coupling is the relative dependence between the modules)
  • Use abstraction to increase the reusability of modules. (Abstraction is the reduction of a body of data to a simplified representation of the whole.)
  • Consider how the users will use and interact with the system. Keep the user interface design user friendly.
  • Include error handling in the designs.
  • Don’t duplicate sections of code – if the sections of code need to be used repeatedly, put them into a function, a package, or subroutine that can be called.
  • Prototype new approaches or designs for difficult requirements.
  • Peer review designs, particularly interfaces, data flows, and logic flows.
  • Use design practices such as documentation review, pseudo code, process diagrams, and logic diagrams to aid in evaluating the design.


Additional guidance and some key design practices may also be found in SWE-058, tab 7.


Panel
borderWidth2

SADESIGN Checklist:

    1. Has the software design been developed at a low enough level for coding?
    2. Is the design complete and does it cover all the approved requirements?
    3. Have complex algorithms been correctly derived, provide the needed behavior under off-nominal conditions and assumed conditions, and is the derivation approach known and understood to support future maintenance?
    4. Examine the design to ensure that it does not introduce any undesirable behaviors or any capabilities, not in the requirements?
    5. Have all requirements sources been considered when developing the design (e.g., system requirements, interface  requirements, databases, etc.)?
    6. Have the interfaces with COTS, MOTS, GOTS, and Open Source been designed (e.g., APIs, .dlls)?
    7. Have all internal and external software interfaces been designed for all (in-scope) interfaces with hardware, user, operator, software, and other systems and are they detailed enough to enable the development of software components that implement the interfaces?
    8. Are all safety features in the design e.g., (mitigations, controls, barriers, must-work requirements, must-not-work requirements)
    9. Does the design provide the dependability/reliability and fault tolerance/Fault Detection and Recovery (FDIR) required by the software, and is the design capable of controlling identified hazards?  Does the design create any hazardous conditions?
    10. Does the design adequately address the identified security requirements both for the software and security risks, including the integration with external components as well as information and data utilized, stored, and transmitted through the software?   
    11. Does the design prevent, control, or mitigate any identified security threats, weaknesses and vulnerabilities? Are any unmitigated weaknesses and vulnerabilities documented as risks and addressed as part of the software and software operations?
    12. Have operational scenarios have been considered in the design (for example, use of multiple individual programs to obtain one particular result may not be operationally efficient or reasonable; transfers of data from one program to another should be electronic, etc.).
    13. Have users/operators been consulted during design to identify any potential operational issues?
    14. Maintainability: Has maintainability been considered? Is the design modular? Is the design easily extensible? Is it designed to allow for the addition of new capabilities and functionality?
    15. Portability: Has portability been considered? Are environmental variables used? Can the software be moved to other environments quickly?
    16. Is the design easy to understand?
    17. Is the design unnecessarily complicated?
    18. Is the design adequately documented for usability and maintainability?
    19. Does the design address error handling?
    20. Has software performance been considered during design? Has the software design been optimized for efficiency to reduce system load, run-time length/speed, etc.?
    21. Has the level of coupling (interactivity between modules) been kept to a minimum?
    22. Has software planned for reuse and OTS software in the system been examined to determine if it meets the requirements and performs appropriately within the required limits for this system? Has the software been evaluated for security vulnerabilities and weaknesses?
    23. Does this software introduce any undesirable capabilities or behaviors?
    24. Has the software design been peer reviewed?
    25. Are components referenced by more than one application, file, module, components, functions, subroutines, classes, etc. stored in a common area such as a library, class, or package?

2.2 Use of peer reviews or inspections

Design items designated in the software management/development plans are peer reviewed or inspected. Some of the items to look for during these meetings are:

  1. Assess the software design against the hardware and identify any gaps.
  2. Assess the software design against the system requirements and design and identify any gaps.
  3. Confirm that the detailed design is consistent with the architectural design and describes the program’s or application’s components at a low enough level for coding.
  4. Confirm the design does not contain undesirable functionality.
  5. Confirm the safety-related requirements (e.g., SWE-134) have been taken into account for safety-critical software.
  6. Confirm the design addresses possible unauthorized access, vulnerabilities, and weaknesses.

2.3 Review of Traceability Matrices

Review the traces from requirements to design and design to requirements to ensure all requirements are completely accounted for. As the project moves into implementation, the bi-directional traceability matrices between design and code should also be checked.

2.4 Software Architecture Review Board (SARB) Analysis - applies to NASA projects only

The Software Architecture Review Board (SARB) is a NASA-wide board that engages with flight projects in the formative stages of the software architecture. The objectives of the SARB are to manage and/or reduce flight software complexity through better software architecture and help improve mission software reliability and save costs. NASA projects that meet certain criteria (for example, large projects, ones with safety critical concerns, projects destined for considerable reuse, etc.) may request the SARB to do a review and assessment of their architecture. For more guidance on the focus areas of the SARB, see the SWE-143 – Tab 3 in this Handbook. For more information on the SARB or to request a review, please visit the SARB site on the NASA Engineering Network (NEN).

2.5 Problem/Issue Tracking System

Per SWE-088 – Task 2, all analysis non-conformances, findings, defects, issues, concerns, and observations are documented in a problem/issue tracking system and tracked to closure. These items are communicated to the software development personnel and possible solutions discussed. The level of risk associated with the finding/issue should be reflected in the priority given in the tracking system. The analysis performed by Software Assurance and Software Safety may be reported in one combined report, if desired.

Div
idtabs-3

3. Safety Analysis During Design

The Safety Design Analysis is a portion of the overall Software Safety Analysis that is performed on all safety critical software, as defined in NASA-STD-8739.8 

Swerefn
refnum278
. A full Software Safety Analysis encompasses all the aspects of the development life cycle (requirements, design, implementation, and test) for safety-critical software and focuses on the safety features (safety requirements, controls, mitigations, fault identification, isolation and recovery, etc.). During the Design phase, Software Safety personnel analyze the design to ensure that it will not adversely impact the safety of the system/software. This tab discusses the Software Safety Analysis activities during design.

3.1 Review Software Design Analysis Information

To begin the Safety Design Analysis, the Software Safety and SA personnel should collaborate on the activities on Tab 2 – Software Design Analysis Guidance. However, Software Safety personnel should perform an independent analysis to become familiar with the design. Both teams should review each other’s Software Design Analysis results to ensure that all safety aspects have been adequately considered and addressed in the software design. In addition to the techniques and activities on Tab 2, it may be useful to use any of the following information for analyzing safety-critical software:

  1. Topic 6.1 - Design for Safety Checklist found in this Handbook, under the “Programming Checklists” Tab of Topics and
  2. 9.01 Software Design Principles” Tab of Topics in this Handbook.

After reviewing the analysis work done to date and the applicable checklists, examine the various operational scenarios (nominal and off-nominal) for what could go wrong with the mission or if the software (or hardware) fails. (This scenario information may be in a preliminary Hazard Report; however, some of the scenarios may not have been identified yet and are a product of this exercise.) Review the Software Design to see if the mishaps or failures are accounted for. It may be necessary to reverse engineer the scenarios to ensure that the software design accounts for them and has the proper hooks in place to deal with any faults or failures.

3.2 Design Peer Reviews or Walk-throughs

Peer reviews or walk-throughs for safety-critical components are recommended techniques to aid in identifying software design problems or issues in safety-critical components early. These meetings allow problems and issues to be revealed and worked prior to design rollout at Milestone Reviews (e.g., Preliminary, Critical). Software Safety personnel participate in these meetings to monitor and analyze the safety aspects of the software design including any changes, and to continue updating their hazard analysis (see Software Safety and Hazard Analysis product).

One of the most important aspects of a design for safety critical software is to design for minimum risk. “Minimum risk” includes the hazard risks (including loss of life, mission, and space assets), security risks, design choice risks, human errors, and other types of risk such as programmatic, cost, schedule, etc. When possible, the design should eliminate or mitigate identified hazards and risks or reduce the associated risk through design (e.g., redundancy, isolating safety critical software). Listed below are some ways to mitigate or reduce risks through software design. This list may be used by meeting attendees to help evaluate the design with respect to safety and risk considerations.


   Safety Considerations during Design Peer Reviews/Walk-throughs:

             Does the design:
      • Reduce the complexity of the software and interfaces?
      • Design for user-safety instead of user-friendly?

      • Design for testability during development and integration. Include ways that the internals of a component can be adequately tested to verify that they are working properly?

      • Give more design “resources” (such as time, effort) to the higher risk aspects such as hazard controls?

      • Include separation of commands, functions, files, and ports (e.g., vehicle commanding, hardware, user input), functions, files, and ports?.
      • Include design for Shutdown/Recovery/Safing?

      • Plan for monitoring of system/software/hardware performance and detection (for faults, malfunctions, exceeding limits, etc.)?

      • Isolate the components containing safety-critical requirements as much as possible

      • Minimize interface interactions between safety-critical components?
      • Document the positions and functions of safety critical components in the design hierarchy?

      • Document how each safety-critical component can be traced back to the original safety requirements and how the requirements are implemented?

      • Specify safety-related design and implementation constraints such as returning the spacecraft to a safe recoverable state after a failure or shutting of a pressure valve before it exceeds tank limits?

      • Document execution control, interrupt characteristics, initialization, synchronization, and control of the components? For high risk systems, interrupts should be avoided since they may interfere with software safety controls. Any interrupts used should be priority-based.

      • Specify any error handling/detection or recovery schemes for safety-critical components?

      • Consider hazardous operations scenarios which may require additional software constraints such as executing commanding operations in a two-step process (arm and fire)?

      • Fully consider safing and recovery actions for real-world conditions and the corresponding time to criticality? Automatic safing is often required if the time to criticality is shorter than the realistic human operator response time, or if there is no human in the loop. This may be performed by either hardware or software or a combination depending on the best system design to achieve safing.
    • Follow a strategy for handling faults and failures? A consistent strategy for handling faults and failures should be used. Some of the techniques that may be used in fault management are:
      • Prevent Fault Propagation: To prevent fault propagation (cascading of a software error from one component to another), safety-critical components must be fully independent of non-safety-critical components and be able to detect an error and not pass it along.
      • Shadowing: A higher level process emulates lower level processes to predict expected performance and decides if failures have occurred in the lower processes. The higher level process implements appropriate redundancy switching when it detects a discrepancy.
      • Built-in Test: Fault/Failure Detection, Isolation and Recovery (FDIR) can be based on self-test such as Built-in-Test (BIT) of lower tier processors where the lower level units test themselves and report their status to the higher processor. The higher level processor switches out units reporting a failed or bad status.
      • Majority voting: Some redundancy schemes are based on this technique. It is especially useful when the criteria for diagnosing failures is complicated (e.g., when an unsafe condition is defined by exceeding an analog value rather than simply a binary value). An odd number of parallel units are required to achieve majority voting.
      • Fault Containment Regions: Establish a Fault Containment Region (FCR) to prevent fault propagation such as from non-safety critical software to safety-critical components; from one redundant software unit to another, or from one safety-critical component to another. Techniques such as firewalling or “come from” checks should be used to provide sufficient isolation of FCRs to prevent hazardous fault propagation. FCRs can be best partitioned or firewalled by hardware. A typical method of obtaining independence between FCRs is to host them on different and independent hardware processors.
      • Redundant architecture: In redundant architecture, there are two versions of the operational code which do not need to operate identically unless required. The primary version is a high performance version with all required functionality and performance requirements implemented. If problems occur with the primary version, control will be passed to the secondary version (called a safety kernel) during failover. Depending on the requirements, the secondary version may have the same or reduced functionality.
      • Recovery blocks: These use multiple software versions to find and recover from faults. Output from a block will be checked against an acceptance test. If it fails, then another version computes the output and the process continues. Each version is more reliable but less efficient. If the last block fails, the program must determine some way to fail safe.
      • Self-checks: This is a type of dynamic fault detection. Self-checks can include replication (copies must be identical if the data is to be considered correct), reasonableness (is the data reasonable, based on other data in the system), and structural (are components manipulating complex data correctly).

Does the software design:

      • Consider any potential issues with the use of COTS, Open Source, reused or inherited code?
      • Consider sampling rate selection for noise levels and expected variations of control system and physical parameters?
      • Identify and document tests and/or verification methods for each safety-critical design feature?
      • Consider maintainability in the design? For example: anticipate potential changes in the software, use a modular design, object-oriented design, uniform conventions, and naming conventions, use coding standards that support safety practices, use documentation standards, common tool sets.

Some additional safety-specific design considerations are:

    • Are the design and its safety features appropriately flowed from the requirements and the evolving hazard analyses?
    • Has the design been reviewed to ensure that software design’s correct implementation of safety controls or processes does not compromise other system/software safety features or the functionality of the software?
    • Have additional system hazards, causes, or contributions discovered during the software design analysis been documented in the required system safety documentation (e.g. Safety Data Package and or Hazard Reports)?
    • Have the controls, mitigations, inhibits, and safety design features to be incorporated into the design been approved by the Safety Review team?
    • Are any needed or identified safety conditions, constraints, parameters, trigger points, boundary conditions, environments, and other software circumstances for safe operation, in the appropriate modes and states? Are they all flowed from the software requirements and incorporated into the design?
    • Does the design maintain the system in a safe state during all modes of operation or can it transition to a safe state when and if necessary? Can the system recover from the safe state?
    • Are any partitioning or isolation methods used in the design to logically isolate the safety critical design elements from those that are non-safety critical effective?  This is particularly important with the incorporation of COTS or integration of legacy, heritage, and reuse software.  Any software that can write or provide data to safety critical software will also be considered safety critical unless isolation is built in, and then the isolation design is considered safety critical.
    • Does the software design include appropriate fault or failure tolerance as planned?
    • If heritage code is being used, is there a clear understanding of the design and constraints associated with any fault management in the heritage code? Are they appropriate for the current system being developed?

3.3 Other Types of Design Analysis

There are other types of analyses that may be useful during design but require more time and effort to perform. The Safety Team should consider them and choose those they feel would provide the most value, depending on the areas where risk is highest in the design. Some of these design analysis methods are:

  1. Acceptable Level of Safety: Once the design is fairly mature, a design safety analysis may be done to determine whether an acceptable level of safety will be attained by the designed system. This analysis involves analyzing the design of the safety components to ensure that all safety requirements are specified correctly. Check to assure the requirements are updated once the design has determined exactly what safety features will be included in the system/software. Review the design looking for places and conditions that lead to unacceptable hazards. Consider the credible faults or failure that could occur and evaluate their effects on the designed system. Does the designed system produce the desired result with respect to the hazards? Think about what the system will do for all the “what if” cases and trace through how the system would respond—Did it respond in a safe manner?
  2. Prototyping or simulating: Prototyping or simulating parts of the design may show where the software can fail. In addition, this can demonstrate whether the software can meet the constraints it might have, such as response time, or data conversion speed. This could also be used to provide the operator’s inputs on the user interface. If the prototypes show that a requirement cannot be met, the requirement must be modified or the design revised.
  3. Independence Analysis: To perform this analysis, map the safety-critical functions to the software components, and then map the software components to the hardware hosts and FCRs. All the input and output of each safety-critical component should be inspected.  Consider global or shared variables, as well as the parameters directly passed.  Consider “side effects” that may be included when a component is run. The goal is to verify there is separation between safety-critical and non-safety critical functions.
  4. Design Logic Analysis: Logic analysis examines the safety-critical areas of a software component by analyzing each function performed by that component.  If it responds to or has the potential to violate one of the safety requirements, it should be considered critical and undergo logic analysis. Design Logic Analysis (DLA) evaluates the equations, algorithms, and control logic in the software design of these safety critical components. A technique for performing design logic analysis is to compare design descriptions and logic flows and then note the discrepancies. This is the most rigorous type of analysis and may be performed using Formal Methods. Formal Methods are the use of mathematical modelling for the specification, development, and verification of systems in both software and electronic hardware. The formal methods are used to ensure these systems are developed without error. Less formal DLA involves reviewing a relatively small quantity of critical software products (e.g., PDL, prototype code) and manually tracing the logic. Safety-critical logic can include failure detection and diagnosis, redundancy management, variable alarm limits, and command inhibit logical preconditions.
  5. Design Data Analysis: Data analysis ensures that the structure and intended use of data will not violate a safety requirement by comparing the description to the use of each data item in the design logic. The Design Data Analysis evaluates the description and intended use of each data item in the software design. Interrupts and their effect on data must receive special attention in safety-critical areas. Analysis should verify that interrupts and interrupt handling routines do not alter critical data items used by other routines. The integrity of each data item should be evaluated with respect to its environment and host. Shared memory and dynamic memory allocation can affect data integrity. Data items should also be protected from being overwritten by unauthorized applications.
  6. Design Interface Analysis: The Design Interface Analysis verifies the proper design of a software component's interfaces with other components of the software, system, or even hardware. This analysis will verify that the software component's interfaces, especially the control and data linkages, have been properly designed. Interface requirements specifications (which may be part of the requirements or design documents, or a separate document) are the sources against which the interfaces are evaluated. Interface characteristics to be addressed should include inter-process communication methods, data encoding, error checking (e.g., data entry validity, value/range, type checks), and synchronization.

    The analysis should consider the validity and effectiveness of checksums, cyclic redundancy checks (CRCs), and error correcting code. CRC is a type of error-detecting code used in digital networks and storage devices to detect unintentional changes to raw data. Blocks of data entering these systems get a short check value attached, based on the remainder of a polynomial division of their contents. When the data is retrieved, the calculation is repeated and if the check values do not match, the data is corrupt and corrective action can be taken.

    The sophistication of error checking or correction that is implemented should be appropriate for the predicted bit error rate of the interface.  An overall system error rate should be defined and budgeted to each interface.
  7. Design Traceability Analysis: This analysis ensures that each safety critical software requirement is included in the design. Tracing the safety requirements throughout the design (and eventually into the source code and test cases) is vital to making sure that no requirements are lost, that safety is “designed in”, that extra care is taken during the coding phase, and that all safety requirements are tested. A safety requirement traceability matrix is one way to implement this analysis.   

3.4 Problem/Issue Tracking System

Per SWE-088 – Task 2, all analysis non-conformances, findings, defects, issues, concerns, and observations are documented in a problem/issue tracking system and tracked to closure. These items are communicated to the software development personnel and possible solutions discussed. The level of risk associated with the finding/issue should be reflected in the priority given in the tracking system. The analysis performed by Software Assurance and Software Safety may be reported in one combined report, if desired.

2. Software Design Analysis Guidance

In software design, software requirements are transformed into the software architecture and then into a detailed software design for each software component.  The software design also includes databases and system interfaces (e.g., hardware, operator/user, software components, and subsystems).  The design addresses software architectural design and software detailed design.  The objective of doing design analysis is to ensure that: 

  • the design is a correct, accurate, and complete transformation of the software requirements that will meet the operational needs under nominal and off-nominal conditions,
  • introduces no unintended features, and
  • design choices do not result in unacceptable operational risk.

The design should also be created with modifiability and maintainability so future changes can be made quickly without the need for significant redesign changes.

There are several design techniques described below that can help with the analysis of the design. Each of these may be used by Software Assurance and Software Safety personnel to help ensure a more robust design.

Tab 3 contains a more extensive list of analysis techniques that can be used by the Software Safety personnel.

Software Assurance and Software Safety tasks in NASA-STD-8739.8 that relate to design analysis are found in SWE-052, SWE-058, SWE-060, SWE-087, SWE-134, and SWE-157.

2.1 Use of Checklists

Consider the checklist below, from SADESIGN, when evaluating the software design. Another checklist that can be used for safety-critical software is found in this Handbook, under the Programming Checklists Topic: 6.1 - Design for Safety Checklist.

Div
idtabs-4

4. Analysis Reporting Content

Documenting and Reporting of Analysis Results.

When the design is analyzed, the Software Design Analysis work product is generated to document results capturing the findings and corrective actions that need to be addressed to improve the overall design. It should include a detailed report of the design analysis results. Analysis results should also be reported in a high-level summary and conveyed as part of weekly or monthly SA Status Reports. The high-level summary should provide an overall evaluation of the analysis, any issues/concerns, and any associated risks. If a time-critical issue is uncovered, it should be reported to management immediately so that the affected organization may begin addressing it at once.

When a project has safety-critical software, analysis results should be shared with the Software Safety personnel. The results of analysis conducted by Software Assurance personnel and those done by Software Safety personnel may be combined into one analysis report, if desired.

4.1 High-Level Analysis Content for SA Status Report

New or updated design analysis results Any design analysis since the last SA Status Report or project management meeting should be reported to project management and the rest of the Software Assurance team. When a project has safety-critical software, any analysis done by Software Assurance should be shared with the Software Safety personnel.

When reporting the results of an analysis in a SA Status Report, the following defines the minimum recommended contents:

  • Identification of what was analyzed: Mission/Project/Application
  • Period/Timeframe/Phase analysis performed during
  • Summary of analysis techniques used
  • Overall assessment of design, based on analysis
  • Major findings and associated risk
  • Current status of findings: open/closed; projection for closure timeframe

4.2 Detailed Content for Analysis Product:

The Software Design Analysis product captures and documents all the detailed results of the analysis and descriptions of the techniques/methods used. The analysis techniques/methods that produced the most useful results should be highlighted for future use. The Software Design Analysis product is placed under configuration management and delivered to the project management team as the Software Assurance record for the activity. When a project has safety-critical software, this product should be shared with the Software Safety personnel.

When reporting the detailed results of the software design analysis, the following defines the minimum recommended content

Div
idtabs-2
Panel
borderWidth2

SADESIGN Checklist:

    1. Has the software design been developed at a low enough level for coding?
    2. Is the design complete and does it cover all the approved requirements?
    3. Have complex algorithms been correctly derived, provide the needed behavior under off-nominal conditions and assumed conditions, and is the derivation approach known and understood to support future maintenance?
    4. Examine the design to ensure that it does not introduce any undesirable behaviors or any capabilities, not in the requirements?
    5. Have all requirements sources been considered when developing the design (for example, think about interface control requirements, databases, etc.)?
    6. Have the interfaces with COTS, MOTS, GOTS, and Open Source been designed?
    7. Have all internal and external software interfaces been designed for all (in-scope) interfaces with hardware, user, operator, software, and other systems and are they detailed enough to enable the development of software components that implement the interfaces?
    8. Are all safety features in the design (mitigations, controls, barriers, must-work requirements, must-not-work requirements)
    9. Does the design provide the dependability and fault tolerance required by the system, and is the design capable of controlling identified hazards?  Does the design create any hazardous conditions?
    10. Does the design adequately address the identified security requirements both for the system and security risks, including the integration with external components as well as information and data utilized, stored, and transmitted through the system?   
    11. Does the design prevent, control, or mitigate any identified security threats and vulnerabilities? Are any unmitigated threats and vulnerabilities documented and addressed as part of the system and software operations?
    12. Operational scenarios have been considered in the design (for example, use of multiple individual programs to obtain one particular result may not be operationally efficient or reasonable; transfers of data from one program to another should be electronic, etc.).
    13. Have users/operators been consulted during design to identify any potential operational issues?
    14. Maintainability: Has maintainability been considered? Is the design modular? Can additions and changes be made quickly?
    15. Is the design easy to understand?
    16. Is the design unnecessarily complicated?
    17. Is the design adequately documented for usability and maintainability?
    18. Has system performance been considered during design?
    19. Has the level of coupling (interactivity between modules) been kept to a minimum?
    20. Has software planned for reuse and OTS software in the system been examined to see that it meets the requirements and performs appropriately within the required limits for this system?
    21. Does this software introduce any undesirable capabilities or behaviors?
    22. Has the software design been peer reviewed?

2.2 Use of peer reviews or inspections

Design items designated in the software development plans should be peer reviewed or inspected. Some of the items to look for during these meetings are:

    1. Assess the design against the hardware and identify any gaps.
    2. Confirm that the detailed design is consistent with the architecture design and describes the units at a low enough level for coding.
    3. Confirm the design does not contain undesirable functionality.
    4. Confirm the requirements in SWE-134 have been taken into account for safety-critical software.
    5. Confirm the design addresses any possible unauthorized access.

2.3 Review of Traceability

Review the traces from requirements to design and design to requirements and ensure they are complete. As the project moves into implementation, the bi-directional trace matrices between design and code should also be checked.

2.4 Analysis by Software Architecture Review Board (SARB) - applies to NASA projects only

The SARB is a NASA-wide board that engages with flight projects in the formative stages of software architecture. The objectives of SARB are to manage and/or reduce flight software complexity through better software architecture and help improve mission software reliability and save costs. NASA projects that meet certain criteria (for example, large projects, ones with safety critical concerns, projects destined for considerable reuse, etc.) may request the SARB to do a review and assessment for their architecture.

2.5 Reporting of Results

Any design analysis done in the interim between status reports or prior to milestone reviews should be reported on to management and the rest of the team. When a project has safety-critical software, any analysis done by Software Assurance should be shared with the Software Safety personnel. The results reporting should include:

  • Identification of what was analyzed: Mission/Project/Application
  • Person or group doing analysis
  • Period/Timeframe/Phase analysis performed during
  • Documents used in analysis (e.g., requirements version, etc.)
  • Description or identification of analysis techniques used
  • Overall assessment of design, based on analysis
  • Major findings and associated risk
  • Current status of findings: open/closed; projection for closure timeframe

2.6 Problem/Issue Tracking System

Findings, issues, and concerns from all the different software and safety design analyses performed should be documented in a problem/issue tracking system and tracked to closure. These items should be communicated to the software development personnel and possible solutions discussed.  The analysis done by Software Assurance and Software Safety can be reported in one combined report if desired.

Div
idtabs-3

3. Safety Design Analysis

3.1 Review Software Design Analysis

There are many considerations for analyzing the design with respect to safety. Most of the design analysis that is used for non-safety projects is still applicable for safety critical software. So, to begin with, the Software Safety personnel should either review or ensure that the Software Assurance personnel have reviewed the set of items listed in Tab 2 -Software Design Analysis Guidance. The first of these is the SADESIGN checklist (previously in Topic 7.18). Another checklist that can be used for safety-critical software is found in this Handbook, under the Programming Checklists Topic: 6.1 - Design for Safety Checklist

3.2 Design peer reviews or design walkthroughs

Design peer reviews or design walkthroughs for safety-critical components are recommended for safety-critical components to identify design problems or other issues. One of the most important aspects of a software design for safety critical software is to design for minimum risk. “Minimum risk” includes the hazard risk, the risk of software defects, risk of human operator errors and other types of risk such as programmatic, cost, schedule, etc. When possible, eliminate identified hazards and risks or reduce the associated risk through design. Some of the ways risk can be reduced  through design are listed below. This list can be used by attendees of design peer reviews or walk-throughs to help evaluate the design with respect to safety and risk considerations.

   Safety Considerations during Design Peer Reviews/Walk-throughs:

    • Reduce the complexity of the software and interfaces.
    • Design for user-safety instead of user-friendly.

    • Design for testability during development and integration.

    • Give more design “resources” (such as time, effort) to the higher risk aspects such as hazard controls.

    • Include separation of commands, functions, files, and ports.

    • Include design for Shutdown/Recovery/Safing.

    • Plan for monitoring and detection.

    • Isolate the components containing safety-critical requirements as much as possible.

    • Interfaces between safety-critical components should be designed for minimum interaction.

    • Document the positions and functions of safety critical components in the design hierarchy.

    • Document how each safety-critical component can be traced back to the original safety requirements and how the requirements are implemented.

    • Specify safety-related design and implementation constraints.

    • Document execution control, interrupt characteristics, initialization, synchronization, and control of the components. For high risk systems, interrupts should be avoided since they may interfere with software safety controls. Any interrupts used should be priority-based.

    • Specify any error detection or recovery schemes for safety-critical components.

    • Consider hazardous operations scenarios.

    • The design of safing and recovery actions should fully consider the real-world conditions and the corresponding time to criticality. Automatic safing is often required if the time to criticality is shorter than the realistic human operator response time, or if there is no human in the loop. This can be performed by either hardware or software or a combination depending on the best system design to achieve safing.

    • Select a strategy for handling faults and failures. Some of the techniques that can be used in fault management are below:

      • To prevent fault propagation (cascading of a software error from one component to another) safety-critical components must be fully independent of non-safety-critical components, be able to detect an error and not pass it along.
      • Shadowing: A higher level process emulates lower level processes to predict expected performance and decides if failures have occurred in the lower processes. The higher level process implements appropriate redundancy switching when it detects a discrepancy.
      • Built-in Test: Fault/Failure Detection, Isolation and Recovery (FDIR) can be based on self-test (BIT) of lower tier processors where the lower level units test themselves and report their status to the higher processor. The higher processor switches out units reporting a failed or bad status.
      • Majority voting: Some redundancy schemes are based on majority voting. This technique is especially useful when the criteria for diagnosing failures is complicated (e.g. when an unsafe condition is defined by exceeding an analog value rather than simply a binary value). An odd number of parallel units are required to achieve majority voting.
      • Fault Containment Regions: Establish a Fault Containment Region(FCR) to prevent fault propagation such as from non-critical software to safety-critical components; from one redundant software unit to another, or from one safety-critical component to another. Techniques such as firewalling or “come from” checks should be used to provide sufficient isolation of FCRs to prevent hazardous fault propagation. FCRs can be best partitioned or firewalled by hardware. A typical method of obtaining independence between FCRs is to host them on different and independent hardware processors.
      • Redundant architecture: In redundant architecture, there are two versions of the operational code which do not need to operate identically. The primary version is a high performance version with all required functionality and performance requirements. If problems occur with this version, the other version (called a safety kernel )will be given control. This version may have the same functionality, or it may have a more limited scope.
      • Recovery blocks: These use multiple software versions to find and recover from faults. Outputs from a block will be checked against an acceptance test. If it fails, then another version computes the output and the process continues. Each version is more reliable but less efficient. If the last block fails, the program must determine some way to fail safe.
      • Self-checks: This is a type of dynamic fault detection. Self-checks can include replication (copies must be identical if the data is to be considered correct), reasonableness (is the data reasonable, based on other data in the system), and structural (are components manipulating complex data correctly).
    • Consider any potential issues with the use of COTS, Open Source , reused or inherited code.
    • Select sampling rates with consideration for noise levels and expected variations of control system and physical parameters.
    • Identify test and/or verification methods for each safety-critical design feature.
    • Design for testability. Include ways that the internals of a component can be adequately tested to verify that they are working properly.
    • Consider maintainability in the design (For example: anticipate potential changes in the software, use a modular design, object-oriented design, uniform conventions, and naming conventions, use coding standards that support safety practices, use documentation standards, common tool sets)

A few more safety-specific design considerations are below:

  • Are the design and its safety features appropriately flowed from the requirements and the evolving hazard analyses?
  • Has the design been reviewed to ensure that software design’s correct implementation of safety controls or processes does not compromise other system safety features or the functionality of the software?
  • Have additional system hazards, causes, or contributions discovered during the software design analysis been documented in the required system safety documentation (e.g. Safety Data Package and or Hazard Reports)?
  • Have Safety reviews approved the controls, mitigations, inhibits, and safety design features to be incorporated into the design?
  • Are any needed or identified safety conditions, constraints, parameters, trigger points, boundary conditions, environments, and other software circumstances for safe operation, in the appropriate modes and states all flowed from the software requirements and incorporated into the design?
  • Does the design maintain the system in a safe state during all modes of operation or can it transition to a safe state when and if necessary?
  • Are any partitioning or isolation methods used in the design to logically isolate the safety critical design elements from those that are non-safety critical effective?  This is particularly important with the incorporation of COTS or integration of legacy, heritage, and reuse software.  Any software that can write or provide data to safety critical software will also be considered safety critical unless there is isolation built in, and then the isolation design is considered safety critical.
  • Are appropriate fault and or failure tolerance incorporated into the software design as designated?
  • If heritage code is being used, is there a clear understanding of the design and constraints associated with any fault management in the heritage code? Are they appropriate for the current system being developed?

3.3 Other types of design analysis can be done to analyze particular aspects of the design.

All of these design analyses would be useful to perform, but they require more time and effort so the safety team should choose those they feel would provide the most value, depending on the areas where risk is highest in the design. Some of the other available design analysis methods are below:

a. Acceptable Level of Safety: Once the design is fairly mature, a design safety analysis can be done to determine whether an acceptable level of safety will be attained by the designed system. This analysis involves analyzing the design of the safety components to ensure that all the safety requirements are specified correctly. The requirements may need to be updated once the design has determined exactly what safety features will be included in the system. Then review the design looking for the places and conditions that lead to unacceptable hazards. Consider the credible faults or failure that could occur and evaluate their effects on the designed system. Does the designed system produce the desired result with respect to the hazards?

b. Prototyping or simulating: Prototyping or simulating parts of the design may show where the software can fail. In addition, this can demonstrate whether the software can meet the constraints it might have, such as response time, or data conversion speed. This could also be used to provide the operator’s inputs on the user interface. If the prototypes show that a requirement cannot be met, the requirement must be modified as appropriate or the design may need to be revised.

c.  Independence Analysis: To perform this analysis, map the safety-critical functions to the software components, and then map the software components to the hardware hosts and FCRs. All the input and output of each safety-critical component should be inspected.  Consider global or shared variables, as well as the directly passed parameters.  Consider “side effects” that may be included when a component is run. 

d. Design Logic Analysis: The Design Logic Analysis (DLA) evaluates the equations, algorithms, and control logic of the software design. Logic analysis examines the safety-critical areas of a software component.  A technique for identifying safety-critical areas is to examine each function performed by the software component.  If it responds to or has the potential to violate one of the safety requirements, it should be considered critical and undergo logic analysis.  A technique for performing logic analysis is to compare design descriptions and logic flows and note discrepancies. This most rigorous type of analysis can also be done using Formal Methods. Less formal DLA involves a human inspector reviewing a relatively small quantity of critical software products (e.g., PDL, prototype code) and manually tracing the logic. Safety-critical logic to be inspected can include failure detection and diagnosis, redundancy management, variable alarm limits, and command inhibit logical preconditions.

e. Design Data Analysis: The Design Data Analysis evaluates the description and intended use of each data item in the software design. Data analysis ensures that the structure and intended use of data will not violate a safety requirement.  A technique used in performing design data analysis is to compare the description to the use of each data item in the design logic.                       
Interrupts and their effect on data must receive special attention in safety-critical areas.  Analysis should verify that interrupts and interrupt handling routines do not alter critical data items used by other routines.
The integrity of each data item should be evaluated with respect to its environment and host.  Shared memory and dynamic memory allocation can affect data integrity.  Data items should also be protected from being overwritten by unauthorized applications.

f. Design Interface Analysis: The Design Interface Analysis verifies the proper design of a software component's interfaces with other components of the system. The interfaces can be with other software components, with hardware, or with human operators.  This analysis will verify that the software component's interfaces, especially the control and data linkages, have been properly designed.  Interface requirements specifications (which may be part of the requirements or design documents, or a separate document) are the sources against which the interfaces are evaluated.

Interface characteristics to be addressed should include inter-process communication methods, data encoding, error checking and synchronization.

The analysis should consider the validity and effectiveness of checksums, CRCs, and error correcting code.  The sophistication of error checking or correction that is implemented should be appropriate for the predicted bit error rate of the interface.  An overall system error rate should be defined and budgeted to each interface.

g. Design Traceability Analysis: This analysis ensures that each safety-critical software requirement is included in the design. Tracing the safety requirements throughout the design (and eventually into the source code and test cases) is vital to making sure that no requirements are lost, that safety is “designed in”, that extra care is taken during the coding phase, and that all safety requirements are tested. A safety requirement traceability matrix is one way to implement this analysis.   

3.4 Documenting and Reporting of Results of the Design Analysis:

Any design analysis done in the interim between status reports or prior to milestone reviews should be reported on to management and the rest of the team. When a project has safety-critical software, any analysis done by Software Assurance should be shared with the Software Safety personnel. The results reporting should include:

  • Identification of what was analyzed: Mission/Project/Application
  • Person(s) or group doing performing the analysis
  • Period/Timeframe/Phase analysis performed during
  • Documents used in analysis (e.g., requirements version, etc.g., versions of the system and software requirements, interfaces document, architectural and detailed design)
  • Description or identification of analysis techniques used
  • Overall assessment of design, based on analysis
  • Major findings and associated risk
  • Current status of findings: open/closed; projection for closure timeframe

3.5 Problem/Issue Tracking System

Findings, issues, and concerns from all the different safety design analyses performed should be documented in a problem/issue tracking system.

  • used. Include an evaluation of the techniques used.
  • Overall assessment of design, based on analysis results
  • Major findings and associated risk –
  • These items should be communicated to the software development personnel and possible solutions discussed. 
  • A high-level summary along with an overall assessment of the design should be communicated to the project management.
  • All items should be addressed and tracked to closure.
  • The detailed reporting should include :the type of analysis where the finding, issue, or concern was discovered ,the problem found, and an assessment of the amount of risk involved with this the finding.
The results of the analysis done by Software Assurance personnel and that done by Software Safety personnel can be reported in one combined report if desired
  • Minor findings
  • Current status of findings: open/closed; projection for closure timeframe
    • Include counts for those discovered by SA and Software Safety
    • Include overall counts from the Project’s problem/issue tracking system.
Div
idtabs-45
4

5. Resources

4

5.1 References

refstable-topic
Include Page
SITE:MC SADESIGN
SITE:MC SADESIGN

Show If
groupconfluence-users
Panel
titleColorred
titleVisible to editors only

Enter necessary modifications to be made in the table below:

SWEREFs to be addedSWEREFS to be deleted





SWEREFs NOT called out in text but listed as germane: none

SWEREFs called out in text: none

4

5.2 Tools

Include Page
Tools Table Statement
Tools Table Statement