See edit history of this section
Post feedback on this section
- 1. The Requirement
- 2. Rationale
- 3. Guidance
- 4. Small Projects
- 5. Resources
- 6. Lessons Learned
- 7. Software Assurance
1. Requirements
5.2.1 The project manager shall record, analyze, plan, track, control, and communicate all of the software risks and mitigation plans.
1.1 Notes
Project managers should be aware of any risks that remain after mitigations have been completed or after a risk has been accepted.
1.2 History
1.3 Applicability Across Classes
Class A B C D E F Applicable?
Key: - Applicable | - Not Applicable
2. Rationale
The purpose of risk management is to identify potential problems before they occur so that risk-handling activities can be planned and invoked as needed across the life of the product or project. Risk handling activities are intended to mitigate adverse impacts on achieving the project's objectives.
1.1 Background
"1.1.1 Generically, risk management is a set of activities aimed at understanding, communicating, and managing risk to the achievement of objectives. Risk management operates continuously in an activity, proactively risk-informing the selection of decision alternatives and then managing the risks associated with the implementation of the selected alternative." 009
Identification and management of risks provide a basis for systematically examining changing situations over time to uncover and correct circumstances that impact the ability of the project to meet its objectives.
3. Guidance
During the past ten years, the importance and complexity of software have grown enormously. With this change has come an increasing awareness of the substantial risks inherent in software development and the ineffectiveness of the usual method of dealing with risk. It is necessary to manage a list of software-related risks throughout the software development life cycle by the software development organizations even if the project office does not recognize or accept the software risks at the project level. The requirements are for the software organization to recognize that all software development has some level of risk. Each discipline of a project development team is to maintain a list of potential risk items for the development activities. The software risk process is handled following NPR 8000.4 process to the extent possible. The most important thing is that software organizations maintain and address risks throughout the software development process.
Software risks also factor in the software cost estimation process (see SWE-015 - Cost Estimation).
This diagram from NASA/SP-2007-6105, NASA Systems Engineering Handbook, 273 provides an overview of a risk management process:
Guidance for each of the risk management steps is provided below. In addition to the guidance found in this Handbook, NASA users should consult Center Process Asset Libraries (PALs) for Center-specific guidance and resources related to continuous risk management.
SPAN - Software Processes Across NASA
SPAN contains links to Center managed Process Asset Libraries. Consult these Process Asset Libraries (PALs) for Center-specific guidance including processes, forms, checklists, training, and templates related to Software Development. See SPAN in the Software Engineering Community of NEN. Available to NASA only. https://nen.nasa.gov/web/software/wiki 197
Risk management activities begin during the project concept phase and continue through project retirement. Larger projects may have project leads with responsibility for risk management, but all project team members need to look for and bring risks to management’s attention.
3.1 Identify Software Risks
When identifying software risks, consider the following insights and suggestions:
- "Identify risks before they become problems. Communication is at the center of the Risk Management paradigm (see NPR 8000.4, Agency Risk Management Procedures and Guidelines). Brainstorming is often used to identify project risks. People from varying backgrounds and points of view see different risks. A diverse team, skilled in communication, will usually find better solutions to the problems." 276
- Use a checklist to avoid "missing" risks that have been identified in previous projects. Topic 7.19 - Software Risk Management Checklists include several software risks checklists. See also Topic 8.24 - Software Assurance Risk
- Add new risks to existing risk checklists for future projects.
- Review lessons learned from past projects.
- Include software assurance and/or software safety personnel on change control boards (CCBs) as roles specifically assigned to identify risks.
- Use a proactive and continual process to identify risks.
- Assess the project to identify risks whenever there is a significant change in project circumstances.
- Include cost, performance, and schedule risks as well as technical or skill risks.
- Risks may be identified through the use of IV&V Surveillance actions. See also Topic 8.06 - IV&V Surveillance.
See also SWE-151 - Cost Estimate Conditions,
3.2 Capture Identified Risks
Once the team identifies the initial risk set, they are recorded with specific information useful to manage the risks.
When recording software risks, use the following recommendations:
- Risk identifier - A unique identification number assigned to each risk item; used to track a risk item from identification until project end
- Priority - A priority assigned to each risk item; used to establish when risks need to have actions taken, and how long risks may need to be watched or tracked before they are no longer a concern or can be closed; generally derived from results of analysis activities
- Probability - Likelihood the situation or circumstance will occur; generally part of the analysis activities
- Impact - Magnitude of the impact given that the event occurs; generally part of the analysis activities
- Exposure grade – Numerical value expressed as the product of the impact and probability; used for tracking and reporting purposes
- Time frame- Time frame of occurrence; gives the risk item a time context; determined by when the risk item will occur, relative to the risk reporting period; used in conjunction with the Exposure grade to determine Priority
- Immediate - within 30 days
- Near-Term - > 30 days ? 90 days
- Far-Term - > 90 days
- Risk statement- Statement of risk; the objective is to arrive at a concise description of the risk, which can be understood and acted upon; components and description of a statement of risk are:
- Condition: a single phrase or sentence that briefly describes the key circumstances, situations, etc., causing concern, doubt, anxiety, or uncertainty
- Consequence: a single phrase or sentence that describes the key, possible negative outcome(s) of the current condition.
Note: The minimum statement of risk is the condition. It is desirable to capture the originator’s assessment of the possible consequences of the risk to assure that it is given suitable weight during analysis; however, the explicit statement of consequence is not required, is often omitted, and can be subsequently added at the planning step.
- Assigned To - Individual or group responsible for tracking and reporting on the risk until closure
- Mitigation– The mitigation eliminates or reduces the risk by:
- Reducing the impact (by some degree or to zero)
- Reducing the probability (to a lower probability or zero)
- Shifting the timeframe (i.e., when action must be taken)
Note: recognize that mitigation may also introduce new risks to the software project
- Source - Assists in tracking identification
- Date opened - A beginning date that is a permanent date indicating when risk is first identified
- Planned date closed - A planned closure date indicating when the risk is expected to be closed, based upon the software project lead’s experience; used to plan resources and track progress
- Date updated - Each risk shall be reported during regular status meetings; this date reflects the last time an action was taken against a risk
- Date closed - A closure date indicating when the software project lead deemed a risk satisfactorily closed
- Closure rationale - Reason for closure; documents the rationale upon which the decision to close the risk was based.
3.3 Analyze Software Risks
Once the team identifies and records the initial set of risks, an analysis needs to be performed to determine the likelihood (probability) and the severity of the consequences of each risk.
When performing this analysis, many risk management guidebooks suggest including the following:
- Scenarios in which the risk could occur.
- Likelihood of occurrence.
- Consequences.
Keep in mind that "a rare but severe risk contributor may warrant a response different from that warranted by a frequent, less severe contributor, even though both have the same expected consequences." 273
One analysis method is a probabilistic risk assessment (PRA). Per NASA/SP-2007-6105, NASA Systems Engineering Handbook, "PRA is a scenario-based risk assessment technique that quantifies the likelihoods of various possible undesired scenarios and their consequences, as well as the uncertainties in the likelihoods and consequences... For additional information on probabilistic risk assessments, refer to (NPR 8705.5A, Technical Probabilistic Risk Assessment (PRA) Procedures Guide for Safety and Mission Success for NASA Programs and Projects)." (Editor's Note: NPR 8705.3 has been updated to NPR 8705.5A in this quotation.) 346
Another recommendation is to model the scenarios and use those models to assess the consequences and determine the likelihood of a risk occurring. See also Topic 7.15 - Relationship Between NPR 7150.2 and NASA-STD-7009.
One possible set of likelihood (or probability) classifications:
- Inevitable occurrence (5).
- Cannot prevent this event, no alternate approaches or processes are available.
- Very high probability of occurrence (4).
- It cannot prevent this event, but a different approach or process might prevent the event.
- High probability of occurrence (3).
- May prevent this event, but additional actions will be required.
- Medium probability of occurrence (2).
- The current process is usually sufficient to prevent this type of event.
- Low probability of occurrence (1).
- The current process is sufficient to prevent this type of event.
One possible set of impact (or consequences) classifications:
(Note: The term technical includes everything that does not cost and schedule, e.g., safety, operations, programmatic.)
- Unacceptable (5).
- Technical - Unacceptable, no alternatives exist.
- Schedule - Can’t achieve key team or major project milestones.
- Cost - Team budget increase >15%.
- Major (4).
- Technical - Major reduction, but workarounds are available.
- Schedule - Key team milestone slip >1 month, or project critical path impacted.
- Cost - Team budget increase > 10%.
- Medium (3).
- Technical - Moderate reduction, but workarounds available.
- Schedule - Key team milestone slip <= 1 month.
- Cost - Team budget increase >5% or other teams impacted.
- Minor (2).
- Technical - Moderate reduction, the same approach retained.
- Schedule - Additional activities required, able to meet need dates.
- Cost - Team budget increase < 5%.
- Minimal (1).
- Technical - Minimum reduction, the same approach retained.
- Schedule - Additional activities required, able to meet need dates.
- Cost - Team budget increase < 2%,
The results of this step are used to rank the identified risks and the possible alternatives for those risks so that informed plans can be put into place to address those risks. NASA/SP-2007-6105, NASA Systems Engineering Handbook, 273 describes tools and techniques for analyzing and managing risks, including:
- Risk matrices - to facilitate discussions regarding "the status and effects of risk-handling efforts, and communicate risk status information."
- Failure mode effects analysis (FMEA) and failure modes, effects, and criticality analysis (FMECA) – "an ongoing procedure by which each potential failure in a system is analyzed to determine the results or effects thereof on the system, and to classify each potential failure mode according to its consequence severity."
- Fault tree analysis (FTA) - "identify potential failure modes for a product or process, assess the risk associated with those failure modes, rank the issues in terms of importance, and to identify and carry out corrective actions to address the most serious concerns."
See also Topic 8.05 - SW Failure Modes and Effects Analysis, SWE-179 - IV&V Submitted Issues and Risks,
3.4 Plan To Address Software Risks
After the team identifies and analyzes the initial set of risks, a plan for managing those risks (and any risks identified later in the project life cycle) is needed. This plan may be standalone or be captured in the Software Development Plan/Software Management Plan (5.08 - SDP-SMP - Software Development - Management Plan ) and updated throughout the project life cycle to reflect the current risk management status. It is also important to inform providers, typically via their contract, that their risk management plans will be reviewed periodically by the acquirer. See also Topics 7.03 - Acquisition Guidance, 7.04 - Flow Down of NPR Requirements on Contracts and to Other Centers in Multi-Center Projects
Typical options for addressing risks include:
- Accepting all or part of a risk.
- Eliminating the risk.
- Mitigating the risk (reducing the likelihood, reducing the negative effects).
- Monitoring the risk.
- Conducting further research on the risk.
The risk management plan needs to include topics such as:
- Risk control and tracking steps describing what will be tracked.
- Risk control actions.
- Criteria for taking corrective actions.
- The project's continuous risk management activities will identify potential technical problems before they occur and mitigate the impact of those problems on the outcome of the project.
- Risk owner, a role responsible for responding to the risk.
The team needs to consider costs associated with managing, controlling, and mitigating risks when developing the risk management plan. This can be especially important for projects with limited or constrained budgets.
Project-level risk management plans need to describe coordination with program-level plans to ensure proper risk tracking and information sharing. Once the plan is created, it is reviewed and approved by an appropriate level of project management before it is implemented.
3.5 Track Software Risks
Risks that are not eliminated need to be tracked throughout the project life cycle to ensure their mitigation strategies remain effective. For low-risk items that are not formally included in the risk management plan, consider using a watch list so that they are not forgotten and to help ensure that they do not escalate to a higher level of risk later in the project.
Additionally, conditions that the team has identified as risk triggers are also monitored and tracked until those situations are no longer risk factors. Risk status also needs to be tracked and weighed against risk criteria to determine if corrective action needs to be taken.
If a risk management tool is in use for the project, risks need to be added to and tracked using this tool. A tracking tool could be a simple spreadsheet or database for a small project, a tool purchased specifically for tracking risks, or part of an integrated tool used to track multiple aspects of the project.
See also SWE-086 - Continuous Risk Management,
3.6 Control Software Risks
When a risk occurs, action needs to be taken. Those actions should have been included in the risk management plan and need to be implemented in this step. Their effectiveness also needs to be measured so adjustments to the plan can be made, if necessary.
3.7 Communicate Software Risk Information
Risk information is communicated to all relevant stakeholders throughout the project life cycle. Stakeholders include project managers, project technical personnel, test team members, and anyone else affected by or with the need to know about risks, their impact, and their mitigations. Project life cycle reviews are one mechanism for risk communication
Information, such as the effectiveness of risk mitigations and action plans, needs to be communicated to project managers, technical authorities, and other roles that make risk decisions and risk-based decisions throughout the project life cycle.
3.8 Document Software Risks And Mitigation Plans
Recording software risks and mitigation plans is an activity that the team needs to do as part of all previous steps. Documentation could include:
- Analysis records decisions based on that analysis.
- Records of risk acceptance (approval signatures and reasons for acceptance).
- Records of planned mitigations and control mechanisms.
- A list of identified risks.
- A list of planned controls.
- Risk acceptance rationale.
The software safety manager should assure that risks affecting software safety are captured, addressed, and managed as part of the program, project, and facility risk management processes, and those risks which could impose a system hazard are captured in the system hazard analyses.
See also Topic 8.10 - Facility Software Safety Considerations. 5.05 - Metrics - Software Metrics Report
3.9 Roles And Responsibilities
The table below shows roles and responsibilities typical for continuous risk management:
Role | Responsibilities |
Center safety and mission assurance (SMA) organizations | Provide risk management consultation, facilitation, and training to program/project organizations. 009 |
Software management | Review and approve risk management plan; ensure continuous risk management is implemented; designate the risk manager; ensure that key decisions are risk-informed; coordinate management of risks across affected projects or project elements. |
Software Risk Manager | Overall responsibility for software risk management; ensures risk management plan is developed. 009 |
Project software team members | Bring risks to management's attention; support Risk Manager in monitoring and controlling risks. |
A recommended practice is that the Software Lead Engineer maintains a list of software risks independent of the program’s risk list. Frequently, the program risks are larger than any given software risk item. The software risk data should be maintained in an organizational database.
3.10 Additional Guidance
Additional guidance related to this requirement may be found in the following materials in this Handbook:
3.11 Center Process Asset Libraries
SPAN - Software Processes Across NASA
SPAN contains links to Center managed Process Asset Libraries. Consult these Process Asset Libraries (PALs) for Center-specific guidance including processes, forms, checklists, training, and templates related to Software Development. See SPAN in the Software Engineering Community of NEN. Available to NASA only. https://nen.nasa.gov/web/software/wiki 197
See the following link(s) in SPAN for process assets from contributing Centers (NASA Only).
SPAN Links |
---|
4. Small Projects
Projects with limited budgets may consider using spreadsheets or small databases to track their project risks rather than purchase a tool for this purpose. Small projects could also consider using tools available at the Center level since those may have no associated purchase or lease costs.
5. Resources
5.1 References
- (SWEREF-001) Software Development Process Description Document, EI32-OI-001, Revision R, Flight and Ground Software Division, Marshall Space Flight Center (MSFC), 2010. This NASA-specific information and resource is available in Software Processes Across NASA (SPAN), accessible to NASA-users from the SPAN tab in this Handbook.
- (SWEREF-009) NPR 8000.4B, NASA Office of Safety and Mission Assurance, 2017. Effective Date: December 06, 2017 Expiration Date: December 06, 2022 See also the Risk Management Plan template.
- (SWEREF-041) NPR 7123.1D, Office of the Chief Engineer, Effective Date: July 05, 2023, Expiration Date: July 05, 2028
- (SWEREF-103) Software Risk Identification, 580-SP-013-03, Software Engineering Division, NASA Goddard Space Flight Center (GSFC), 2014. This NASA-specific information and resource is available in Software Processes Across NASA (SPAN), accessible to NASA-users from the SPAN tab in this Handbook.
- (SWEREF-104) Software Risk Monitoring and Control, 580-SP-014-03, Software Engineering Division, NASA Goddard Space Flight Center (GSFC), 2014. This NASA-specific information and resource is available in Software Processes Across NASA (SPAN), accessible to NASA-users from the SPAN tab in this Handbook.
- (SWEREF-122) Alberts, C.J. , 1996.
- (SWEREF-197) Software Processes Across NASA (SPAN) web site in NEN SPAN is a compendium of Processes, Procedures, Job Aids, Examples and other recommended best practices.
- (SWEREF-223) ISO/IEC 16085, IEEE STD 16085-2006. NASA users can access IEEE standards via the NASA Technical Standards System located at https://standards.nasa.gov/. Once logged in, search to get to authorized copies of IEEE standards.
- (SWEREF-271) NASA STD 8719.13 (Rev C ) , Document Date: 2013-05-07
- (SWEREF-273) NASA SP-2016-6105 Rev2,
- (SWEREF-276) NASA-GB-8719.13, NASA, 2004. Access NASA-GB-8719.13 directly: https://swehb-pri.msfc.nasa.gov/download/attachments/16450020/nasa-gb-871913.pdf?api=v2
- (SWEREF-346) NPR 8705.5A, NASA Office of Safety and Mission Assurance, 2010. Effective Date: June 07, 2010, Expiration Date: June 07, 2022
- (SWEREF-380) Software Risk Checklist, Flight Software Branch, Software Risk Management Plan, NASA Marshall Space Flight Center (MSFC). This is a list of generic risks organized by life cycle phase. This NASA-specific information and resource is available in Software Processes Across NASA (SPAN), accessible to NASA-users from the SPAN tab in this Handbook.
- (SWEREF-500) Public Lessons Learned Entry: 272.
- (SWEREF-512) Public Lessons Learned Entry: 625.
- (SWEREF-524) Public Lessons Learned Entry: 803.
5.2 Tools
NASA users find this in the Tools Library in the Software Processes Across NASA (SPAN) site of the Software Engineering Community in NEN.
The list is informational only and does not represent an “approved tool list”, nor does it represent an endorsement of any particular tool. The purpose is to provide examples of tools being used across the Agency and to help projects and centers decide what tools to consider.
6. Lessons Learned
6.1 NASA Lessons Learned
The NASA Lessons Learned database contains the following lessons learned related to risk management:
- Lewis Spacecraft Mission Failure Investigation Board. Lesson Number 0625: Adopt Formal Risk Management Practices 512. "Faster, Better, Cheaper methods are inherently more risk-prone and must have their risks actively managed. Disciplined technical risk management must be integrated into the program during the planning and must include formal methods for identifying, monitoring, and mitigating risks throughout the program. Individually small, but unmitigated risks on Lewis produced an unpredicted major effect in the aggregate."
- Identification, Control, and Management of Critical Items Lists. Lesson Number 0803: The Use of Probabilistic Risks Assessments 524: "Probabilistic risk assessments have proven to be useful procedures in providing product development teams with an insight into factors of safety and to strengthen critical item or single failure point retention rationale. Margins of safety have a strong influence on the acceptability of retaining potential failure modes or critical items if it can be proven that risk of failure is reduced to an acceptably low level."
- Flight Anomaly of Atmospheric Trace Molecule Spectroscopy (ATMOS) Instrument, Risk Assessment. Lesson Number 0272 500: Lesson Learned No. 2 states: "Low-cost Shuttle Transportation System (STS)-borne experiments with plans for repeated flights, exemplified by the Atmospheric Trace Molecule Spectroscopy (ATMOS) spectrometer, require risk assessments different from those used for single launch experiments."
6.2 Other Lessons Learned
No other Lessons Learned have currently been identified for this requirement.
7. Software Assurance
7.1 Tasking for Software Assurance
1. Confirm and assess that a risk management process includes recording, analyzing, planning, tracking, controlling, and communicating all software risks and mitigation plans.
2. Perform audits on the risk management process for the software activities.
7.2 Software Assurance Products
- Software Engineering Plans Assessment
- Risk Management Process Audit Report (Results and findings from audits on the risk management process performance for software activities, including any risks or issues.)
Evidence that Task 1 confirmation has occurred.
Objective Evidence
- Software risks
- Software assurance audit results
- Software status charts and data
7.3 Metrics
- # of software work product Non-Conformances identified by life cycle phase over time
- # of Risks trending up over time
- # of Risks trending down over time
- # of Risks with mitigation plans vs. total # of Risks
- # of Risks by Severity (e.g., red, yellow, green) over time
- # of Risks identified in each life cycle phase (Open, Closed)
- # of process Non-Conformances (e.g., activities not performed) identified by SA vs. # accepted by the project
- Trends of # Open vs. # Closed over time
- # of Non-Conformances per audit (including findings from process and compliance audits, process maturity)
- # of Open vs. Closed Audit Non-Conformances over time
- Trends of # of Non-Conformances from audits over time (Include counts from process and standards audits and work product audits.)
- # of Compliance Audits planned vs. # of Compliance Audits performed
- # of software process Non-Conformances by life cycle phase over time
See also Topic 8.18 - SA Suggested Metrics.
7.4 Guidance
Risk management is one of the areas where software assurance can add the most benefit. The software assurance involvement with risk management begins early in the project and continues throughout. Software assurance has two primary areas of involvement with risk:
Task 1:
For the software assurance tasks, software assurance will confirm that the project is planning and performing all the activities needed to manage its risk appropriately.
Software assurance will confirm that the project has adequately planned its risk management by reviewing the project risk management plan. Check that the plan for risk management is established and documented. Think about the following questions:
- Is there a risk strategy including the criteria for developing/implementing a mitigation plan for risk?
- Has a risk management process been established (or chosen from a Center asset library and tailored for the project)?
- Has a tool or a process for documenting and tracking the risks been chosen?
- Are regular meetings set up to review the risks and update their status and take any necessary action to control the risk?
- Has the communication path for risk status been established, including the path for elevating risks with increasing severity levels?
- Have all the necessary people been identified with their roles for participation in the risk process?
- Are software assurance personnel included in the risk review meetings?
Throughout the project, confirm that the risk management plan is being followed:
- Are risks being identified and recorded on a continuing basis?
- Are the risk review meetings being held regularly and are the risks being updated frequently?
- Are the more severe risks being watched closely so mitigations can be implemented as necessary?
- Are the appropriate people participating in the risk management process?
- Are there any issues with the project risk management process that need to be brought up to the project manager?
Task 2:
For the second SA task, software assurance needs to do a process audit periodically to make sure the project is following the planned risk management process. Software assurance will track any findings to closure. If audit findings are not being addressed by the project, then they should be elevated through the SA reporting chain.
Independently, software assurance should be looking at all aspects of the project to provide a second set of eyes and identify any risks they see in project activities and bring them to the attention of the project management (Usually, this occurs through the participation of SA in the project risk management process or the software assurance participation in reviews.)
Assurance should keep risk in mind with all of their activities and notify the project of any risks they identify while they are performing any of their regular tasks. These tasks include:
- any process or product audits performed,
- attendance and participation in any reviews (including peer reviews), or project meetings, and any analysis performed.
Software assurance personnel are typically members of the project risk boards and should submit their risks to the risk board for inclusion. These risks should be tracked to closure. If the project risk board does not accept any risks submitted by SA, SA still tracks these risks and elevates them again if the likelihood or severity of the risk seems to rise.
Every task that involves performing an audit should also clarify that all audit findings are promptly shared with the project and will be addressed in the handbook guidance.
See also Topic 8.12 - Basics of Software Auditing.
7.4.1 Checklist for Auditing the Risk Management Process
The checklist below can be used to audit a project's risk management process. Click on the diagram to download a copy.
7.5 Additional Guidance
Additional guidance related to this requirement may be found in the following materials in this Handbook: