SOS! Ensuring safety and security in an expanding system of systems landscape
Position paper
TNO authors: Sezen Acur, Teun Hendriks, Yoram Meijaard, Carolien van der Vliet-Hameeteman
A collaborative work between TNO-CST and TNO-ESI stresses the importance of considering security and safety throughout an entire System's Development Lifecycle, and advocates integral systems engineering for cyber resilience.
How can Systems Engineering ensure resilient safety and security over the full lifecycle?
Security and safety are emergent properties of a system and are interlinked: security breaches may cause safety incidents. No system can provide absolute security and safety due to limits of human certainty. The uncertainty referred here exists in the lifecycle of every system, and constraints such as cost, schedule, performance, feasibility, and practicality are some examples. Therefore, systems engineers must make trade-offs across contradictory, competing, and conflicting needs and constraints to provide adequate levels of safety and security, whilst keeping the system operational.
The system of systems landscape is changing, which makes it more difficult to ensure safety and security. Systems are increasingly complex, software intensive, and are deployed in distributed system of systems and solution ecosystems. Systems still need to be correct and display safe behavior, even when interacting with this dynamic environment. Proving these properties now overwhelms the established Verify & Validate approach used by system engineers to cope with component failures and software bugs.
Safety and security standards are established to provide guidelines for the protection of systems. Standards are the basis for acceptable processes and technical protocols to arrange for safety and security of systems and services. However, compliance to a standard does not provide guarantees nor certainty with respect to safety and security of a system.
In fact, today’s large and complex system of systems invariably have potential weaknesses and possible residual flaws. Unavoidable gaps with reference to knowledge in safety and security are expected within such complex systems.
While operating a system, there may be situations where lack of knowledge could lead to incorrect and unsafe system operation. In case of such events, upfront and collective stakeholders agreements are then important to determine what steps are needed to achieve rapid incident recovery, provide business continuity, and remain resilient.
Consequently, systems engineering should embed safety and security in its System Engineering (SE) process over the full system lifecycle. SE should be able evaluate and incorporate requirements to enhance safety and security early in the SE process. Furthermore, SE should be able to manage and update incident response and recovery capabilities over the full system lifecycle.
This position paper advocates for incorporating resilience thinking/engineering in SE, driving system capabilities that support mission or business functions. The aim is to minimize the effects of incidents and adversarial actions on systems, and minimize impact on their capabilities and system usage behavior. Collaboration of systems engineering, safety and security, and operations is now crucial for creation and operation of resilient systems as well as SoS which fulfill the needs of society, and are trustworthy for society.
Systems Engineering, safety and security are crucial for creation and operation of resilient systems and SoS which fulfill the needs of society and are trustworthy for society. SE should ensure that effects of incidents and adversarial actions on systems are minimized and disruptions in operation are quickly resolved.
To ensure adequate resilience in systems, system operation, systems engineering, safety and security should collaborate. Foundational principles are identified and an overall methodology is suggested. However, practical guidelines are missing.
Next steps are to investigate how to make cyber-resilience practical, how to assess threats, how to enable orchestrated response and recovery. We need to understand what counter measures are adequate. We need to learn more about threats and vulnerabilities in a SoS landscape. These need to be explored further to give practical resilience guidelines to the various disciplines.