New terms for primary architectural elements are defined. These include a resilience target, which can be recovered; a resilience engine that performs recovery actions; and a resilient authority that sets policies for recovery and initiates new actions when they are needed to put the resilience target in a trusted state.
A resilience target is a component that has mutable code and configuration information, whether firmware, software or both. By defining a resilience target within a device, the architect is able to draw a boundary around the code, configuration and runtime environment that other components in the specification can fix or service when a compromise occurs or patches need to be deployed due to a known vulnerability. Because it is anticipated that a resilience target might be compromised by malware, the architecture for recovery assumes the resilience target will not assist in the recovery processes. In fact, malware that compromises the resilience target may actively try to prevent recovery from occurring.
The resilience engine is a component that can service one or more resilience targets, even when they are uncooperative. With the ability to function even if the remote network is unavailable, the resilience engine supports configurable policies regarding when and how it performs servicing actions. To be able to respond to future circumstances not foreseen at the time the device was created, the resilience engine is expected to have the capability to receive new instructions from an authorized entity called a resilience authority.
Three important resilience building blocks enable the resilience engine to reliably service the resilience target. The first is a secure execution environment that the allows the resilience engine to run without interference from the resilience target. The simplest example of a secure execution environment is rebooting a device and having the resilience engine start first in the boot sequence and having the resilience engine decide when the resilience target starts. The resilience target can’t interfere with the resilience engine if it isn’t running.
The second resilience building block is a storage protection latch. Storage protection latches are intended to provide read and write protection for persistent storage. Initially after a device is turned on or restarted, the storage is accessible for reading and writing. Once a storage protection latch is enabled, it prevents reading, writing or both on an area of persistent storage. Storage protected by a storage protection latch provides an ideal location for a resilience engine to store is code and configuration. When the resilience engine runs first during boot, it can read and modify its persistent storage, but before it starts the resilience target, it can switch on the storage protection latch so the resilience target cannot modify storage used by the resilience engine.
The third and final resilience building block is a watchdog counter. The watchdog counter is how the resilience engine gains control to do servicing actions even if the resilience target is not cooperative. The simplest example is a “latchable” watchdog counter. It can be configured by the resilience engine to unequivocally restart the platform after a fixed period of time, for example once a day. The restart gives the resilience engine a chance to do servicing actions. Obviously for some use cases, restarting a platform could be disruptive for the user, so the specification has other types of watchdog counters to address different situations.
Cyberattacks are evolving and continually improving. This means that solutions believed to be secure at the time of manufacturing are consistently in need of patches for unanticipated vulnerabilities or mistakes during the lifecycle of devices. By adopting the new cyber resilient building blocks, there can be strong protections for the resilience engine and reliable servicing chances when vulnerabilities or compromises of the resilience target need to be rectified. For a solution to be secure over time, robust building blocks like these mentioned above as well as active support is needed.
In order to protect the ever-increasing number of IoT devices, the capabilities must be designed to be both simple to implement in hardware or firmware and simple to use for software. Such simplicity decreases the vulnerability of the security-critical firmware that operates them, while also minimizing cost, power consumption and size of the hardware.
By adopting and implementing these building blocks, device architects have a robust starting point to work from to ensure the resilience of any IoT devices, both now and in the future. As technology advances, new cyberattacks and vulnerabilities will arise that threaten the security of the growing number of IoT devices. However, through the development of specific use cases that require this new specification to ensure device security, manufacturers will be able to implement the particular building blocks required for their devices and the applications in which they will be used. Doing so safeguards IoT devices throughout their lifecycle, regardless of the sophistication of the potential attacks.