Pipeline Publishing, Volume 4, Issue 4

	This Month's Issue:
	Maintaining Network Health

Network, Heal Thyself!
Technological Advances Put Building Blocks in Place to Create Self-Healing, Reactive Network Architecture

By Ken Ferderer

Since the beginning of network computing, networks have been designed as a collection of independent devices that have a limited awareness of each other, and no awareness of the collective whole outside of the basic routing fabric. As a result, networks have always had a limited ability to dynamically adjust to problems or events occurring within the network and are incapable of adjusting to any changes outside the network.

Traditionally, network engineers have attempted to build around these shortcomings with redundant routes, standby devices, and other mechanisms that introduce some basic resiliency into the network. These simple workarounds have always begged the larger questions:

Is it possible for a network and its collection of independent devices to react appropriately to changes in the environment without manual intervention?
Could a network ever recognize an event outside of its routing fabric that requires a change to its behavior or operation of the collective whole?
And based on what happens, could it modify the behaviors of multiple independent devices to accommodate that event?

Take as an example a sophisticated emergency response network, which links together special sensors that detect fires and chemical, radiological or nuclear threats. Based on the ‘type’ of event recognized by the sensor arrays, the underlying logical network must be dynamically configured in any number of pre-defined configurations.

In one such scenario, the sensors may report that a fire has been detected and, based on this event, the network should immediately alter its current logical configuration to provide secure connectivity between first responders, including police and fire departments, local authorities, and local news agencies. If however, a chemical or nuclear event is detected, the underlying network should instantaneously reconfigure itself to securely connect all federal response agencies, command and control, and route around any unresponsive sites.

To date, this type of dynamic network-level reconfiguration – or self-healing – based on a non-network event has simply not been possible. Even the network management solutions that, by definition, provide a broader view of the deployed network environment, are only able to recognize limited network-level events, such as dropped routes, throughput issues, and device failures. These solutions have a very limited capability to automatically react to, or recover from, any type of network event. In fact, most network management solutions are only

Most network management solutions are only capable of pushing static configuration files out to deployed devices that have lost their configurations.

capable of pushing static configuration files out to deployed devices that have lost their configurations.

In most cases today, the only self-healing network alternative is to construct redundant routes so that if a path becomes unavailable, traffic will automatically re-route onto the secondary path. This solution is far from ideal, as it is impossible to build enough redundant routes and workarounds for every type of potential event, both internally and externally. Truly dynamic and automatic reconfiguration of the logical network is simply not possible with existing technologies no matter what device and management vendors would like us to believe. However, times are changing.

Technology advances have finally provided the building blocks required to create a sophisticated networking architecture that is truly dynamic and has the characteristics to self-heal – or react – to any number of definable events, even if they are occurring outside of the networked fabric. The essential technologies consist of:

Device-level service oriented architectures (SOA): These SOA are able to expose underlying communication services and other capabilities on a typical network device to higher-level applications;
Advanced policy-driven control and management solutions: These technologies control network characteristics and behavior based on configurable user defined policies and implement the behaviors onto selected devices;
Sophisticated rules engines: These engines are capable of defining events and the appropriate response criteria for scenarios both within the network fabric and external to the network.

article page | 1 | 2 | 3 |

© 2006, All information contained herein is the sole property of Pipeline Publishing, LLC. Pipeline Publishing LLC reserves all rights and privileges regarding
the use of this information. Any unauthorized use, such as copying, modifying, or reprinting, will be prosecuted under the fullest extent under the governing law.