10.5. Restoring and Recovering Resources
While an incident response is in progress, the CERT team should be investigating while working toward data and system recovery. Unfortunately, it is the nature of the breach which dictates the course of recovery. Having backups or offline, redundant systems during this time is invaluable.
To recover systems, the response team must bring any downed systems or applications back online, such as authentication servers, database servers, and any other production resources.
Having production backup hardware ready for use is highly recommended, such as extra hard drives, hot-spare servers, and the like. Ready-made systems should have all production software loaded and ready for immediate use. Only the most recent and pertinent data needs to be imported. This ready-made system should be kept isolated from the rest of the network. If a compromise occurs and the backup system is a part of the network, then the purpose of having a backup system is defeated.
System recovery can be a tedious process. In many instances there are two courses of action from which to choose. Administrators can perform a clean re-installation of the operating system on each affected system followed by restoration of all applications and data. Alternatively, administrators can patch the offending vulnerabilities and bring the affected system back into production.
10.5.1. Reinstalling the System
Performing a clean re-installation ensures that the affected system is cleansed of any trojans, backdoors, or malicious processes. Re-installation also ensures that any data (if restored from a trusted backup source) is cleared of any malicious modifications. The drawback to total system recovery is the time involved in rebuilding systems from scratch. However, if there is a hot backup system available for use where the only action to take is to dump the most recent data, system downtime is greatly reduced.