The Federal Financial Institutions Examination Council (FFIEC) is a mouthful. It’s also a U.S. government interagency that codifies standards, principles, and forms for this country’s five banking regulators: the Federal Reserve Board of Governors, the FDIC, the National Credit Union Administration, the Office of the Comptroller of the Currency, and the Consumer Financial Protection Bureau. The State Liaison Committee is also on board, which adds state-level banking regulators to the roster.
The FFIEC ensures that all of these bodies don’t create conflicting regulations, which would make an already challenging set of regulations even harder to follow.
FFIEC Business Continuity: Best Practices
One of the FFIEC’s responsibilities is to prescribe business continuity guidance for regulators and the finance industry. FFIEC business continuity standards starts with an enterprise-wide business continuity plan (BCP).
In theory, most American businesses already have a BCP. In practice, many BCPs sit and gather digital dust since implementing and testing disaster recovery can be hard and expensive. However, the FFIEC is quite serious about businesses developing and testing business continuity plans. Although the inter-agency does not directly charge non-compliance fines, its member agencies do.
Compliance requirements for financial institutions are more complex than other industries with the possible exception of healthcare. The initial business impact analysis (BIA) requires planners to explore how a disaster will affect all of these areas and more: personnel, phone and digital communications, application environments across the company, and facilities. Planners will also need to know how to protect electronic payment systems, liquidity, and financial disbursement in the face of a disaster.
On-Time Application Recovery
Let’s drill down to FFIEC’s application recovery guidance, since getting applications back up and running is critical to business health and compliance – sometimes even survival. Here is the critical mandate: Recover your applications before downtime or data loss causes significant damage and triggers non-compliance investigations.
According to the FFIEC, business continuity plans should operate both proactively and reactively, both for the business and for the business’s outsourced partners. Proactively, set Recovery Time Objectives (RTO) to minimize downtime and Recovery Point Objectives (RPO) to minimize data loss. Create data protection policies and procedures that enable swift recovery, validation, and compliance reporting; maintain remote DR sites and/or cloud-based failover; and test the BCP at least quarterly.
Reactive planning lists all procedures in case a disaster does happen. The BCP should contain updated contacts for emergency response personnel, computing locations, and detailed recovery procedures covering everything from server restores to application and data recovery, and relative priorities and resources. Although no one wants to think about this, the key IT personnel who know these procedures by heart might not be available. Assume that the person carrying out the procedures in generalist IT.
Since the emphasis here is not just on restoring data but restoring applications, your BCP must include a way to keep mission-critical applications running in the event of a disaster. The FFEIC does not especially care how you so this, only that you do. They list a number of likely options. No one option does it all; in practice you will probably mix and match them to achieve your priority-order recovery objectives. Below are the top most common options:
- Remote hot or warm site. A hot site is a secondary DR site that can take over production within minutes to hours. Typically, IT mirrors or replicates production data on a continuous or near-continuous schedule from the primary site. Admins or service providers must practice the same security measures as the primary data center. Warm sites run along the same model, housing duplicate equipment and maintaining fast access to backup data. Warm sites are an economical alternative to mirrored hot sites, although I suggest that warm site owners also deploy failover services for applications with RPOs and RTOs in the seconds.
- Mirrored sites/data centers. Some data centers serve as DR hot sites to each other, such as two or more regional data centers that bi-directionally replicate mission-critical data between duplicate computing environments. This makes a lot of economic sense since it lets businesses make greater use of an expensive DR site investment.
- Disaster Recovery as a Service is widely defined as a cloud-based recovery service. There are several advantages including customized service levels, reasonable storage pricing, and –unless you’re right next door to your provider’s data center – a remote location. I strongly suggest that if you go with a DRaaS provider, spring for the failover option where your provider will spin up a virtual data center for your mission-critical applications.
- On-premise recovery services. This option assumes that your physical plant is undamaged or that you maintain a cold site with sufficient room and power for computing equipment. You or your service provider maintain sufficient equipment so you can rebuild and relaunch your computing infrastructure within hours to a day. This option may not be quick enough to recover transactional databases but is a good alternative for recovering assets such as business-critical VMs or a production storage appliance.
Some industries play fast and loose with regulatory compliance. Financial services are not among them. Their challenge is not learning how to take regulation seriously; it’s making sense of the bewildering range of DR possibilities and practices.
If all of this sounds quite complicated, that’s because it is. Seriously consider investing in a Disaster Recovery as a Service offering whose provider knows FFIEC business continuity well, and can guide you to your optimal business continuity services and technology choices.