Active Passive Controllers – So misunderstood

When considering high availability storage design, vendors usually deliver a dual controller architecture to provide controller redundancy.

But an important decision that needs to be made is whether or not the controllers should use an active/active or active/passive configuration.

In an active/active (AA) configuration, both controllers are working at the same time, each providing storage services to connected hosts. But, in an active/passive (AP) configuration only one controller is providing services, whilst the other controller is online in a standby state, ready to take over the services of the active controller when required.

An AA configuration can take advantage of the resources on both controllers, so in this configuration it could be argued that we are making best use of CPU, networking and memory available in both controllers. However, some vendors may stipulate in their best practices that you should not exceed 50% utilisation from each controller.

But, if you can only run each controller at 50% utilisation, then the argument for being able to use ALL of your resources from both controllers doesn’t quite work. It’s certainly possible to run each controller at values above 50% with careful monitoring and ensuring some background services have the ability to ‘back off’ when services are contending for controller resources in a failover situation; this could be managed by QOS or prioritisation policies if the vendor has these features available.

Nevertheless the period of time in a degraded state is unknown and could range from a few hours to a few days whilst awaiting the replacement component.

Monitoring and managing controller utilisation will add a level of administrator overhead and management complexity, as the administrator must decide how to balance the workload evenly between controllers. An AA configuration can also affect the way software upgrades and hardware replacements are performed, as it could introduce some disruption to these processes.

Sometimes systems using AA are configured such that the controllers merely own LUNs/Volumes and hosts have access to both controllers but data access is served by an owning controller. This method is often simpler than the alternative (which effectively treats each controller as an independent controller giving it its own hostname, IP addresses and services).

Choices about disk assignment/ownership, RAID and pool configurations are made at a controller level. This can further add complexity to the management of the system, requiring a deeper knowledge of the design of such a system. It works more like a cluster with independent nodes and in a failover situation requires proper mapping of IP addresses to interfaces, transfer of processes (with restarts?) so that the second node can take over all of the services of the first node and run it on its resources.

Storage vendors have worked hard to make this process work, and it does work, but it does add complexity to the system, requiring a deep understanding of how all of the components are configured to ensure that a successful failover will occur.

In an Active Passive configuration, the system is much simpler. Both controllers have access to all of the resources, networks, disks, etc. but services run only from one controller, whilst the other controller sits in a standby state awaiting failover. The passive controller might be doing a small amount of activity like NVAM mirroring from the active controller, but all other resources including networks, CPU and memory are online, ready and waiting for a manual or automatic failover.

Both controllers are of identical specification, so if the active controller is running at 95% utilisation, when a failover occurs, the standby controller can accommodate the full 95%, resulting in no degradation of performance.

Also, as there is no further complexity in the configuration of the controllers, the failover can be much quicker than on a system with an independent configuration between each controller; and speed of failover means less impact for the services running on the controllers.

The AP configuration also makes upgrades and replacements easier: do it on the passive first, quickly failover then on the other controller (now the passive).

A Final thought:

The decision to use an active/active or active/passive controller configuration is one the intelligent folk working at storage vendors give careful consideration to when designing a storage array. There are many factors that will decide which method will work best for their particular architecture. Just because one vendor uses AA in one form, does not necessarily make it better than AA in another vendors tech. Likewise just because some use AA does not mean they are more superior to others using AP in their tech. Review each technology separately and with its own merits, read their manuals and whitepapers, then make up your own mind. Vendor FUD has no place in this discussion.

Blog written by

Amirul Islam, Technical Director

https://www.linkedin.com/in/amirulislam/

Multi-modal and multi-vendor: the smart way to do cloud

Nimble OS: The gift that keeps on giving

Start your journey today

Talk to an expert