Hi,
I have a concern that I can't seem to find an easy answer to through all the marketing gloss!
My concern is how to protect against a failure of shared storage but I need to explain how we currently operate.
We have 2 data centres on site which allows me to use linux clusters with local storage running drbd and heartbeat.
The DC are linked by fibre which allows data to be synchronously replicated. If, for whatever reason, the primary server
dies, then the other node can take over within 30 seconds with no data loss and noticeable interruption to the users. This allows
us to have an active-active DC setup.
We have a small VSphere deployment but using local storage as the VMs are static which allows a clone to sit in the
ESX box in the other data centre but we do wish to expand on this.
So I do understand how a SAN will benefit and add resilience but if we have one SAN, this is still a single point of failure e.g.
worst case is if we loose a data centre. Now by adding a second SAN and replication at the hardware level or via software we can
mitigate this and have a up-to-date VMs brought up using the storage in the other data centre. However it still seems like there is manual
intervention required so it not as highly available as my drbd cluster.
Is it possible to have ESX servers across 2 DC, a SAN in each DC and each ESX host connected to the 2 SANS where if one (primary) SAN dies,
the VMS restart via the other SAN with no data loss or manual intervention?
I have tested FT before (with nexenta) but again, this relies on the same storage.