Currently installing ABB robot control software(Robview) on a Virtual Machine(VM) to prevent production loss due to Desktop PC failure or hard drive failure. This will also provide a single point licensing point. There will also be a great increase in data storage capability along with instant system restoration with only 10 minutes of lost production. There will also reside on the VM, PLC software, camera configuration software and any other support software need for system support. Operators will access the software through Remote Desktop Connection on any PC or laptop.
Fault tolerance in the Cloud is really something that needs to be built into applications which never assume that any single cloud component is going to be up all of the time. You don't buy Cloud IaaS capacity, for example, assuming that the server on which you are running a VM will always be up. I suggest you look at the following two papers and links. The first is on cloud aware application development. This white paper talks about how to build applications in the cloud, such as scaling out and not up, always assuming that infrastructure pieces can fail, and using microservices. The second is "The anti-fragile organization", which talks about the ways to test applications by destroying parts of the infrastructure using such tools as Chaos Monkey.