high availability - Everything2.com

High Availability is most often defined as a system that is free of Single Points Of Failure (often shortened to SPOFs). Thus, any component failing will result in system slowdown, but not in full loss of service. In practice, if an entire node in a High Availability cluster fails, there will be a brief loss of service (e.g, active TCP connections will be aborted), and a slowdown while the caches are being warmed.

In older texts, especially marketing material, you will see Fault Tolerant and Highly Available used interchangably. Nowadays, Fault Tolerant is most commonly used to describe a system where the hardware let any component fail without having any impact on the software.

Note that a highly available system may be more attractive than a fault tolerant system, as the former is usually also resistant to many forms of software fault.

Single Point Of Failure	wake and bake	redundant power supply	load balancing
catastrophic failure	cache warming	TCP	Fate sharing
November 23, 2001	Availability	MTBF	cache
root log: April 2011	March 23, 2006	DET	high-end
M131 Modular Pack Mine System	First Day Cover	50/50 raffle	Integrated circuits: a technology fable
IBM e-server	September 17, 2000	data center	failover