Fail Fast vs Fail Safe

Last modified on May 19th, 2013 by Joe.

How does a system react when there is a failure characterizes it as a fail fast or a fail safe system. This article is to discuss whether fail safe or fail fast is better. Then what it has to do with java.

Fail fast or fail safe – which is better?

Though the word ‘fail safe’ sounds better, I feel fail fast is best. Fail safe is not safe. Fail safe doesn’t mean robustness. We are preserving, concealing defects in the system. Resilience exhibited by fail safe systems may not be permanent. Fail safe system is need for high availability scenarios. When a failure is detected a workaround is substituted and the availability of system is ensured.

Fail fast brings out the defect as and when it is detected. The error is taken out wide open to public and the system is shutdown. Business is obstructed, but we get a chance to rectify. We fix the error and bring the system up and proceed. Thus continuing really makes the system robust, not concealing the error condition. Though it results in interrupted availability, over a period it results in a robust system. Fail fast ensures that we don’t ride a punctured bike and create irreversible issues. Do not expect failures in the program as natural, but it should be designed in a way such that in case of unexpected failure the program should fail fast.

Just a thought provoking question. Is fail fast better for a nuclear reactor?

Exception handling and fail fast or safe

Exception handling in java induces us to design our program to be fail safe. Instead we should use exception handling to fail gracefully and quickly. Exception should not be carried around in the sequence flow for too long. Golden rule with exception handling is, throw early and catch late. When we hit an exception, it should be thrown immediately. We should not catch an exception unless we are sure what to do with it and the action we take by catching it should negate the failure.

Fail fast vs fail safe Java iterators

Java iterator provides us with interface to parse the items of the underlying collection. When we are using the iterator the underlying collection should not be modified. If this treaty is not honoured and it is possible in a multi-threaded environment, then we get a ConcurrentModificationException.

Fail fast Iterators

We need to remember two points about this. Point one is, this behaviour is not guaranteed. Detecting modification in the collection and parsing the collection is not done synchronously. Modification on the collection may go unnoticed under certain circumstances. So while programming, this behaviour should not be banked upon. Example for fail fast iterators are ArrayList, Vector, HashSet.

From Java API,

“Note that the fail-fast behavior of an iterator cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification. Fail-fast iterators throw ConcurrentModificationException on a best-effort basis. Therefore, it would be wrong to write a program that depended on this exception for its correctness: the fail-fast behavior of iterators should be used only to detect bugs.”

Fail safe Iterators

Point two is, not all collections are fail fast. Even if the underlying collection is modified, it does not fail by throwing ConcurrentModificationException. When an iterator is created, either it is directly created on the collection, or created on a clone of that collection. One example which supports failsafe iterator is ConcurrentHashMap.

Comments on "Fail Fast vs Fail Safe"

  1. maddy says:

    Nice Explanation…

    It is going to help me to understand some concepts more clearly.

  2. kaviyarasu says:

    Nice explanation sir…..

  3. kaviyarasu says:

    sir i need to know how can we develop mobile web application and what ide can we use for it..??

  4. Syed Viqar Raza says:

    Sir,
    The explanation is very good, but I stuck in 1 point.
    How does Fail fast will take advantage of robustness when the error leads to shutting off system?

  5. Joe says:

    When there is a fault, because of fail fast design a systems shuts down. But it prompts us and gives us a chance to fix the issue. Though there will be disruption in service over a period the system will become stable and error free.

    When we have fail safe design, there is a possibility that we will conceal a defect in the system itself. It may come later at a critical stage and haunt us. When it comes out it may be very devastating and there is a possibility that this may cause side effects also.

    Overall, things depends on application nature. A general web application where high availability is not critical then we should go with fail fast.

  6. Joe says:

    In mobile web application, there are many platforms to develop for. If you are starting fresh, I recommend you to start with Android. Eclipse IDE is my choice for Android development.

  7. Gaurav says:

    I love your compact and to the point articles. :-)

  8. Yogita says:

    Nice Article Sir.. I am convinced with the point and need.

  9. Jawed says:

    In short and to the point..good one..keep it up. Your effort helping others to learn

  10. Santhosh Reddy Mandadi says:

    Very good Joe

  11. Amar Sannaik says:

    Nice explanation

  12. […] handling should be used judiciously. Earlier I wrote about fail fast vs fail safe and that article loosely relates to this context and my personal preference is to fail fast. […]

  13. Ganeshji Marwaha says:

    Nice article @Joseph. My take is that, you should fail fast during development and testing and fail safe during production. It is better not to expose the system failures to the user when you have an opportunity to substitute it with alternative actions. Coming to your analogy of the punctured tire. These days, cars are equipped with what is called as tubeless tires. Even if those tires are punctured, you can just refill the air and run the vehicle for even a couple of days before you fix the actual problem. I feel that this is an example of fail safe, as it does not make you looking for a puncture shop in the middle of nowhere, at the same time, gives you enough information and time to fix the issue. Again, it all depends on context. Nothing is separable from the context. In a different context, the opposite could be true.

    Overall, great article. Keep writing thought provoking technology articles like this more and more

  14. Joe says:

    Thank you so much for the valuable comment Ji and privileged to have you comment. Yes I agree with you and it depends on the context.

  15. Shrikant Kale says:

    Nice Explanation… but an example will add more advantage.

  16. Fayeq Ali Khan says:

    Excellent Effort Sir, its always a pleasure to learn from you. Thanks for your time and patience to write down these articles.

  17. Kazim says:

    Awesome!

  18. Mahesh says:

    Good article !!!

  19. Puja says:

    Good explanation ! Thanks Joe

  20. Ashish says:

    I really appreciate your articles.They are always very good and to the point explanation.

  21. Joe says:

    Thanks Ashish.

  22. Ravi.K says:

    Good evening Joe,

    This is the Art of explaning the things Effectively at the same time in Simpler manner.
    where can i find java programs using Collections concepts and helps in understanding more.

    Regards,
    Ravi.K

  23. anju says:

    hello

  24. anju says:

    hello sir

    r u working somewhere?

  25. ambuj says:

    Sir you should write a book on java because your explanation is so interesting and to the point.

  26. Niraj says:

    thanks for sharing knowledge. keep it up.

  27. venkat sarma says:

    Very nice article.

    Today I had an interview in that they asked how to prevent fail-fast I said use iterator but he is not satisfied. can you please explain.

  28. Binh Thanh Nguyen says:

    Thanks,nice post

  29. Anonymous says:

    superb

  30. […] modified concurrently. It may reflect the state when it was created and at some moment later. The fail-safe property is given a guarantee based on […]

Comments are closed for "Fail Fast vs Fail Safe".