X

Tech Talks ~ The Importance of Reliability

By Nishadha

I was the 2nd presenter in line to have a go on the Tech Talk session series at Cinergix. I decided to make the focus of the presentation to center on a QA Quality Factor which is known as “Reliability”. Generally, if something or someone is described as being “reliable” it gives an idea of being trustworthy and dependable. Reliability which is a time-bound component implies successful operation over a certain period of time.

Reliability in software can be defined as “the probability of a computer program performing its intended functions, without any failures for a specified time under a specified environment”. For web applications such as Creately, reliability is an important Quality Factor that needs to be considered. Achieving reliability will give you benefits in the areas of:

Customer Satisfaction – unreliable product will negatively affect customer satisfaction severely. Thus high reliability is a mandatory requirement for customer satisfaction.

Repeat Business – Customers will return to a website that is reliable & has a positive impact on future business

Reputation – When a product is reliable the company will have a favorable reputation

Competitive Advantage – Companies can publish their predicted reliability numbers to help gain an advantage over their competition who either does not publish their numbers or has lower numbers

Warranty Costs – Reliable products will fail less frequently during the warranty period. This will lower the repair & replacements costs & refunds

Cost Analysis – Reliability data can be used for cost analysis. Reliable products will show that although the initial cost of their product might be higher, the overall lifetime cost is lower than a competitor’s because their product requires fewer repairs or less maintenance

To achieve reliability in software, activities can be followed in the areas of:

1. Error prevention

2. Fault detection and removal

3. Measurements to maximize reliability, specifically measures that support the first two activities

1. Error prevention activities

When it comes to error prevention activities, there are many things that need to be undertaken in order for you to achieve reliability. While it would be impossible to delve into the whole spectrum of these activities in this post alone, I will mention a few so that you get the gist of what  these activities entail. Ideally, you need to have requirements that should clearly & accurately specify the functionality of the final product. Moreover, you have to follow proper coding standards, perform regular code reviews for correctness & safety and perform unit testing to independently test the modules. Other activities that need to be considered would be load testing to determine the system’s behavior under both normal and anticipated peak load conditions and to also perform regression testing after additions or modifications are done to ensure that the existing functionality remains the same.

2. Fault detection and removal activities

There are two aspects that need to be considered here – Software Testing & Software Inspection.

3. Measurements

Reliability metrics are units of measure for system reliability. System reliability is measured by counting the number of operational failures and relating these to demands made on the system at the time of failure. As far as this topic is concerned you need to take into consideration Static Code Metrics (which gives information at the code level) and Dynamic Metrics (which provides information on the actual runtime). Examples for Static Code Metrics would be Source Lines of Code (SLOC) of the program, Number of Modules & Go To Statements & Number of Classes & Weighted Methods per Class (WMC). One of the Dynamic Metric examples would be Failure Rate Data such as:

– Probability of Failure on Demand (POFOD)
POFOD = 0.001 (For one in every 1000 requests the service fails per time unit)
– Rate of Fault Occurrence (ROCOF)
ROCOF = 0.02 (Two failures for each 100 operational time units of operation)
– Mean Time to Failure (MTTF)
Average time between observed failures

Problem Reports

When talking about problem reports, it is imperative that you use error logs & access logs to determine the following:

– Date of occurrence, nature of failures, consequences

– Type of faults, fault location

So there you have it. I hope this rather techy blog post acts as a good focal point when it comes to assessing your site or app with regard to reliability. I cannot but drive home the fact that this is certainly an aspect that can be regarded as being one of the best cornerstones when it comes to building a great site or app. Got any queries, comments or complaints, please do go ahead and let us know.

References: Software Metrics and Reliability by Linda Rosenberg, Ted Hammer, and Jack Shaw.  IEEE International Symposium on Software Reliability Engineering. 1998., http://swreflections.blogspot.com/2009/08/lessons-in-software-reliability.html, http://www.tectrends.com/tectrends/article/00172844.html, http://www.eweek.com/c/a/Enterprise-Applications/Measuring-SAAS-Reliability/

View Comments

  • On performing all the cost benefit analysis, it occurs that if reliability is maintained right from the beginning lot of cost could be saved.

  • Quality Assurance is important in each and every aspect as to get better and satisfactory product as the end, Quality assurance is like the insurance of your product.