Testing in Production: A New Paradigm for Shift-Right

URL copied!

Categories: Testing and QAAutomotiveCommunicationsConsumer and RetailFinancial ServicesHealthcareManufacturing and IndustrialMediaTechnology

Michael T. Nygard states, “Too many production systems are like Schrodinger’s cat—locked inside a box, with no way to observe its actual state.”

Testing in production (TiP) lets a software development (Dev) and IT operations (Ops) team prepare for possible bugs. It’s also helpful in analyzing the user’s experience. However, it is essential to understand that testing in production is not releasing untested code in the hope that it works or waiting for the bugs to be detected once end-users use it.

Techniques for testing in production:

A/B testing
Canary deployments
Continuous monitoring

This article covers various facets of TiP and why this is a crucial tool in the paradigm of Shift-Right testing.

Great Expectations for Testing

It’s not easy capturing all defects in the development lifecycle or simulating a live, real-world environment. So Dev and quality assurance (QA) teams put valuable effort into white-box and black-box testing, exploratory testing, and automation testing.

In addition, they are considering more environments like Dev, QA, and User Acceptance Testing (UAT) for validating the user flows. Dev and QA teams test these environments using mock test data to find the defects with all possible scenarios, corner cases, and out-of-the-box strategies, but they cannot capture the end user’s intentions all the time.

Additionally, partially testing code can be dangerous as it does not prove that the code will function correctly in production. Further, it’s important not to wait for issues to arise for users.

A few case studies and surveys show that top companies constantly release new features to a fraction of their traffic to measure the impact. The figure below shows that most modern software development follows a lifecycle where developed code and testing propagate through increasing environment layers. Finally, the code deploys to production, and we say the product is “go-live” or released.

But how is this reflected in reality? Consider the following myths:

It works in all pre-production environments, and all testing types pass. So it will work in production.
I don’t always test my code, but when I do, I do it in production.

In today’s digital transformation age, with other software deployed, “production” will always have its own set of unique issues. Therefore, it is imperative to have TiP.

TiP – A few key characteristics

So what exactly is TiP, and how do we distinguish this from other tests that happen throughout the lifecycle. A few key characteristics of TiP are:

A set of tests that incorporate new changes with live traffic and a group of users.
Tests that analyze user experience including failures, sudden breakage, slow performance, usability, and acceptance.
Tests that impact the quality of the product by getting continuous feedback from the end-users.
An activity that iteratively and progressively increases customer trust and expectation with a product.

TiP – The various techniques

TiP, also known as Shift-right testing, continuously tests the product when it is in production or near-production. This approach helps software developers find unexpected scenarios that they did not detect previously and ensures the correct behavior and performance of the application.

The map below is a representative set of various shift-right techniques that span the TiP lifecycle. These techniques can effectively speed up the overall software release cycle:

1. Bug Bash:

Despite the testing procedures, a few defects are likely to make it past the development phases and into production. There will always be hidden issues, no matter how thorough your testing is or how successful your automated tests are in development, and these bugs can impact end-users.

Bug Bash is one of the methods used by several companies to ensure product quality. Typically, all the internal stakeholders, content team, survey team, marketing team, product owners, etc., are part of the Bug Bash.

Before the product deploys to “live” clients, the latest updated code is put through its phases one more time to ensure everything is in working order. Then, for the Bug Bash, they install the most recent version of the product, play around with the features, and provide feedback.

The core philosophy of a Bug Bash is to get other eyes, typically not those fully embedded into the software teams, on the product before releasing it to production users and ensure we haven’t overlooked anything.

There are eight steps for a successful Bug Bash:

Set a date and time.
Send invites.
Create teams.
Preparing scenarios.
Bug cycle process and template.
Hunting time.
Prizes for an outstanding catch.
Wrap up the bash.

2. A/B Testing

A/B testing, also known as split testing or bucket testing, compares two versions of a website or app to see which performs better. A/B testing is essentially an experiment in which consumers review two or more website variations at random and statistical analysis shows which variation works better for a specific conversion objective.

3. Canary Testing

Canary testing is a powerful technique to test new features and functionality in production while causing the least amount of disruption to users. The words canary testing and canary deployment are interchangeable in this context.

A canary release is a software testing strategy for reducing the risk of releasing a new software version into production by progressively distributing the update to a small subset of users before releasing it to the whole platform. Blue-green releases, feature flag releases, and dark launch releases are commonly used terms for canary releases.

4. Destructive T esting

Destructive testing is a software assessment method used to detect areas of failure in a program in an IT context. Essentially, the process entails wrongly interacting with software, such as entering data that is inaccurate or in the wrong format, in order to see if the application would fail if an end-user made that mistake.

Destructive testing (DT) involves a type of object analysis that includes breaking down a material to determine its physical attributes, such as strength, flexibility, and hardness, using a test.

5. Fault Injection Testing

Fault injection testing is a type of software testing that intentionally introduces defects into a system to ensure that it can withstand and recover from them. Fault injection testing is commonly performed before deployment to identify potential flaws introduced during production, generally under stress conditions.

Fault injection testing in software is completed either at compile-time or during runtime. A compile-time injection is a testing technique that involves changing the source code to simulate software system flaws. Modifications or mutations to existing code, such as changing a line of code to reflect a different value, can be used to accomplish these changes. Additionally, testers can modify code by adding or inserting new code, such as additional logic values.

Chaos engineering is when fault injection is a core aspect of a production system. Chaos engineering is a field of study where someone can do fault injection in a chaos experiment. If fault injection is a method of introducing failure, Chaos Engineering is a strategy for implementing fault injection to achieve the goal of more dependable systems.

Chaos testing has grown in popularity to ensure high-quality software while it is still in production. Many firms have benefited from this relatively new method, which has transformed how we assess software’s robustness. Chaos testing is cloud-based resilience testing. However, because today’s networks are so dispersed, they require a high level of fault tolerance. To evaluate this, you’ll need to take a different approach to testing.

Chaos testing, coined by Netflix, is a method of purposefully causing harm to an application in production. The Netflix engineering team developed Chaos Monkey, one of the first chaos testing tools. Chaos Monkey creates faults by disabling nodes in the production network, the live network that serves movies and shows to Netflix users.

6. User A cceptance T esting

User acceptability testing (UAT) is a software development tool where the product is tested in the "real world" by the target audience. UAT is usually the final stage of the software testing process, conducted before distributing the tested program to its target market. The goal of UAT is to ensure that the software is fit for its purpose.

UAT is an excellent way to ensure quality for time and money spent on the program while boosting transparency with software users. UAT also enables developers to deal with real-world scenarios and data, and if successful, confirms that it meets the business requirements.

Best Practices for Testing in Production

Smaller, more frequent releases are the norm in today’s agile world. Although such techniques reduce some dangers, the high frequency raises the chances of releasing vulnerable code in the world. Meanwhile, if done correctly, testing in production can improve the effectiveness of the app testing strategy.

We have curated a list of best practices for TiP:

You must always use real browsers and devices, which may seem obvious, but it is essential to note. The production environment must consist of an actual device, browser, and operating system combination. It’s impossible to judge the software’s performance without putting it in a real-world setting because no emulator or simulator can accurately simulate real-world user conditions.
Timing is everything. When there is a lot of traffic, do timed tests. A genuinely successful application should perform flawlessly even when it is under a heavy load. Since production testing aims to find flaws in the actual world, it must occur under the most challenging conditions.
Bring in a chaos monkey. Netflix engineer Cory Bennett states, “We have found that the best defense against major unexpected failures is to fail often. By frequently causing failures, we force our services to be built in a way that is more resilient.” A chaos monkey randomly throws failure into production, forcing engineers to develop recovery systems and more robust adaptable bug resolution practices.
Monitor continuously. This is necessary to see what is going on with the servers or databases. For successful production testing, monitoring is an absolute must. Keep an eye on critical user performance metrics when running a production test to see whether the test has an undesirable impact on the user experience.
Prioritize the defects reported by end-users. If they require immediate attention, strive to fix them as soon as possible; if they take time, have suitable replies with proof to inform the end-user. Accepting our flaws isn’t a bad thing.
Allow users to engage in exploratory production testing if at all possible. By properly telling people about new features and releases, you may solicit their input. It is feasible to collect end-user insight in this manner without having to worry about disturbing user feedback. In addition, users will be less surprised or upset by bugs if they know that they are beta-testing a product.

We have also put together a table of tools and libraries that will aid your efforts in Testing in Production.

Scenario	Tool/Methods	Benefits
Bug Bash	Slack, Zoom, Skype communication channels. Rooms to discuss product features and defects.	Catch more bugs - Identifying several types of bugs. Cut regression testing time. Incorporate other people’s experiences & perspectives. Collaborate across teams.
A / B Testing	Google Analytics & Google Optimize Optimizely Visual Website Optimizer [VWO]	Reduces bounce rates. Helps to increase conversion rates. Results are easy to understand. Inexpensive. Increased sales.
Canary testing	-	Only a small percentage of users will be affected by the bug; reduction in risk. Increased assurance in releasing new features at a faster rate. Remove the feature as soon as possible if it is defective, slows down the application, or causes negative user feedback. Shortening the feedback loop and responding to feedback faster by bringing new features to production sooner.
Destructive	Alpha / Beta Testing Regression Testing Equivalence Partitioning Acceptance Testing	Helps to check the robustness of a software product. Helps to understand predictable software behavior under improper usage.
Fault injection testing and Chaos Engineering	Chaos Monkey Gremlin Chaos Toolkit	Improve the resilience of a system. Stop significant losses in revenue by preventing lengthy outages. Helps improve incident response. Improved service availability and durability.
User acceptance testing (UAT)	Fitness Tool Watir Usersnap	Provides a finished result that is satisfactory. Assists in the delivery of a bug-free final product. Provides users a finished product that is in good operating order. Fulfills all functionalities that a finished product should have.

Benefits of Shift-Right

Reduces the risk associated with continuous delivery.
Assures customers that the product is ready for production.
Permits engineers to add, erase or change highlights based on feedback.
Increases the efficiency of software.
Supports the distribution of products more quickly.
Captures problems before the end-users see them.

Conclusion

Production testing is increasingly becoming an unavoidable aspect of the testing process. Without genuine user experience, it is hard to forecast and cure all defects when millions of people access a single piece of software from thousands of different devices, browsers, browser versions, and operating systems.

As a result, DevOps-aligned developers and organizations benefit greatly from production testing. It helps to improve user experiences, brand reputation and increases income by allowing developers to be more prepared for dealing with abnormalities.

Production testing is, without a doubt, an essential part of software development in today's world.

Since over 90% of software companies adopt agile methodologies, the number of production releases has increased. Unfortunately, each production release can change how things work in the real world. Therefore, DevOps teams must check all modifications in production as early as possible to ensure the reliability of any program. It can be detrimental to a software company's reputation if they do not discover these issues before the customers.

References