How to do SMART Regression Test?

Regression Test is a very critical test in any software development phase, which can’t be skipped or overlooked if you want to deliver high-quality products to your customers. After multiple new enhancements and bug fixes, only through Regression Test one can ascertain that the application is indeed stable and good enough to release into production. To ensure there are no side-effects/impacts due to new code development, one must carefully craft the regression test that can be done with minimal effort and maximum coverage. That is the primary reason on why your regression test has to be SMART.

“~50% of production defects leaked by QA team are due to inadequate regression tests”

Do we know that, out of all the production defects leaked by QA team, ~50% of production defects are due to inadequate regression test coverage? It may surprise you, but if you do your due diligence in doing the root cause analysis for all the defects leaked by QA team then you will know that it is true. If it is true in your QA team, then how do we build the right regression test? Building the right regression test is very challenging as there are multiple factors that needs to be considered in planning a regression test with adequate test coverage.

Doing a regression test by running as much as test cases possibly through automation is not the SMART way of doing regression test. In today’s Agile delivery model, finding the luxury of time for doing thorough regression is nearly impossible and challenging. To address these challenges, I am going to explain key principles and factors that one needs to consider in building a SMART Regression Test.

“Smart Regression is NOT running extensive tests as much as through automation”

10 Key factors for performing Smart Regression Test:

1. Coverage:

Test coverage of the regression test is very important factor in deciding the quality of your regression test. To ensure the adequate coverage of your regression test, follow the below guidelines

Coverage by Functionality/Feature – Does your regression test covers all the functionalities/features in the application? Do you ensure adequate regression tests are added/modified in the regression suite when a functionality is newly added/modified? List down the functionalities of the application and map the regression test cases to appropriate functionality to ensure coverage of regression test for each functionality in the application.

Coverage by critical user behaviors/paths/flows – Do you have the understanding of the critical behaviors/paths in each functionality? Does your regression test covers all the critical paths/flows in the application? From the System Test cases designed for a functionality, identify the critical path scenarios and do ensure to add the critical path scenarios into the regression test suite

Coverage by critical data combinations – Do you have regression tests that covers the critical data combinations to validate the business logic implemented in each functionality? It is important to ensure that the test data combinations used for regression test is generated using scientific methods (ex: Pair-wise test combination method). By this you will be able to validate the stability of the application by running minimal tests with maximum coverage. There are multiple tools in the market that can help you to implement the right test data combinations without redundancy. Few tools of my choice: Hexawise (Licensed), Microsoft PICT (Open Source).

Identifying the right regression test from the existing System Test cases is a skill and has to be done thoughtfully by considering the above 3 factors (functional coverage, path coverage and data coverage). Don’t just dump your System test cases into Regression Test suite, it will lead to inadequate or redundant coverage. We should follow proper balance in adding new test cases into regression test suite. Once should also consider, can we modify the existing regression test case to cover the new enhancement or not. Always adding lots of new test cases into the regression test suite and blowing off the test count will not be the smart way to manage your regression test.

Quote for thought: “Measure the quality of your regression test by coverage and not by volume of test case count”

2. Classification:

Having right classification in your regression test is very important practice to ensure proper identification/selection and reporting of your regression test. Few guidelines on how you can classify your regression test suite

By Application, by Functionality/Feature, by Priority (P1/P2/P3), by Sub-functionality (if any), by business criticality of functionality (High, Medium, Low), by Execution mode (Manual/Auto)

As mentioned on the 1^st factor (Coverage), once you have identified the critical & non-critical paths in your application/functionality, do ensure that your regression tests are classified by test priority (P1/P2/P3).

Example for Classification of regression tests:

Reg. Test Case ID	Application	Functionality	Business Criticality	Test Priority	Exec. Mode	# Defects identified (in Past Reg. Test)
TC-01	Member	Login	High	P1	Automation	0
TC-02	Member	Login	High	P2	Automation	3
TC-03	Client	Registration	Medium	P1	Manual	5

Quote for thought: “A well classified regression test is always easy to manage & maintain”

3. Impacts

Impact assessment is the most important factor in identifying smart regression test. Impact assessment can’t be done by QA team in silos, it needs collaboration with Business and Dev teams to ensure thorough assessment of the various impacted areas in the application.

I will classify the application impact areas into two categories.

Functional impact – What are the different existing application functionalities that have direct/in-direct impact because of the introduction of a new/enhanced functionality? To do this impact assessment, it is important for QA to have ‘Functional Interaction Matrix’ or ‘Impact Assessment Matrix’ for the application in scope. What is ‘Functional Interaction Matrix’? It is a simple 2-dimensional matrix which shows the interaction between different functionalities in the application.

Note:

* To interpret - read from left to top
* As the above example – Interaction Matrix can also be written between two different applications that has dependency/related to each other

Interpretation of above Functional Interaction Matrix:

- Functionality 1 interacts with Functionality 2 & 3
- Functionality 2 interacts with Functionality 1
- Functionality 3 interacts with Functionality 2
- Functionality 4 interacts with Functionality 3

By using the above interaction matrix one can understand the impacted areas and identify the regression suite based on the impact of new/enhanced functionalities introduced into the application.

In the above approach, QA team has to review the identified areas with Business and Dev team to ensure that the impact assessment is done accurately and there are no gaps in identifying the impact areas which needs regression test.

Technical Impact – To understand the impacts created by technical changes in the underlying code/configuration, QA team has to work closely with Dev team to understand the impacted functionalities. For example, any enhancements/bug fixes in the critical component of the code structure will have direct/in-direct impact on certain functionalities. So, QA team needs to engage with Dev team to get more insight on the technical impacts and identify those functionalities that needs regression test.

Once we have clear understanding of the impacted functionalities that needs regression test, then you regression will be more focused and targeted regression than a generic regression. This is a key step in defining smart regression test.

Quote for thought: “If you know the impact well then your regression is right on target”

4. Historical

Getting insight from the historical data is an important factor in identifying smart regression test. Historical data could be multiple sources as listed below.

Past Regression defects – Do you track the defects of past regression tests? Functional areas that are error prone in the past are good candidates for regression test. Historical regression test results should help to identify risky/brittle functionalities. Based on the distribution of historical defects in your application, classify the features/functionalities into High-Risk/Medium-Risk/Low-Risk areas. This will play as key input in identifying areas which are not directly impacted by the introduction of new functionalities.

Past Production leaked defects – Do you track the production defects leaked by QA team? This list will carry vital information on why that defect escaped the quality gates of QA team. Do ensure that you track the production defects leaked by QA team and do thorough RCA (Root Cause Analysis) to find the gaps in QA process. Based on this information, we should include specific tests in the regression test that can simulate the leaked production defect scenarios.

Defects from ST (System Test) phase – It is also important to look into functionalities where more defects were uncovered during the ST phase and has gone through multiple defect fix cycles. This could be potential risk areas which can fail in the regression test due code breaks. Do consider this factor for your regression test scope.

Defects from UAT (User Acceptance Test) phase – This is another good area to focus and identify the gaps in QA test and improve the regression test. In most of the teams, UAT will be done by Business users who are functional experts. If UAT team finds any unique defects which are not found by QA in System Testing, then we need to do RCA of those defects to identify the gaps and include the tests in our regression test.

Changes introduced in past few releases – It is also important to track the changes introduced in the past few releases as it could be a potential unstable functionality because it is relatively new to the product and hasn’t been tested multiple times in the past regression. New functionalities released in recent past releases could be a high-risk candidate for regression test.

Quote for thought: “You may forget the history, but always remember & carry the key lessons from the history”

5. New Vs Existing

Does the regression test can consist only of the tests for existing functionality, or it can contain the tests for new functionality?

According to me, Regression test should include the test for existing functionality & new functionalities (that are ready for go-live/in done state). We need to remember that the goal of the regression test is to guarantee the stability of the application. In most of the cases, we run regression test towards the end (after completing the Sprint testing/completing the System testing/completing the testing for a release scope). So while identifying the regression scope, we have to identify the scope from existing/established functionalities + we need to identify the scope from the new User Stories/new enhancements for which system testing is completed (waiting for go-live). This approach of including New & Existing functionality validation in regression test will good confidence of the overall stability of the application.

Quote for thought: “Don’t overlook the new when you are too focused on the old/existing ones”

6. Execution Mode

Once we have decided on the regression scope, we need to plan the execution mode (Manual or Automation) for the regression test. Regression test are always the first choice for implementing the test automation and one should consider running the maximum regression through automated execution. But at the same time, we need to have some check and balance to see the outcome of automated execution. It is a very common notion in most of the QA teams, that automation test finds no/only few defects. The problem is not with automation or the tool used, but we need to evaluate all the above factors explained in this article to ensure there are not gaps in regression test identification.

At the same time, my recommendation will be also do a limited manual testing (more of an exploratory test/out of the box scenarios) to check if nothing has escaped the quality gates of regression test.

Quote for thought: “Speed is great, but sometimes walking will help your find hidden mysteries”

7. Iterate

Like the iterate process in Agile methodology one has to constantly evolve the product based on the frequent feedbacks. Similarly, we need to constantly evolve our regression test based on the changing application requirements, new features introduced and historical data, to ensure that the regression test meets all the coverage parameters, and it is on par with the application changes. Regression Test suite should never be stagnant, there should be continuous effort on a monthly/release basis to enhance regression test bed. This is a key factor in sustaining a smart regression test.

Quote for thought: “Don’t be stagnant for long time, it will stink later”

8. Phasing

Running the regression test in right phase of software development is very important to have smart regression test. Usually, regression tests are done only in System Test phase, which may seem like a good idea as we have all the code available for test will complete system integration. But we need to understand that finding defects late in the software development cycle is very costly and it will lead to lot of re-work and potential delay for release timeline. So, to do smart regression test, we need to shift-left our regression test approach to the early phases of development to find defects early in the cycle. We can identify sub-set of our regression (Priority 1 test) and classify them as Sanity/Smoke test, and we can run it in the development phases to validate the stability as they are incrementally delivering new code into the system. Automation will be key player in the shift-left regression approach. This will give quick feedback to the developers on the impacts created by the new code and will enable them to fix it early in the development cycle.

Quote for thought: “Start early as possible, so that you can finish well”

9. Timing

As important as phasing the regression, it is important to time the regression. Most of the teams plan the regression test only at the end of System Testing or before the release. But as mentioned earlier it is going to create lot of pain and re-work, if we find lots of regression defects at the end of system test phase. My recommendation is to run-continuous regression throughout the System test in a parallel mode. If the regression test is automated, it is going to be easy to adopt this approach.

Key recommendations on timing:

Run Sanity/Smoke test (Subset of Regression [P1 tests]) – Immediately after major code deployments. This can be done alone with your In-Sprint testing in each Sprint.

Run parallel regression test during System Test phase - Run minimal targeted regression test based on the impact assessment of new code deployments. This can be done alone with your In-Sprint testing in each Sprint.

Run extended regression at end of System Test phase – This could be after Sprint +1 or before release. Based on the availability of time, one can plan for extended regression (full scope) or targeted regression (focused on impacted areas)

Quote for thought: “There is no good time/bad time, running all time will not get you surprises”

10. Governance

Finally, governance is a very important factor in driving the success of smart regression test. One need to consider the below recommended process to ensure the governance of the regression test

Regular checkpoints to validate the maintenance of Regression Suite – Master Tracker (tracker with details of all the test cases part of regression scope, with all the appropriate classifications, with clear summary of the regression pack)

Process to review the regression suite maintenance activity and ensure that the desired test coverage is achieved

Process to publish the Regression Test dashboard (metrics on TC development/maintenance, Automation, Execution results, defects identified etc.,)

Process to review the impact assessment process with Business & Dev teams

Process to review and sign-off the regression scope identified for each sprint/release

Quote for thought: “Process will never let you down, it will keep your in momentum and push you further”

Summary

Coverage:

Do ensure that the regression test meets the test coverage by 3 important areas. Coverage by Functionality, coverage by critical paths and coverage by data combination.

Classification:

Classify the regression test by application, functionality, sub-functionality (if any), Test priority (P1/P2/P3), Business Criticality (High/Medium/Low), Execution Mode (Manual/Automated).

Impacts:

Do impact assessment of new/enhanced functionalities introduced based on the functional impact and technical impact.

Historical:

Do consider the data from past production defects leakage, past regression defects, defects from SIT & UAT to identify the high-risk areas and enhance the regression test to validate the stability of the high-risk functionalities

New Vs Existing:

Have balance in the regression test to consider not just the regression tests of the existing functionality but also include the regression test for the new functionalities that are waiting for go-live.

Execution mode:

Focus to achieve maximum automation of the regression test and do consider the inclusion of manual regression test to cover some exploratory/out-of-the-box scenarios.

Iterate:

Regression test needs to be continuously evolved based on the changing application requirements, new features introduced and historical data, to ensure that the regression test meets all the coverage parameters, and it is on par with the application changes.

Phasing:

Planning regression test only at the end of System Test phase could be costlier due to the cost of defect fix and re-work, but instead plan to run regression right from the development phase to get early feedback on the system stability and to act quickly on the issues identified. Think about shift-left regression test (running targeted regression in early phases).

Timing:

Plan to do regression test progressively during the System Test phase (Run Sanity/Smoke test in the beginning of code drops, run targeted regression in parallel with System test, Run extended regression at the end of system test).

Governance:

Establish governance process to validate maintenance of regression suite, measure the outcome of regression test and process to include key stakeholders for review & sign-off on regression scope.

Finally, you have reached the end of this article, thank-you very much for your patience and curiosity. Hope this article has helped you to learn and implement few new areas in your regression test. TEST SMARTER AND REGRESS SMARTER...

- Rakesh PG

[Smart QA Minds]

Test Type	Scope of Validation
Meta Data Test	· Validate based on ETL mapping sheets, DB Schema of Source & Target · Verify that data field types and formats are specified appropriately · Verify that source data type and target data type are same · Verify that length of data types in both source and target are equal · Verify the name of columns & naming standards in the table against mapping doc · Verify that the columns that cannot be null have the ‘NOT NULL’ constraint · Verify that the unique key and foreign key columns are indexed as per the requirement · Compare table and column metadata across environments to ensure that changes have been migrated appropriately
Data Quality Test	· Validate data of source table to target (dimension) table · Validate row creation date for accuracy · Validate for null check in target columns with not null constraints · Validate for duplicate check based on unique key, primary key and any other constraints as per the business requirements · Validate for duplicate check in target columns generated by combining values of multiple source columns · Validate number columns of source table has only numbers · Validate date format in source table for correctness & uniformity with other date columns · Validate data is not truncated in the column of target tables · Check for any rejected records
Data Cleansing Test	· Validate deletion of unnecessary columns before loading into the target area as per the business rules · Validate replacement of incorrect/invalid data by default value and reporting invalid data as per the business rules · Validate the accuracy of any rejected records as per business rules
Data Integrity Test	· Validate the data integrity between dimension and fact tables · Validate that the Fact table shouldn’t contain the keys that doesn’t exist dimension tables · Schema data integrity validation (Star schema/ Snowflake/Galaxy) · Validate count of records with null foreign key values in the child table · Validate count of invalid foreign key values in the child table that do not have a corresponding primary key in the parent table · Validate that the data conforms to reference data standards - values in certain columns should adhere to the values in a domain · Compare domain values across environments – Validate reference data values from the development environments has been migrated properly to the test and production environments · Track reference data changes - Baseline reference data and compare it with the latest reference data so that the changes can be validated
Data Transformation Test	· Validate the accuracy of computed/generated data in target table · Validate the accuracy of computed values based on the boundary conditions given in the transformation business logic/rules · Validate the accuracy of computed values based on the equivalence class conditions given in the transformation business logic/rules
Data Completeness Test	· Validate all expected data is loaded accurately into target table · Validate record count between source & target tables · Check for any rejected records · Validate based on aggregate value (ex: total sales, total claims, total premium paid) · Validate source & target table based on descriptive statistics (Min, Max, Avg, sum, count of null) · Validate complete data set in source and target table using Minus query · Validate matching rows in source and target table using Intersect query
Incremental ETL Test	· Validate new inserts in source table are getting processed and reflected in target tables as per ETL process & business rules · Validate new updates in source table are able to lookup the existing record in the target table and update it. · Validate that the changed data values in the source are reflecting correctly in the target data · Validate by comparing all the records that got updated in the last few days in source & target on the incremental ETL run frequency · Validate that the denormalized values of target tables are updated based on the changes in source data
ETL Performance Testing	· Validate the ability of ETL jobs/system to handle multiple users & transactions · Validate that the response time of the various data load into data warehouse are within prescribed and expected time frames · Validate the scalability of ETL system with addition of more loads based on the business expectations

Labels

SMART ETL Test

SMART Test Design

SMART Regression Test