What Should We Test?
This seemingly elementary question is one of the most complex in the world of quality and testing… while being perhaps the most important because, need I remind you, exhaustive testing is impossible. As you will have recognized, we are talking here about test scope, an essential element of test plans and a topic that is already covered by Arnaud Verin on the blog “La taverne du testeur“.
In order to reduce the scope of this article, I will limit myself to testing in an Agile environment. I will, therefore, start with a quick reminder of how a product is developed in Agile
1. Quick reminder of Agile and its impact on testing
A product developed in Agile is software developed in an incremental way like this:
This creates many constraints, such as:
- Have a potentially usable product (designed, developed, and tested!) at each increment
- Delivering value with each increment
- Delivering quality at each increment
- Have a team able to do all these tasks
In Scrum, the increments are generally between 2 and 4 weeks long. This time, compared to the long testing periods required for software as a whole in traditional cycles like the V-cycle, it is significantly shorter… Especially as it is possible to have increments much shorter than 2 weeks.
Agile methods therefore generate at least these 3 types of challenges for testing:
- Frequency of execution challenges (we test often)
- Time challenges (you have to test in the limited time available)
- Quality challenges (you have to reach a sufficient level of quality to have a potentially usable product)
The scope is therefore at the center of the problems. To what extent should we test to achieve the right level of quality while respecting the constraints of time and frequency of execution?
This is the subject of the rest of this article!
2. What elements of your product should be tested in Agile?
Testing the new features
Each new product increment adds functionality through User Stories, and it is important to test these new features in-depth to ensure that they meet the expectations.
The whole issue now comes down to defining “in-depth” and “expected”.
For the expectations, we must take into account
- The value that the functionality should bring. This should be tested both for its value and for its response. BDD is a fairly effective practice for dealing with this element.
- The ability to meet the stated functional and non-functional requirements: this can be covered in part with scripted black-box tests, tests to be automated (as they will be re-run) or exploratory tests
- The ability to meet unstated functional and non-functional requirements: this is perhaps one of the most complex elements. It requires a detailed understanding of the context. These elements can be covered in part by exploratory tests which can, for example, identify problems of ergonomics, performance, consistency, etc.
- The quality of the code, which must be maintainable, stable, and clean: this is done in particular with code analysis tools such as SonarQube, code reviews, and unit tests (which can be written in TDD)
- The quality of the documentation. This can be done through reviews and depends strongly on the type of documentation edited for the product
- …
Striking a Balance
The elements to be tested are numerous, we now need to know to what extent it is necessary to test them in order to go sufficiently “in-depth” but also to respect the time constraints: we cannot spend a whole sprint of 2 weeks or more to validate the behavior of a single new feature!
This is where it becomes crucial to choose what you test and how you test it. For this, the ISTQB Agile Technical Tester certification proposes an interesting method that can be used for integration testing, system testing, and potentially acceptance testing.
It is based on the risks and the criticality of the functionalities. According to these 2 elements, the certification proposes a specific mix between automated tests, exploratory tests and “black box” tests (scripted upstream and manual):
The higher the risk, the more black-box and automated testing should be used. Similarly, this mix also depends on the criticality of the functionality of the system. You will note that exploratory testing is the only test that is always recommended.
An approach like this can help to make choices to test quickly enough.
Testing new features is good, but a new version of a product is not just about new features. It is necessary to ensure that these new features have not introduced major anomalies and that the old features continue to work properly by testing the whole product.
Testing the whole product
As you will have understood, we are mainly talking about regression here. There is already a specific article on this subject, you can consult it if you wish to go further into the subject: Regression Testing: Essential Areas to Focus on.
It is impossible to keep a correct pace if each new feature requires the same level of validation as the previous one. A regression campaign should provide a safety net, but cannot be used to fully validate the product. It is necessary to accept the risk of introducing minor regressions… on the other hand, it is compulsory to limit the appearance of major regressions as much as possible because users are particularly frustrated when this happens!
The stated objective must therefore be to identify major and critical regressions. Minor regressions can be detected even if this is not the primary objective. In fact, these are regularly identified during exploratory test sessions validating new functionalities.
To identify mainly major regressions, it is interesting to couple 2 complementary approaches:
- The first, and most important, is that of usage. The tests of the regression campaign must correspond to the current uses of the product. This need to correspond to current usage is significant because a regression on an obsolete use of the software is not necessarily major. Whereas, an equivalent regression on a new or democratized use of the product is particularly disabling. This notion of current use, which is preponderant, is rather complex to set up and requires regular updating of the regression. To implement it correctly, it is often essential to know the uses in production. There are several ways to identify these usages. One simple and efficient way is to analyze logs, which can then identify behaviors as Gravity does.
- The second, which represents a much smaller volume but whose impact is just as significant, is that of “abnormal” behavior or “exceptional events”. These mainly concern uses that we want to protect ourselves from (e.g., a way of not paying but still getting your order thanks to a Timeout) or non-functional elements such as accessibility or security.
Once you are aware of these two elements, you must adjust the volume of your tests to the team’s capacity. In terms of execution, analysis, and maintenance. Here again, the Gravity tool can save a lot of time by offering the usage part “on the fly”.
The ISTQB agile technical tester certification highlights 5 approaches that can help reduce execution time:
- Prioritize tests to ensure that the most important tests are run
- Have different configurations depending on the environment on which you are delivering
- Decrease the number of GUI tests and increase API and unit tests
- Execute tests according to the lines of code impacted
- Parallelize tests
We could also talk about shift right and production testing with Canary Release or any other method.
Conclusion
In short, as you can see, there is no ready-made answer to the seemingly simple question: “What should we test?” Even if we restrict ourselves to agile methodologies.
There are, however, beyond any context, elements to take into consideration and to keep in mind. The rest is and will remain a clever mix, requiring the expertise of testers to try to approach the best compromise between time/cost/quality of the product through the choice of a constantly evolving scope.
Author: Marc Hage Chahine