Which coverage indicator for your E2E tests?
When producing E2E test suites, one challenge is to know what to test first, and when to consider that the test suite is sufficient to ensure the desired level of quality.
To make the most rational decisions possible, we need to base our decisions on objective indicators. As far as testing is concerned, one such indicator is coverage. But what does “coverage” mean exactly?
Code or test coverage?
“I want to ensure my code does what it is meant to do”
For developers, code coverage is often used as an indicator. Code coverage is a white-box testing technique, primarily performed at the unit testing level. Several criterions can be used to measure how the code is covered by the tests. During the test execution, we can measure:
- statement coverage
- decision/branch coverage
- path coverage
- condition coverage
- boundary value coverage
- …
“I want to ensure my application works as planned”
Test coverage (whatever testing level we are speaking about) consists of measuring the percentage of elements that are activated (covered) by a test, or a test suite. Depending on the need, you can cover many kinds of elements.
Those elements can come from specifications, user stories, or can be the result of a previous analysis. Here are some examples, not exhaustive.
- Is my specification well implemented? Let’s go for requirement coverage. Which are the specified requirements (functional or not) covered by my tests? This is why test management tools most often come with requirement traceability features.
- Is my user story well-developed? We defined a user story, we had an example mapping session, so we identified different cases related to the story. Are all those examples/cases covered by our tests?
- Following a risk assessment, we have identified all the potential risks and issues that may arise, including low-probability scenarios that could have a significant impact on the software. Let’s assume what we identified is correct, are those scenarios covered?
Is the overall test coverage rate a good indicator?
We often hear coverage summed up in a single number: the overall test coverage rate. But whether we’re talking about code or test coverage, what can this indicator tell us?
“I have 75% code coverage”, “My tests cover 65% of my functional requirements”…
Great… What decision are we going to make regarding this figure? Is it too much? Is it enough? What are the missing percentages? Perhaps they are more important for our application than those covered?
Is a line of code that logs information as essential as a line that stores some value in the database? Same for requirements, risks, …
We definitely need to be wary of too-high-level indicators, which could mask coverage shortfalls where it really matters in the end.
So, unless we have 100% code/test coverage, overall coverage can only give an indicator of the effort spent to test an application. One more question… What threshold value should lead us to test more?
And for my E2E tests?
Let’s have a look at the ISTQB definition of end-to-end test:
“A test type in which business processes are tested from start to finish under production-like circumstances.”
This means that we should test like we are a user, so “from the User Interface”. Our tests will go through all the application layers, down to the database.
“Like we are a user” also means that we will realize flows of activities (tasks in a business process) on our application, from the moment we enter it, to the moment we leave it.
Bridging the Gap Between Assumptions and Realities
Obviously, we can’t cover all the paths a user might take through our application. But if we put ourselves in the user’s shoes, we want to make sure that his key journeys won’t break, release after release.
To ensure that, code coverage doesn’t give us enough information. Neither does test coverage based on requirements. We can expect the users to follow some specific paths in our application, but often users can be quite unpredictable. As a tester, there might be a gap between what we think the user would do and what they actually do… between theory and reality…
We need to understand what the user really does on our application. The more an activity or flow of activities is key for a user, the more we want to avoid regression on it. But what makes an activity or flow “key” for a user? Here comes Usage-centric Testing into the picture. We believe that what the users do, most often, is key to them.
By recording real usage on the application, we can extract trends and determine which activities or flows are the most common for the users. Cover the flows that are common and critical for your business from these common flows.
Imagine you could determine if your E2E tests cover those flows, imagine completing your E2E tests with the uncovered flows. This is what we call “Usage coverage”, and this is what Gravity does.