It is early afternoon on Friday.
As the week is coming to an end, so is Team Alpha’s Sprint.
The team is rushing to finish the last User Stories in the Sprint. Marion is putting the last touches on the Daily Report User Story. Just a bit more tweaking of the CSS… and… we’re done! Marion shoots Kate, the PO, a WhatsApp message: “Hi Kate, the daily report story is done, can you please check it out and accept it?” A couple of hours pass and Kate is finally done with the grueling series of back-to-back meetings she’s been enduring today.
Kate logs into the staging system and takes a look at the daily report. That’s weird, the report does not include beta users. And when you refresh it, the order of rows changes. After just a few minutes of testing, Kate uncovered 8 issues. Some of them are quite basic. How can there be so many issues with such a straightforward User Story? What could have been misunderstood about the detailed description she wrote?
It’s 17:34. Marion is starting to pack for the long weekend. Her phone rings. It’s Kate. For Marion, this is a Deja Vu.
This scenario describes a problem that is all too common in software development. Miscommunicated expectations and assumptions left uncovered till late in the game. This costs us dearly. In blood, sweat, and tears.
Enter ATDD.
Acceptance-Test-Driven-Development proposes to address this challenge by discussing expectations before we jump into the implementation.
Before we implement a User Story, we get together and discuss how this story is going to be validated using a small set of Examples / Scenarios / Acceptance Tests (I’ll use those interchangeably). “We” typically includes someone who understands the Customer/User (the PO), someone who is going to implement the solution (a Developer), and someone who understands how we typically screw things up (QA). This trio is sometimes referred to as the Three Amigos.
ATDD is also known as Specification By Example, after the list of Examples the discussion centers around.
Lately a format has been adopted as the lingua franca for those Acceptance Criteria, a language which guides us in formulating the Examples in a way that is concrete, can lend itself well to automation, and yet, can be easily read by non-technical people. This format is called Gherkin, and virtually all Test Automation tools used by Development Teams have either native support for this language, or have plug-ins which add support for it. Including popular frameworks like Cucumber (the tool that created the language), Cypress (web), Pytest (python), Robot Framework, Appium, and many, many more.
Note that BDD (Behavior Based Development) is another common synonym for this approach, although there are some nuances between ATDD and BDD.
The Gherkin format for Acceptance Criteria looks like this:
Scenario: Vote for a retrospective insight card Given that I am looking at a list of insights and there are no votes And I am taking part in a session where each participant is limited to 3 votes When I vote card "A" up Then I see that my number of available votes is 2 And I see that card "A" has one vote
Gherkin also supports a tabular format so we don’t need to repeat ourselves when there are many variations of a particular scenario:
Scenario Outline: voting for retrospective insight cards Given there are 3 maximum votes per participant And I have voted <voted> times already When I try to vote <new votes> for a card Then I should see <remaining> votes And I should see this <message> Examples:
Voted | New Votes | Remaining | Messages |
0 | 1 | 2 | None |
2 | 1 | 0 | This concludes your voting |
3 | 1 | 0 | You cannot exceed a total of 3 votes |
Pretty straightforward, but there are many questions that come up when you actually get started with ATDD:
Q: Who writes those Acceptance Criteria?
A: The process of creating (and, more importantly, discussing) the Acceptance Criteria should include at least the perspective of the Customer (typically the PO), the technology (typically a developer) and the Quality (typically a Test Architect / QA Analyst ). They can be written by any one of them, but they would then need to be reviewed by the other two.
Since “reviewing” is a passive endeavor, my experience tells me it is better for those “three amigos” to actually sit together and write them down together. An alternative is to use a back-briefing mechanism which gives the developer an active role: PO briefs QA and Dev on the User Story. Developer writes Acceptance Criteria with assistance from QA, all the while asking the PO questions, and then the PO reviews and comments. Less interactive, less conducive to having some great discussions, but still has value.
Q: At which level do we write Acceptance Criteria? Feature? User Story? Task?
A: While there can be a definite benefit to defining Acceptance Criteria at the Feature level, features naturally tend to have a coarse granularity and won’t go deep enough into the details, therefore leaving many potential issues undiscussed and undetermined. Accordingly, User Story level is where I would start. Tasks typically represent a technical breakdown of the User Stories and don’t require a discussion with the Product Owner and the Quality Analyst / Test Architect, so no need for Acceptance Criteria there, at least not of the type we are discussing here.
Q: I’m not sure there is value in this practice
A: I wouldn’t be sure myself without trying it out first. How about taking on an experiment to validate this assumption? Here’s what such an experiment could look like:
Carve off a small portion of the User Stories the team takes on, and use ATDD just on this small portion. Gather data for both the User Stories for which you used ATDD, and for those you did not, then compare.
You could, for example, ask the team to estimate each User Story, then, define Acceptance Criteria for a few of them, and provide the team an opportunity to revise estimate for this subset of User Stories, factoring in the new knowledge gained from the discussion. You could then measure the Cycle Time (calendar time from In-Progress to Done) for both types of stories, and factoring in original estimates, compare weighted Cycle Times. I would also capture the amount of time spent on defining the Acceptance Criteria, so you can do some ROI calculation.
Q: Automating the Testing of the Scenarios is a lot of work
A: The main value of the practice is the discussion and its ensuing insights. Unless developers on the team are eager to automate some of the Acceptance Criteria, I believe it’s fine to start without any automation initially.
Q: Developers don’t want to do ATDD, how do I still promote it?
A: You can try it out without scaring people by just sitting down with a developer before she starts working on a User Story, and asking questions in the spirit of ATDD without even referring to that acronym. For example you could ask:
- “Can you walk me through the tests you intend to perform on this User Story before it’s Done?”
- “What data are you going to use for the tests?”
- “If there’s only one test you are going to automate what would it be?”
- “How confident will you feel releasing this to production once you have carried out these tests?
- “Tell me about a similar User Story you have implemented in the past… Were there issues that were uncovered near or after the completion of the implementation? What were those issues? How likely are we to encounter similar issues for this User Story?”
Try to push for the use of fully qualified examples using actual data, as concrete examples are more likely to elicit the kind of thinking that leads to some great questions.
Example: instead of “I will test that the amount is correct”, prefer something like “When there are two items costing each $0.99, then the total amount displayed should be $1.98”.
Think about the actors (user, customer, support person etc.). What are their different needs from the system?
These questions are very likely to surface some questions and assumptions. Great! Time to talk to the PO…
Q: Developers don’t want to write Gherkin. Should I insist?
A: I don’t think you should push this particular agenda. I’ve seen teams use MS Word Documents with a table which essentially contains the Given/When/Then format. Surprisingly, a Word Document is to some, less intimidating. Furthermore, the PO was very happy to get a Word Document and go over it to validate the Acceptance Criteria. Yes, it will make the jump to automation harder, but everything in good time, I guess.
Q: When should we write them?
A: To reduce waste, you probably want to write those “Just in Time”. Before the team starts designing and implementing the User Story. However, ATDD is likely to affect the estimates for the User Stories, and therefore if your team is using Scrum as its framework, and hence needs to predict how much work it can bring to completion (Done-Done) within the Sprint, you may want to do this before the Sprint Planning, otherwise the team will have a hard time committing. This can be done gradually during the previous Sprint as part of the Backlog Refinement process, or in an ATDD Workshop just before the Sprint Planning. The former can be more effective since people have time to “sleep on it” and to ask other stakeholders, who may not be always available, for clarifications about questions that inevitably come up while writing Acceptance Criteria.
Q: What if the User Story’s estimate goes up when we create Acceptance Criteria and end up surfacing more work than we originally anticipated?
A: That’s great! It means you have uncovered some important knowledge that was missing, and now you have a more realistic view of the effort needed. You may want to split this User Story into smaller User Stories if the resulting larger User Story is too large to be completed in a few days. In effect, writing Acceptance Criteria is a powerful User-Story Splitting technique.
Q: Should all scenarios / Acceptance Tests be automated as End-To-End tests?
A: No. User-Story level tests should be User-centric, and therefore tend to be large, hard to implement, run and maintain. I would automate only those that have to become part of our pre-release regression testing suite (which should be small and fairly fast). The rest can either be tested manually as once-offs, or, if they represent permutations of a mainstream flow, can be pushed down the test pyramid and implemented as Medium (Component / Integration) or Small, Unit level tests, which are much cheaper to implement and maintain down the road.
Q: How do I make sure we don’t miss important scenarios?
A: The best way to cover most important scenarios is to meaningfully involve stakeholders who hold diverse perspectives in the discussion. In addition, I would recommend using checklists like a Definition-of-Ready & Definition-of-Done. These should remind us of things we tend to forget but have to be part of a complete User Story. In fact, my opinion is that the very definition of a good Definition-of-Ready is one that helps us identify missing Acceptance Criteria. Note that you will still miss some Acceptance Criteria and will have some “Duh!” moments even when you adopt ATDD. But the goal is to drastically reduce those.
Q: Aren’t the Acceptance Criteria sort of a Test Plan? And if so, do we need both?
A: The output should definitely be as inclusive as the test plan for the User Story, and there should be no need to create a separate test plan, but again, the main benefit of ATDD is the structured discussion which leads to surfacing assumptions and sharing expectations. Note that in addition to the Acceptance Criteria, it is helpful to have a discussion (often without the PO) about how to test and automate those Acceptance Criteria (level of automation, need for simulators, environments etc.).
Q: Isn’t it the same as TDD?
A: Well the two terms having three letters in common is a dead giveaway that there are some similarities, and indeed, both aim at pulling quality to the left by thinking about test scenarios before writing the code. But there are some differences, summarized in the table below. Please note that each brings its own set of benefits, and ideally both would be practiced. It’s not an “either or” choice.
ATDD | TDD | |
Focus | Top-down: Doing the right thing. Customer/User Focus | Bottom-up: Doing things right. Technical focus |
Scope | End-to-End use cases. Perhaps Component level as well. At the User Story level. | Unit-level: Class, Method. At the task level. |
Who defines the tests | Developer + PO + QA (the three amigos) | Developer |
When do we define the tests | Just before implementation (& design) of User Story starts | During the implementation of the User Story. Tests are emergent |
Automation | There is value even with no automation of the Acceptance Criteria. When automated, automation is implemented by a feature developer (not a test automation expert) | Key to the practice. Implemented by the same developer (or pair/mob) who implement the User Story |
Outcome | Uncovered assumptions. A common understanding of the important aspects of the User Story and its quality attributes | Lean design, Testable code, Unit tests which become part of the Continuous Integration build |
Q: Do all User Stories / Projects benefit from ATDD?
I think the majority do, but by no means all of them.
If we use the Stacey chart below to look at the uncertainty in the “what” and “how” dimensions, at the bottom left corner, someone (presumably the PO) knows exactly what is needed, and someone (typically developers) knows exactly how to make this happen. In this case, the risk we want to reduce is one of knowledge equalization – Making sure the developers understands what the PO knows about the problem space, and that the PO understands the tradeoffs the developers can make in the solution space. Here, ATDD is very valuable.
Where requirements uncertainty is higher, ATDD can surface assumptions and push us to understand better what the customer needs. Where uncertainty is very high, we don’t know how to answer many of the questions, and the right strategy there is not to try to get it right the first time, but rather to use small experiments to get validated learning early. ATDD may have less value there.
On the technology uncertainty axis, if we don’t know what’s possible, what’s easy and what’s hard, it’s very tough to estimate, and to make sound implementation tradeoffs, and we may want to take on a Spike to answer some technology questions. At this side of the spectrum, full blown ATDD would probably be too much and too early.
Q: Do all teams need to use ATDD?
A: ATDD is a solution to a challenge. The main challenge is the lack of efficiency and speed resulting from partial understanding of what the User Story is all about. This lack of understanding translates into blocks and rework down the road, effectively slowing you down.
Since it is a well established fact (and my personal experience) that issues found early can be much, much cheaper to address, the ROI can be huge.
If this challenge does not exist in your situation, move on. I have to say though, that I cannot remember seeing a team who doesn’t practice ATDD or some close alternative, and for whom this problem is non existent. Please let me know if your experience contradicts this observation.
Q: If we spec out in great details the User Stories, isn’t this a slide back to detailed documentation? Isn’t this an insidious fall back to waterfall?
A: With ATDD, the emphasis is still on just enough documentation, just in time. We don’t try to create detailed specs for something that is going to be implemented months down, but rather to a tiny piece of work (the User Story) which is going to be implemented as early as tomorrow. Furthermore, even after we have defined the Acceptance Criteria and start the implementation, Acceptance Criteria are likely to be refined, and more may be created since we inevitably uncover more knowledge, so there’s no hard phase change here.
Q: Having a Developer, a QA Analyst and the PO discuss User Stories at this level of details will surely be time consuming and will mean developers will have less time programming during the Sprint, and therefore we will end up lowering our velocity
A: A few things to consider here:
- Our goal is not to spend as much time as possible coding, or to create as many lines of code as we can. It is rather to make as much Impact as we can, in as short a time as possible. By avoiding the rework that results from misunderstandings, we can be more efficient, and create more impact in less time
- Understanding what is important about a User Story will need to happen sooner or later. Doing it later will often be costlier, so avoiding spending time doing ATDD early on will actually waste more time later on. Where’s the time saving there?
- If you are not doing some form of ATDD, chances are that you are taking on too much work into your Sprint, and not finishing it all, and your velocity might be fake and include things that are sorta Done.
Q: Should the whole team be included in all the ATDD workshops?
A: If the team is small, then probably. For larger teams, you may want to have User Story Owners from the team, so they can participate selectively in the ATDD workshops. In some cases one developer refines a specific User Story, and a different Developer ends up implementing it, but there’s no harm there (actually having another pair of eyes can be great).
Photo by NEW DATA SERVICES on Unsplash