Discussion and audio clips from a previous interview with Kent Beck
This transcript starts as an auto generated transcript.
PRs welcome if you want to help fix any errors.
[music] Welcome to Test and Code, a podcast about software development and software testing. Kent Beckâs Twitter profile says âprogrammer, author, father, husband and goat farmerâ, but I know him best for his work on Extreme Programming, Test First Programming, and Test Driven Development. Heâs the one. The reason you know about TDD is because of Kent Beck. I first ran across his writings when I was studying Extreme Programming in the early 2000âs. Although I donât agree with all of the views he expressed on his long and verbose career, I respect him as one of the best sources of information about software development, engineering practices and software testing. Along with Test First Programming and Test Driven Development, Kent started an automated test framework that turned into JUnit. JUnit, and its model of setup and teardown, wrapping test functions, as well as base test class-driven test frameworks, became what we know of as xUnit-style frameworks now, which includes Pythonâs unittest. He discussed this history and a lot more on episode 122 of Software Engineering Radio. The episode is titled âThe history of JUnit and the future of testing with Kent Beckâ, and is from September 26, 2010. Iâll put a link in the show notes. I urge you to download it, and listen to the whole thing. Itâs a great interview, still relevant and applicable to testing in any language, including Python. However, I know that many of you arenât going to listen to it, and thereâs a few portions of the interview that I really want to share with you. So hereâs what I did. I tracked down the right people to ask permission to pull some clips out of that interview and play on this podcast. They said it was OK. Actually, since SE Radio is part of IEEE now, my request ended up going to someone at IEEE Computer Society. They said yes, which is cool, so here we are. Oh yeah, I did ask Kent via Twitter if it was OK if I introduced him as a goat farmer from Oregon. His reply: âthatâs fineâ. So here we go, some bits of software testing wisdom from a goat farmer in Oregon.
[2:22] The first clip is about having your tests be readable and tell a story.
âI always strive for a kind of declarative expression in my tests. You should be able to just kind of read a test and it tells a story. That is, somebody coming along later and reading it should be able to understand something important about the program. â
Sometimes, normal programming good practices donât apply to software tests. One example is DRY. DRY stands for donât repeat yourself, and many people take it to mean that if you have any repeated chunks of code, you should put those into a function and call that function instead. Software tests naturally have code similar to other tests, and itâs tempting to put the common lines in a separate function. Hereâs Kent on the topic:
âDRY in particular I donât subscribe to for test code, because I want my tests to read like a story.â
[3:21] Another thing that Kent brought up was the idea that tests should advance your knowledge of the software under test. A test that fails should legitimately tell you new information about the problem in your software. He also warns against having multiple tests that tell you the same thing about your software.
âTests should have, in medicine they call it differential diagnosis, where they say Iâm going to order this test, and based on the results of this, you know, whatever, blood test, I will be able to rule out a bunch of stuff and confirm some other things. So, every test should have this kind of, maybe this is an information theory thing, should be able to differentiate good programs from bad programs. If you have a test and it doesnât do anything to advance your understanding of good programs and bad programs, then thatâs probably a useless test. But if you took the space of all possible programs to solve your problem, you know, almost all of them wonât, and a few of them will. A test should lop off a big portion of that space and say nope, any program that doesnât satisfy this test is definitely not going to solve the real problem. So, thereâs a part of that. And then thereâs a sense of redundancy. If you have a bunch of tests that tell you exactly the same thing, then, I would look to see which of them adds the least value and delete them. But they have to really cover exactly the same cases.â
[5:06] This next clip is one of my favourites. You see, when I first learned about Test First Programming and Test Driven Development, I understood it to be useful at the user API level, with an idea of functional units. I also found it very useful to write tests at layer interfaces, especially when I was working on a layer closer to the hardware and I wanted to test, from my level down, functionality that was ready in the hardware but didnât have upper layers ready yet, and no API available yet. I think the level, the interface where you apply your tests, is a pragmatic decision based on the circumstances youâre in. But thatâs not how a lot of people saw it. A lot of TDD proponents, other than Kent, came around and pushed isolated unit tests, and tried to shove end-to-end tests and system tests back over the fence to QA teams. So Iâm very pleased to hear Kent talk about testing at different levels, or as he puts it, different scales.
âSomething I didnât communicate very effectively in my first discussions of TDD is the importance of testing at various scales. So, TDD is not a unit testing philosophy. I write tests at whatever scale I need them to be to help me make my next step of progress. So sometimes theyâre what somebody else would call a functional test. So, for example, forty percent of the JUnit tests work through the public API. Sixty percent of them are working on lower level objects. The public API is quite good for testing, probably because weâve written so many tests, so I donât know if those proportions are âI donât want to claim those proportions are anything more than one data point, like should you have 40-60, should you have 10-90 or 90-10, I really donât know, but just this idea of moving âPart of the skill of TDD is learning to move between scales, right. So I write a test that my customer says âoh, this scenario should result in a fiveâ. So you write a test that says this scenario should result in a five, and then youâre down deep in the intestines of your program and youâre thinking, oh, I see, well this object when given a five and a seven should return the five. Well thatâs a good place to write a test because thatâs another piece of the story that needs to be told. But, you know, is that Acceptance Test Driven Development, or is that BDD? I think that erecting rigid walls between the styles is actually a mistake, like the scales, as a programmer I want to understand all those scales. Tests help me understand, so I write tests at all those scales.â
[8:18] So letâs say you have tests in place that give you information about your system and tell a story well. The tests are software, and have to be maintained. You shouldnât have tests in your system that are hard to understand, because, at some point that test will fail, and someone will have to figure out why itâs failing. Thatâs where readability and value are very important. Iâm totally sick of people saying that end-to-end tests are fragile, meaning they break all the time. Listen, if you write a test using your user-facing API, even if itâs a long story, itâs kind of like something your customers are going to do with your software. If it breaks, or fails, thatâs your customersâ code that will break too. Thatâs serious! If it really is a test problem, then thatâs just weird, but I have to say, end-to-end tests donât have to be long stories. Focused functional tests can be short. But sometimes it has to be long to match a real customer use model. So be it, itâs long. But if it fails, take it as seriously as you would a customer defect report. Hereâs Kent on the topic.
âI still go places and people say âoh yeah, we did a bunch of tests, but then the tests stopped working, so we threw them outâ, which just seems bizarre to me. I mean, like, Aristotle would be shocked. The logic just doesnât add up. This test said if the test is running my program is running, and if the test is not running then my programâs not running. And the test stops running, and your next act is to delete the test, or just stop running it or ignore the test report that you get. Like, wow, that means your programâs not running. But somehow, I mean, thereâs a lot of other pressures on people other than get your program running. I guess thatâs the conclusion that I can draw from that, but itâs kind of, itâs too bad, I think. Thereâs potential value there, people could produce more value as programmers if they trusted the tests, and paid more attention to them. But I, you know, thereâs a lot of other things going on in software development than coding.â
[10:32] So far, I agree with everything Iâve played. I thought it might be fair to play a clip that I donât agree with. Iâll just play it and discuss it afterwards.
âI think itâs worth being dogmatic as a learning tool, right. What if I just said Iâm always going to write tests for everything. And then you discover, oh, Iâm glad I did this. Here Iâm sorry that I did it. So I wonât do it in â whatâs the commonality in the experiences where I wished that I hadnât written tests, whatâs the commonality in the experiences where Iâm glad I wrote the tests, then let me infer âIâll use that to inform my behaviour going forward.â
Heâs saying that itâs OK to teach TDD in a dogmatic way, and that people will learn with these training wheels on, and when they outgrow the dogmatism, theyâll let their common sense dictate how much they should test and what needs to be tested and what doesnât. But, I think history tells us you canât always rely on peopleâs common sense to kick in, and thereâs a bunch of people out there saying things like âyouâre not really doing TDD rightâ, âthatâs not a unit test because you arenât using mocksâ, âunit tests shouldnât touch the database or the file systemâ, âthatâs not really Scrum, itâ ScrumButâ, and stuff like that. Anyway, Iâm sick of the dogmatism and the excuse that people are smart enough to know we donât really mean test everything. We should teach people what they really ought to do, not some idealised version that they are supposed to just know to change when they get the hang of it.
[12:07] Anyway, this is six years later, and Iâd love to get Kent Beck in an interview sometime, and ask him about these clips, and about goat farming. And Iâm really curious if heâs loopy about IPAs and pinot noir, like half the rest of Oregon. Itâs unreal, especially in the summer, youâd be amazed how hard it is to find a beer that isnât an IPA, or one of its variants. So what did we cover?
Your tests should tell a story. Be careful of DRY, inheritance, and other software development practices that might get in the way of keeping your tests easy to understand. All tests should help differentiate good programs from bad programs, and not be redundant. Test at multiple levels and multiple scopes, where it makes sense. Differentiating between TDD, BDD, ATTD et cetera, isnât as important as testing your software to learn about it. Who cares what you call it? [13:11] But thereâs lots more great stuff in that interview. Please check it out. Show notes can be found at pythontesting.net/23. This episode was brought to you by Patreon supporters. Visit pythontesting.net/support for more info, or go directly to patreon.com/testpodcast, and help keep the show coming. On Twitter, Iâm @brianokken, and the show is @testpodcast. Thanks for listening. [music]