Some information about software testing is just wrong. I’m not talking about opinions. I have lots of opinions and they differ from other peoples opinions. I’m talking about misinformation and old information that is no longer applicable.

Transcript for episode 79 of the Test & Code Podcast

This transcript starts as an auto generated transcript.
PRs welcome if you want to help fix any errors.

Some information about software testing is just wrong. I’m not talking about opinions. I’m talking about misinformation, and old information is no longer applicable. I ran across a few bits lately that I want to address. I’ll try to point out places is where I disagree with somebody based on my opinion and not facts. But mostly I like to focus on fixing outdated information, incorrect facts, outright deception, and bizarre logic. I’ve been waiting a while to do this for quite a while, so this should be fun. This is Test Encode Episode 79. Thank you to Pantheon for sponsoring this episode. Pantheon makes building, managing and optimizing website simpler. Check them out at pantheon. Dot IO slash testing Code welcome to Test and Code, the podcast about software development, software testing, and Python.

It’s summer. The weather has been nice here in Portland. I’ve got a lot done that’s been on my Todo list. Around the house. I changed a worn out flapper thingy in the bathroom. I replaced an anti siphon valve on an outdoor faucet to fix a drip. There.

I reattached the top to a side table that was broken for a while, and I secured the legs onto another table. And now I’m going to try to fix something else. I’m going to try to fix some misinformation about software testing.

Let’s try to tackle a few of these today.

All of the following are wrong. First, integrated tests can’t work. I can prove it with some wacky math. Two tests have to be blazing fast or they won’t get run. Three TDD is about design, not about testing. All of those are wrong. Seems obvious to me, but I keep running into these ideas and it bugs me. Here’s an example. I was listening to a podcast called A Question of Code with Ed and Tom Hazeladine. I actually love the show. It’s great. There was an episode called Why Should You Write Tests? Where Tom gave Ed some kind of cringeworthy advice about testing. At least some of it was a bit painful for me to listen to, but it was still fun. But that’s not what this episode is about. I bring it up because Tom brought up Gary Bernhardt’s Boundaries talk, and he also talked about JB. Rainsberger’s talk, Integrated Tests, or Scam. This talk is the first time I encountered the wacky math and bizarre logic issue.

I’m not sure if Rainsberger made it up or if he heard it somewhere else, but then Gary repeated the weird stuff and then Tom repeated it, and really, this is all going to stop. So here’s the idea. Say you have a function that has an if statement. To hit both paths of the if statement, you need two tests or two test cases. Now let’s say you have function A that has six paths, A calls, B which has five paths, and B calls C which has two paths. If you use isolated unit tests or mockest unit tests, you need six tests for A, five test for B, and two tests for C. To completely test them, or at least get all the paths, one test for each path through the function.

So in total, for A, B, and C, that’s six plus five plus two or 17 test cases. If you test them together with integrated tests, all of the combinations of logic paths multiply instead of add for six times five times two equals 60 paths. So you need 60 test cases. That’s 43 more test cases than the isolated unit test model. Are you lost yet? I’m kind of lost that’s weird logic. But let’s take it further. If you extend that to your entire application with much deeper dependency trees, then you end up with thousands of test cases that you need really fast. It blows up exponentially.

So that’s what people remember about these talks. I think just that the combinatorial path problem is huge. I think that’s one of the reasons why people remember that’s not the point of these talks. Neither one of the talks or that’s the point. I think the reason why people remember that, though, is that the other half of the talks are even more bizarre and everyone gets lost and just takes away the first bit. But of course, that’s my opinion slipping in. Let’s go back to the math.

The isolated unit test argument is that testing function in isolation is enough, and you don’t have to test things together. I call bullshit, but still, if all of these paths are independent, there’s nothing stopping you from using the same test 17 test cases from the unit test model and running them as integrated tests. You’re still going to hit all of the paths of each of the functions. If you don’t care about the combinatorial bit, then don’t do that.

It is correct that any nontrivial application cannot be exhaustively tested. However, with tools like equivalents partitioning, boundary value analysis, code coverage tools, feature testing, and practice, we can do pretty darn good with way fewer tests than Bernhard and Reinsberger are implying. I think the weird math thing there is there to defend the complexity of their supposed cures. Here’s where I definitely mix my opinion in. But my opinions aren’t just mine.

They come from others as well.

Rainsburg after this attack on integrated tests, Reyesburger goes on to draw that the conclusion. To solve the problem of integration, you create more tests, and I think they’re called collaboration and contract tests. I think I may have forgotten the names, but I got lost through all of this and it just sounded complicated. I even listened to the whole thing twice, and I don’t quite get how this could all work. It seems like a lot of work to avoid higher level tests. I don’t know why we’re trying to avoid higher level tests.

One of the things I really do appreciate, though, is that Reinsberger covered the idea that when you test any container like data structure. You should include the cases of zero items, one item, a few items, many items, and air conditions. These cases, of course, can be derived from equivalent partitioning and boundaries of the partitions. But that’s a good five rule of thumb of test these five things cases, but that’s just for container like data structures. There’s all sorts of other types of things you got to test that you need other if you had rules of thumb, you’d have lots of rules. Or you can just use equivalent partitioning and boundary value analysis and cold coverage and other things to make sure you’re covering things pretty good. Okay. Gary Bernhardt also has some good stuff. Plus, Gary is an incredible public speaker, and he had some really cool slides.

One of the things I liked is his emphasis on the importance of separating algorithmic and functional code from imperative code. I think that’s cool. So algorithmic or function code is functions that they don’t touch anything, they just take input and they provide output. And every time you give them the same input, they could give the same output.

An imperative code is stuff that depends on state or state of the system or touches hardware or something.

Separating those, I think, does make sense, mostly because complicated algorithmic code is nice to have separated. So you can put unit test and code logic. That’s definitely one of the places where I like unit tests is if you’ve got complicated logic in areas to test that part.

Okay, but that’s where his talk went haywire.

Gary’s cure is to modify the system design to have all the functional stuff in the inside and push all the imperative code to the boundaries.

And so you can test the entire inner bit with unit tests and not care about how things go together, I think. I’m not sure what he was talking about there. Or maybe you can test the whole thing as a fun. I don’t know. Again, I watched this talk two or three times, and I still got lost. I don’t really know how he’s saying that we should test it even like, gives an example and says that some of these things can be tested by inspection or by manually testing it, but I don’t think that helps us really a lot.

Anyway, I’m still not sure how this would be really easier to test or maintain, because it’s sort of a lot of easy problems can become complicated if you try to muck with your design. Okay, so that’s my opinion. But I’m not alone in thinking that you should not add massive complexity to your tests with lots of mocks or doubles, nor you should complicate your system design just for the sake of unit testing.

To defend myself, first up, we have Martin Fowler. He’s one of the creators of the Manifesto for Agile software development, often incorrectly referred to as the Agile Manifesto, Martin describes the history of unit testing in an article link to and describes the split between Sociable and solitary tests and Sociable being functions that talk to other functions and you don’t knock out things, and solitary tests being test where you isolate a function with mocks and whatever. In the beginning of XP and Test First and TDD unit tests were Sociable. There were no mocks or doubles.

There were external dependencies that were difficult, like databases or services were stubbed out sometimes.

But the unit tests were just things. They were called unit tests because there was a unit of behavior. They were testing behavior at the interface of a function, no matter what the dependencies were.

There were people around, like test specialists that didn’t like them calling it unit tests, but at first they get too much voice in it. But in the early 2000s, the notion of isolating functions and their dependencies started to get thrown around again. I think this is mostly by TDD consultants, I think because I don’t think people like they’re really developing real systems would look at that problem and think, yeah, throwing a ton of mocks in there is a great idea. I just don’t get it.

The isolated unit test ideas described it as mockest TDD by Fowler, and the Sociable notion is the classicist TDD. I don’t know. Those are just two terms you might run across.

How about Gary Bernhardt’s idea of changing the design to make Mocking quote easier, I guess.

Here I’ll direct you to David Heinemyer Hanson and his great paper, Test Induced Design Damage, and also his 2014 Ruby on Rails talk, where he talks about the usefulness of high level tests and the danger and harm of unit tests and discourages changing design for the sake of tests. I quote above all, you do not let your test drive your design. You let your design drive your tests kind of says it all, but he goes through and talks about it more eloquently. I do encourage you to watch the video, too.

There’s a lot of great stuff there.

But the paper also describes how he recommends testing MVC style applications, which are what a lot of like that would be because Ruby on Rails is kind of for Ruby. It’s kind of like the I’ve not used either of these too much. I haven’t used Ruby on Rails at all, but I’ve heard it said that Ruby on Rails is similar to Django in the Python world, but yeah, take that for what it’s worth. Anyway, let’s call that one done and move on to other things.

Before we get to the next two items of misinformation, let’s thank Pantheon for sponsoring this episode.

Pantheon Makes building, managing, and Optimizing Website Simpler as the leading Web Ops platform, Pantheon features the fastest hosting on the planet, automated workflows with Dev, test and live environments, plus professional development tools and easy integrations with GitHub, CircleCI, Jira, and many more.

Stop putting out fires and start building sites and experiences that deliver results by using Pantheon, rated as one of G two crowd’s top ten software products of 2019.

Get started for free at Pantheon IO Test And Code. That’s Pantheon IO Test And Code.

Next item is the notion that tests have to be blazing fast or they won’t get run. This is just plain outdated. It doesn’t apply anymore. Back in the day, when all this X unit stuff started in the 90s, we didn’t have continuous integration servers running our tests. We didn’t have easy branching in the early revision control systems. We didn’t have build and deploy pipelines. The idea then was to create test suites that could be run by developers before and after they modified some code, mostly after. This is often tied.

This was often tied into compile steps by putting test runs the code to run the test actually in make files or something like that. But the idea was to run all of the tests before checking in your code to make sure that and because once you checked it in, if it was broken, you would affect everyone else on the team. There was no branches or merging. It was just in there, and everybody else would be affected. In that situation, tests really needed to be pretty fast. If they were slow, developers wouldn’t run them. That’s right. Running test was optional. Developers could choose to not run the tests. We don’t live there anymore. We’d like developers to run tests before they commit their code, but we don’t trust them to. If they don’t, it only hurts them. We can set up continuous integration to not allow their branch to be merged until after all the test pass and the code coverage numbers haven’t dropped. And we can set up multiple stages of testing. Of course, we still want tests to be fast. Slow is pointless, but how fast? Martin Fowler and David Heinermier Hanson both described tests you run during code as a subset of the whole suite. And this subset should be focused and fast and doesn’t have to be super thorough. But still, a few seconds is probably fine. Maybe even 30 seconds. I don’t know. This is why you’re developing. Maybe even a minute. And then before committing your code or pushing or whatever, you run a commit suite. This can be longer. Martin and Kent Beck both talk about shooting for ten minutes for the pre commit tests. That might be too painful for your team. It might be fine for my team. I’d like to have about three or four minutes. This is plenty of time for somebody to kick off the test, go walk around, get some coffee, come back, and it’s just a little break. And the test and code. I don’t know what your team is like, but I don’t think it has to be super fast. I think ten minute Mark sounds fine. And then after. But even if you don’t have that after you commit your code, maybe you could just wait, put it up in the CI server after you commit and push. The CI server is going to run all those tests again anyway, and possibly even more tests, and it will still be on your branch. And before your branch is allowed to merge with Master or with the main, or with whatever your normal code is merging into, it’s got to pass all the tests. And many teams also have code reviews that are needed at this point, so a little bit of delay and testing on the server isn’t going to kill anybody. And the build pipeline could also include longer tests like running on different operating systems, or different hardware, or different versions of dependent systems. Apache Beam even includes in their project what they call post commit tests, which are really post merge tests. And what that does is runs most of the tests or some of the tests before the merge happens to make sure sanity checking, and then after the merge happens, run even more tests, and then they have some protocol for what happens if those tests fail. It’s an interesting idea, but anyway, our modern continuous integration pipelines and easy branching didn’t exist in the 90s. We can’t keep using the unit test advice from the 1990s. We have a different world now. On a related sub note of misinformation is that there’s advice out there that says unit test shouldn’t touch the hard disk or the database because it’s too slow.

That’s just not true anymore.

I’ll link to David Heinemyer Hansen’s article called Slow Database Test Policy. Our systems now have SSDs, speedy CPUs, tons of Ram, and our databases are way faster, and databases are even set up to be designed to allow testing easier. You can run test databases with Caching even, and even in memory that they don’t even write to disk. All of Base Camp, for example, is tested in a few minutes with no mocking of the database at all. I’ll end this section with a reminder that jumping through Hoops to make your tests faster also make your tests harder to read and maintain, and is a form of premature optimization. In the words of Donald New, premature optimization is the root of all evil. Actually, let’s quote more. It’s a good quote. Programmers waste enormous amounts of time thinking or worrying about the speed of non critical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintaining are considered, and your tests have to be maintained and debugged. Also, we should forget about small efficiencies, say about 97% of the time. Premature optimization is the root of all evil, yet we should not pass up our opportunities in that critical 3%. So use profiling. Figure out which tests are the top slow tests of your suite, and you maybe speed those up. But yeah, speeding up all of them just by mocking up the database or something is silly. It’s just going to make your test harder to deal with. My voice is starting to go out, so I’m glad there’s only one more and this one’s pretty short. This last one is TDD is about design, not about testing. Yeah, whatever. Of course, this is totally opinion, right? This seems like opinion and there’s no way I could argue fact versus fiction on that. But let’s go to the source. Tester and development is known as tester and development or TDD now because of Kent back and Kent has the can you sleep at night test? What is that? So Kent back emphasized at the end of the day, as developers and programmers, it’s our job to make sure we sleep at night knowing we didn’t break anything. This is one of the main purposes of an automated test suite, regardless of whether it has been written in a test first way or after the code. The objective of the test is to make sure things are working and that the new code doesn’t cause any problems with the existing code. If Kent back was thinking that TDD is about the test checking that your code works, I feel pretty good about agreeing with him. I’m also going to link to an article I wrote called my reaction to his TDD dead. That links to some great discussions as well as lays out my ideas of test driven development. That seems like a good place to stop. Let me know if there are other testing Dragons you’d like me to take on.

Thank you to Pantheon for sponsoring this episode. Pantheon makes building, managing, and optimizing websites simpler. Check them out at pantheon. Io Test And Code. That link is also in the show notes at 79 I’ve also included quite a few links of the different items I’ve discussed in this episode.

I’d like to also thank my Patreon supporters and the people who support my work by purchasing the Python testing with pytest book. I got a really nice note just this morning and the note said hey, I just wanted to let you know that I’m rereading the Python’s book. There is so much that I missed the first time around. It’s really excellent. That was from Ruvin Lerner. Thanks Ruben, but that’s all for now. Go out and test something.