unittest is an xUnit style framework, which makes it fatally flawed for software testing. This episode is my opinion of “Why NOT unittest?”, or more broadly, “What are the fatal flaws of xUnit?”
Transcript for episode 173 of the Test & Code Podcast
This transcript starts as an auto generated transcript.
PRs welcome if you want to help fix any errors.
00:00:00 In the preface of Python Testing with pytest, I list some reasons to use by test under a section called Why pytest.
00:00:06 Someone asked me recently a different but related question of Why not Unit Test?
00:00:12 Unit Test is an X Unit style framework. For me, X Unit style frameworks are fatally flawed for software testing.
00:00:19 That’s what this episode is about. My opinion of Why not Unit Test?
00:00:23 Or more broadly, what are the fatal flaws of X Unit?
00:00:29 Thank you, PyCharm for sponsoring this episode. Pycharm saves me time. Error highlighting lets me know as soon as I type a mistake.
00:00:37 Code completion and parameter hint pop ups save me from having to look stuff up.
00:00:42 The Get integration makes all of my Get workflows super easy.
00:00:46 And then you get a whole bunch of bonus features like the database front end that are super cool and a super fast debugger.
00:00:52 I wonder how PyCharm will save you time. Find out yourself at testandcode.com. Pycharm welcome to Test and Code.
00:01:25 In the preface of Python Testing with pytest.
00:01:28 I list some reasons to use pytest under a section called Why pytest. By the way, the second edition is now content complete. I’ve incorporated feedback from many technical reviewers and updated just about every chapter. Now it’s off to copy editing and indexing. The print version hopefully will be available in February. Very exciting. Anyway, back to Why pytest. This is what I say in the book. Here are a few reasons pytest stands out above many other testing frameworks. Simple tests are simple to write. Complex tests are still simple to write. Tests are easy to read. You can get started in seconds. You can use assert in tests for verification, not things like self assert equal or self assert less than just assert. And you can use pipest to run tests written for Unit Tests or Nos. Someone contacted me after reading this and said, this seems like the second half of the story, the first half being Why not Unit Test? Like I had done an evaluation of testing alternatives and come to the conclusion that pytest was best, but they also wanted to hear more about what didn’t I like about Unit Test part? I think that’s fair enough. I decided to leave it out of the book because in the book I don’t really want to Slam Unit Test, but I think it’s fine to talk about it on the podcast. This episode is the response. This isn’t going to be a treatise on why you shouldn’t use Unit Test. If Unit Test works for you, great. It just doesn’t for me. But it’s not just Unit Test, really. It’s the X Unit system altogether that I think is fatally flawed.
00:03:12 Unit Test is based on J Unit, which is for Java. J Unit was not the first of its kind. Guest Unit for Small Talk started all of this, and I think Kent Beck was the one that did both of these. I think he had some help with other people.
00:03:27 Now there are frameworks like JUnit for tons of different languages, and they are collectively known as X Unit.
00:03:36 There are some old references to Unit Test being referred to as Pi Unit even in the documentation for Python 2.7. However, as far as I know, the module has been Unit Test since it’s been part of the standard library in Python 2.1, and I don’t hear people talking about Pi Unit that much anymore. However, Unit Test is still an X Unit style framework. X Unit style frameworks were all that I’ve known up until the time I ran into Pi Test. I tried Unit Test actually around the same time I tried pytest around 2012 or 2013. My first blog post about Unit Test is from January 23, 2013, just a couple of weeks before my first Pite Test blog post. But I’ve used several other custom built X Unit style frameworks. Even way before then. There were internal tools at companies.
00:04:34 A couple of the versions were Pythonbased Xunit tools, and one was written in C Sharp. So what is X Unit? I’m going to describe it from the perspective of someone writing tests specifically with Unit Test, but I think it applies to most X Unit based systems. To write a test or to use it, I create a new file and I put my test in it. Maybe I call the file testunderscoresomething Pi in it. I’ve got an import Unit test at the top. Now I create a class, say test my stuff, and I derive that class from Unit test test case. This is common amongst all of the frameworks. You got to derive from a special test case.
00:05:16 Inside my test class, I’ve got to have test methods for things that I’m testing. And these test methods, since they are methods of a class, they have to have a first parameter of self, and then within the methods I’m going to check some things. And to check things, those checks are done using things like self assert equal and self dot assert true, etcetera. A whole bunch of these assert methods that Unit test provides that you can reference through the self object.
00:05:47 Now let’s say I’ve got three test methods in the class. I can write more and I can write more classes in the file.
00:05:56 They’re kind of a way classes are really a way to organize tests. So I can organize them in a whole bunch of classes if I want or different modules.
00:06:04 And then at the end of the file it’s traditional to stick a couple of lines at the end of the file, like if Dunder name equals Dunder main, call unit test main. These are actually optional now, but I still see them in a lot of documentation. What these lines do allow you to do is they allow you to run the test file in the command line with just Python test something Pi.
00:06:32 Without those lines, you can still run your test, but you run it with Python M unit test test something. So you have to call the unit test module with dashmunit test and then give it your file name. Okay, so right now already, there’s a lot of stuff some people don’t like about unit test. It’s not the focus of my problems with it, but let’s list a few of them. So far, you have to derive from a class, the unit test test case and have to have all test code in a class.
00:07:07 Okay, that’s part of what I don’t like also, but there’s more to the story. So pytest with pytest you can use classes, but you don’t have to, and you don’t have to derive from anything. The naming conventions of unit tests is in Camel case. They got this from the Java version, not the normal snake case of Python. That’s actually a minor gripe for me. It doesn’t really bug me too much, but bugs a lot of people. pytest is closer to modern Python conventions. And then there’s all of the assert methods. All the assert methods are hard to remember. It’s a small annoyance to me, because in reality you end up using a handful of methods frequently, and you have to know where the list is if you need to use it later on and look for something else.
00:07:57 A lot of times it’s just assert equals is what I want to use. But equals is not just one. It’s not just assert equals. You have assert sequence equals and assert DICT equals and assert list equals and a certain set equals. And there’s just all these different types of equal. Why can’t we just have one? But the other ones, like for instance, list equal, will have a better output for lists that are not equal. And the reason for all of these helper functions are so that it isn’t so that you can’t compare with just a normal assert, which you can actually, but you just won’t get good output. It’s the output. So if the assert fails, you want to have explicit output as to what’s gone wrong. And that’s where all these helper functions do. And then it’s the Dunder name equals Dunder main thing at the end. Some people say it’s boilerplate. That’s unnecessary. And it is you don’t have to have it, so you can just take it off if you don’t want. So that’s some of the gripes that I’ve heard about unit tests, and they’re just minor things for me.
00:08:59 But let’s go on. Before I get into my fundamental complaints, let’s also talk about the setup and tear down procedure for these tests, because if I want some code to run before each test in a class, I can write a set up and tear down methods. Okay, once again, one of the things that bugs me is this is Camel case again. So set up isn’t just set up, it’s set up with a capital U, and tear down is with a capital D and no underscores.
00:09:25 The set up runs before each test method, and the tear down runs after each test method. And then I can connect to a database or grab some data or something. But okay, then if I want to have data in my setup and I want to pass that to a test function, I’m going to need to store it as a member variable, like say self DB or self some data or something. Now if I want the data or connection to persist for more than one test, not just the set up and tear down, I can do set up class and tear down class, and that allows me to share data between multiple tests.
00:10:04 Like I can keep a connection to a database or a service open.
00:10:08 Except that those are not test methods, those are class methods. And the data stored is not self dot something. It’s like class dot something. So you got to store it as class variables instead of instance variables. Okay, so remember that class. Also, because it’s a class method, you have to remember that class methods in Python need to be decorated with the Atlas method decorator. Okay, also what if you have related tests in a class but it doesn’t need some resource? Well that’s too bad. These are like broad scale things. Your only option if you want to run a test without the resource being initialized for a particular test is to move that test to a new class because these are big Hammers.
00:10:55 So why are set up and tear down normal methods instead of class and tear down class class methods?
00:11:02 Now we’re getting at one of the fundamental problems of X Unit frameworks.
00:11:07 So my mental model of how unit Test and other X Unit system deals with my test class turns out to be wrong. So my incorrect assumption was that my class gets instantiated into an object and then unit test calls setup class and then it calls set up and then like say test one and then tear down and then set up and then test two, and then tear down, etc. E. And then the tear down class if it’s available.
00:11:36 The setup class and tear down class are only called once for the whole class, and the setup and tear down are wrapped around each test function. This is my mental model of how this works, but it’s wrong. So what actually happens is unit Tests will look at how many test functions there are. So in my case there’s three. So it instantiates three instances of the test class. So I’ve got three test objects and then it runs the class method setup class.
00:12:07 But it doesn’t matter what objects. It could have done that before, but it waits until after for some reason. And then each of the test objects have their set up and then one of the test methods and then the tear down called and then the class method tear down class is called. So that’s why you can’t have set up class and tear down class be normal methods, because unit test really doesn’t want you to try to pass data through instance variables because you’ll have three instances. You can’t pass it between those three anyway, and set up and tear down or pass you can pass the test through instance variables because those are run on the object. So that’s also why setup and tear down methods are three objects, and they all get their own instance data. The reasoning behind all of this is to encourage test isolation, I think, but it’s a little bit of a head shift to try to get this understood. To me, at least, it was a little strange.
00:13:06 Now that we know what X unit is and how it works, kind of. Let’s get to what I think are the fundamental issues with Unit Test and Next Unit. So my fundamental issues are that the class inheritance and class structure thing is weird.
00:13:20 The class object model is used for test isolation and for test function organization, and that’s not normal object oriented program reasons to have a class, having to have the correct mental model to understand why class methods are needed for setup class and tear down class, and why you use set up and tear down or not their instance methods.
00:13:45 It’s annoying to have to keep that straight. And a lot of people that use Python never write their own classes. So forcing them to learn how to use classes just for test code. I think that seems mean.
00:13:58 In fact, I tried to teach somebody an ex unit based system before way before coming up with this, and they made a comment of and I didn’t even think of it as being a problem.
00:14:09 This is how you do it. And they said, okay, they took a deep breath and said, I’ve never been really comfortable with object oriented programming, but I’ll go read up on it and read up on classes and then come back to this. And I had to let them know that mostly it was bookkeeping stuff and you didn’t have to learn object on your program to use test classes. But that’s the fear for a lot of people, and I think it’s a valid fear. There’s a lot of code that people don’t write their own classes. I think that’s valid.
00:14:43 Okay, so when you’re writing your test functions, you have to remember that it’s a method of a class. So every method needs itself parameter, even if you don’t need it. If you don’t grabbing data, and if you’re not passing data between setup and your test functions, then test classes really don’t help you much, but you have to have them anyway.
00:15:04 Also, I think just that classes and objects are a weird thing to use for a mostly procedural problem anyway. That’s my belief. I don’t think that class inheritance and class structure things should be necessary, and specifically having to use a specific test case to derive everything from it’s. Odd.
00:15:23 Anyway. Okay, the second big fundamental problem is resource handling through setup and tear down. This is error prone and clunky.
00:15:32 One of the things to remember is that tear down doesn’t get called if the setup fails. So if there’s an assert or an exception thrown and set up, the tear down won’t get called. What does this mean? So if you have two resources set up and set up. So let’s say you have like a connection to a database and a connection to an API endpoint.
00:15:54 If you have two research set up, you have to make sure that they get cleaned up properly, and that clean up has to be both in the set up and the teardown function.
00:16:02 You heard that correctly. It has to be in the set up also. So let’s say you have your two connections, an API endpoint, a database connection. If the database connection fails, then you have to remember if you set up the API first and then you set up a connection to the database, but the database fails, you have to remember to go back and tear down the API connection in the setup function or else it won’t get torn down. It’s just the way it is. If you have a whole bunch of things you have to connect, you have to make sure they get cleaned up in the setup.
00:16:32 And then my third gripe is kind of a mixture of these two things, mixing the two classes and resource usage because of the granularity of setup and tear down and set up and tear down class.
00:16:46 It’s weird. I don’t like that. Having to organize your functions, organizing your functions into classes. If that’s what floats your boat, great. But having to do it based on what’s similar in testing and what similar features, that makes sense. But organizing them just around what resources they use, that’s a little clunky. So if you want to, let’s say set up a database connection in the class set up. You need to have all of your tests be part of that class or it’s not going to work or you’re going to have multiple if you have multiple classes, you’ll have multiple connections to the database.
00:17:29 So organizing similar test functions into classes might be handy. Having to organize your test functions together to maximize use of shared resources is weird. And having to break apart test functions because they use different resources, that’s also annoying and weird. So how does pytest fix this? In Pytes, you don’t have to use classes. Just using functions is fine. No self is needed. You can use classes if you want. You don’t have to. Resource set up and tear down can be in a fixture, and it usually is in a fixture. So resources depend on other resources. I mean, fixtures can depend on other fixtures. So let’s say I’ve got a database connection.
00:18:09 You can set it up for any test that needs it, and it can persist for a long time. I can set the scope to be session scope or module scope or whatever.
00:18:19 And then I want to like, let’s say, but I’ve got this. It’s a temporary database, but I want to reset it to a known state at the beginning of each test. That would be easy. So I’ve got to reset the database fixture. That depends on the database connection fixture, and then the test can just depend on both of those are one of them or something, and it works. The session and package and modules, module scope for the connection and the function scope for the reset. They just all work.
00:18:50 And you can even have multiple layers of this stuff and it doesn’t get too confusing. So what about the tear down having to have the tear down everywhere? Well, with pytest fixtures, if you put a resource in handling in a fixture, you’ve got the set up and the tear down in the same function. It’s just separated by a yield statement. It looks weird at first, but it’s really handy to have the set up and tear down together. And then let’s say my connection to the fixture set up for the API endpoint that succeeded. It’s going to be there. How about the database? If that connection fails, I’m never going to run the test, but the tear down for the API call is always going to get called anyway, so you have less to care about there. It’s nice.
00:19:38 So you can still use classes to organize your test if you want to, but you don’t have to. And you can have individual tests that need no resources in the same class as a test that needs several resources. And if you only run that one, you won’t initialize the resources. That’s nice. If it makes sense for your test organization to keep things together, go for it. Also plain asserts rock instead of all of these assert methods having just assert. And so how does Python do this magic? Well, it rewrites the asserts into it has all those helper functions also, but you just don’t have to know about them, you don’t have to look them up. It looks at your arguments and says, oh, you’re comparing two lists. I’m going to give you the equal method that’s optimized for list comparison, and it just does it behind the scenes. I love that. So there’s also like parameterization that’s pretty amazing and plugins and really easy ways to filter test base on markers and command line flags, et cetera. But that’s a whole bunch of other reasons why I love Pi Test. That’s not around the reasons why I don’t like Unit Test and X Unit. I’ve never really told people the reasons why I don’t like Unit Test and Code X Unit Frameworks, so I’ve already kind of covered it in this episode.
00:20:57 As a recap. What are the reasons the fundamental flaws that I think are the class model is annoying and broken for the problem it’s trying to solve and organizing and running tests and setting up resources. I think the class model is overkill and annoying. So that’s number one. Number two resource handling through methods set up and methods set up and tear down and set up class and tear down.
00:21:23 It’s just weird and it’s problematic and then you can’t change resources very easily like you can with Python fixtures. So those are the fundamental problems I see with X unit style frameworks. However if you don’t have to set up data or resources in your tests, if you are just using test methods and you’re not using setup and tear down or maybe you only have one resource to deal with and mostly all the tests needed. Well in those cases unit test might work fine for you as long as you’re okay with writing classes and writing self and all your methods and keeping a list of assert methods bookmarked and handy to look up fine but I think for projects that have multiple resources or resources with state requiring levels of state reset I don’t think I would want to work with X unit systems like unit tests again for complex systems. I think it’s just fundamentally broken for those problems and I don’t think it can be fixed. I could be wrong. Let me know if you think I am but pytest is cool and it’s the only test framework I’ve encountered that thought of the problem differently. Most test frameworks are this X units style. pytest has fixtures for dealing with resources, set up and tear down of resources in the same function. You don’t have to remember to clean them up. I also like not having to teach people about class methods and object oriented stuff just to teach them how to write a procedural test code. I like pytest elegance pytest fits my way of thinking and working way closer than any test framework I’ve used before.
00:22:59 I guess I’ll end it there.
00:23:05 Thank you PyCharm for sponsoring this episode. Visit testandcode.com/pycharm for a four month free trial of PyCharm. Save time, use Pytran. Thank you Patreon supporters join them at testandcode.com support. That’s all for now. Now go ahead and test something.