In this episode, I talk with Niklas Meinzer about techniques to speed up test suites discussed in a recent article he wrote and how they can apply to just about any project.


Transcript for episode 85 of the Test & Code Podcast

This transcript starts as an auto generated transcript.
PRs welcome if you want to help fix any errors.


Fixed software testing strategy is one of the best ways to save developer time and shorten software development life cycle time. Software test suites grow from small quick suites at the beginning of a project to larger suites as we add test and the time to run the test suites grow with it. Fortunately, pytest has many tricks up its sleeve to help shorten those test suite times. Nicholas Minzer is a software developer that recently wrote an article on Optimizing Test Suite. In this episode, I talked with Nicholas about the optimization techniques discussed in the article and how they can apply to just about any project. Thank you to Azure Pipelines for sponsoring this episode. Many organizations and open source projects are using Azure Pipelines already automate your builds and deployments with Pipelines, so you spend less time with the nuts and bolts and more time being creative. Get started for free at azure.com pipelines.

Welcome to Testing Code, the podcast about software development, software testing, and Python.

I am super excited to have Nicholas Meitzer here on Testing Code. We tried earlier this week and for some various internet reasons didn’t work. So we’re crossing our fingers that today will be great. So welcome to the show, Nicholas.

Thank you.

For people that are not familiar with who you are, could you introduce yourself?

Yeah, sure. Hi, my name is Nicholas. I’m a developer from the south of Germany Frywork.

I work in medical it and yeah, we make a web app and it’s a Python based web app. That’s right. Yeah.

Okay.

We use Vexyc, which is the Http stack of Flask and basically we build our own stuff on top of that.

That’s probably an interesting topic in and of itself.

Yeah.

That’s actually the first topic I ever gave a talk on.

I think there’s still a recording somewhere out there.

Talked about Vector Python 2015, I think.

Okay. Yeah.

If people want to Hunt that down.

Yeah. And you’ve spoken at a couple of Euro Pythons, right?

I think justice one and then a couple of Python de.

Okay. I didn’t know there was a specific Germany related PyCon.

Yeah, you can’t really make out a difference. Pikon is all in English as well and it’s basically just the same people.

Basically, we just find excuses over the years to meet and have a good time.

Yeah. I got to try to figure out ways to get to more Pythons. So how long have you been involved with Python?

I think about six or seven years. Basically. When I first joined my company, I was still in University and I had touched with Python just like just to get my feet wet. But really I was more of a Cy. And then this company posted an ad on a student mailing list looking for student assistant who is proficient in Python and web applications. And yeah, I answered and said, I can do none of this, but I’m very interested. And that’s how I ended up there.

I love that. That’s great.

Yeah, I learned everything there.

I’m sure I can learn it, but I don’t know it yet.

Exactly.

Nice.

Bring the spirit and learn the skills along the way. Right.

I think more people should do that. That’s great. And then pipe test specifically. Have you been using that for long?

Yeah, we haven’t. Using that from the pytest two days.

Are we now on five? I’m actually forgetting. What is it?

Five? Yeah, we’re up to five.

Yeah. There you go. I don’t know if they follow any specific scheme with that, but it feels like they’ve been taking that number up faster in the last couple of months.

Well, there’s been a lot of excitement around it. There’s been people using it for all sorts of stuff.

And so one of the things I wanted to have you on anyway, but you wrote an interesting article about optimizing a test suite, and oddly enough, it’s not something people talk about that much.

Yeah, that’s right.

I was a bit surprised myself when I basically went to Google and tried to find out how to improve a test suite.

We have a test suite, and personally, I felt it took too long. Maybe we can have a discussion about this separately. And I think you had an episode recently where you basically said that the speed of a test it doesn’t really matter anymore. Right.

And I actually listened to that at the time when I was researching this, and I thought, wait, am I totally wrong here? But I think basically you can look at it from different angles. Right.

Well, I was going to bring that up at the end, but let’s go ahead and hit that.

Does the speed of a test suite matter?

Clearly it matters. But why does it matter to you?

Yeah, well, to put specific numbers on it. I think our test suite, when I started this project took about five to six minutes, which many people will say it’s not very long. Right. So I’ve been working on some open source projects which have test suited that run 20 minutes or half an hour or something, but I don’t know for my personal taste and my team members agree. And we just like to run our test suite very often. We like to practice TDD, and when you’re working on a feature, then you want to get immediate feedback. Basically, you want to see do they destroy anything?

Where am I in the development cycle? So you want to run your tests often and you don’t want to wait for them. And another thing is when we have the test running in your continuous integration environment, then, yeah, you can post a PR and do something else and come back later. But when you really want to get something merged and you don’t want to sit around just waiting for the green tickets to arrive.

Yeah.

Okay. So on my defense on the line being okay, with long test suites, it’s more on the side of I want to make sure that we’re testing thoroughly, and I want to make sure that people time is valued more than computer time.

And for certain situations, I mean, I would just love to have a five or ten minute test suite that would be awesome.

And in some cases that’s unacceptable. If I’m working on a small Python library, I mean, that’s ridiculous to take that long.

But if I’m working with hardware and software together and test signals, it has to take a while. The speed is definitely still important. For instance, let’s say I’ve got a test suite that runs for half an hour. If it’s only running at night or running a few times a day, there’s plenty of half an hour slots. Let’s say we’re okay with half an hour, but then I add hardware. Now we’re supporting four different kinds of hardware. If I can run them in parallel, great, then I can just run them all at the same time. But if I can’t run them in parallel for some reason, then it’s 2 hours instead of half an hour. So test suite times are very important. But one of the things we do since we do have to deal with long test times is focus in like when we’re developing a particular feature, we usually don’t run the whole suite. We run like a subset.

Oh yeah, okay, that makes sense.

So do you do any of that while you’re developing or since you’ve got this down to so fast, you just run it all the time? Everything.

When I’m developing a specific feature and I don’t only run the tests that are relevant to it. So I do use filtering, obviously.

And yeah, you can obviously do much more with Pythons. You can group stuff with marks and things like that and basically have something that there’s a specific term to call it. So tests that you run on your own machine and test that only run in CLI later.

This episode of Test and Code is sponsored by Azure Pipelines. Azure Pipelines is a continuous integration and continuous delivery service that supports Python and any other language and lets you run automatic builds and tests of your code on Windows, Mac and Linux. It is fully integrated with GitHub and lets you define your continuous integration and delivery pipelines with a simple YAML document. Azure Pipelines is free for individuals and small teams. If you are maintaining an open source project, you can get unlimited build minutes and ten concurrent pipelines. Many organizations and open source projects are using Azure Pipelines already. Get started for free at Azure. Compipelines and automate your bills and deployments with Pipelines. So you spend less time with the nuts and bolts and more time being creative. Again, that’s Azure.com Pipelines.

I like that you write up this and you kind of picked up like three different areas that I think apply to any test project. One of the first things is to pay attention to the collection time. Can you describe that a bit?

Sure. So that’s something that is especially annoying when you’re developing. And even if you only run one or two tests, Python still has to collect the whole test suite. And that’s done by importing all of the modules that you pointed to. And yeah, basically, depending on what you pointed to, that can take a long time. And we initially had that pointed at our roots directory. So Python really went and imported all of our basically our entire repository to find all the tests. And that took a long time. I think it was about 10 seconds. And that doesn’t sound very long, but it’s very annoying when you just want to run a test over and over again. Yeah, it can add up.

Basically the Pipe test feature, there’s a collection only thing, so you can just tell pytest, don’t run anything, just collect stuff. Did you use that to time how long that took?

Yeah, that was one of the interesting command line flags for Python for pytest that I discovered along the way. So you can, as you said, tell Pipe is to just collect the tests, show you what it’s collected, and then be done with it, and then you can wrap a time around that call, and then you know immediately how long your collection time is.

One of the other things that some people have done for startup time for pytest is to try to limit the number of plug ins that are even the built in plugins that are being used. Have you explored that?

Oh, no, I didn’t actually. Okay, that’s probably a good idea. And I was also wondering if maybe there was some way to cash this collection, because I would think that you could look at the files and what has changed and maybe somehow figure out which files you wouldn’t have to import again, I don’t know if that’s maybe something a future feature of Pytos. I don’t know. Yeah.

For large test suites, that might be an interesting thing. So you picked up on and it’s a simple thing is that a lot of people start pipest and talks, does this by default to just start pipest at the root directory of a project. And that includes not just your test, but everything. That’s a really quick, easy, simple optimization is to just add the appropriate setting in the any file to say, hey, just test this directory or list of directories and files.

That’s right. So the easiest thing to do if you don’t have an inifile, it will just take the current directory or whatever directory you give it on the command line. But most projects will have an in file, and then you can give it a list of directories of file to importance. Right. And that’s what we did. So, yeah, we pointed it directly at our test directory, which sped things up to about 3 seconds or 2 seconds. Then we had a problem that we were missing some tests and those were actually dog tests from a couple of our modules which still had doctors or still have dog test and code can tell Pi test to also import and run dog tests. But for that, you obviously have to look at the modules where the dog tests are. And what we ended up doing is just giving these two or three modules that have dog test directly to Pythons as well.

Okay.

Which is sort of a compromise.

Yeah, that’s nice. I like it. One of the other parts that you talked about was set up and tear down optimization. And I know that most projects could take a look at this. In your article, you discussed the database connection. What were the fixtures like before the optimization? And what did you do to fix that?

Right. So obviously as a web app with the database connections, we have a lot of many of our tests use a test client with the database connections to basically mimic Http calls to it, which is very nice to program, very handy. And you can write tests that are really close to what you will then later be doing in production. But obviously this way you have a lot of boilerplate that needs to be run for each test. So basically we had I think it’s called a client fixture that tests could request and then that would basically create the complete database for them. And we use a SQL in memory database. So we use SQL. Alchemy for the RM. It’s really cool because you can just use a secret in memory database, which is really fast and it’s easy to throw away after the tests. But even this fast operation, if you do it a thousand times, it still accumulates, right?

Yeah.

And so what I then wanted to do is to look at a way where I can for the whole test session only once created a database, keep it, and for each test just clear it. So each test still gets a fresh database, but it doesn’t have to be created for each test.

It was just cleared. And the rationale behind it, this is obviously that data. This is just me assuming things. But I guess database engines are optimized for stuff that they do often, which is for example, deleting stuff or inserting stuff, selecting stuff, but creating tables, that’s not really an operation that you do often. So I guess it’s not as optimized. That’s just some assumptions there. But yeah, basically what I did is I split the split the picture up into two fixtures. One, that’s just the database engine, that’s a session scope fixture now.

So the database engine will only be created once per test run, per test session. Basically it creates all the tables and it’s done there available for the other fixture, which is the client fixer still. And it just uses this database engine and just clears all the contents before it yields, basically.

Nice. Yeah. So the clearing of it can happen before every test, but you only set up the database at the beginning of the test suite. That’s right.

And this is, of course, some trial and error to figure out which is the quickest configuration in which you can do this. And really, if you just tell it to drop all the tables, there’s a little hack you can do for SQL. You can just disable all of the consistency checks so you can drop the tables in whatever order, and it won’t do any consistency checks after each drop. That speeds it up a little. So you disable all the checks, drop all the tables, and enable the checks again.

Okay.

And then you’re good to go.

Now, I don’t know if Sqllight allows it, but some people also do this double level thing with transactions around a test. You can start a transaction before the test runs and then roll it back after the test. Yeah, sounds interesting, but it really depends on what your database functionality allows and whether you’re doing. If you’re doing transactions within the test, then that won’t help you much.

Probably not. Yeah, we’re doing transactions.

I guess the RM does it.

Yeah. But the simpleness of just starting with an open database, it was like the best example I could come up with for using multilevel fixtures. It’s used all over the place. I do that with test instruments, the connection to a device and making sure that certain firmers loaded, and all sorts of extra work that has to be done can be done once. And then I know that the code within the test is not changing certain things, but just right before a test, a lower latency call that’s just to get us back to a reset state. And yeah, it speeds the things up a lot. So, I mean, this is great for databases. It’s great for really any resource where you can set the connecting or setting up of a resource, but you don’t have to do it every time. So that’s cool. I like it. Yes.

Python Pictures is a mighty tool, and scoping can be a bit confusing. I think if you start out using pictures of different scopes that use each other, and you must be careful in which order you use the stuff.

Yeah. Function scope can use a module scope, but not the other way around.

It makes sense if you think about it, but sometimes you just have to do some extra thinking how you want to structure your stuff.

Yeah. One of the things that people run into right away is using temp directories, because the default temp directory fixtures are function scope, which you kind of want them to be. It’s a temporary directory for your test. If you’re using that in a fixture, that’s module scope, you suddenly have to use a different fixture. Because of the Scoping rules.

There you go. Is there an extra fixer for the other scopes?

Yes. There’s a temper factory and then there’s a temp path factory. That’s a session scope. You can have temporary directories, be pytest, native objects, or if you want to use path Lib, there’s one for that too.

So it’s cool.

Now the last thing I wanted to talk about was you have this story about PDF reports.

That’s right.

So what happened with the PDFs?

So that’s another interesting thing that I noticed that took some time.

Basically our software spits out PDFs on the site whenever interesting things happen. It’s a medical software and doctors, hospitals, they want records of everything. So it’s baked into the software that PDFs are created for certain events when documents within the software change state and stuff like that. And it’s so back then that you can sometimes you test something completely different and you change the state of a document and PDF comes out, you don’t even notice. And for the creation of the PDFs, what we use is latex or latex or whatever you want to call it that thing.

And we use that with an external call to open call process call.

Yeah. And obviously that takes some time to open the call and let the PDF be generated and come back to mitigate that.

Basically you can mock this out in every test.

But I wanted to find a way to do this to mock it out automatically without having to change all tests, but sometimes be able to not Mark it out. So there are obviously a couple of tests where you want to test this kind of functionality and then you want to tell it to not knock it out. So yeah, it’s in the article I posted a little code snippet. So what I use is a fixture. It’s set on auto use and it goes into the module that makes the poem call and marks it out.

I really love this example and I like that you set it up as an auto use. I think it’s a good use for an auto use fixture. But then you also pay attention that there are cases where you’re not going to want to have it happen.

So there are certain tests that if a test is really testing that PDF functionality, then you can Mark the test with don’t mock the PDF output or something like that.

Yeah, exactly.

That’s a great way to get around it, because my first thought is to only Mark it when you want to Mark it. But then that’s most of the tests and flipping that logic of just marking the tests that need it to be not marked. I think that’s a great idea. This is of course helpful for people that are printing PDF reports using this tool. However, I think it’s an interesting idea just to think about the code under test what you’re testing. There might be some expensive operations going on that for a lot of the tests don’t have to be done because that’s not the focus of your test and looking for ways to stub out functionality.

It does change. It changes the system under test, but it also speeds up the testing quite a bit. It’s pretty cool. Yeah.

And you can think about different applications for this, I don’t know, making calls, external services or something that just want to block for the whole test suite and don’t want to think about it for every test.

Are you happy with how fast your suite is now?

Yeah, I’m happy with it. And I’m definitely also even more happy about the things that I just learned on the way when you have the time. And I think I worked on this for on and off for about two or three weeks. And if you have the time to just dive into these things, that’s really fun and understanding that you have after this.

Well, you probably have a really good understanding of what everything in your test suite is doing and probably understand your system under test better also.

Yeah, definitely better than before.

Yeah. I love that you wrote about this.

I’d like to have more people talk about this. We had this gnarly test thing or even this mildly annoying test set up that we sped it up using this one method. And I think it helps other people. So if anybody out there listening has some other cool examples, write about it. And even if you don’t write about it, contact me and maybe we can get it on the show. It would be cool. Wouldn’t you like to have more people talking about this like you said?

Yeah, definitely. I mean, we had a good Twitter feed going there with many people jumping in, suggesting stuff, thanks to everyone who participated in this. I think I got all my starting points from my investigation from that Twitter feed. I mean, there’s also obviously there’s the obvious suggestions, like use. pytest exists to just have your stuff run on multiple cars. And that’s obviously a good idea if you just want to just throw more cars at it, right?

Yeah.

I just wanted to see if there are other things and there are other things.

Cool. Any future plans to yeah.

Well, something I did learn along the way. I did not actually put this into the article, but actually our test suite, we have a couple of plugins to our software, and they have their own pilot test suite, and they use some stuff from the core test suite. And that is something that I also overhauled in this project. And for this, I basically learned how to write my own pytest plug in and use that expose my own hooks and stuff and use them. And that’s something that I found so interesting. That’s basically my next article that I want to write. And I also actually I think this was my first or second contribution to Pi test because I just found that the documentation that was somewhat improvable and I shot it in PR and was merged quickly. So if you read something about pytest plugin hooks, some of that may be from me.

The hook system is incredibly powerful and incredibly not documented enough.

It’s very great. Yeah, it’s really cool. I think I stole some of that and borrowed some of that architecture with the hooks for our own software.

Well, thank you so much for talking with me.

Thanks for having me.

Thank you to Nicholas for this great interview and for writing about test suite optimization. Also thank you to Patreon supporters for continuing to support the show. You can join them by going to testingco. Comsupport and a big shout out. And thank you to Azure Pipelines for sponsoring this episode. Automate your bills and deployments with pipelines so you spend less time with the nuts and bolts and more time being created. Get started for free at azure.com pipelines that link is also in the show notes at testandcode.com 85 that’s all for now. Now go ahead and test something and maybe spend a little time optimizing that testing you.