164 - Debugging Python Test Failures with pytest

An overview of the pytest flags that help with debugging.

Transcript for episode 164 of the Test & Code Podcast

This transcript starts as an auto generated transcript.
PRs welcome if you want to help fix any errors.

00:00:00 Hey, welcome to Test and Code. Thanks for listening. I am just wrapping up Chapter 13 of the second edition of the Python Testing with pytest book. So Chapter 13 is about debugging test failures, and it goes over for a lot of things. It goes over debug flags within pytest. It goes over using Debuggers and even combining PDB and talks together for Debugging in individual talks environments. It’s pretty exciting. It’s a new chapter for this edition. What I wanted to talk about today, though, was just a snippet of this because one of the things I did within this chapter is pull together all of the Pytest flags that can be used for debugging. And I thought it’d be fun to just go through those. So we’re going to do that today.

00:01:01 Welcome to Testing Code.

00:01:13 Thank you, PyCharm, for sponsoring this episode. Why not debug test failures right in PyCharm and not with command line flights? When you’re debugging with PyCharm, you can use breakpoints really easily. You can start the debugger easily. You get a window for local variables, and you even have those local variables displayed in the edit window where a variable assignment happens. It’s pretty nice to actually debug within PyCharm. Debugging with PyCharm or debugging with flags is not mutually exclusive. Pycharm has run configurations. So if you have a test and you say run that test, it’s going to create a run config, and there’s a dropdown menu that you can edit and you can add my test flags. So you can add X to stops on fail or all the other flags that we’re going to talk about later. And it’s really easy to use both flags and PyCharm together to debug efficiently. Check out Pycharm at testingco.com, PyCharm and get four months of pro for free.

00:02:18 Okay, let’s go through some of these flags. These are great. So I’ve been using these for years, but I never really pulled them together into one list until I started working on this Debugging Test Failures chapter for the second edition of the book. Again, check out the book.

00:02:36 Information about the book is at pitestbook.com first of all, there’s a lot of flags that you can use to specify exactly which tests you want to run.

00:02:46 You can use M for markers and K to use keywords. I’m not really going to talk about this much more here.

00:02:54 Right now, I’m focusing on the flags that are specifically after a test has failed. So you’ve ran by test once and there’s some failures. So what do you do?

00:03:06 Again, the entire Chapter 13 is about what you do completely, but I’m just going to talk about the flags that I find useful.

00:03:15 The combination of two flags that I like a lot is LF and X. So LF is the short version of Last failed. So it’s dash. Last failed. Actually, it’s going to be a pain in the butt to talk about dashes all the time. They’ll just talk about the flags. So last failed will run just the test that failed the last time. So if you run if you’ve got five failures and pass it again, run it again with LF, it’ll do just those last five. What if you want to run all of them but start with the failed ones? Well, then you can say FF or failed first. That runs all of the tests but starts with the failed ones and then the rest of them. And then X is shortcut. It’s also exit first. It’s a shortcut for Max fail equals one. So the idea is I want to stop after the first failure, and the combination of those two things, LF and X, I use all the time because I’ve got a test run with failure. Maybe I want to use verbose or something to look at it again, but I really just want to start with the first failure and go from there. So X and LF together will do that. It’ll just run the last failed ones, but it’ll stop after the first failure. That one is perfect to combine with other things.

00:04:46 More recently, there were a couple of new flags added since the last version of the book, for instance, stepwise and stepwise skip. So these are pretty cool. So these are almost like the LF and X combined. So what stepwise does is it just runs the whole suite, but it stops at the first test failure. And then if you run by test again with step wise, it just runs that first failure. It just starts there, starts with that failure, and if that test passes, it continues to the next failure. So you can kind of debug your code and debug your test at the same time and just keep running by test, step wise, and then it’s subsequent runs, it will just run that one test until it’s green again, and then it goes to the next failure if there’s any.

00:05:40 And then there’s stepways skip, which is okay, let’s say you’re doing this step away thing and you’re stuck and you don’t know how to fix this one, but you’d like to maybe do the next one. So skip says skip the first one and then go to the next failure. So that’s kind of cool to use those together.

00:06:00 I skipped over a couple Max fail equals NUM, which is nice if you’ve just got a ton of failures, but you just want to see the first few. You can set Max fail equals like three or four something just to see a few failures, especially if your failures generate huge logs. It might be really nice just to have this on a lot anyway, like maybe set to Maxwell to ten or five or something to avoid a whole bunch of screenshots of all the failures. Maybe. I don’t know. It’s useful if it’s there.

00:06:32 Like I said, I usually just use Max fail equals one when I’m debugging, just to find the first one and a shortcut to that is X or just stop on the first failure. A couple of other ones that aren’t really just for debugging, but I use them for this B for both.

00:06:53 It displays all the test names passing and failing. It’s kind of fun if you’re feeling like the dots aren’t giving you enough positive feedback about all your past. Test.

00:07:03 Verbose will help with that. It’s neat, but then also you can add extra ones, so you can do VV or a couple of these. It’s extra verbose, and you can have more information in your trace backs. And speaking of trace backs, the trace back flag. There is a flag for that to have the trace back style. And just to be honest, often when I’m first running a bunch of tests, I will turn off trace backs because I don’t want to see them right now. So I do that with TB equals no and it shuts off trace backs.

00:07:39 And then when I do want to look at the first failure, I can turn trace back on and just do the last failed. So often I’ll start with trace back, but there’s a bunch of other options too. There’s Auto, which is just a kind of a medium length thing.

00:07:53 Trace back there’s long for the full trace back. You can have short for just like a few lines around the failure. There’s a single line you can say trace back to the line and then native, which to be honest, I’m not sure what that does. It’s just whatever the trace back thing is on your machine or with Python, but I don’t use it.

00:08:15 Another handy one is L or show locals, and I wish I actually did this more because it’s very useful. Sometimes I forget about it, but it does the trace back and shows the local variables alongside the stack trace. So you’ve got your stack trace, and it’ll also show you the stack trace might include the local variable listed there, but not the values. Dashl shows you what the values of all the local variables in the function that had the assert, so that’s super handy. It also encourages people too, including me, to have more local variables of intermediate stages so that I can use this when I’m debugging.

00:09:00 Now, specifically around debuggers, I’m not going to go through this. This would be hard to do in a podcast anyway, but there is a few flags around using PDB as the debugger. So PDB is a command line debugger built into Python. It’s actually kind of fun to use.

00:09:20 I was reluctant to try it at first because I like ideas, but there are times where this is really handy, and so just trying a little bit is good. I do have a full description of how to use or not a full description, but an introduction to PDB within the Chapter 13, but I’ll just list these flags. So PDB. What that does is as soon as there’s an assert or right after the assert, it launches PDB. It is an interactive debugging session. I don’t find this incredibly useful because it’s after the assert. I probably want to step through the test function before the assert. So there’s another one called trace. And what trace does is it starts the debugger.

00:10:11 It’s as if there’s a Breakpoint at the start of every single test, which is cool, but that’s not the failure. It’s just every single test. So if you combine that again, if I combine the LF for the last failed with trace, then I get exactly what I want. I start the debugger at the beginning of the test that failed the first test that failed. That’s perfect. So I use this a lot. This is great.

00:10:40 Again. Also, PDB isn’t for everybody.

00:10:45 For instance, you can use a flight called pdbcls for a PDB class where you can pass in a particular class object that gets called instead of the PDB debugger. This is rather complicated, but the documentation on pipest.org specifically tells you how to use the IPython terminal debugger and that’s kind of fun to use that instead of PDB. But anyway, so those are the flags run through those again. At last. Failed, failed first exit first. Max fail.

00:11:24 I forgot new first. So new first is kind of neat. It just runs all the tests in the order of file modification time. So while you’re writing tests, you kind of probably want to run the tests that you just wrote recently first. And so ordering them in that order is kind of neat. Okay, then there’s stepwise, stepwise, skip. We’ve got verbose and trace back options. Show locals and then various ways to start PDB with PDB or trace and then also PDB class to have an alternate debug. So those are the flags that I wanted to talk about today.

00:12:03 My favorite combination probably is with TB equals no at first and to turn off trace backs. Then when I’m debugging the failures, I do last failed and X and Dashl if I can remember it for show locals and that will just run the first failure and that’s it. And stop after that and show the locals. That’s great. And if I combine that with trace for the debugger, I can pop into the debugger. So those are the flags I want to talk about.

00:12:36 Let me know if I missed one or if there’s other ways that you really like to debug test failures. I’d love to hear about that. Thanks.

00:12:50 Thank you, PyCharm for sponsoring this episode. Check them out at testandcode.com. Pycharm. Thank you, Patreon supporters join them by going to testandcode.com support. Those links are also in our show notes at Test And Code.com. Onenew, six, four. That’s all for now. Go ahead and test something.