168 - Understanding Complex Code by Refactoring into Larger Functions

To understand complex code, it can be helpful to remove abstractions, even if it results in larger functions. This episode walks through a process I use to refactor code that I need to debug and fix, but don’t completely understand.

Transcript for episode 168 of the Test & Code Podcast

This transcript starts as an auto generated transcript.
PRs welcome if you want to help fix any errors.

00:00:00 In this episode, we’re going to talk about a technique that helps me understand complicated bits of software and sometimes results in larger functions. Thank you to PyCharm for sponsoring this episode. Pycharm helps me to understand and and play with my code. The refactoring tools are amazing. A simple one is just to rename a method and it just gets renamed everywhere. There’s a whole bunch of other cool refactoring tools as well. If I changed a bunch of code, I can visually see the difference of my code and the get repo code, and I can even visually walk through the local history to see all of my changes. I actually love refactoring, and PyCharm helps me have fun while I’m doing it. Try PyCharm for four months by going to testandcode.com PyCharm.

00:01:01 Welcome to Test and Code. I want to describe a method that I use frequently to understand complicated code, and I think there’s a generalizable lesson here, but I’m not really sure how to express it. So I’ll discuss what happens, and we’ll go from there. A recent example that I’m thinking of is from refactoring some test code, but I’ve done this technique with production source code as well in several programming languages, so I had a complicated test. By complicated, I mean that I don’t know exactly what it does just by reading it. I can tell that it’s similar to other code that I’ve seen, but it’s a little different.

00:01:51 This is just one test function. It may have been subjected to copy paste modify at some point, but maybe it was just generated like this. In any case, I don’t quite know what’s going on.

00:02:05 The function calls other functions to do some of the work. So I go down and look at those functions, and if those functions kind of do a little bit more work than I’m expecting by the function name, that helps to confuse me more. So in this particular case, I had a test function that I didn’t quite know what was going on because some of the code was hidden in helper functions.

00:02:30 And then I’ll pull this technique that I’m about to describe out of my toolbox. When a couple of things are true, the code isn’t working completely in all the cases where I need it to be working, and it’s kind of urgent that it starts working, so surgery is necessary. So this is the technique I start with isolating the code. I may stick the test function or whatever code I’m refactoring into its own file. In this case, I did, I put it in its own file. If it’s not really appropriate to move the file, I might just surround the code with big comment blocks so that I can visually see all of my changes versus the rest of the file. And then I start removing functions that it calls, and I don’t just delete them. I replace the function call with all of the contents of that function, so I’m expanding my test function or the function I’m refactoring by removing abstractions, removing the function calls with the contents of those functions, and then I do that with all of them, especially with the functions that are either nontrivial or the contents of the functions don’t quite match what I expect the function to do based on the function name. I want to point out that while I’m doing this, so in this case it was a test function, I will run the test function. Lots of times during this refactoring process, I want to make sure that I’m changing the structure of the code, but not the behavior. So I’ll keep running the test to make sure that it doesn’t change the outcome. I’m not looking to fix anything right now. I’m looking to understand what the code does. Now, if this is source code that has tests for it, the same is true. I’ll run those tests to make sure the tests are resulting in the same way, the same failures are happening, the same passes are happening. That’s what I’m looking for when I’m changing the structure. And if there aren’t any tests for this function that I’m refactoring, I’ll write some. So I’m going walking through this function, expanding it by replacing contents of other functions that are confusing to me. I ended up having a lot of code there, but I’ll use visual white space to group the code into code blocks that logically seem like they should be going together.

00:04:53 And I’ll add comments, code comments. Hopefully I’m adding comments that are one liners, that maybe they can be used as names for possibly new functions. I’m thinking of an analogy here. So at this point, I’m really just pulling all of the code that’s relevant together into one page. And I think of these like those crime shows where people fill up a wall with pictures of evidence and people and such, and they move things around. They can move stuff around, and they can stick with stick sticky notes around to describe things, or put labels or use string to connect concepts and places and events and people. That’s what I’m doing here. I’m getting everything on in one page so I can see it all and reason about it. This may be a lot of code, actually, and I may have to scroll up and down and to be able to see it all. I mean, I’ve got a pretty wide monitor, but I don’t have a really tall monitor, so there’s that. But I think it’s okay. It’s good to get it all on one page. Now I’ve got these in groups blocks of code, some of the groupings and some of the code feels like clutter. It doesn’t feel like it’s helping me understand it, but I know it’s important, especially if that code is being reproduced in multiple places. If it’s duplicated. These might be good things to go back into functions, but instead of putting them in the function file, where the miscellaneous file, or wherever it came from. I’ll just put it right next to the function I’m refactoring. So it’s a new function, a new abstraction, and it’s just right near the test or the function I’m refactoring because it might not be final. It’s just still there. So I’ll move some code into function, into new functions, but in the same file, right near the function I’m working on. And like I said, it may be temporary, it may be wrong, but I’ll keep it close by. The point of these functions is not for code reviews. It’s really just to help me reason about this function. I just want it to help me with my understanding. At this point, I’m trying to reason about the whole flow of the code, and now some of the code that I’ll see might just seem wrong or unnecessary for where it is in the flow. This might be my bug. So at this point, I want to run the test code to see that the behavior hasn’t changed, and then try removing the code that I think shouldn’t be there, or changing the code if something seems wrong and then I’ll retest that behavior. So at this point I am trying to fix the behavior. So I’m alternating between restructuring and trying to modify the behavior, but doing those in single steps and running the tests between. And actually I do a lot of that. I’ll remove some code that seems unnecessary or seems redundant, or seems like it’s in the wrong place in the flow. We run lots of tests and just reason about the code.

00:07:53 I’m trying to understand it. This process has definitely helped me understand what’s going on in the code, and usually enough to find the problem. So if I found the problem, awesome.

00:08:06 Now what? Now I have a choice if I fix the problem, but now I have this mess that I’ve created. Is it better or worse than the mess that I started with? So which code is better, the old code or the new code? Now, don’t take the new code at face value because I just was doing this process to help me understand what was going on in the code. So I might want to now that I’ve know where the fix is. Looking at it with this knowledge, I might be able to look at the old code and now it makes sense to me and I can see the fix and maybe I can change it where it is. Maybe it’s just a one line or a couple of lines, or maybe there’s some actions in one of the functions that shouldn’t be there. I’ll try testing that and testing everything else that this might affect. This might be the right way to do it. It’s all in source control, so I can compare my code with what’s in source control and see where it is. And it’s okay to throw away all this code that I’ve been doing the point was to learn but I might want to keep the new code the new code might actually be way better than the old code it probably still needs some cleanup because I wasn’t really intending it to be final but it might be better than what was there before.

00:09:24 If I do go with the new code those functions that I expanded if they’re not being used by anybody else they might need to be removed they might be abstractions that are no longer necessary and the extra helper functions that I made maybe those need to go in someplace else so they can be reused by other code but maybe they’re just fine where they are. So there it is there’s my process for really understanding a new piece of code or a complicated bit of code and by complicated and complex of course I mean complicated to me as I’m reading it is this generalizable? I think so because I use it a lot I just don’t really know what to call it. I also want to point out that the process isn’t very long this is usually a several hour process but I often pull out this technique when I’ve already spent like an entire day or maybe at least a few hours debugging a problem and I’m kind of stuck so to be honest if I would have used it faster used it earlier the problem would have been done faster the resulting code also might have larger functions than we had before but larger functions are okay. If it’s more readable, that’s fine. That’s all. Please let me know if you find this technique useful or if you do something similar it’d be interesting to talk about.

00:10:55 Please visit the show notes at testandcode.com 168 that will include the link to the PyCharm pro four month free trial. Please check them out. Thank you, Pi PyCharm for sponsoring that’s all for now. Now go out and test something.