Harry Percival has completed his second book, “Architecture Patterns with Python”. So of course we talk about the book, also known as “Cosmic Python”. We also discuss lots of testing topics, especially related to larger systems and systems involving third party interfaces and APIs.
This transcript starts as an auto generated transcript.
PRs welcome if you want to help fix any errors.
00:00:01 Harry Perceval has completed his second book, Architecture Patterns with Python. So of course, we talk about the book, also known as Cosmic Python, but we also discuss lots of other testing topics, especially related to larger systems, systems and systems involving third party interfaces and APIs.
00:00:19 He was actually the very first guest that I had on the podcast back in episode nine. And it was super cool to have him back for episode 102.
00:00:28 Thank you, Oxylabs, for sponsoring this episode. Oxylabs, a top provider of innovative services including realtime crawler, web scraper, and residential and data center proxies trusted by more than 500 companies. Find out what they can do for you at Oxylabs. Io Test And Code.
00:01:01 Welcome to Test and Code, a podcast about software development, software testing, and Python.
00:01:12 Welcome to Test and Code. I am really excited to have Harry Percival on again.
00:01:19 I should have looked this up, but you were like one of the first or second or third guest I had on the show. So way back in the day, I was excited to have you on because you just have a TVD book that was out.
00:01:32 What a coincidence.
00:01:34 Yeah. But since then, there’s been a lot of change in your life and work. Right. You’re doing a lot of different things. Do you want to reintroduce yourself? Who here he is today.
00:01:44 Great. Well, thank you for having me on for the second time, Brian. I’m really flattered. So if we spoke a little while ago, around the time of what I’m now calling my first book might have been 2015 or something. So maybe it’s been five years. I don’t know. And so at the time I was working at Python Anywhere, which is a platform, as a service for hosting Python stuff. And that’s where I really learned about TDD. So that was the topic of my first book, and they were doing extreme programming and really rigorous TDD, pair programming everything. And the first book was really about trying to just share that with the world. And I kind of wrote this book almost as like, pair programming with the audience and saying, look, here’s TDD, how you do it if you’ve never heard of it. And I think at the time when I first started writing that book in 2013, testing was the exception. People were talking about testing as this good idea, but I don’t know what flossing your teeth. You really should do that. And then maybe you don’t quite get round to it. I realize I’m talking to Americans here who will do systematically floss their teeth every day. So that was a bad example.
00:02:45 Right. But good intentions is where we were in the testing world. But a lot of people were like in 2013, were saying, oh, you know, I don’t have time for it or like, you don’t really need it. We’ve really seen the world of Python, I think, evolve to testing being the default assumption. We’re doing testing because that’s how you write software properly. And if you don’t, you’re in the exception case. I think that’s been a real change in mindset. I think it’s a Mark of kind of the maturity of the Python world. And as we all sort of evolve as a community, we come along to this second thing, which is what I’m talking about in the new book, which is that a lot of companies are working in Python. We’ve been doing it now five years, ten years, 15 years, and a lot of things that were just initially a little Python script or a quick Django web app or a little thing we’ve thrown together for a startup or really evolving into like, bigger and bigger applications. People work for Dropbox, they work for Instagram, they work for Google, they work for these things that were thrown together in weeks and then suddenly like three, four, five years down the line of really serious business applications. And the challenges you have are not the same as the challenges that you have when you’re a startup and like managing a large code base is a different kettle of fish. And that’s why I think we’re seeing interest in the typing module. Not that I’m an unreserved fan of the typing module that is still on a controversial topic with me and with the world. These questions of how do I manage a larger code base over time are the questions that I started to run into talk about in the new book.
00:04:10 I think that’s an awesome topic. I was just a recent episode was equated working in an existing code base to walking in the middle of a chess game. It definitely is different challenges for a large code base.
00:04:25 One of the biggest things to me is you can’t keep the whole thing in your head at once and nobody remembers all the decisions that were made in the past.
00:04:34 Interesting. So what’s the name of your new book while we have it out? We brought it up already.
00:04:38 Yeah. So you know, the old book, I really wanted to call it Obey the Testing Goat. And then the O’Reilly said, well, that’s a bit too wacky, Harry. You have to call it something sensible. Test driven development with Python. Fine. This new book, we really wanted to call it Cosmic. Python. Cosmic Python. And Riley said, okay, that’s a little too wacky. Come on, let’s call it something sensible. And we called it Architecture Patterns with Python. We narrowly avoided them, forcing us to call it Enterprise Architecture Patterns with Python.
00:05:06 We convinced people that Enterprise would lose us as many readers as it would buy us.
00:05:11 Oh, totally. I would not read something that had Enterprise in the title.
00:05:14 I don’t think it’s correct either, but Architecture Patterns with Python, aka Cosmic Python.
00:05:19 Cosmic. I like Cosmic.
00:05:20 I’ll tell you why. This is a little joke of Bob’s. He took me down to it. Bob is my co author. Talk about him more. But Bob said, I want to call a Cosmic Python because Cosmos is the opposite of chaos. You see, you can look it up as a quote by Carl Sagan, but in theory, in the ancient Greek, Cosmos means order and chaos is the opposite of order. So Cosmic Python is like Python code with order or how to avoid chaos in your Python application. That’s the kind of tagline.
00:05:46 Oh, that’s cool. What’s Bob’s last name?
00:05:48 Bob Gregory. Okay, so you can find our stuff at Cosmic Python.com and that’s got a link to stuff about the book. As always, it’s going to be like freely published online. You can already preview it there by the GitHub ASCII Doc source rendering. Okay, cosmic Python. That’s my new Jam. That’s the new thing I’m pushing. It’s a new brand name.
00:06:05 This concludes the sponsored episode. Thank you for joining the show.
00:06:10 No, I’m excited to read it. Any idea when I can start reading it? When is it coming out?
00:06:16 Well, you can read it now if you like. Everything is online and viewable on GitHub. You can ask me nicely. I’ll send you a PDF if you want, but don’t tell it. Right. And it comes out in formerly in April. We got it for April 9 or something. Released dates. So we’re hoping to have it just in time for Python.
00:06:33 At the same time as we launch the kind of paid for print edition, we’ll be launching the free online edition. So it’ll be Creative Commons license as always.
00:06:42 You did that with the free online sort of thing where you can read it online. You did that with the first book, too, right?
00:06:47 Was that something you came up with or is that something that O’Reilly had done already?
00:06:52 They had done it before. I was pushing for it. So at the time I was an activist with a political party in the UK called the Pirate Party. Big agitators for Copyright form, and that’s a bit more dominant these days. But I think still, I believe that that’s the way to go. If you’re trying to publish things in the modern media, your choice is not between having your book available for free and having it only available for paid. Your choice is having your book available for money and having it available for free. But it’s pirate copies versus having it available for free. And you are happy for them to be free. It doesn’t actually reduce or increase the amount of people who read it for free just because you authorize or do not authorize the free copies.
00:07:32 That’s it. The other thing is we live in an open source world, right. We benefit so much from people donating their time for free, for Python, for Django, for all of these projects that we use.
00:07:43 I think it’s a similar sort of thing. Like this book is not my book, it’s mine and Bob’s book. But even then, these are not all our ideas and the language that we’re using is not our language. And so it’s our way of giving back and saying, look, this is our knowledge. We want to share it with everyone. If you’re happy to pay for it. And we love that. And I really love that. And if you just can’t afford it or you just want to sample it first, of course, have it for free. And like, it gets you around export bands. Right. If people in countries that have some kind of argument with the US can’t read physical books are published by US publishers, but they can still go online.
00:08:13 And there are people who are just waiting. Like, this book is going to cost $40 or something. And that’s fine for you and me. And it’s fine for people who pay like highly paid software engineers in the Western world. But there’s like, kids and students out in other countries can’t be talking about $40 even if you try and drop the local pricing. Forget it. So, yeah, of course, everyone should have access to it for free. That’s my point.
00:08:33 Yeah, I think that’s a great idea. I’d like to see more. That just becoming the norm. But anyway, cool.
00:08:39 I don’t want to put down anyone who doesn’t do that. Like he says, I’d rather have mine paid. I think that’s a totally valid stance. This is just the way I see it for me.
00:08:46 Yeah. Well, like, mine is totally paid. I’d like to look into the options of doing something more along the lines of your model in the future because I have lots more books than me.
00:08:55 Some publishers are more skeptical than others, right? That would be all. Some publishers are more skeptical than others about it. So I did talk to other people before O’Reilly. And they’re like, what? Give it away for free? You’re mad. Okay, we’ll forget you guys.
00:09:06 But O’Reilly getting forward thinking about it and then also working with a co author versus working by yourself, was that a bigger quite a difference.
00:09:15 Yeah, it was. It worked out really well. I’m kind of used to the secret with this book is it’s not really my book. It’s Bob’s book. I just made him write it.
00:09:23 So I cannot pretend engineering block somewhere that has seven stages of knowledge of a topic from, like, heard about it through to apprentice through, to novice, through to journeyman through to master. And I am basically an expert beginner in this topic. And I think I know a lot more than I know, and I know that I know very little. And so Bob is the person who actually has the experience and has done this stuff for ten years. And so I just made him write it all down in a book and as a result, asked all the right questions and learn some of it along the way.
00:09:53 Why am I interviewing you then? Why should we interview Bob instead.
00:09:57 Sure. Yeah. I like to hang up now and get him on. He’s also much funnier than I am.
00:10:03 Thank you to Oxylabs for sponsoring this episode. Oxylabs is a top provider of innovative web data gathering services such as realtime crawler, web scraper, and residential and data center proxies.
00:10:17 Oxylabs is now introducing their next generation residential proxies, which are a significantly improved data gathering solution. They provide a stable and Fax proxy pool with more than 30 million global IP addresses and they are resource efficient with the proxy management, user agents and IP rotation. All done on the Oxylab side. Oxylabs has a deep understanding and knowledge of how to acquire web data, and they provide a dedicated account manager for every client already trusted by more than 500 companies. Visit Oxylabs IO Testingcode to find out more about their services and to apply for a free trial of their next generation residential proxies. That’s Oxylabs IO Test And Code.
00:11:05 I did want to talk to you about a few more things.
00:11:09 Is test driven development. Is that come up in your book at all? The new one?
00:11:12 It is, yeah. We eventually put it in the time title. We said that the architectural patterns are about enabling TDD. They’re about enabling TDD and DDD. If you’ve come across that domain driven design and we’re going to talk about event driven microservices, which is kind of a second half of the book topic, but yeah, absolutely. One of the things that got me really excited about these architectural patterns and we’re talking about things like ports and adapters and clean architecture and onion architecture, and all this stuff is people in dynamic dependency injection. These are all things that people in dynamic languages are quite skeptical of with good reason. And so I came along and I saw all this stuff that Bob is doing with like dependency injection in classes and blah, blah, blah. This is nuts. And 18 different levels of indirection before you can do anything. What is this all about? What I really saw the effect on was in TDD and was in the test pyramid. I just moved from a world where we’re looking at a lot of tests with marks, a lot of complicated unit tests, a lot of Django tests that end up using the database and they’re like fast enough, but maybe they’re kind of slow and a lot of ratio between your kind of end to end and acceptance and slow test and code fast unit tests, it’s not very good. And I actually saw a world where you have a test pyramid, like where genuinely the ratio between end to end, slow test and fast unit test is like an order of magnitude or two. And so you can really have a world where your applications when you just run your test and in a few seconds all of your tests run and then the kind of acceptance and end to end test are just a little validation that you wide things up correctly. It’s a whole different world to a world in which you’re waiting for bills to run overnight or you’re like, okay, there it goes on Jenkins better give it 40 minutes.
00:12:52 So it’s a different world. Is it better?
00:12:55 Well, yeah. It’s all about feedback cycles, isn’t it? If you can get the feedback cycle over whether your code is correct faster, then you just develop faster and it’s more of a pleasure. And that whole thing of like having to wait a long time for your bill to get a slow flaky bill to tell you that something is broken. Well, maybe. I mean, it might just be flaky if you can minimize that and instead maximize the amount of time you run just actual unit tests that say yes. Okay. That’s all your edge cases are covered. That is all your business logic is great. Everything is the way it should be. And like, you occasionally make a mistake and misfire some conflict variable. So, yeah, okay, the API is broken, but. Okay, so that’s the one thing that you do, like once a month and you find out from a slow test, then. Great. Yeah. I mean, people instinct about the test period and seeing it in real life. Right. Okay. And the way you achieve it, I’m not saying that the way that we’ve done it are the only ways of doing it, but when you get there, it’s wonderful. It’s the ways that I’ve seen it work. Yeah.
00:13:46 Wow, this sounds magical. I’m still very skeptical.
00:13:52 I think that’s right. And I was skeptical, too. And it’s very much like TDD when I was being taught it by Charles and the gang at Python. Anyway. And I’ll drag my feet at every single thing. I’d be like, what are we writing? Like, end to end test as well as unit tests? Isn’t that duplication? I’m like, what? You’re going to rerun the test in between writing every single little line of code and like, what you’re going to write a test for, like a one line function? This is mad. They were like, yeah, Harry, come on. You’ll see. And then over time with experience, you see how it works. And it was similar here. What’s all this dependency exception? What is this, like commands and mappers and CQRS? Can’t we just use Django?
00:14:30 And as you like, over time, you see all these things that seem mad at first, you sort of learn the justifications for them, and then they make sense. Either that, Brian, or it’s just.com syndrome. I can’t tell you to judge it yourself.
00:14:42 Okay. There’s also it works in my realm, but it might not work in somebody else’s domain.
00:14:47 There’s definitely a thing here which is and we’re kind of a pain to talk about this in the book about, like, every single one of these things comes with a trade off. There’s a rich Hickey quote which says, you’ve heard the saying, Brian, like economists know the price of everything and the value of nothing. Have you ever heard that saying, I think so. There’s a set of counterpart in programming, and programmers know the benefits of everything and the trade offs of nothing.
00:15:10 Like, we’re all very quick to go, hey, this is better. And it’s just a little too slow to go. Like, okay, it is better in the following aspects, but it also costs you the following things. And it’s when we do things like throw in dependency injection or like decoupling IO from your model, blah, blah, blah, every single time. Like, you’re adding layers of indirection, it does cost you something. And we’re paying to say, okay, look, you are paying a cost here. You’re paying a cost here, you’re paying costly. So when is it worth it? And I think it’s worth it when you have a complex application. So, like, when we prefix this conversation over the five years, like a little startup that just needs to get basically a thin wrapper around a database in front of as many people as possible on the Internet. Yes, great. Like Django is your friend, you’re going to save a lot of time, go for it. But when you have an actual complex domain, like a business that has a workflow and rules and edge cases and complicated concepts that have to interplay with each other and different teams of developers and different teams within the business that speak different languages, then and only then does it start becoming worth it to go, okay, well, maybe some of the shortcuts and time savers that we get from Frameworks, Django and Python Magic are more like breaks. And if we add these levels of interaction, if we apply these slightly counterintuitive patterns by putting a little bit more work in here, we make this thing over here more manageable and easier to deal with overtime.
00:16:35 Well, I will reserve judgment and read more about it. First, the fear I have is there’s a couple of fold. One is the complex systems are sometimes complex because of the software we put in place to avoid complexity.
00:16:50 Yeah. And we’re going to try and make it opposite of that.
00:16:52 So that you actually, like surfacing the logic and putting it all in one place where you can see it and you can unit test it. And like, that is the logic here, and it’s got nothing to do with your database and it’s got nothing to do with your web framework and it’s got nothing to do with some API for Jenga Rest Framework or Forms or like some clever Flask thing or like some plug in to do with OAuth. No, the business logic is here. It has no dependencies. It’s just Python.
00:17:19 Yeah. That’s a great place to focus on your business logic tests and stuff if your code doesn’t have a bunch of nonbusiness logic in it. So that’s great. Okay, I’ll take off my Skeptic at the other part that’s amusing to me, or I guess just an observation is that I think this is Fred Brooks. I can’t remember. I’m probably mascarring this, though. Somebody said that software architecture often mimics the hierarchy of the company.
00:17:46 Conway’s Law, I think.
00:17:47 Is that Conway?
00:17:49 Yeah, exactly. The software systems end up reflecting the organizations that they serve. Absolutely.
00:17:53 Yeah. That’s for good or bad. Partly that comes about because individual teams need to have control over the software that they’re writing. But that sometimes isn’t the right way to slice an onion. You got to be careful with that. But actually more people talking about it and more opinions and more books out there is good, because actually I don’t think we haven’t really talked about this a lot. Large scale systems in Python, in very many books that I’m aware of. I am an anti test pyramid person, but I know that a lot of people love it out there, and it may just be my domain. I don’t know how to write a unit test and tie the test criteria to the requirements of an application. Yeah, I know how to unit test a little thing, but a unit test often is making sure that the code runs as the developer expects the code to run. That is different than making sure the code runs as it needs to, to fulfill the requirements of the system.
00:18:45 Yeah, right. Absolutely.
00:18:47 And I think I had a similar thing, like I was complaining earlier about builds that take forever. We used to work at Python Anywhere, and Python Anywhere is a platform as a service.
00:18:59 So that whole domain, like that whole business is about building a Web application that transforms all the keystrokes and stuff they do in a browser into input and output to processes running on a cluster, a distributed cluster of services with containerisation and shared file systems and complicated things like that. So it’s like it’s all about trying to turn one set of boundaries and edges and UI, input things into another set of permanent storage processesboundaries.
00:19:32 It’s very hard at pipeline to see how you would make a pure Ethereal domain out of that, because the whole thing was about piping. The right kinds of tests are going to be integration tests. So this book is really aimed at people who have a domain. Is there a thing here that you need to model conceptually?
00:19:52 We don’t need to really conceptually model a Unix process.
00:19:56 Right. We just here it is wireless that Unix process to keystrokes from the browser. But if you did, is there a conceptual model here? Is there a thing where you have different kind of objects and relationships between them and permissions and rules and interactions and groupings and invariants and kind of constraints that you want to apply? If you have that, then this is where the stuff pays.
00:20:20 You brought up marks a little bit. So does thinking about the architecture more allow you to write cleaner tests without as many mocks or to use the mocks differently. Or how do mocks relate to this?
00:20:33 So in the book, we sort of pretty much don’t use mocks at all. And I kind of have a talk and I have maybe someone like a blog post about testing Rest APIs. It’s all about okay, well, we often reach for marks as a default tool because they’re nice and quick and convenient. They sometimes come with a cost, which is that you have these tests where like every single test in your test file has four or five mocks, even the ones that don’t care. And when you look at your search and set up, you do the mock that this mock that set the return value and then you have your test under code and then you check this mock was called with that, and that mock was called with this, and this one had this method called it ends up being quite hard to reason about what those tests are doing. That’s the pitfall. And so, yeah, the thing that we propose in the book is that once you start decomposing, once you start working on this architecture where you say I want to separate out my business logic, the stuff that’s like pure conceptual. And I have that as just pure conceptual. Well, if that is all just pure Python objects, then you can test that completely without box, because it’s just Python classes talking to one another. You do a bunch of set up with some Python objects and classes and you make some assertions about some Python objects classes. So that can just be a completely dependency free, mock, free world. And then the other side of things you basically want to test with integration and end to end tests. And then halfway between those two things we discuss in the book, the idea of like, okay, well, if you want to start kind of building ways of plugging your Ethereal domain into the real world, one way is and you want to keep that thing free of dependencies, keep that domain independent from your database. You have to do this kind of inversion of the dependency. You have to say, okay, well, instead of having my business model objects inheriting from a Django models class so that they map one to one with a database table or from a sequel alchemy table object, if you do that, then your model depends on your depends on the database. If you want to go backwards to that and say instead, what I’m going to do is I’m going to ask my database to look at my model and design tables to match the model. And when I’m going to ask my database for rows, I’m going to get those rows, I’m going to transform them into model object invert that dependency and make the database depend on the model rather than the model depending on the database. Once you do that, you get into this sort of story of things like inverting the dependency sometimes leads to dependency injection, and it often leads to you thinking about your infrastructure as like, okay, well, what is the idea of a database? The database is like, what’s an abstraction that I can build around that to represent the idea of fetching stuff for my database, I might have a class that can either get things or list things, add things to the database.
00:23:05 That’s going to be my interface between the domain, the model, and the database. It’s going to be this layer that says get list ad. When you do that, it’s really easy. Like I can make a mock that does get listed, and it’s a very simple mock rather than like mocking a Django session or a sequel outcome session. But I can also build a fake version of a database adapter instead of six or seven lines of code. And so the thing that we’re pushing there is if instead of using mocks, you force yourself to identify what your external dependencies are, you build a little wrapper for them to this. Okay, well, for my database, I need things that can get an added list. And for my, I don’t know, SMS notifications, I need a thing that can send a notification as a string. And for my file system, I think need a thing that can list files and read a file of a given name. Then I can have that abstraction of a file system, and then I can build an implementation of that that is going to use a real file system. I can build an implementation of that that uses S three as a file system, I can build an invitation that uses Dropbox, and I can build an implementation that’s just in memory for my tests. If I decide on abstraction for something, then it’s much easier to make a really simple fake one for your test and code, a real one that touches the real world without needing to, without using mocks. By deliberately not using mocks kind of forces you to keep those things simple. As I’ve said a lot of things in a few minutes.
00:24:26 There no I like it.
00:24:28 These are just generally good ideas for external dependencies anyway, is to put an adapter layer or something in the middle, an interface layer that minimizes the width of the interface.
00:24:42 If you only need access to three methods in an interface, then hide the rest of them so that you can’t even call them.
00:24:49 Precisely. And so one of the things I’m going to push as one of the things I suggest people try, if you have an interface and it has like 10 million options, but you only need three of them, if you just reach for mocks, then you can knock out any of the 10 million ones of those things in that interface for your actual code. And then when you do one more, you just add one more Mark and you do one more, you test and code more Mark. And then when you decide that you’ve used the wrong one and actually a slightly different version of the API that you depend on would be a neat way of getting the thing that you want. You need to go back and change 350 marks that mocked out one specific part of your dependent API and change it to have a different and so, as you say, that’s the danger when you go directly to an interface and you say, well, let’s build an adapter that minimizes that surface. That’s great. And then the adapter gives you that decoupling. If you say, I’m not allowed to use mocks, well then having that adapter being something you can swap a real adapter and a fake adapter, and having to actually build your own fake adapter that’s a fake version of the real adapter forces you to keep it simple because it’s really annoying to maintain a fake adapter that has loads and loads of methods on it. So it exerts pressure on your abstractions of your dependencies, trying to keep them simple.
00:25:56 Yes. It also allows you the lean software goals of being able to decide late if you have a small interface that you have to go through to get access to your database and you decide three quarters of the way through your project that MySQL is not just not right and you want to switch to Postgres or something else, you have one place in the software, you have to change that adapter and test the heck out of that, and that’s it. The rest of your system should be fine.
00:26:26 I mean, that’s the theory at least.
00:26:27 Yeah, RMS already buy you a little bit of that, like Django in theory should make it relatively easy to switch from my sequel to Postgres.
00:26:35 Oh, that’s true.
00:26:36 The bits that are hard in practice are the same bits that are hard in practice. If you build your own interfaces.
00:26:41 Maybe the better analogy would be switching. And now Django, I’m not a Django expert, so I’m not sure if this is even possible, but switching to a Mango or something document database probably a little bit harder.
00:26:56 In the book we talk about, we build an app and it has a database, and barely. And then just for fun and appendix that say, you know what, let’s say we got this from the project and the business is like, you know, we’ve just decided that we’re going to give you some spreadsheets and we’d like you to output spreadsheets instead. And so you’re like, Ah. And I built this whole thing around the assumption that I’m like reading and writing to a database, but because we’ve built this nice abstraction around like permanent storage, switching a sequel Alchemy repository to being a CSV repository is just a matter of rewriting one file and all the domain logic is completely separate.
00:27:27 Oh, that’s cool.
00:27:28 Swap the database for raw CSV files, and it’s pleasingly straightforward.
00:27:33 So now both you and I are talking as if we’re the same language, but it is different than the language that a lot of people speak with mocks. And what I’m specifically saying is I’ve got the assumption it sounds like you do too, that the only thing really worth mocking or Faking is external dependencies, something outside of your code. Now, that isn’t the entire story in the world of testing, especially. I don’t know if it’s prevalent in Python, but it is prevalent in the Ruby community and some in the C world to just kind of mock everything, even if it’s my code. But it’s not the code I’m working on today. I’ll mock the dependencies of a function that is not something you do normally, right?
00:28:17 Yeah. No, it is not. No worries. I’m going to get my concepts back with him. But I think this is the kind of London School versus classic or Detroit School of TDD. So the London School says define what all your different classes and collaborators are going to be, and then test each one in isolation and you mock out the collaborators and you test each unit with mocks and test them all separately. And that’s the London School. The wrong ones.
00:28:41 The wrong ones.
00:28:42 It’s not at all the wrong ones. Right. Because obviously there’s loads of really smart people who have like this. Right. But there’s the other side of it, which is the kind of class school which says no. Okay. Just try and set up your state before run your assertions and look at the state afterwards, which is the way that’s more instinctive. And what I would say, we’ve gone down this path or I’ve gone down in my career this path of the kind of classic school. And I figured out ways of making that work. I have not seen the London School work. That’s not because it doesn’t. It’s just because I haven’t seen it.
00:29:15 Yeah. It also fits better. So I’ve talked with people about the mocking or Faking external dependencies. I always try to bring up the third option is to just include that in your architecture. You can design an architecture that has option that stubs out some external system, like, for instance, an email notification system. You can design into it that instead of actually emailing somebody, any email that goes out just gets logged to a file or something or thrown into a directory or stuck in some permanent memory somewhere or anything and be able to retrieve it later. Yes, that is mostly for the purposes of testing, but it’s really handy to be able to even turn those options on, even for like if you’re doing user interface testing or something like that, to be able to just speed these things up, to sort of turn off external systems but still have the behavior of the system. Right. That can just be part of your architecture.
00:30:19 I think you’re not a skeptic at all, Brian. You’re already a convert.
00:30:22 You don’t just don’t realize this isn’t unit testing.
00:30:27 This is even for system level testing. Makes it faster, right? I wish more people talk about that. Let’s actually bring up the one while we’re on this topic, I got an email question recently or I don’t know, over Twitter or something saying specifically if my code is depending on a rest API from somewhere, how do I deal with that? How do I make that?
00:30:49 And how do you do it, Brian? Because you’ve already given a hint of your answer to this.
00:30:52 I don’t really. This isn’t something I’ve had to do.
00:30:56 But my first thoughts would be, I like the idea of a recorded system. So taking a snapshot of some live data that would come off of that API and somehow serving it to my system without having to actually talk to it.
00:31:11 There’s a tool out there called VCR Pi I’m not necessarily paying for, but it’s quite smart. Like you can run all your tests with no marks or anything like that, and it goes out and calls to the real APIs that you depend on. And then as you’re running your test, it records all of the outgoing and incoming Http calls the first time you run it, and then the second time you run it, it just replays what it says before. So whenever it sees the same request, it gives the same response back. And so it sort of freezes that thing in time and just replace the same responses forever until you decide to flush your sort of prerecorded responses and get new ones pre recorded. That’s why the name Vcr.com. I play with that a bit, and I found that people in my team, it can get very confusing because you’re never quite sure what’s talking to what and the algorithms for saying what is the same request as last time are not always straightforward.
00:32:03 So Vcrpa is great at saying, okay, here’s a request I’ve seen before. Then you send back the same responses before. Well, what if every request has a uniquely generated randomized ID in it? Okay, now I have to make a special thing that matches requests that says, okay, well, ignore this field because that’s a randomized ID. Although actually the test is going to send out three of them. So I need to somehow notice this one is the first, this one is the second, this one is the third, and like pretty soon you’re in some complexity. So I would say to people, check it out, but be aware that things are not always that simple. And then I can answer your question. Yeah, about saying how would you test it? I think when you were saying, well, why not design into your system the idea that you can have email notifications and the real one sends email. And then for testing, I can have a version that just saves those emails to memory somewhere. That’s where I would kind of go. And the other part that you answered is saying, well, if you have a dependency, what I would like to do is build a little adapter, build some sort of wrapper around that dependency that has a very simple API and says like in terms of my client code, what do I need from this external API? And I’m just going to define that interface and then I’ll have the real one that talks to the real API and for my unit test. So I can have a fake one that just works in memory. So like maybe a more concrete example would be, I don’t know, I want to say like some sort of payments. Like you have a payment provider and you want to do some sort of thing that says, okay, well when someone signs up, only go check with the payment provider whether I actually already have an outstanding payment from this person. And I also then maybe need to initiate a new one. And the credit card provider has like 250 different methods in their API that you could use, but you actually only want to use. You only want to do two things. They’re all named in weird complicated language that that credit card provider likes, like it’s PayPal it’s Stripe or something. And they have their terminology for transactions versus payments versus cards versus payment methods versus blah, blah, blah. And you don’t care about any of that in your app. You just want to say check if they exist already and set up a new account. Right? So you’re going to build a little interface that just has two methods. Check if something exists already that takes a string representing a user ID or a username or something, and then set up a new account that takes like a dollar amount, whatever. So that’s what I would do is I’d build a little interface that just wraps that API, I’m going to build a fake version of that interface that maybe I can set up to say return true or false, does it exist already? And I can set it up to return okay or not okay. For like did I set up a new payment correctly? And I can build a really nice little fake without using mocks, just little in memory class that pretends to be that API. It’s usually these fakes are usually like a wrap around a list or a wrap around a Dick. So you know when you fetch things, you put things into it, you take things out of it. And then for the real one, I’m going to have an end to end test at some point the checks that you really can talk to that third party API. And so hopefully they have a nice sandbox. I’m going to have maybe a couple of integration tests, but just test my real adapter and say, okay, well if I check is there an existing payment, it should say no. If I then set up a payment, it should say, okay, if I check again, is there an existing payment? It should now say yes. And this is talking to the real API with a real set of data. So that’s where I’m going to have integration tests for my little handle adapter.
00:35:20 So one end to end test, a handful of integration tests, as many as you need, and then all your unit tests can just use your fake and you can have as many of them.
00:35:28 I think that’s great. For instance, there’s some the integration tests that are talking actually to either the Live system or like a test server. Like, for instance, a lot of these service providers do provide a test service, but they’re not intending all of your developers to run it a thousand times a day. It’s intended to maybe run it at night or maybe just run it when you change your code or when you get notified that there are changes to the interface.
00:36:00 That’s a common challenge.
00:36:02 There’s a real contrast in the quality when a third party provider gives you first of all, you’re lucky if they do give you a test sandbox where you can mess about in it. And then you’re really lucky if they give you a good one where it’s really easy to set up test data and to clear, especially to clean up test data. So I’ve seen things where like, hey, can we use a sandbox? And they’re like, sure. And they’re like, okay, so we’re going to run our test again. Yeah, great, that’s great. And you go, okay, so I have five developers and they’re going to be running the test like 1020 times a day. And each test generates, I don’t know, 2300 entries. So by the end of the month, we’ll probably about have 2020, 5000 entries in your database. So how do we clean it up? And they go, okay, so maybe it’s not sustainable. And in those cases, you can actually say just in the same way that you built a fake version of your interface for your unit tests, you might actually consider building a fake version of the third party dependency for your integration test for the test that every day, once you’ve set up your credit card payments thing, right? Do you need every single time you run your suite of integration test, do you need to really run that against the real third party credit card provider?
00:37:08 Yeah, probably not.
00:37:09 How often do they really change their API, and especially if that is a small aspect of your code that is not really central to your life. Like, your code is not about credit card payments, they’re important. So you want to check them every so often, but you do not need to check them 35 times a day. Then build a fake. We have a little bit of fun, like writing one page on file flask application that just pretends to be your credit card payment provider, and it just emulates the endpoints that they would provide. And you put it in a little Docker container and it runs like when you run your test and code sort of integration, test, integration, test environment. It’s one of the little dock containers you can, and it’s called, say, credit card provider. And your app is configured to talk to that and it sends the HP requests to report on local host instead of a Port on the public Internet.
00:37:54 Love it. So then you set up. Okay, fine. So that’s great. And then what we’ll do is we’ll just be a bit smart and like once a month, we’ll run the test against the actual thing, or if we have a spot that we’ve made a change to code that’s near payments, like in the accounts module, then any Pull request that has a change to that accounts module, we’re going to run against the real API or whatever it will be. And you give yourself the option of running the real thing or the fake thing.
00:38:19 Yeah. Actually doing risk benefit analysis on your tests as well as the rest of your life. That’s good.
00:38:26 Yeah. Well, I was in a lot of pain of just debugging PayPal, specifically test them, coin them out. There you go. I did it. Fine. Sue me. But, yeah, just do debugging PayPal.
00:38:37 The test sandbox was years ago. Maybe it’s better now. It’s lacking and it wouldn’t clean up and the test was slow.
00:38:45 The man hours that we would spend, personal hours that we would spend debugging PayPal, test failures. And they had never changed anything and we had never made a mistake. It was always just a testing flaky. And at some point, this is not well, yeah.
00:38:58 And also just minimizing that sort of stuff. If it is a complicated interface that you’re interacting with, isolating that code from the rest of your code is good anyway. Just anything that talks to PayPal or anything that talks to Stripe or anything like that, just Isolating that in like a little tiny piece of your software that really, unless something changes, you don’t have to change it.
00:39:20 And when I call it, when I make a thing called my payment provider interface, which has its two method on it, and I set it up to talk to PayPal. And then after like six weeks, I’m really angry with PayPal. Then I changed Payment Provider interface. Instead of talking to PayPal, I talked to Stripe, and none of the rest of my code needs to know the difference.
00:39:38 Yeah, that’s good.
00:39:41 I got a funny story about that, but I don’t think I’ll share it right now.
00:39:45 We’re going to get sued today.
00:39:48 Well, this has been a blast. I can talk to you about testing and stuff for hours, but we probably should wrap it up if people want to know more about you. Are you still at the same place or where should they go to find out more?
00:40:00 Yeah, come and check out cosmicpyson.com three pictures of random nebulae in there as you might expect. So cosmic python.com links to the book. There’s a link to some old excellent blog posts by Bob on the ports and adapters architecture. I see that. And yeah, I hope to give a talk at Python this year about mocks provisionally entitled stop using mocks.
00:40:23 Nice just said. Thank you so much for having me on, Brian. It’s an absolute pleasure talking to you. It’s nice to see on camera hope to see you again in real life. Thanks to everybody that’s out there listening. Sorry if I spoke too fast. It was annoyingly English accented and give us a shout out on the internet if you’ve enjoyed it. Yeah, especially if you haven’t enjoyed it. You think this is all nonsense and like I said, STFU then yeah, tell us hjwp on Twitter and tell me all about it.
00:40:48 Yeah. And if you have any aggressively different opinions or aggressively the same opinions, come on the show and let’s talk about it.
00:40:57 Let’s have the DHS test induced design damage. Maybe that’s what this is all about. Let’s have that conversation.
00:41:03 All right. Well, thanks a lot.
00:41:04 Bye bye.
00:41:07 Thank you, Harry. That was a really fun episode. I look forward to reading your new book and seeing you at Python. Thank you to Patreon supporters for continuing to support the show. Join them by going to testandcode.com support.
00:41:21 Thank you to Oxylabs for sponsoring this episode. Find out what they can do for you by going to Oxylabs IO testandcode. That link is also in the show. Notes at testandcode.com 10 that’s all for now. Now go out and test something.