You need tests for your web app. And it has a database. What do you do with the database during testing? Should you use the real thing? or mock it? Jeff Triplett says don’t mock it.
In this episode, we talk with Jeff about testing web applications, specifically Django apps, and of course talk about the downsides of database mocking.
This transcript starts as an auto generated transcript.
PRs welcome if you want to help fix any errors.
00:00:00 You need tests for your web app and it has a database. What do you do with the database during testing? Should you use the real thing or mock it? Jeff Triplet says. Don’t mock it. Jeff is a partner at RFS, a director of the Psf and President and cofounder of the Django Events Foundation of North America America. In this episode, we talked to Jeff about testing Web applications, specifically Django apps, and of course, talk about the downsides of database Mark. This episode is brought to you by Datadog and Config Cat.
00:00:26 Config Cat is a feature flag service. Easily use flags in your code with Config Cat libraries toggle your feature flags visually on a dashboard, hide or expose features in your application without redeploying code. Set targeting rules to allow you to control who has access to the new features. It allows you to get features out faster, test in production, and do easy rollbacks with Config Cat, simple API and clear documentation. You’ll have your initial proof of concept up and running in minutes. Whether you’re an individual or a team. You can try it out with their forever free plan or get 25% off any paid plan with the Code Test and Code 2021 release features faster with less risk with Config Cat. Check them out today at configcat.com.
00:01:26 Welcome to Test and Code.
00:01:36 Hello, and welcome to Test and Code. I am thrilled today to have Jeff Triplet on. I’ve been following Jeff online for a long time, but we’ve never actually spoken, so this will be fun and we’re going to talk about Django and other things. So welcome, Jeff.
00:01:54 Hi, Brian. Good to be here. I’m a developer out of Lawrence, Kansas. I work for Revolution Systems, which is a Django consulting Python consulting agency. I’m a partner there, and I’m on the Python Software Foundation board. So I’m a director voted in by the community, which I appreciate being voted for and not voted for and just appreciate getting to represent people.
00:02:15 I’ve also ran Jengocon us with a bunch of organizers. For the last six years, I started a nonprofit or co founded a nonprofit called deathna, which is the Django Events Foundation of North America. And then I also run a weekly Django newsletter called Django News with William Vincent, who I think has been on your show maybe before or one of your other shows.
00:02:35 Yeah. He’s involved with the newsletter with you.
00:03:18 And I just heard that you’re trying to automate some of this.
00:03:24 I’m sure it’s maybe Fool’s errands. I don’t know who’s goal.
00:03:30 Yeah, most of it we use a product called Curated because we figured we really didn’t want to write our own newsletter software. So it’d be really easy just to write like newsletter software and never really write about what’s going on in the community. So instead we just decided let’s do something that’s off the shelf and then we can just focus on it like 15 minutes. We can actually be doing the thing that we want to do versus taking time to write software to run it and write it. So I’ve been trying to finish some of the automation pieces around like tweeting out links and stuff like that, which I think that probably if I was successful, we probably had two or three of those go out today.
00:04:07 How did you get into working with Django?
00:04:11 I think about eleven years ago or so. I was living in southwest Missouri and I was kind of looking at where do I want to go career wise? Because I felt like I wanted to do more Python development or more web and I was interested in startups and kind of side projects and stuff. And then I kind of happened on Ruby on Rails out of Chicago and I’d spent time in Chicago and really like Chicago, and then I noticed there was a weird blip called Django and that blip on the map was in Lawrence, Kansas, which was only about a three hour drive from where I lived. And so I noticed it was kind of funny because Matt Croydon and Simon Wilson, who were both people I really respected and read their blogs for years, were both like PHP developers and they both had switched over to Python. At one point I remember seeing Django and I think I actually emailed Simon Wilson to say, hey, have you seen this cool framework called Django? And I had no idea that he was one of the co creators of it. And so I don’t know if Simon, if he ever listens to this, if he has that email still, but I can’t remember if he replied or not. But I applied for the job to come work at the Lawrence Journal World. I didn’t get the job, but they told me try back in a year or something, we’ll probably be hiring again. And I think about like eleven months, twelve months to the day. I got an email from him and said, hey, if you’re still interested since you’re local, and I applied and came to work here and worked for the newspaper for probably three years. It’s actually one of the newspapers other companies.
00:05:32 They basically forked off the Django people to work on newspaper software.
00:05:36 And so I came to work on what was called Ellington CMS, which was kind of the second, I think, Django app, but it was definitely the biggest for the time. And then three years later, I left to come work with Frank Wyles and Jacob Kaplan Moss, who was another creator of Django.
00:05:54 They invited me to come work at Repsys, and so I’ve been working with them for the last, I think, eleven years or ten years or something like that. And they asked me to be a partner a couple of years ago. So nice. We’ve been doing Python Jango consulting. So even though I do Django, probably half of my year, last year was writing Flask code. So I don’t just do Django code. It’s anything and everything in Python.
00:06:14 So you do a little bit of everything. Were you a developer before Django?
00:06:19 I actually was director of operations for an ISP, but I actually left the company. I was working in ASP in Microsoft World. I left that company to do PHP side work or contracting. And then a group, they were starting an ISP, and they invited me to just basically join them because they had servers and they’re like, well, it’s pretty easy to scale PHP on Linux server. So why don’t you come join us? And then part of the time you can help us write software for the business, and the other half, then you can work on creating a consulting agency or bringing basically, like, web applications to the Joplin, Missouri Metro area. And so I did that. And three, four months later, they made me director of operations. And then I ended up running ISP for five years. So I did like to run like, I wrote a lot of PHP code and part of my hobby at the time. We ended up switching over to Django and Python at the time because I was pretty good with PHP, but it’s kind of tough to write server scripts and stuff with it. It’s much easier. Python just works better. I feel like if you’re going to run it from a console level. So we slowly started switching the ISP over to more and more Python, and Django came out. And at the time, it wasn’t a great framework. There was a lot of magic. So when Django had the magic removal, that’s when it really hit my radar, as this is kind of perfect because I don’t have the right admin code, which means it’s easier for people to input data who isn’t just me. And so our staff could work on data projects and stuff that we had. And we ended up running like a news portal for Joplin for a while. So it kind of grew from there. So I liked being a developer. I’ve kind of found myself in management roles over the years, but I’m kind of happy to do whichever.
00:07:55 Now, are you still in Kansas?
00:07:58 Yeah, I still live in Lawrence, Kansas. I feel like me and Frank Wyles are here of kind of that Lawrence Journal World group trying to think who else is here? Daniel Lindsley, who wrote Haystack.
00:08:08 He actually moved to Seattle and moved back here. So there’s still some Django people from, like, you know, that worked in that era still around here.
00:08:16 Okay. Is your job now remote? I mean, do you work from home or do you have an office?
00:08:21 We do have an office that’s probably seven blocks from my house. I think I’ve been to it three times in the last year.
00:08:27 I’ve got a three year old. And, like, with all things with COVID, we’ve mostly worked from the home and stuff like that because of it.
00:08:35 I have this thing about testing. I think it’s a good thing. And you wrote something on Twitter saying something like, by the way, don’t ever mock your database or something like that. So I wanted to poke that. How do you feel about testing with Django and why they need to tell people to not mock the database.
00:08:55 So as a consultant, I see a lot of good and misguided, and I see a lot, basically.
00:09:03 And so I think I stumbled upon a client who was doing a lot of database mocking. And this is a code base that’s kind of older. So at the time, ten years ago ish, let’s say mocking your database was something that people encouraged because Docker wasn’t a thing back then. It wasn’t prevalent. So the ability to spin up, like Postcards or MySQL boxes on the fly and connect to them was kind of slow. And so I think because it was a little harder to configure for a while, we started seeing people who were mocking up their databases.
00:09:33 I think now there’s just no reason to do it. I think it’s more complicated. It’s harder for me.
00:09:39 For one, I guess I don’t want to just say mocking is bad. Mocking can be really good. And so I guess for me, the lines for Mocking is like resource boundaries. Like if you’re going to write code that uses Boto and you’re going to connect the S Three, then it’s kind of hard to write tests. And if you have a team of developers, you don’t really want to distribute test keys and have people writing tests which write the S Three, and then everybody has to have their own buckets and their own credentials. So using a library called Motto is a really good way to mock up S Three, Amazon services in general, AWS services. So Motto, I think, is really good because you can actually write to it. You can fake where data can go. You can verify that things would have gotten saved to S Three if you have the right credentials. And it’s just a really nice library to work with. And I think the other kind of like anything to do with certain types of state, like with time zones, state timestamps. I think freezing or mocking dates is a really good way to test.
00:10:39 I did have a client issue not that long ago where they didn’t have data loaded for daylight savings time for this year, but they had a lot of tests that looked at like two or three years ago, daylight savings time and trying to predict that in different dates. And so testing past dates, present dates, and future dates, I think those are really good applications to mock that way, because you just don’t know what’s going to happen when your tests try to run at February 28 at midnight to know is there going to be the next day of February 29 or not?
00:11:17 Yeah. If some user interaction happens over the boundary of a day, what happens to that? Things like that.
00:11:28 Exactly. So I think with, like with databases now.
00:11:33 Yeah. I just think it’s just a lot of overhead. And what I see is a lot of drift that happens where people will write a layer of mocking and then they’ll add new features to their code base, run their test and code a while. When they mock their databases, the mocks no longer reflect the actual code and what’s happening on the servers. And so I’ve seen it where a feature can be like five lines of code, but then they may have to overhaul 50 lines of code to basically make the mocks happy. But when you remove the mocks and test against the database, these problems don’t exist. And so I would avoid that drift in general if it’s a distributed database. Like maybe, I don’t know if MongoDB would even matter, but if you consider S three a remote database or something because of how you’re saving objects, mock those kinds of things, for sure. But Postgres and my sequel, there are ways to speed up the test so that you don’t have to worry about performance issues when running them.
00:12:27 Okay. But doesn’t all of the unit test advice in the world say don’t touch the database and don’t touch the file system and things like that?
00:12:38 I guess I don’t really read that much about best practice of database mocking.
00:12:44 I understand being able to test isolated areas, but to me that’s kind of a function call as well. So testing individual pieces. I agree. And I believe in integration testing.
00:12:55 And for the most part, I’m more of a performance testing person. So because being a consultant, people usually call because they’re having some major issues, like why is their homepage taking 5 seconds to run? But other pages on the website may take a second. And so what we do is look at that and break that problem down to see why like real world. This happened once and somebody had what’s called Django middleware, which is like functions that call every time a request is made and they were hitting the database and pulling up a lookup table that had a quarter million records in it every single time somebody pulled up their home page. And so things like that, by looking at performance testing and measuring the number of database queries that happen every time that Homepage view gets hit very quickly revealed that you have this big table. And why do you want to load this for every single request? And so those issues are kind of fun to find.
00:13:50 And I do believe in testing. So if clients I’ve seen that don’t test much have hidden performance issues they just don’t know about because they don’t have the metrics to see where they’re coming from.
00:14:05 This episode of Test and Code is brought to you by Data Dog. Do you have an application that is performing slower than you like? Do you know why requests have high latency? With Data Dog, you can find the root cause fast troubleshoot your app’s performance with Datadog’s end to end distributed tracing and continuous code profiling to quickly detect what happened and why down to the line of code. And in one click you can correlate individual requests with related logs and infrastructure metrics to get full stack observability with zero contact switching. Start tracking the performance of your apps with a free trial at Test And Code. Comdatog and Datadog will send you a free T shirt. And to be honest, my Data Dog T shirt is one of my favorites. I wear it all the time.
00:14:59 If I don’t mock the database, then do I use the live database or is there some other alternative?
00:15:05 I like to use pytest fixtures and I’ll create fixtures on the fly.
00:15:10 Fixtures kind of get a bad rap, I think. I think pytest really reinvents maybe, or reinvigorates fixtures because like in early Jenga days, what was common practice or best practice was to use a bunch of JSON files that you would load up and pipe. Test fixtures have the right level of granularity that you can say, give me one object or give me a thousand objects and be able to test against it pretty quickly. So that’s my go to is pytest fixtures.
00:15:38 What does that mean? Do I have a test that uses a fixture and that fixture goes out and spins up a small database or something and throws data in it?
00:15:48 Or what do you mean I will test with a live database. It’ll be a local database, depending on the size of the test suite.
00:15:57 Instead of using it on this cache, we might use it in memory cache. That way your rights are really fast. And so usually the first couple of seconds of any test suite I write, that might be what loads up database fixtures. Some of those get written to the database, some of those are just on demand so you can pull up an object. It will give you everything minus hitting save on the object Django so that you can hit that object and run functions against it. So if you’re testing some logic like maybe a property or something that needs to calculate something that’s on the field but doesn’t need to hit the database, I’ll just use one of those fixtures and not save it because you can do the calculations based on seeing the fields themselves if I need to save it. I mean I can do thousands of tests in a minute. So to me that’s kind of the line. Does my test suite take more than a couple of minutes to run? Then let’s see what we can do to maybe isolate some of the tests or look at the fixtures to see am I Loading too much data?
00:16:51 That’s kind of my general use case is how do we keep this under a couple of minutes?
00:16:56 Okay, that seems reasonable. So you have like maybe a session scope fixture that loads some live data locally in the database and maybe you want to speed it up you can use in memory database and then you just have a whole bunch of tests that are using the same database then yeah, okay, cool, nice. Also the performance aspect. So one of the things that I think I don’t really address too much is the performance aspect.
00:17:22 Again because of being a consultant, it’s usually the I get brought into a job because somebody has some clients have a ton of test and code thing I like about Pi test is being able to Mark tests based on time and say like slow tests or any tests that take over a second and then immediately isolating those and then we kind of know what we need to work on. The other part is let’s say a client only has about 30 40% test coverage. And so one thing I like to do is go and write just very loose tests around like all of their views or all of the rest APIs that really just calls the endpoints even if it doesn’t have enough data and if it records $500 hundred error or whatever, the point is just to live hit that endpoint to get some kind of response from it, and then do an assertion on what those response codes are. And then that way we at least have some coverage. When you look at coverage reports to know that we’re at least touching as many aspects of the code base as we can. And then from there that’s when I put the marks in to say these are slow tests that take over a second. Sometimes these will be 27 2nd test and nobody really understood why does this one test take longer? Because they don’t really measure it. And then from there we come back and start looking at like why does this take a while?
00:18:36 Probably if it’s taking 27 seconds it’s trying to access some outside web service or an internal web server. It’s timing out and you just can’t see it. And then from there we can start making determinations of maybe for tests we should time this out after half a second or a second and we can start assessing what’s going on here. Another round I like to do, too, is just seeing if a client says, like, usually what I’ll ask them is what is like the bread and butter of their application? And this could be vegan butter. This can be gluten free bread, like pick for your condiments or sides of choice. And basically, what is the area of your code that’s your main business logic, like what’s paying the bills. It’s probably not going to be your about page, but there’s probably some core functionality. That’s why I’m going to go back then and start peppering the certain queries calls and stuff where we’re looking at the actual database calls. And those are the tests that we’re going to start with, and we’re going to fill in the actual fixtures to make the data look live and then measure what’s going on inside it.
00:19:36 Yeah, I like that. I like that you brought that up. One of the things I try to encourage people to do is prioritizing the parts of your system that makes sense.
00:19:44 And it’s completely valid to say the reason why people use my software is this thing. That thing needs to work then. So test more around that than around things like saving to Excel or something like that.
00:19:58 Exactly. That’s where we want, like your 80 90% test coverage. But for some of your ancillary content pages, like your blog, maybe, unless your blog is where you make your money, then it’s probably okay to have 20% or 40% test coverage there. It’s just priority wise, it’s probably not worth the time and money to prioritize putting tests there versus the part that pays your bills or where you accept transactions.
00:20:20 Yeah, definitely. And if you have, for instance, a high transaction area, something that service that people are using needs to be fairly fast also, whereas maybe your contact form is okay if it’s a little laggy because people are just sitting there waiting for it to show up anyway.
00:20:40 So you go in as a consultant. So you see a lot of different testing styles then probably. Are a lot of people embracing pytest with Django testing and with other web testing, or is there a lot of unit tests out there still?
00:20:56 So we’re kind of in a period where I feel like the people who haven’t moved over from Python Two to Python Three have kind of got the message, like Pip doesn’t really work anymore. You’re not getting new versions of Python 27 anymore. So I’m seeing a lot of old unit tests more and more, but thankfully running Pi tests to bootstrap those is not bad. It’s a pretty decent experience. So getting people to pytest is kind of the first thing I do 90% of the time, I can just install pytest, make a couple of configure a Pi test Ini file, and just run with it. And so I do see about 50% old unit test, but most of that is just because of this. I think we have people that are decided that how long do we want to run on Python Two now that it’s not being maintained anymore?
00:21:44 Okay, so you think people that are starting out on Python Three are starting with pytest?
00:21:50 I feel like it’s got to be a good I mean, if you care about testing, I feel like that’s where a lot of the articles are being written now is about pytest. So yeah, that’s at least my recommendation for sure.
00:22:01 It took me kind of a while to embrace pytest, maybe because I didn’t understand the way some of the plug in architecture worked. So it was a little weird, like writing a test function and using arguments that I didn’t know where those came from. But once I kind of wrap my brain around pytest fixtures, I really like how it works now.
00:22:20 Do you recommend people like, rewrite their unit tests if they already have an existing set of tests or you just run those with pipest?
00:22:30 I would run those with. And I think it just kind of depends on where they’re at in testing, like if they already have good test coverage. I don’t really believe that code gathers dust or Moss as it gets older, so it may not be worth it from a business perspective or justifiable for them to rewrite everything. But I think as you write new tests or as you touch old tests, especially because of that functional design in most Pi test, I don’t think I think it actually kind of makes things easier because Lacey Williams Central, I think you had on a while back, she works at Representative and we’ve worked on tests and different APIs and stuff together. And at one point we decided just to switch everything from kind of that classic unit test style to just a pure Python style. And she absolutely hated it and kind of cursed my name, I think, for the first couple of days of it. But then after some time we look back and go, actually, that wasn’t too bad. And so those tests kind of saved me a couple of months ago and I couldn’t figure something out. And I was trying to wrap my head. I don’t even remember what the problem was, but I couldn’t wrap my head around either author some piece of it. And I went back and looked at those test and code enough. Lisa had figured it out and that was like a huge time save for us.
00:23:43 Cool with testing with Django. I know that Django has implemented has this way where you can run you don’t have to run through the web interface, then you don’t have to run Selenium or something I can hook right into the Django guts and run locally, right? I don’t know how that works.
00:24:03 Yes, you can. I don’t really know how it works either. It’s just kind of a request object. And the Django developers, I guess, figured out some mechanics so that it looks like the test servers running.
00:24:56 And it’s not that I’m awesome, it’s just if you do enough testing and once you get in the hang of it, you don’t really need to go back and manually test as much. You still should. Like, these are healthy things to do, but from Rest APIs and stuff, it’s kind of a solved problem, I guess, right?
00:25:14 I guess for Rest APIs definitely.
00:25:18 Is that where your time is often spent, or are you testing things that have web pages to look at?
00:25:27 I’d say it’s probably 90% or more Django Rest Framework or Flask or probably fast API more and more.
00:26:31 Okay, so do you see any of the test suites you run across or run across have like Selenium components or to test the front end?
00:26:45 I don’t work with it as much, but yeah, I had a client a while back that used robot framework and it fires up Selenium, I think, or Google Chrome and does a bunch of manual automated tests. It reminded me when I saw this test run, it reminded me of Fantasia where the little buckets have the water and they’re all walking like to see everything dance around your browser and everything. But you can switch that to like a headless mode and stuff too and get good tests from it.
00:27:10 It was pretty impressive. I don’t tend to write those types of applications, but if things switch to be more servicide, then whatever that is for Pi test is what I’m going to look into and do more that Dom style testing for seeing what the browser sees and filling in data.
00:27:26 Okay, cool.
00:27:28 Well, I’m glad that you can trust web stuff without having to do this Lenient style. That just doesn’t like fun to me. But maybe that’s just me.
00:27:38 Yeah. I mean, I don’t know that I think testing is necessarily fun, but I like that feeling when you hit deploy with a reasonable amount of confidence that you can go get coffee or have a weekend.
00:27:49 So while I don’t love testing, I love the freedom and security maybe from a good test suite that lets me know something is going to break or not break at least the obvious problems.
00:28:00 Yeah, I’m thinking about databases again. The local databases that you run, are they pulled from the live data or do you have some dummy data that you fill in 90% of the time it’s dummy data.
00:28:16 I have had clients before that will have some pretty sizable use cases. Usually the more Selenium type driven tests those tend to rely, I’ve noticed on more like where they’ll copy client data to a staging server or they did at one point in time and then they tend to maintain those. So probably every ten clients I have that are just generated data, there will be one that has a really complex because one of them was kind of they dealt with a lot of encryption and a lot of like serial number type encoding stuff. It was kind of harder to fake and have reasonable data. So I think what they did is they had a pretty good system around. If they found the regularities and bugs, they could freeze that data and they would copy that down to their staging servers and that’s what we would run test off of too. And then that way after they kind of anonymized it and stuff so that everything was secure. But yeah, they were using kind of real world data to generate their fake data and then keeping that data set up.
00:29:09 For generating fake data, there’s a bunch of tools available. Do you have a favorite that you use?
00:29:15 Yeah, model bakery is my favorite. It’s got a pretty nice API and does a pretty good job of generating fake data along with filling out the objects in a nice way.
00:29:28 I’ll have to check that out.
00:29:32 Does it work with Django models then?
00:29:35 Yeah, that was the model part come from?
00:29:38 Yeah, it’s for Django models. Essentially it uses Faker and so Faker you can use on anything like Faker I assume works with SQL alchemy but Faker is just create a bunch of random first names, last names, addresses or whatever and so they have integration with Faker with it.
00:29:56 Okay. Nice. So you don’t have to do that yourself. You can give it a model template or something and it’ll yeah.
00:30:04 You can either just give it a model like pass the Django model into it or the string representation of the app and the model name and then it will generate as many data fake objects as you want.
00:30:16 Yeah, that seems like it would be very helpful. Nice.
00:30:19 Well, I think this is a fun discussion so thanks so much for joining me today, Jeff.
00:30:25 Yeah, thanks for having me.
00:30:27 It’s nice to talk to you every once in a while and if people want to find you on the web where do they find you?
00:30:33 I’m pretty vocal on Twitter at web ology so you can either find me there. My website is jefftriplet.com which most people can’t spell but that’s fine.
00:30:42 Thanks a lot and we’ll catch you later.
00:30:44 Thanks, Brian.
00:30:49 Thank you, Jeff. Lots of great information. Thank you. Also to Patreon supporters, join them at Test And Code. Comsupportthank config cat for sponsoring release features faster with less risk with config cat, check them out at configat.com, try them for free or use code Test And Code 2021 for 25% off. Thank you data dog for sponsoring modern end to end monitoring and security. See inside any stack, any app at any scale, anywhere. Get started with a free trial at testandcode.com data dog and davidog will send you a free tshirt. Those links are in the show email@example.com one, five, four. That’s all for now. Now go out and test something.