26 - pyresttest – Sam Van Oort

Interview with Sam Van Oort about pyresttest, “A REST testing and API microbenchmarking tool” pyresttest A question in the Test & Code Slack channel was raised about testing REST APIs. There were answers such as pytest + requests, of course,

Transcript for episode 26 of the Test & Code Podcast

This transcript starts as an auto generated transcript.
PRs welcome if you want to help fix any errors.

This is Test in Code, episode 26. I’m your host Brian Akin, and on this episode I talk to Sam Vanort and we talk about PY Rest Test.

A question in the Test and Code Slack channel was raised about testing Rest API.

There were answers such as Pi Test, plus requests and other various answers, but there was also a mention of a tool called Pi Rest Test, which I hadn’t heard of. I checked it out on the Get Head Hub Rebo and was struck by how user friendly the user facing test definitions were. So I contacted the developer, Sam Vanort and asked him to come on the show and tell me about his tool and why he developed it.

Today’s podcast is supported by Patreon supporters. Visit Pythontesting Network to find out how you can help get more shows on the air and help me pay for some services like audio editing and transcripts.

Welcome to Test and Code, a podcast about software development and software testing.

Thanks for coming on.

Thank you for inviting me in to do this interview.

I wanted to talk about Pyrestest. Am I pronouncing that right? Is there a good website for this?

Also, just the README docs on GitHub are where most of the information lives right now, so there should be links to navigate around that fair bit of documentation up there too, as well as some code examples.

Okay, so to back up a little bit, can you tell me a little bit about yourself first before we get into Pyrest Test?

Sure. So I guess I should give you some background.

I’ve done a lot of different things, but I started getting seriously into programming when I was working in science labs.

My original background was in chemistry, but then I branched into research and nuclear physics.

Yeah. So it’s kind of a weird starting point for getting seriously into programming, but I found that there was kind of a lack of applications for the kinds of data analysis that I wanted. So I basically ended up writing my own and then from there, via a long and circuitous path, I ended up as a professional software engineer, most recently.

Well, my current employer produces an enterprise flavor of Jenkins, so we’re in the DevOps space doing CICD work.

I’m currently in Jenkins core contributor, so my day job is primarily Java, but I also do Python for fun and for of course, for this side project.

Before this, I also worked at Red Hat and I did a mix of different languages there, primarily Java and Python, plus a variety of other things in JavaScript, the usual currently, I guess my current claim to Fame Besides this in Python is that I have production code running it running at Red Hat that is serving a large number of users.

In fact, if you’ve installed RPMs on legacy versions of Rail, odds are very good that you’ve gone through one of the code paths that I touched oh, that’s cool.

So not in a deep way, mind you. I didn’t write that application, but yeah, so I’m used to working on things that are fairly business critical at the companies I work at.

At Cloud B is currently I’m engaged with a lot of our bigger customers, and I do.

I’m sorry, I didn’t catch that. What was the name of the company again?

Cloud Bees.

Cloud Bees. Like as in the insect bees?

Yeah.

Okay.

Our logo at one point was actually a Beehive.

Okay.

So I guess, is that more or less what you’re looking for? Sort of where my background is. I’m not trained in QA.

I tend to do testing as incidental to my main work.

Yeah.

That’s actually how I got into it is making sure the software that my own team is working on works as it’s supposed to.

So I came across PY Rest test because somebody on the Test And Code Slack Channel was asking about testing Rest APIs. And one of the people mentioned that there was this Pyrest test that we should check out, and I had never heard of it. So I went to I Googled it, of course, and found the GitHub site or the GitHub read me. And one of the things that intrigued me about it is right on the front you’ve got a sample test that it looks really easy to read.

And that’s what intrigued me is that the interface to the user seems very friendly.

Thank you.

I guess I’d like to know a little bit more about what the I haven’t tried to run it, but I’d like to know more about what problem are you trying to solve that wasn’t solved by other products? And what is do you use this and what is it used for and where are its limits?

Trying to figure out if somebody’s listening to this and trying to figure out if they should use it, who’s the right kind of project to use this?

Sure. So that’s actually a really good question. So the reason it came about is that I was doing a project at Red Hat that involved a very significant change in our services layer, all of our Rest APIs.

It’s used to serve a lot of the core business functions at Red Hat.

And in the process of this, we were doing a lot of rearchitecting and changing the deployment model and changing, basically doing a lot of deeper changes to the services.

And we needed a way to validate that. After making those changes, they worked.

Basically, the way it started was a very simple bash script that I used to invoke a series of curl commands and look to see if they return successfully.

Very basic, right?

Yes.

There wasn’t really anything that was super easy for that purpose.

It sort of purposes well enough. But as you can imagine, being a bash script, there were some pretty serious limitations.

You couldn’t do very deep inspections.

The scripting was very brittle and hard to extend.

If you written anything on your side and Bash, you know that it’s not an ideal language beyond shorter scripts or kind of limited functionality. I mean, you can do a touring complete language, but support for typing is very weak. Use of arrays is very painful in some cases.

Bash is not designed to scale to complex features.

Yes.

So what I ended up doing was basically rewriting the whole thing in Python.

So the interface is YAML. Am I getting that right?

Yeah, I picked YAML because I wanted a way to declare tests for systems. It was easy to use. I had a choice basically between a config file which was basically like a delimited set of address or URL plus expected response code or method type.

You can have that like a one line for each test, but that’s very limiting. That’s how it started out with the Bash test.

And then I looked at using XML, JSON, XML as you do. If you have worked with it, I assume we all have.

It’s not exactly the most concise language, right?

Yeah.

Parsing. It can be very there’s lots of gotchas things like attributes and formatting and schemas and validation and all of that.

And I think XML was intended to be like a human readable thing, but the way most people use it, it’s definitely not very readable.

Yeah, it’s readable, but only after a fashion.

And JSON is wonderful for server to server communication. But if you use it in practice, its support for data structures is not as human readable.

It’s not cleanly formatted.

People who’ve worked a lot with AWS APIs, you have some familiarity with the challenges of using JSON for everything I settled on. Yummy. Basically because I thought it would be easier to read it’s very close to something like Markdown or one of the other minimal markup languages. And the structure of the code very naturally maps to how you would write a piece of text.

To give an example, I’m looking at an example here.

You can declare a test with just test colon and then you can give it a name and a URL, and that’s a basic test. Now what does that do to just validate that you can get something from this URL?

Yeah, sure. I think I didn’t properly answer your question earlier. So there’s kind of three levels to pyrest test.

The basic case that I was initially trying to solve is just checking that basically if you make a call to a given API, it returns successfully.

So by default you’ll look for like a 200 response code.

Okay.

Except for post and put, which can return different values.

If you’ve created something or if it already existed, you’re basically looking validating a set of Http response codes that’s like the first level. The next level is going a little deeper. You can validate that the response content matches certain patterns. There’s a Plugable set of validators that can check, for example, text in the response body, look for headers, look for you can pick specific codes you expect to see back, like validating that. If you have an API that’s creating records for employees, when you try to create an employee that already exists, it returns an error and then sort of the next level is you can build this up using variable bindings and templating.

Oh, really? Okay.

This lets you build basically arbitrarily complex kinds of requests and responses.

You could template out URLs cookies and request bodies.

And also the same thing applies to the validations.

Now is this something I’d probably use for the most part on a production site, just to run it occasionally and make sure everything’s still working?

Yeah. So there’s kind of two ways to do it. I think that’s a very common use. What it’s often done is hooked into deployments. So every time you run a deployment, you’ll fire pyrest test to validate that all of the APIs work at whatever level of detail you want. Now, the other approach is to use it as a functional testing apparatus.

So something a little bit higher level than just a basic unit test, but you’re actually testing that when you launch your application, it works.

You’re testing from the external interface, all of your requests and responses behave as you’d expect.

Yeah. Okay.

It’s still, I guess it would be difficult. I’m trying to figure out some of the limits of it. If I had an API that did some work that is not like some side effect that’s not obvious from one request and retrieval, then if I had to interrogate some other part of the system to see if it worked, that would be difficult using this, I’m guessing.

Yeah, it’s really designed to work with Http requests. It’s not designed to work with browsers, and it’s not designed to couple into the database. But there is an extension of functionality provided that would allow you to do that. If you chose, you’d have to write some more involved Python code to do that. Okay.

Yeah, it’s built to solve the most common case, but also to let you grow it out if you want to do something more involved.

Well, this definitely looks like it’s way easier than Bash, I guess. Continuing on, how would it so this is implemented in Python.

Have you looked at like, for instance, using requests or something like that to do a similar functionality?

Yeah.

I guess it’s a loaded question. My first reaction before I even looked with request, it wouldn’t be that hard to write a test suite to test an API. However, the interface is really concise, and I would be hard pressed to try to write something more concise than this for the end user interface using anything else.

Thank you. That’s high praise.

Well, I don’t know if it’s really that high appraised, because my job isn’t to test APIs.

So that’s kind of the most common question I’ve gotten about this is, well, couldn’t I also do this with Unit test or Talks plus requests and start your Python testing framework of choice?

Yes, you could.

It is a direct competitor to that combination.

But as you’ve kind of highlighted, the big advantage is that the syntax here is very terse. It’s designed to be very easy to use, very readable. It’s more or less declarative, which makes it a little bit more straightforward to follow the logic.

And I think the big win here for this is that the syntax is designed to be language independent. So if you’re in an environment where, for example, Java go Python, maybe Ruby, JavaScript are all languages that are used for server side applications.

You don’t need to know Python to write a test and pyrest test. You just need to build a do a little bit of YAML writing.

So the big Pros and Cons versus Unit test and requests versus Pirates test, you can do a lot more.

You have a lot more flexibility if you’re writing tests in your Python code. Obviously you’re not as restricted to as YAML syntax, but of course you have to provide a lot more yourself. You have to check what response code you expect. You have to implement your own validations on the input. You have to have various handlings for things like timeouts network failures and so forth.

So it starts out very simple writing like the basic test case and requests.

But as you kind of grow out the list of conditions it has to handle and the complexity grows, it starts to become more complex. And that’s one of the things that can be very helpful for something like this and true for frameworks in general.

Yeah, I think that it’s a natural question to say, like how does this compare to other test frameworks? And I guess I would rather look at it and see that I could definitely see this complementing another test framework. So there’s a whole bunch of I’m sure that the more complex things that you can handle are great as well. But at the very least a lot of the I don’t know on making sure that the about page and some of the simpler stuff in an application that it seems like overkill actually to use a Python test framework to check for a different test framework, whereas doing them in combination. And if there is side effects that you need to check, well then use something else to check those side effects. But the things that are just Http request checks, why not use both?

It’s designed to be complimentary.

The idea is that you’ll have different test frameworks designed for different things.

This is designed to kind of cover the 90% case. If you have one of those special stove Lake cases that needs the other 10%, then write your own code basically, or write an extension.

I think the idea here is that I wanted to focus on one thing or one cluster of things and do it well. Like I intentionally avoided including load testing, for example.

Yeah, okay.

There’s a ton of really good tools for that, everything from Gatling to Apache Benchmark to Siege to Locust to the others. I could name five of them off top of my head, and there’s not really a need for that. This has benchmarking because I think that’s important for many APIs, but it doesn’t have load testing intentionally.

It’s not intended to do unit testing either. It’s not designed to do sort of lowlevel functional tests. It’s designed to be very much external interface oriented.

Now, I haven’t figured this out yet, but I haven’t really looked that hard. What is the output of this?

We’re talking over each other, but can I hook it up to a Jenkins build server or something like that?

Sure. It actually produces a summary output.

You’ll get a listing of all of the tests that were run, the groups, the overall past sales statistics for each, and errors.

There’s also a PR out that outputs it in J unit format or X unit, I should say. So you could parse that that’s currently in the process of being integrated, there were some architectural challenges that locked that early on.

Mostly come now.

Yeah. So it’s designed to integrate with that. It also provides if you want to hook it into scripting in something like Jenkins, it provides a response code, so if you just run it as a shell command, you’ll get back like a failure of any of the tests fail, actually.

That’s great.

So many people forget to use error codes now and command line stuff, so that’s cool.

It probably helps coming from a Linux background, having worked at Red Hat.

Yeah, that’s good. They designed a tool to do one thing. Well, I like it now.

Is this an active project then?

Right now it’s been a little bit quieter lately because I’ve been kind of bogged down with work responsibilities. But yeah, it’s still under development. It still gets issues in PR submitted.

I’m actually looking at doing a bigger push over the holiday season until we have some time to do coding to kind of integrate some of the planned features and tie together some of the submitted features.

Okay.

It’s got a small open source community, but it’s had a number of contributors offer stuff.

Is there any particular part of it that you’d like to have anybody help with, or if somebody wanted to get in there and help you with it? I guess there’s some issues and whatnot that somebody could look at.

Yes. I actually have a tag that I’ve created for issues to Mark them for things that I would really like to have community assistance with or that would be good candidates for someone making their contribution.

Oh, yeah. You’re using the help wanted tag also.

Yeah.

That’s kind of my marker for hey, this is something that would be great for someone to chip in with.

I think one of the bigger things that someone could do that’s really helpful is there’s a pluggable registry based approach for comparators validators extractors, which is sort of my approach to.

So I should take a step back and explain what those are, because I haven’t done that.

Basically the way we do functional tests here, it’s a request response model.

The way you get information out of a request, rather is you can use an extractor to extract some variable from that to use in future tests. Like if you create a user, you’ll get their ID, and then you can use that to validate that user has certain properties and future requests, or to do things with it, you have validators that validate some property of your responses like that. For example, it has a valid username or validate that it is returning JSON.

There’s comparators that let you check for some extracted property versus another like look to see that when you create a user, the created username matches the one you submitted.

Okay.

And then there’s generators which let you generate templated content to use in testing, like creating, say, a list of usernames.

So all of those are pluggable and there’s a set that are provided.

I’ve got things that cover a lot of common cases, but there’s always room for someone to contribute more, like more XML validation maybe, or just about any feature you think is useful that could be implemented that way is a great thing to submit.

Yeah.

So it’s designed to kind of grow that way, and I’d always love to have more features added via those.

Okay, well, I think now I’m just intrigued and I’m trying to find time in my life where I can go play with this, but I definitely think it’s cool to get this installed and running. It looks like it’s just a Python install with some curl requirements, I guess.

Yeah, it’s just a Pip install.

That’s one of the things I’ve put a lot of work into. I’ve got a sort of a test harness that runs in my own personal Jenkins server to validate the installation, because that was something that had was kind of a challenge to ensure it worked across a range of environments.

Yeah, it really runs on everything, actually.

I have a PR out, in fact, that lets it run in Windows too, from someone recently.

Okay. I didn’t see that. So it doesn’t currently work in Windows.

It doesn’t explicitly have support for Windows yet, but it looks like the changes that were needed were very trivial.

Okay.

So it’s very easy to adapt to it. And of course, on Windows, well, there’s options to use like a virtual machine or run Docker or something like that.

Yeah.

I tried to make it multiplatform because originally it needed to support everything from very old versions of Linux like Rail Five through Two very modern systems, so I put a lot of work in trying to get compatibility there.

Yeah. Let’s see. All the way up through the Python versions starting at 26, a lot of people are dropping two six support.

The reason it still has two six support is that it’s being used at Red Hat for testing at least a couple of projects, and sometimes those might be deployed on older servers.

Okay.

So keeping back compatibility there was kind of important.

I might decide to drop two six support in one of the later versions if I need to add additional features that are kind of blocked by two six.

For now, it’s maintained.

Okay.

Anything else you want to cover about it that we haven’t hit on so far?

Sure. So I’ll highlight what I think are kind of the cool things about it. That’s okay.

Oh, yeah, definitely.

So you mentioned the YAML syntax. One of the things I’m proud about is that this is designed in a very flexible way so that you can plug in new, validators and new functionality just by basically giving it a name, like, for example, testing or extracting content via JMS path, which is a JSON. It’s kind of like an X path, like JSON query language. It’s just as simple as adding an extractor JMS path colon and then whatever query.

So it’s designed to be really easy to add something. And because it uses a library based method, you just basically have to name it and have some code and you can use it in your sentiment or test some text almost immediately.

I also put a lot of work into setting up defaults when parsing the test structure.

So by default, the basic test is going to have timeouts. It’ll have a basic test for Http response code.

You’ll cover the things you’d expect to see initially, and then you can just modify that as you go. So it’s kind of designed to be it follows the Python philosophy of batteries included, and it also basically follows convention over configuration in general.

So I do see that, for instance, you have a config set that you can say, like you can specify, for instance, I guess a configuration for a test set. So can I have multiple test sets all in one file with different configurations? Like different timeouts for different parts of the system?

Absolutely.

Okay, cool.

And you also have an option to set up imports within tests, so you can compose groups of tests that include other files in different ways. So you could have one set of tests that’s exhaustive and includes a full suite, and then you could have a fast test set that imports a subset of those.

So I can build these up and import other tests.

Yes. This is actually something that was added fairly early on by a collaborator at Red Hat.

That’s pretty cool. I like it.

Thank. Thank you. You. There’s some Nifty stuff in here.

There were some serious limitations in a lot of the test utilities that I had seen before. This like they required either they required a lot of kind of custom code to build something or what they tested was very restricted. There wasn’t a lot of flexibility to extend it.

Yeah.

I really want to thank you for definitely coming on and talking about it. I’ll give you a chance, of course, to talk more about it. But one of the things I wanted to just mention was the main reason why I was excited to get you on here was just this.

It’s the beginner mindset.

It’s kind of a Zen thing.

It’s the beginner’s mind. Whereas I knew all of the tools I knew how to test with, and this was a different tool. And my first reaction is I don’t know how to use it, but I want people to if you’re testing a Rest API, just take a look at this because it may save you some time. And especially, I think in a lot of situations, if you’ve got some people that are responsible for making sure an application is up and running and adding to it that are not programmers, I think this is a really great option for people to maintain a test suite that are not developers themselves.

I’ve definitely tried to kind of cater to that use case, and I’ve tried as much as possible to make it friendly to new users.

But I think it seems like it’s powerful enough for the hardcore developer to get a lot out of it. Also, especially on a team like, let’s say we’ve got the development team is the test team, and they don’t have a lot of time to make sure that the regression suite. And I like that you position it as a I mean, it could be more thorough testing, but at the very least, it’s a really great way to do smoke testing of an application.

Thank you. Yeah. I mean, that’s kind of one thing I was kind of looking to address is how hard it is to get a lot of test suites up and running.

You think in many cases think about like setting up an environment, making sure that all of the things work, all of the different components getting selenium up and running in particular can be very painful.

I wanted something that was very easy to land that you could use both on a server for smoke testing. It’s designed to run both locally on a server, from a central test server or from a developer’s box.

Basically, I just wanted to make this process a lot easier because I’ve seen so many cases where it’s very painful, and I was trying to do something about that.

Yeah. I guess one of the things I haven’t even noticed in looking at a lot of the examples, you don’t specify the root of an application, is that specified in the launching of the test or how do I tell it the root?

Correct.

There’s a base URL that you give it. All of the tests are appended to that. There’s a way to override that. Of course, there’s an option if you want to hard code the full URL there. But the intent there is you can just change the server location to switch from running a test against a local application to, say an application running in your test environment to running in your production environment.

Okay. Yeah. So then you have the exact same test running in two different environments, and the base URL is going to be different.

Okay, now I get it. It’s just one. It’s like the first argument on the command line. Yeah.

What we were doing is basically you land a config file that has the tests descriptions, and then you have a little script that invokes that and gives the server URL. So you can actually, to some extent, it separates control of the test execution from control of the test definition, which can be important, for example, for security.

Yeah. And then, of course, those URLs are going to be different, like you said, with a test bed or within a continuous build environment.

Okay, I interrupted you. But any more cool features you want to highlight?

Well, I think the way I did the extensions with support for separate, extract, validate, compare, generate.

I haven’t really seen that approach used any other testing frameworks.

Maybe you know of one that does. I’d be curious to take a look at how other people have done it. I’ve certainly borrowed ideas wherever. I think there’s something useful that someone else does.

But I made some mistakes in the early design, I think, because I was still kind of learning some of the more abstract aspects of Python. But I think that was a really good architectural decision that I made because it lets you compose, compose everything very flexibly. It’s the same approach as, for example, bash scripting.

Each command does one thing and does it well, and you can pipe them together to create a very complex results.

Can you, I guess, expand on what you mean by the you said generate, extract, validate. What do you mean by that?

Yeah, so it’s laid out in the docs have a section, Advanced Guide, that goes into this in detail if people are curious about how you can put it together. But basically the idea is you’re creating a sort of a pipeline for all of your test scenarios.

For each request, you’ve got templating and variables that are stored in the context, and each test can build on the test before that. So, for example, one common case would be you create a test user.

Let’s say you set up to give them permissions. You validate that the permissions are what they want to be like, what it should be. You validate that the user can’t access things they’re not supposed to. You validate that the user can then be disabled their account or delete their account. So you create a full life cycle. So that’s the goal, right? That’s a very common scenario. Translating that to pyrest test, you have all of the tools you need to build that. So for example, you can use a generator to supply random usernames or say addresses.

You generate a dummy address like 1234 Anytown USA.

Okay.

Like say Bob Smith, you can use a list of different possibilities for the generators.

So that gives you the first part, the templating using a generator to generate dummy data. Then you can plug the output of that into your next test by using an extractor to pull out the user ID, the user name, and store that to a variable. You can then use that in your next test to validate that the user has the properties you want, using a validator that compares the variable to the extracted output from your response and you say look up the user.

Then you can use that same variable again.

Say when you set permissions to look to see that, to compare the permission expected with the permission output, or to check that there is a restriction in place on that user or permission present using one of the comparator functions or an extractor. Plus it exists.

You set this up so that people can write their own generators and validators and extractors and just plug them into the system.

Yeah, it’s a very short snippet of Python needed to define any of them. And I actually have a little sample one. I mean, you could write one of these in five lines of code.

Yeah.

About that long.

Okay.

But I could use some other library as well to generate data that does it.

Random addresses or whatever.

Yeah.

And you can hook that in, just create an implementation of the generator and then tell it to import the extension of the command line and it will load up from the folder. You say it’s got some sort of smarter handling to deal with the way Python builds up the libraries.

So you’ve said that you’re not a QA person yourself, correct?

No.

Have you worked with QA teams that use this tool?

Yes.

I’ve answered a lot of different questions from people all over the world working with working with it in different ways.

Brazil, China and India seem to have a lot, a lot of the users at the moment as well, as, of course, the US and Europe.

Everyone seems to have kind of a different approach. So like, some people are using it like you mentioned, as a smoke testing tool.

Other people are using it as a functional testing tool or a validation.

A validation somewhere between code deployment and like to validate that like a test environment or for health checking. I mean, there’s a bunch of different ways people have kind of hooked into their systems.

Yeah. Like I was even thinking on my own, my own personal sites like the test and code Python testing.com. I mean, I have something that makes sure that it’s still alive, but I don’t have something in place to make sure that all the pieces are all still working, and I could probably easily put that together with this, so that’d be cool.

I would be a little dishonest if I didn’t admit this has some weaknesses as well.

Okay, yeah, tell me those.

So I think one of the bigger ones is that Curl has been a consistent pain point getting it installed.

Partly that’s due to the fact that it’s a native library.

It brings a lot of low level functionality that this is very dependent on, especially for benchmarking. But I think if I were doing this over again, I probably would have started out with a pluggable Http library so that I could switch between a curl and a request based implementation.

Because of how painful the working out all of the installation issues was, I think I’ve got most of those solved now, but I would rather not have had to do it.

Are you using Curl to actually submit the request?

Yeah.

Conceptually could be rewritten with requests or something. Correct, maybe, yeah.

So there’s some kind of partial work in there that’s going to decouple it a little bit from the one of the other mistakes I made early on in design was I wrote a very procedural way of executing the tests, which was kind of very linear, very rigid. But as it became more complex, the different options that people wanted to use things like masking outputs or customizing, sometimes customizing headers, sometimes using templating, sometimes not.

I had to add a lot more conditional logic, so I’m kind of working on, I guess, isolating those into smaller functions, which will make it also a lot easier to pull out the curl part and replace with requests or allow different ways of outputting the results.

Okay.

I will say that for people who are curious about requests versus Pycroll, Pycroll is very fast in comparison. It’s great for benchmarking because you get a lot of low level control. You could do just about anything you might possibly want to do with Http requests and responses, but it has a lot of quirks to it.

I would strongly encourage people who are looking at doing networking to try requests first and then only go to Pi curl if they need the performance or the low level functionality.

So what was the reason that Pyrestest uses curl?

Which of those or both?

It’s a combination of the lowlevel functionality and the fact that I wanted to do benchmarking and Lib curl makes it very easy to gather timing information, like time to connect DNS lookup time, time to the first response from the server, processing time, and then how long it takes to transmit the data request doesn’t really make that very straightforward.

And how are you using some of that data?

Yeah, so you can actually get an output of all of the different timing information from a benchmark.

So basically it supports pulling out any of the statistics you can get from Curl, everything from bytes sent to bytes received to each component of timing. It also supports aggregates across those benchmarks. So you can do, for example, the median, the average, the geometric average. There’s a couple of different ways of doing it.

Okay.

So you can basically get the statistics for every request or aggregated across, say, 100 runs.

Oh, is it build up on it then?

Pardon?

Do I run the entire suite a hundred times or every time it runs? Does it add to the metrics, I guess is what I’m asking.

I would guess that you specify that. I want to.

Yeah, it looks like benchmark runs. You specify how many times to collect, how many times to run through the tests.

You can also output directly to a file.

Csv files.

You can get either the raw data or the aggregates.

Okay.

Standard deviation, totals or sums.

Yeah. I guess this sort of stuff is important, too. So you might have an application that functionally still works the same, but for some reason, one of the parts takes like five times longer than it used to. That’d be something you’d want a red flag and look at, I guess, right?

Yeah. I mean, it was helpful. Also, when I wanted to do changes to specific APIs, I would use it to test to see if changing the way I did database lookups made the request significantly faster on average on aggregate.

One thing I’d love to see is someone had a 99 percentile response time to this, because that’s something I hadn’t done and I really wish I had added it.

99%, explain to me what that is.

Especially if you’re coupling together a series of micro services.

The total response time to some action is determined by each one of the component requests, each one of the smaller services. Right? Yeah. You’re basically adding those together.

Each hop back and forth adds to the total time.

Any one of those that goes extremely slowly once in a while will slow down the whole request, the whole processing pipeline.

So it’s not just important to have your average request time or processing time be fast, but also the slowest ones need to be still fairly fast.

Oh, right.

Okay.

Especially for environment. Think about like Amazon or Google, where you’re running at large scale.

You need to have predictable response times for your system. So it’s actually far more important to have your long tail, your 99%, the longest, longest, fairly normal request time still be fairly close to your average.

It doesn’t matter if, say, 500 requests in 1000 are extremely fast if the other 500 run very slowly, because that’s going to be user visible very quickly.

Sometimes it only just takes one bad experience for somebody to leave. Right.

Exactly.

Like the thing about pulling up email.

If your inbox loads very quickly 100 times and then one time in 100, it takes like five minutes to load, you’re going to be pretty cranky.

Yeah. Users are darn picky.

Yeah. They’re going to notice the one five minute Loading. They’re not going to notice the other hundred that were very fast.

Yeah, definitely.

Okay, so this is one year. Do you have other projects that you’re working on other than this?

I guess I didn’t ask this. Is this something that you’re still developing at work on during work time or is this a side project?

This one’s a side project for me.

I was doing a little bit of work time when I was at Red Hat, but my day job is primarily Jenkins related. So this is personal stuff.

I do have some other side projects that I’ve done. I don’t have any other ones that are very active at the moment that aren’t related to my work. Things I have done some little like benchmarking. There’s a benchmarking library I put together to test the different Http libraries. That’s how I know that. Hi Pearl is much faster than requests. For example.

I’ve got numbers for that.

A little bit of this, little bit of that small.

Cool.

I guess I got nothing else other than just I think it’s really awesome that you put this together and I’m glad that you came on the show to share it with us.

Well, thank you. I guess I should ask. This is sort of my first time doing an interview like this. Is this the kind of things that you wanted to know about it?

I’m new to this too.

So my background mostly as a developer, but for a very long time I’ve been very involved with the part of developing that’s making sure your stuff works and also being in development teams where there’s not QA a dedicated QA team.

I found there’s a lack of information out there is information out there on how to do test driven development. But the test driven development is focused on unit tests and all of that pipeline of TDD and XP and all that assumes that you can ignore the big picture because the QA team is going to take care of that.

That’s one of the reasons why I started this was to talk about a lot of things, but one of them is what do you do when there’s no QA team? The development team has to spend their time testing a little different.

You can’t spend half your time writing unit tests and half your time writing code and then there’s no other half of your time left over to do system level tests.

That’s a problem I’ve seen. And the other thing is I really wanted to try to find out more about how other people test stuff. So most of the time I’m making up my own stuff and how we test and listening to other people is really important finding out what people are using to test.

I’m kind of curious what you’re seeing as far as system level testing, because I see my own corner of the world, obviously, and I try to keep abreast of what’s going on, but there’s always new things coming out.

So I guess repeat the question.

So I guess I was just kind of curious what you’ve seen in terms of system level testing as separate from unit testing and maybe browser testing.

Most of my time is not spent on web applications, so I deal with environments that are not I don’t have to deal with a browser, but there’s a lot of complex systems that involve lots of teams and lots of development, and there’s got to be a reasonably quick way to make sure everything works from top to bottom and not just looking at one little corner of your world.

My work right now, my career has been in electronic test instruments, so right now we use Python test tools to test instruments, and the interface is visible via Python. Or you can it’s a text based interface and you can make things happen. You can send things and make settings, and then you have to check to see if the results are correct or the site effect is correct.

These sorts of checks are common across a lot of stuff. And one of the things I’ve seen now is as web interfaces have become more complex with microservices and multiple teams and multiple layers.

It’s similar to the embedded world where there isn’t one person that’s the expert of the entire system anymore. It’s a lot of people working together, and it’s a similar complex problem.

It’s always the boundaries that get you, isn’t it?

Yeah.

I am pushing an uphill battle to fight for tests that are at a larger scale rather than unit tests. There’s nothing wrong with unit tests, but the really hard problems that keep me up at work late at night or going in on the weekends are never something that’s isolated to one piece of functionality or one piece of the code.

It’s timing problems or thread priorities, or there’s weird system interactions that there’s no unit test that would ever find that.

And setting up a full system environment is also very challenging. One of the things that I’ve kind of tried to cope with working on the Jenkins project is how to test in a fairly realistic scenario.

Does your server application start up? Will it start across different Linux distributions?

Does it do all of the things work?

Yeah, that’s a hard problem.

I fortunately have to usually have one piece of hardware that I have to deal with and not multiple environments, but then I also deal with different problems like that. Even a very simple round trip test takes quite a while because I’ve got latencies with things like settling delays and just the timing interaction for the entire electronic system.

I’d like to check out some more of your podcast to see if there’s some interesting things in there that I might be able to apply because it sounds like you’re coming to a very challenging problem.

There’s a lot of great. One of the things I picked Python is that there’s testing within and how to do tests is very active right now in Python. I don’t know if it is in other areas, but I like Python and it is here. The Pi test community is pushing the boundaries, I believe, on how to do a very good test environment that’s as elegant as possible without being too magical.

And I don’t want to dismiss unit test too. The unit test within Python, it’s often dismissed as being too complex, but it is still being developed and it’s a pretty powerful piece of software as well. So I wanted to bring a voice to some people that really heard about a lot. I like listening to podcasts and I like listening to technical stuff, but I’m not a web developer, so I wanted to create content that is applicable to web people and everybody else as well.

It’s very interesting. I think Python is a wonderful language to develop it because it’s got such a dynamic community.

There’s often more than one solution to your problem in terms of libraries and there’s a lot of growth and development with new approaches and new ideas. It makes it very exciting.

Well, you know what’s funny is I thought I was going to have, like, about a 20 Minutes Conversation With you, and it looks like we’re pushing an hour or so.

We should probably wrap it up, but is there any particular call to action you’d like to say before we go?

Well, I just encourage people to grab a copy of Pierce test, give it a try. Let me know if it works for you. Let me know if it doesn’t. I’m always taking a look at New features, bump reports, really. The whole point is to do something useful for people, so if it’s not useful, I’d like to know what to do to make it better.

Awesome. Hey, I really enjoyed Talking with you, and we should keep in touch. Sam.

Absolutely. Brian, it’s been a pleasure. Thank you for making some time to chat.

Alright, well, I guess I’ll wrap it up there and thanks a lot. Thanks.

So when this comes out, I’ll let you know and it may be a couple of weeks before I get to it, but anyway, I guess I better get back to work.

Yeah, me too. I appreciate it. Hopefully I’ll be able to trim down my overly verbose answers to something a little bit more useful.

Oh, no, I don’t have time for that. We’ll probably just ship it as is.

Fair enough.

All right, talk to you later. Bye.

Take care. Bye.