100 - A/B Testing - Leemay Nassery

Let’s say you have a web application and you want to make some changes to improve it. You may want to A/B test it first to make sure you are really improving things. But really what is A/B testing? That’s what we’ll find out on this episode with Leemay Nassery.

Transcript for episode 100 of the Test & Code Podcast

This transcript starts as an auto generated transcript.
PRs welcome if you want to help fix any errors.

00:00:00 Let’s say you have a web application and you want to make some changes to improve it. You may want to do some A.

00:00:05 B.

00:00:05 Testing, too, at first to make sure you are really improving things.

00:00:09 But really, what is A.

00:00:11 B.

00:00:11 Testing? That’s what we’ll be finding out about in this this episode with Lehman Nasseri.

00:00:18 Thank you, Oxylabs, for sponsoring this episode. Oxylabs, a top provider of innovative services including real time crawlers, Rob, scrapers and residential and data center proxies. They are trusted by more than 500 companies. Find out what they can do for you at Oxylabs. Io Test And Code welcome to Test and Code, a podcast about software development, software testing and Python.

00:01:03 I’m excited to have today on Test and Code, Lima Nasseri, she’s joined me to we’re going to talk about A.

00:01:09 B.

00:01:09 Testing. I can’t believe that I’ve gotten so many episodes in and we haven’t talked about A. B. Testing.

00:01:16 So welcome to the show.

00:01:18 Thank you. Hi. I’m excited to chat about A. B.

00:01:20 Testing. You have a really kind of an interesting background. I think so. Tell me who you are.

00:01:27 Yeah, sure. Thanks for the intro.

00:01:29 So my background, I guess you could say it’s interesting. I don’t know if it’s that interesting, but I’ve spent like the last nine or so years at Comcast, essentially working on software development in one way or another, whether I was like an engineer, lead engineer, a manager, a director, or back to a manager.

00:01:47 So, yes, I kind of made like the joke is personally, I don’t know if anyone finds it funny, but I’ve made my own rotational program at Comcast to an extent. Like, I’ve moved every like two years or so to a different team.

00:01:58 But the majority of my experience has been on in one way or another on working on web services or platforms to help improve our content discovery platform. So that’s like, how people get to watching TV shows or movies when they watch TV via Comcast cable. And then recently and specific to this topic, like, the past three years, my focus has been on all things A. B. Testing and how to measure different experiences to improve the product.

00:02:28 Okay, so A.

00:02:28 B.

00:02:28 Testing is really crucial to your role right now?

00:02:31 Yeah, definitely.

00:02:33 I guess so. In a nutshell, my team supported, or at least in one way or another supported building algorithms to improve how we surface TV shows or movies. And then specific to that, how do we measure the way we change an algorithm is via A. B. Testing to make sure that we’ve improved that experience and not degraded.

00:02:54 It cool.

00:02:57 Now, when I think of A. B. Testing, I think of people changing something on a website and we’ll get the traditional I mean.

00:03:05 I think that’s usually what I think too. It’s like an ecommerce website or something like that.

00:03:08 Yeah. So is the A.

00:03:10 B.

00:03:10 Testing you’re doing within the, like, when I’m on my Comcast remote looking at this stuff on the TV, or is it through the website?

00:03:20 Yeah, it’s all the above. I mean, I think at Comcast, we’re trying to do it all the above. We’re trying to not only AB test the things that you see when you look at your TV, but also when you’re on the browser or on your mobile device and also navigating the content or even just like navigating with how you use your broadband Internet. There’s so many, like, everything. In a nutshell, most things should be AB tested, especially when they’re like customer facing. And that’s kind of the goal of Comcast.

00:03:51 Majority of my focus has been on the TV experience, but I know in general here there are other teams focusing on the different platforms or products.

00:04:00 Okay, well, let’s back up a little bit. If nobody’s ever heard of a B testing, what is it?

00:04:07 Yeah, that’s a great question. I think at a high level or in its most, like, simplest form, it’s like one way to think about it. It’s a way to experiment.

00:04:16 Another way to think about it is it’s simply like comparing two different versions of something? So what? I mean version. I mean, you have a control, which is like the original way your website, for example, looked, and then you have a variation or test, and that’s like the new way. And you want to make sure that you test those experience against each other so that you’re not degrading the experience for the customer. Similar to how you test your code, like how you write unit tests and how you write functional tests and regression testing and integration tests. Once all that’s done, once you’ve actually validated that your code works, I think the next natural thing, once in production, is making sure that you’re testing it so that your users like it. Right?

00:04:57 Yeah, I’m actually thinking that it might be more like optimization stuff. So if you profile a piece of code, you have to measure it ahead of time to figure out how fast something is.

00:05:10 If you’re trying to speed it up, you make a change to see if it’s really faster or slow or something.

00:05:15 Right? Yeah, very similar. Exactly.

00:05:19 So it’s like a B testing, but it’s really the current state versus a change.

00:05:27 Yeah.

00:05:28 Okay.

00:05:29 Yeah.

00:05:30 I don’t know if you’ve been hearing this term in your world, but everyone keep saying they want to make more data driven decisions.

00:05:38 And one way to do that is via A. B testing. I mean, when you work on a platform that in one way or another, like a user or a human interacts with, you want to make sure that the changes that you iterate or how you iterate on your platform is in fact, good for them. And how do you do that? Do you based on feelings? Do you based on, like, I feel like moving this button from the left to the right is better for the user or do you based on like, I think it is, but let’s test it out.

00:06:06 Yeah.

00:06:09 I really love Max and I wish I could go back in time to where Windows and Mac kind of diverged and get them to decide whether or not the menu bar is attached to the window or on the top of the screen.

00:06:22 Right.

00:06:22 And just pick one.

00:06:25 Exactly. When you said Mac, the first use case that came my head is the space bar. I’ve been struggling hardcore with my space bar.

00:06:32 I wish they somehow and this is not really a B testing. I wish they kind of just tested it. They stress tested it. I don’t know. Do you have the new one of the not the new, but the MacBook Pro?

00:06:42 That’s not from four years ago. Do you have a problem with your space bar?

00:06:45 I don’t. What’s wrong with the space bar?

00:06:48 It doesn’t work correctly. I’ll do one space and then five spaces will happen.

00:06:54 Yeah. Anyways, that would be stress testing.

00:06:57 I feel the issue I’ve got is I end up don’t have my hands up high enough in the air and my thumbs will hit the track pad and move the cursor to some random place in the text and then you’ll just start.

00:07:12 Yes, I’ve seen that happen before, but I have not personally experienced that.

00:07:16 And I know there’s a setting to say ignore the trackpad. If I’ve been typing recently, I don’t remember where that is. So I’m going to have to Google it again.

00:07:25 It’s hard to find settings.

00:07:27 Okay, so when you say you’re measuring to data driven, trying to make the user experience better.

00:07:34 Yeah.

00:07:35 You have to measure something then.

00:07:37 Yeah, definitely.

00:07:38 What do you measure?

00:07:39 That’s a great question. So there’s definitely data. So it’s kind of like first you need something to test and then you need to make sure you have a question in mind. So you want to make sure you have a hypothesis of some sort and then you want to validate that. So let’s say my hypothesis is like, well, this is the simplest use case. I’m going to move a button from the left to the right.

00:08:02 And I’m going to say that just moving that button is going to increase people buying stuff from my website. Well, what’s your metric?

00:08:10 How are you going to measure that? Indeed is the case.

00:08:14 There’s one way to do that is like, well, for your test segment. So for the people that are getting this new experience, do you see an increase in transactions? Do you see more people buying stuff because of that change? You made another way to measure it as well. Are people taking longer to buy stuff? Like is this moving this button? Is this resulting in sessions that are longer than the control, longer than when the button was on the other side?

00:08:38 So it’s important to understand like a conversion rate or some sort of engagement metric to validate your hypothesis or what you’re trying to test? Essentially, yeah.

00:08:50 So there might be actually different metrics that might not all be going in the same direction, right.

00:08:57 For sure. And I think that’s why it’s important to think somewhat holistically of what you’re testing. And I know this is something that we’ve done at Comcast where we’ll make one change to one specific part of our product. But we also want to make sure this one change isn’t like stealing traffic away from other ways of doing something.

00:09:18 Two things. A, you can have a lot. This is one pitfall of a B testing you can get stuck in, like, all you do is figuring out metrics and data to validate your test. And then the other route is you could have too little data and you can only be looking at one way of slicing your results. And that’s not the bigger picture.

00:09:38 Yeah.

00:09:40 I’d like to thank a new sponsor of Test And Code, Oxylabs. Oxylabs is the top provider of innovative web data gathering services such as real time crawlers and web scrapers. They also have residential and data center proxies. Oxylabs is introducing their next generation residential proxies, which are a significantly improved data gathering solution. They provide a stable and fast proxy pool with more than 30 million global IP addresses, and they are resource efficient with the proxy management, user agents and IP rotation all done on the Oxylab side. Oxylabs has a deep understanding and knowledge of how to acquire web data and to provide a dedicated account manager for every client already trusted by more than 500 companies. Visit Oxylabs IO slash test and code, find out more about their services, and to apply for a free trial of their next generation residential proxies. That’s Oxylabs IO Test And Code.

00:10:38 I can imagine somebody a change that makes it so that people add things to their cart and purchase things faster. Right.

00:10:46 But at the same time, people are spending, oddly enough, less time on your site and buying fewer things. So in total, you’re making less money.

00:10:56 Right. So I think it’s really important to think about the like, and I’m an engineer, so I don’t want to sound like I’m a product person, but it is important to think about your business as a whole.

00:11:09 I think every change is important, especially when you’re running a business.

00:11:13 So you have to understand your product and extend is what I’m saying. It takes time. A B testing is expensive. You have to really think it’s not something for I’m not very patient, but you need patience with a B testing.

00:11:25 Why is it expensive? What do you mean by that?

00:11:28 It takes time. I think there’s two ways to think about it.

00:11:33 You could just build a feature, make a bunch of changes to your platform, ship it, and you could be like, well, I like it. It looks great. We all agree.

00:11:42 And that’s not good. Because you could be degrading the experience. But on the other hand, you could build a feature, spend time on figuring out the proper AB tests, and then run the AB test. That means that your feature has not been released for another like, let’s pretend three months.

00:11:58 So I think it’s like time and resources is what I mean by expensive. Like, it takes time to run an AB test. It takes time to look at your metrics, to build the right segments, to make sure that you’re really looking at the bigger picture when you’re making a change. And like, I think most humans, or at least myself, I’m not very patient and I kind of just want sometimes I just want the change to go out.

00:12:20 So it is expensive in the sense of, like, resourcing and time.

00:12:24 Okay.

00:12:24 You do have to figure out what you want to do.

00:12:27 You brought up segments. What does that mean?

00:12:30 Yes, that’s a good point. So when we talk about A B testing, I did mention earlier how you’re measuring a change and how do you measure that change? You have your old experience, and then you have the new variation test experience that you want to validate. So what you do is you have essentially two segments or two sample groups or two sets of users. And for one segment, you give them the control experience, and then for the another segment, you give them the new experience.

00:12:59 So it’s kind of just like they’re called segments. They can be called cohorts or audience segments, but it’s the sample of subscribers or samples of customers that you’re validating these changes against.

00:13:11 So do you do this just like half the people or you pick people from the West Coast versus the East Coast?

00:13:18 That’s a great question, too.

00:13:20 Yeah. So I think it depends on your feature, like, what you’re trying to test, but you do have to be cognizant or you have to be careful in how big your segment is. You don’t want it to be too small such that it’s not statistically significant. And what I mean by that, you’ll hear people say that word all the time. If you want to pretend like, you know stuff about A B testing, just throw the words particularly significant conversion rate and sample size, and everyone will be like, oh, this person knows what they’re talking about.

00:13:50 You want the number of customers or users that are going to get this experience. You want that to be big enough such that your conversion rate is valid. So you don’t want us to give it to five people and that’s not enough. Maybe those five people have totally different user behaviors and it’s not valid.

00:14:10 There’s bias in the data or it’s just not enough to really measure that. Maybe this is a good feature or not.

00:14:17 It goes around like your conversion rate. How are you actually going to get enough data to figure out if this is valid.

00:14:25 Somebody might just suddenly have like, a family reunion and they’re watching tons of movies.

00:14:29 Exactly. Yeah. And that brings a great point. This is something that is really important at Comcast or any platform. There’s seasonality. Like, there’s not the family reunion, but there’s like, holidays.

00:14:40 When you run your A B test is very important. Should you run your A B test during a holiday where everyone is home? Probably not. I mean, you could. But then also make sure to run it a little bit longer past the holidays.

00:14:53 You avoid that seasonality or the change in user behavior because of the time of year.

00:14:59 Okay, so let’s say I take like 10% of my customers do this, and it looks like it’s good. So do you just roll it out at that point or do you do a larger segment?

00:15:09 I think that’s also a very good question. It depends, at least from my experience, we’ve allocated more accounts or more subscribers or more users to validate that. We see similar patterns.

00:15:24 So it’s not just this like 10%. Let’s double it to 20% and validate. Okay, 20%, the conversion rate that we saw or the engagement that we saw with the 10%. It’s the same, but I play it safe or we play it safe. You could probably if the 10% statistically significant. Sure, go for it. I think it depends on the team itself. Another reason to play it safe, though, too, is if, let’s say you’re introducing a new Web service or a new feature into your engineering ecosystem, it could be a good way to validate that. You can handle the traffic, you can handle the load that you’re then introducing to this new feature.

00:16:01 So I think it’s best to, like, once you’ve tested against the 10%, I don’t think it hurts to then do, like, an incremental, roll out to 20%, validate that the metrics are the same, and then that your platform can handle that.

00:16:14 And then do you also do real time or in production monitoring so that once you roll out a feature completely, you still have the opportunity to roll it back if things crash?

00:16:31 Yeah, that’s a really great question, too.

00:16:34 I personally have not built a platform that can do that, but it’s kind of like the dream is to have a platform where you’re consistently monitoring how things are going and then you have triggers or you have ways to kind of like circuit breakers or something. You see some things, it’s not as it should be. Then you can quickly roll back. And I think there are tools or there are frameworks in the ad testing world that do that for you, but the one we built at Comcast previously was in house. We didn’t build that feature yet, but I think that makes sense. It’s similar to how, like, you’d want the same thing when you just deploy code in general, you’d want to easily roll back. If you see something that’s not to what you expect, right?

00:17:18 Yeah, I know it’s different models for different applications, but it’s sort of similar idea. So the decision on if things are going well and then whether to roll out or increase or whatever, is that done by people now or do you have a system taking care of that?

00:17:41 From my experience, it’s done by people. I know that there are platforms that can do that programmatically, but at least from my experience, at least at Comcast, there’s usually been like a review process, which is kind of old school. But there are a lot of stakeholders when it comes to AB tests here, and there’s a lot of like, checks and balances. Everyone wants to make sure that this test is valid and that the metrics are valid and that we’re not forgetting about one specific part of the platform. We’re also measuring, again, that whole construct of holistically looking at what your change is doing to the product.

00:18:17 So in my experience, it’s been more manual. But I know that there are platforms out there that once things look good.

00:18:25 It goes, do you do multiple experiments at once?

00:18:29 That’s a good question.

00:18:32 Multiple experiments at once? I think it depends.

00:18:37 Technically, you can run as many experiments on your platform that it allows you to run as long as they’re not on the same segment of customers or on the same page.

00:18:49 But I think the one thing to be careful with is if one experiment affects the outcomes of the other, you might get results that are kind of useless. And what I mean by that is, let’s say you have a page and you’ve got two A B test that you want to run on that one page. If you make two changes to that page and you happen to have accounts or you happen to have customers that are in both those segments, how do you know which experience increased or improved your conversion rate?

00:19:19 Right. There’s probably some data people that can figure that out.

00:19:23 I agree. Yeah.

00:19:25 There’s less one thing I wish I did, I wish I paid more attention in College to instattics. I feel like it’s used so often and I undervalued that class.

00:19:34 Yeah. I mean, calculus was required, but statistics wasn’t. What gives? I’ve never used calculus.

00:19:40 Right. And I remember taking the statistics class, like my senior year, and I was like, already checked out. I was like, this is something I should’ve taken, like freshman year when I was fresh and paying attention to everything.

00:19:51 Statistics was horrible for me. It was 04:00 in the afternoon, fourth floor of 100 year old building.

00:19:57 That sounds terrible. The 04:00 part sounds terrible.

00:20:00 And I think it was 100 year old teacher also.

00:20:03 Oh, no.

00:20:03 And I got like two days into the class, and I’m like, there’s no way I’m not sleeping through this. So I dropped it.

00:20:10 Yeah, that’s like the psychology class. I dropped psychology because it was one of those, like, extra. What is it called? The electives.

00:20:19 But that would have been good, too. Like some basic psychology. I wish I would have taken that.

00:20:23 I think there’s a lot like that that I wish I focused on in College, even just, like normal history. I don’t know much about history.

00:20:30 I took history in school, like in high school, but it was kind of one of those things, like, I just need to pass it so that I can go enjoy math, classics or whatever. Yeah.

00:20:42 Anyways, I wish I just knew more. Now I feel like the older I get, the more I want to know and the less value. I valued learning less when I was younger. Now I just want to learn everything.

00:20:52 Yeah.

00:20:54 People blowing out school now, high school kids and, like, what do you mean? This is the last time you’re ever going to get a free education, right.

00:21:01 That’s a great point. But it’s, like, so undervalued. Did you think of that when you were in high school?

00:21:05 No, I thought of it as a game.

00:21:07 Exactly. I just need to get an A and then go to the next thing.

00:21:11 Yeah.

00:21:12 If I can learn nothing and still get an A, I would do that.

00:21:15 Exactly. That was me.

00:21:18 So anyway, so the whole statistical aspect.

00:21:21 Yeah.

00:21:23 I guess you just have to be careful. I think one way that you can run as many A. B test that you want to run if they’re all in different pages. Sure. Like, you’re not going to impact the results if you’re testing in different parts of your product.

00:21:34 Well, let’s say I’ve got, like, a start up or small business and it’s just like me or something. Right. I know that there’s tools for just, like web apps and websites that I can do A. B testing, but it’s just going to report back to me, and then I have to decide to go and make the change whenever I decide to.

00:21:51 Yeah.

00:21:53 And I’m sure I’ll get a whole bunch of people to contact me and tell me all the other ways to do it.

00:21:58 No, that’s what I’m curious.

00:22:01 Yeah. Now I hear you.

00:22:05 I was curious. You wrote down in some of our notes that you might have some case studies from Comcast that you could share.

00:22:12 Yeah, definitely. I have one that I thought was really interesting.

00:22:16 So we did an AB test where we introduced this more personalized experience into our video platform. Meaning, like, it was just a more personalized page that you can say, I’m trying to talk, like, if you don’t know the product, I’m just trying to make it sound simpler.

00:22:33 And this page just returned, like, the TV shows that you would like and the movies that you would like. So it’s geared towards you. And we ran this A B test, and I thought the most interesting from running this the most interesting aspect of running this a B test was when we are analyzing the data. So when we are initially analyzing the data, we were looking at our segments, we had to test in our control segments and we saw an increase in engagement. So we saw that more people were watching TV, for example, with the new personalized experience. That’s what you expect. Right. But it was like about 4% and like that. We didn’t feel like that was a lot.

00:23:10 It’s a whole page just for you, literally. It’s based on what we think you’ll like, based on what you previously watched on our platform. It seems like this should just be like a home run. Right. So I remember thinking this is surprising that it’s only four is 5% increase. And then while thinking about this, what we did is we decided to take our segments and then break up those segments into different cohorts. So what I mean by that is we looked at the data and we were like, well, let’s see if we can find commonality within our segments and then look at our conversion rate. So we found customers that watched more TV and we did analysis on they watched more TV and on demand. And then we saw customers that watched more like, for example, TV on live, like linear. We call it linear TV. So TV, that’s not preavailable for you. It’s on Live right now. And what we found is if we broke that segment down into different traits, that the conversion rate was different. So people who watch more on demand and this new page was available based on the on demand menu, we found that those customers saw like a 21% increase in their conversion rate.

00:24:25 That was really interesting. Was and this is probably something, again, that researchers or statisticians would figure out to do from the start of the AB test. But it was interesting that we saw, like a smaller conversion rate initially. And then once we analyzed traits about those customers and then did the analysis again that specific types of users had a higher increase in the engagements or whatever, if that makes sense.

00:24:54 Yeah. There’s some people that are just never going to use on demand.

00:24:57 Right.

00:24:59 If you include that in your account, in your segments, that kind of like decreased our conversion rate.

00:25:06 Interesting.

00:25:10 Comcast.

00:25:12 That’s a big sort of market. There’s all sorts of people that interact with TV.

00:25:17 Right.

00:25:18 Both in new ways and old ways. Whereas if you were a Netflix, there’s no live TV on Netflix.

00:25:25 Yeah. It’s just like an on demand aspect. Right, exactly. I was wondering, how long is it going to take for the word Netflix to be mentioned?

00:25:34 We did it. I don’t know how long we’ve been in this conversation, but you brought it in.

00:25:37 Sorry. Is that a good thing or a bad thing?

00:25:40 No, it’s a joke.

00:25:44 I have a funny Netflix story.

00:25:47 Oh, yeah. Say that for a later.

00:25:48 Yeah.

00:25:50 They do a B testing and they do it well. So that’s pretty cool.

00:25:53 Anyway, I had another question, so I’d like to hear more stories, but I was curious about in the Comcast case, how long are these experiments do you run them for? Two weeks, a month?

00:26:09 That’s a great question. I think it depends on the use case that we’re testing for this test where we’re introducing the new highly personalized experience. It was a few months, like it did take a while, but that was like for various reasons. I mean, we started it kind of during the holidays, so we wanted to see results past the holidays, the whole seasonality aspect. And then also we did do what we kind of spoke about earlier is it was available for X number of accounts or X number of subscribers. We did do something similar where we increased, we doubled it, and then we wanted to ensure that doubling it, we still saw the same usage patterns. So that one did take a while. But it also was one of our first real big AB tests. And similar to what I said earlier, we also introduced a new web service or new infrastructure, new component into our ecosystem. So we also wanted to validate that that could handle traffic as expected.

00:27:08 Oh, yeah. Okay. Any other cool case studies?

00:27:13 That one I think was our coolest. We did run a case study. We did run an AB test where we were testing just one row.

00:27:22 And I thought it was super interesting when we did that analysis was to see which placement in the row matters a lot. So let’s say you have a row of content, meaning you have ten different TV shows. Like you have ten different TV shows in one row.

00:27:37 And when we were looking at the data and this is kind of where there’s so much data to look at, you can be in analysis paralysis or whatever we would call it.

00:27:47 But I remember the interesting aspect of doing an AB test with just one row was it was interesting to analyze the data and see should we put the best TV show for you? Should we put it in the second spot or should we put it in the spot right underneath the first? Meaning our customer is more likely to go navigate with their remote to the left or down? To the right or down. So it was just interesting how you could analyze the data and figure out the best placement for content.

00:28:21 But then again, I think that’s the best part about A. B testing is you can make more data driven decisions, which is kind of the goal I think in life is making informed decisions on what you’re doing.

00:28:34 Now, speaking of analysis, paralysis, does every change go through A. B testing or do some changes? Just that’s a really good question or whatever.

00:28:46 I mean. Yeah, definitely the latter of that statement.

00:28:50 Some changes just makes sense. You know what I mean? There’s also ways to make changes without doing a B testing. I guess I could have talked about that earlier. But you can make a change, but you can make an informative change. Like, you could do qualitative analysis. I know we have UX researchers study how get a group of human beings and see how they interact with the platform and then make decisions on how to change the product based on qualitative analysis. But you could also do, like, with algorithms. If you don’t want to AB test, you could do what we call offline evaluations. And that’s just evaluations using some specific metric that’s specific to your platform or your algorithm to measure. Is this algorithmic change going to still return good results? So there are ways to evaluate changes without a B testing, and there is the whole again, because it is expensive. So if it’s blatantly clear that there is a specific part of your site that sucks, just change it, and then you can measure the impact of that change. You can take data before you made that change, take data after you made that change and kind of gauge. Was this a good change? Do you know what I mean?

00:30:03 Yeah, it’s a content platform.

00:30:07 Is a B testing ever coming in with things like what content to have and highlight and what descriptions of the shows and stuff like that?

00:30:17 Yeah, that’s like a big aspect of it. That’s not something that I’ve personally had experience, but I know that there are companies and teams and platforms that are doing that just the way. And I had a recent conversation with someone about this. You could have the best algorithm, but if you’re not surfacing it in the correct way, it doesn’t matter. Meaning, like the words and how you represent the data that the customer sees is very important.

00:30:42 You could say for you, or you could say, like, I can’t think of another way to say for you. But the context in which an algorithm is used should be a B tested. Like, you should figure out what’s the best way to represent this TV show or what’s the best way to represent the metadata around an asset or the metadata around a movie or TV show that the customer sees. That’s more compelling or more likely for the customer to click on it or to have some conversion rate.

00:31:11 Yeah, I’m just skimming through and you’ve got, like, there’s a picture of the show, right?

00:31:17 There’s a lot. I mean, I know that’s something that I’ve seen. Other streaming platforms do they’ll AB test just the images that are surfaced to them.

00:31:26 Interesting.

00:31:30 There’s a rabbit hole there.

00:31:32 Yeah, totally. I mean, I haven’t done that, but that definitely seems like you could take different images of the TV show. And if there’s a darker one, if this person is more likely to watch, like, crime or scary TV shows.

00:31:47 Okay.

00:31:49 Is this fun for you?

00:31:52 Is what fun talking about A.

00:31:53 B.

00:31:53 Testing or doing A. B. Testing?

00:31:55 Well, just in the role of being involved with the A. B. Testing part of Comcast.

00:31:58 Yeah, it’s super fun. Definitely. I recently switched into our broadband part of the world, and one of my goals is to figure out how we can run more AB tests simply with the apps that help customers navigate their WiFi or their Internet experience. I think A.

00:32:17 B.

00:32:17 Testing is super fun.

00:32:19 I think the act of having a question and then seeing if that question or having a thought and then validating that thought with data is a fun experience.

00:32:30 Yeah. I kind of wish more of our life was like that make two decisions.

00:32:35 And then you could, like, undo one decision.

00:32:39 That would be super cool.

00:32:41 We just moved into a new building at work, and so if they could A. B. Test the width of the steps on that.

00:32:50 Yeah, that’s a great one.

00:32:52 That’d be cool. Except for the damage.

00:32:53 It’s like a safe thing to do.

00:32:56 Well, yeah. It might be a little expensive to just try it out and then rip out the stairs if it doesn’t.

00:33:00 Right. Yeah. See, that’s what I mean. A.

00:33:02 B.

00:33:02 Testing is expensive.

00:33:04 Yeah. Well, you have to implement the feature even if you end up not keeping it.

00:33:08 Right. And that’s a big part of it, is like you can’t get attached to features like you could easily throw away the feature that irregardless of the staircase example we’re talking about.

00:33:19 There could be an AB test that results in a bad conversion rate. You have to be okay with throwing away that code as long as you learn something from it. I guess it’s okay.

00:33:28 Yeah. So how often do changes go into the Comcast platform?

00:33:33 That’s a great question.

00:33:34 Are they going in daily?

00:33:36 Yeah, I think it depends on the team. I mean, I know there are teams that do somewhat daily releases. I know there are teams that do monthly releases. I know there are teams that do quarterly releases. It kind of just depends on what team you’re on. The team I’m on now that’s specific to broadband or your WiFi network. We do a release every Thursday. Okay. So it kind of just depends. But then the team that I was on previously, the personalization team, we kind of did a release whenever we needed it. I mean, I remember there was one week where we literally did a release every day in preparation for this feature that we were rolling out.

00:34:09 So it depends.

00:34:10 Okay. And these are full releases to customer and everything.

00:34:14 Yeah.

00:34:15 Cool.

00:34:16 Yeah. It’s a different world.

00:34:18 Yeah. I’ve only known this world, so I don’t know how it is on the other side.

00:34:22 Well, I mean, we’ve got a ship software that goes to instruments, and the customer has to download the file and load it on their instrument.

00:34:34 That’s interesting.

00:34:36 They would kill us if we released every day and I guess can you force an update or no.

00:34:44 They’Re not connected to the internet.

00:34:45 Oh, wow.

00:34:48 So your changes when they go they go they could potentially have code on their devices or I guess instruments from like five years ago.

00:34:57 Yes.

00:34:58 And then also that there’s no way for the other world where you can do metrics on what people use and stuff.

00:35:06 We don’t have that.

00:35:07 That’s really interesting. I’ve never thought about that side of the software world.

00:35:13 Like your car I don’t know. Your car probably now does report back to yeah, I think so.

00:35:19 It’s like diagnostics or something like that, right? Isn’t that something that’s always like Microsoft or something always ask you can we send data to track how things are doing or something like that for diagnostics, right.

00:35:32 But my car has never done that for me.

00:35:35 I don’t own a car but I’m sure that they do that at this point.

00:35:37 Maybe Tesla’s Tesla’s. Well, they probably do. Yeah, if Tesla blew up, all the cars would just stop or something.

00:35:46 Yeah.

00:35:47 Well, cool. Well, I know so much more about a B testing now. So cool. Thank you for coming on the show.

00:35:53 Thank you. Have fun.

00:35:57 Thank you to Lima for teaching me so much about a B testing. Thank you to Patreon supporters for continuing to support the show. Join them by going to testandcode.com support.

00:36:07 Thank you to Oxylabs for sponsoring this episode. Find out more about them at Oxylabs. Io Test And Code. That link is also in the show notes at testandcode.com 100 that’s all for now. Go out and test or maybe a B test something.