Lots of Python projects are starting to use GitHub Actions for Continous Integration & Deployment (CI/CD), as well as other workflows.

Tania Allard, a Senior Cloud Developer Advocate at Microsoft, joins the show to answer some of my questions regarding setting up a Python project to use Actions.


Transcript for episode 123 of the Test & Code Podcast

This transcript starts as an auto generated transcript.
PRs welcome if you want to help fix any errors.


00:00:00 Lots of Python projects are starting to use GitHub Actions for continuous integration and deployment, as well as other workflows. Tanya Lard, a senior cloud developer advocate at Microsoft, joins the show to answer some of my questions regarding setting up a Python project. You use Actions. This episode of Test and Code is brought to you by listeners like you that support the show through Patreon and by Datadog. Listen to their segment in the show and visit Test And Code. Comdayddog to get started.

00:00:42 Welcome to Testing Code, because software engineering should include more testing on Test And Code today. I’m thrilled to have Tanya Alard, welcome to the show. And before we get started, too much, could you tell the people in my audience who you are?

00:01:03 Yes. First of all, thank you for having me in your podcast. And I’m Tanya Alert. Like Brian has introduced me, I’m a senior developer advocate at Microsoft. I specialize in all things scientific computing, machine learning and open source. And I love Python. I love tinkering with Python, and I love automation. And that’s basically also one of the reasons why I’ve been very interested in sporting the community that are wanting to use GitHub Actions or struggling with for mainly the scientific computing community, which is like my side of Python, I mostly interact with.

00:01:40 Oh, really? Is it different sort of problems that scientific computing has then, just like, say, your web developer person?

00:01:47 Yeah, well, mostly now the scientific folk. Well, the scientific computing folks encompass people that are doing any kind of modeling, simulation, using the scientific Python ecosystem, machine learning, data science.

00:02:01 So because of the kind of work that we work on workflows and highly dependent workflows and data, some things operate a bit different, libraries are different, and people tend to use different packages.

00:02:17 I know a lot of open source packages are migrating towards GitHub Actions. What are GitHub Actions anyway?

00:02:23 It can get also very confusing because there is GitHub Actions that is the actual product of GitHub. And it’s basically a set of tools that allows you to do continuous integration and continuous delivery and also a great load of workflow automation. So you can start doing some sort of automated labeling check in your PR. You can integrate with some web hooks, for example, to get notifications through things like Slack, or send messages through Twilio or stuff of the such. So it’s very, very versatile. And then within the way we work with Actions is to define a workflow, and it can be whatever you want to do. For example, create Wills and publish your package in IPA and do your set of tests, for example, build your documentation and publish it wherever that you’re publishing it, but also in the sense of ordinary, authentic computing machine learning. And it opens up a lot of possibilities as well. And then there’s also a thing called Actions, which are like individually composable steps or units of work. Let’s say that you say you’re working with Docker containers, you can have an action that will log you into Docker, build your image, and push it directly within your main workflow. I don’t know if I made you feel more confused though.

00:03:50 Are you having trouble visualizing bottlenecks and latency in your apps and not sure where the issue is coming from or how to solve it? With datadogs in monitoring platform, you can use their customizable built in dashboard to collect metrics and visualize a performance in real time. Datadog automatically correlates logs and traces at the level of individual request, allowing you to quickly troubleshoot your Python applications. Plus, their service map automatically plots the flows of requests across your app architecture so you can understand, dependencies and proactively. Monitor the performance of your apps.

00:04:26 Start tracking the performance of your apps, sign up for a free trial and install the agent and data dog will send you a free T shirt to get started. Visit Test And Code. Comdatog the problem space is fairly broad. We’re just doing work based on work being done on a server, based on some sort of trigger, whether it’s a commit or merge request or something else.

00:04:57 And the interaction with GitHub Actions is a little different than I’m used to. Within GitHub Actions, you don’t really have to interact with the interface that much. You mostly interact through workflow files. Are there other differences between other competing CI systems and actions that you know of?

00:05:18 If you’re a maintainer for an open source project and you wanted to cover, for example, the three main platforms Windows, Mac OS and Linux, then you would have needed to have a combination of Travis, a Bayer Recycle CI to cover that. So GitHub Actions give you the ability to have just one single workflow and be able to test your code or package your code for those three platforms, which for me it makes it much easier because I didn’t have to keep track of different tools as well. And also the kind of triggers that you can have for your workflows is very diverse. You have the common ones that you’ve already mentioned, like pull requests or commits, but you can also have because it’s directly integrated with GitHub as part of GitHub. Then whenever you create a tag, then you can also use that. You can have scheduled actions. Let’s say that you want an action to run every Monday at 09:00 in the morning, because I don’t know that’s when you build your application and deploy, then you can do that as well. And that gives you a lot of flexibility. So if you need to do, for example, build Docker images every so often, those Chrome actions are really good. You can create web hooks, which I don’t think you can do with things like Travis, in the sense that if you need notifications or a report on your workflows or your CI or CD, you can get them sent to your email or Slack and you can also start doing interesting things like chat apps through GitHub Actions, which you can do with any other tools either, because it doesn’t have direct integration with GitHub.

00:07:06 Okay. Having it packaged with GitHub is very convenient.

00:07:11 Yeah, definitely.

00:07:12 There’s no doubt about it. It’s very convenient. It wasn’t obvious to me how to get it started. You have a GitHub directory with a workflow directory and then you drop some YAML file in there with some syntax and that gets run. And I don’t remember if you have to turn it on somewhere. Or do you have a magic checklist for how to get started?

00:07:33 No. So basically it is that kind of magic that you create your GitHub file, your workflow directory, and then you don’t do demo file and that creates a workload, you don’t have to initialize it or click button. So that actually enabled.

00:07:52 Okay. It just runs. If you have that there, GitHub will do it.

00:07:55 Yeah, exactly. Github will directly identify that you already have workflows and run them whenever your next trigger occurs. One of the nice things that now Actions have and they didn’t used to have in the past is that if you’ve never used Actions, you don’t have Actions in your repository. You can go to the Actions tab and it will try and identify the program language that you’re using. So in this case, if it’s Python, it will suggest it will add a very basic main Jamal or workflow. Jml file. So that will give you the starting point and it will start with some things like install a specific version of Python and then try probably look for your test folder or do your test directory and then you can build up from there. So that’s also a nice add on and there are lots of starting configurations now I think there is for node. Js. I don’t use one for now. Therefore Docker soft Python.

00:08:53 I don’t know. Like Ross, I think it’s very diverse.

00:08:57 So I’ve got a directory that I’ve got so far, one file in it. A couple of questions around that. Can I have more than one workflow then per project?

00:09:06 Yeah, definitely. I actually like separating my workflows depending on what you’re doing.

00:09:12 So if I have one, for example that builds a Docker image, I have a file called Docker Jamal. If there is one that is only for testing, I have a testing Jamal then one for build and publishing. So you can have as many workflows as you have. Because I also find that if you have everything in one main JAMA file, it can end up being super long. Even if you’re using things like talks or knocks. It can get quite long if you’re doing a lot of different things.

00:09:44 Yeah. So also I’ve got a couple of small projects, but somebody did a full request to add a pre commit workflow as a separate workflow to test things like linting and stuff. It’s pretty cool to have that separate because it is nice to be able to look at the dashboard and say, okay, well, all my tests and stuff are in one workflow, and that was fine, but the Lending stuff failed. So it’s a different panic level for me of like, oh, my system doesn’t work anymore versus it’s not formatted, right? Or something like that.

00:10:24 Definitely, yeah, definitely.

00:10:26 Those actions get run on the merge request. I don’t know if that’s automatic or if I set that up or not, but does it run all the workflows and all the workflows have to pass before the merge? Or is the merging and workflow not associated with each other at all?

00:10:39 I normally just wait until all of the workflows have passed to be able to marry, and that also depends on your trigger, because if you have your trigger request, that’s going to trigger your workflows. But if you don’t have it, it doesn’t. But you can also set it so that it’s on merge rather than when the full request is open. So you have those two variants, but I just normally wait for all workflows, and you can also have fast fails. For example, if you’re testing against different versions of Python, let’s say they have 363-7380r as many versions as you want, and then if fails for one, then you can have a fail pass it, and it doesn’t continue the workflows for the other different versions as well.

00:11:25 Oh, that’s great, because there’s not much point. If your policy is that everything has to pass, then there’s not really much point in testing everything else, and there’s a good chance it’s just going to fail on everything. Possibly, yes, there’s reasons why we test on multiple ones because we’re testing compatibility issues, but if there’s an actual, like, something broken in the code, it’s probably going to fail everywhere. Interesting. Have you done some of these sessions with helping people migrate?

00:11:53 Yeah, so I’ve been working with well, actually, because of the tweet that I put up there, a few people have reached out. I’m working on those projects. I’ve migrated a lot of my own personal projects and projects that I regularly contribute to or maintain.

00:12:10 I’ve done this in the past as well. When people were also migrating to a short pipeline. I was helping folks, for example, in Jupiter World in Paper Mill, and I always find this very interesting.

00:12:23 It’s something that I quite enjoy. I don’t know if I’m a weird person, but I really enjoy this kind of problems.

00:12:32 So what kind of problems are different projects having the same problem, or are they all unique?

00:12:36 Some problems have shared problems. For example, there is not a very straightforward way to install and start an environment just because GitHub Actions doesn’t provide native and condo support. So there are a couple of actions that folks have created out in the community and are available in the marketplace. But I still have to find one that does things right every time or the way that I would expect it. So sometimes there are issues fiddling with empowerment versions.

00:13:12 And also there have been some problems with especially machine learning or scientific computing. Sometimes you need to compile things on the fly or you have some sort of memory limits. So those also have been interesting things to overcome. And I actually have been working with some people from Imperial College London and they’re doing something really, really interesting. They’re doing all of their benchmarking for their software or for their code using GitHub, Actions and virtual machines. So we’ve had to find a way to spin off certain virtual machines to do the benchmarking, extract the reports, and then decommission the VMs. That was actually very interesting. That was not straightforward at all. Like finding a good way to do this complex workflow was quite challenging.

00:14:09 Yeah, that sounds challenging.

00:14:11 You did mention that you have used that you’ve helped people do Azure Pipelines in the past. Also, are people migrating from Azure Pipelines to give actions or people staying pretty much leaving those alone?

00:14:24 I’ve seen a few people that are still running on a shared pipelines, but also I think a lot of folks because of the convenience of having jury code and Juris CinCin within GitHub altogether, starting to migrate. And I think also a sure pipelines because it is integrated within a short DevOps. There is a lot of stuff. So if you just want to do the CI or testing or the CD part of things, it was very confusing. It was quite overloaded. If you wanted to do just that bit.

00:14:59 I forgot a question that I had right away and I think I know the answer. But the name of your YAML file, does it matter?

00:15:05 No, not really. I think a lot of tutorials or a lot of the documentation that we’re going to find out there is going to suggest you using main Java. But as I mentioned before, you can have different files and you can name them as pretty much as you wish. Yeah. Depending on the task or the workload they’re doing.

00:15:24 Okay. And then once you have if you’ve got like say two or three different files with different purposes and workflows, they can have different rules of when they run.

00:15:36 But if they all have the same rules, they have the same triggers. You can go to the dashboard and you see all of them as separate tabs then, right?

00:15:45 Yes, that’s correct. So you can have each individual workflow with your own triggers. Probably there is some of those that you only need to run whenever you’re making a release. There are some others. For example, you’re Linting. If you’re using things like Black, for example, and Flake Eight, you want to run that ideally every time that you’re pushing or something. So you can see the different triggers. It makes it very, very easy to identify when the workflows are running, then you can see everything without pretty much leaving GitHub at all.

00:16:18 Having those different workflows are good. I was actually just talking to somebody yesterday about tests that you don’t really need to run. And he was like, okay, so I got like some weird code that’s off in a corner that is just for it’s hard to run because it’s for a different operating system that I’m normally using. So I don’t run tests for that. Oh my gosh, that’s exactly the reason why you need to write tests for that and have your CI system run it, because if it breaks, you’re never going to see it because you’re not running that code.

00:16:49 Yeah, exactly. And then I think being able to on all operating systems domain ones, it is very easy. Now, for example, I don’t use Windows. I develop either on Linux or macOS, but you always want to test that. It works for those of your users or collaborators that are using Windows.

00:17:11 You can even test on Mac with GitHub Actions, right?

00:17:14 Yes, that’s correct.

00:17:15 Okay. That’s pretty cool. Are there different flavors of Linux that I need to test for or just picking one? Is that sufficient?

00:17:22 Usually I normally test on Ubuntu because that’s what is available normally.

00:17:28 Okay.

00:17:28 And I think it’s one of the mostly used distributions.

00:17:34 I don’t use anything else because normally you would already have all of the required Linux packages installed beforehand rather than you having to do Apt, get, for example, if it’s Gdmbased or something or APK.

00:17:51 Okay.

00:17:51 If it’s not a registration.

00:17:52 So I definitely want to run my tests on multiple operating systems.

00:17:57 Do you recommend that I run like my Linting? Do I need to run that on multiple operating systems or just picking one sufficient?

00:18:03 Yeah. I think for Linting, for example, you can just do one operating system if you’re already doing, I think defaults from Contaporage do a lot of different operating systems of and they do leverage Windows, Mac, News, Linux, Mac and Windows because they do have to be all of the packages that are then published kind of force to be able to be installed through Konda.

00:18:33 That’s cool. So they’re using GitHub Actions.

00:18:36 I don’t know if they’ve migrated, they were using a shirt pipelines, but I can check now, but I think they are a very good example of workflows on right. For testing and building on multiple operating systems.

00:18:52 Call to action for people, since a lot of people are migrating. If you’re doing something cool within your pipeline, if you can throw comments in there, that will be awesome because people are looking at GitHub projects to see how they’re doing things.

00:19:06 Yeah. And also one of the nice things of GitHub Actions is that people can publish their own actions, which are the individual steps. So let’s say that you always use Docker. Again, I use that a lot because it’s one of the actions that I use most of the time. So if you’re using Docker across jewelry projects or across different workflows, instead of you having to rewrite the same bits in every workflow demo file, you can Polish that action and then you can pull it. And it’s basically like doing a Pip install package. But instead of doing pipe installed or doing I use this certain action and this version so you don’t have to rewrite that part of your workflow.

00:19:55 Okay. Neat. Is there a place for people to get started? Where do people go to just get started and learn the basics?

00:20:01 There is the GitHub Actions documentation, and there is also a GitHub repo called awesome GitHub Actions. It’s one of the awesome collections. Sorry. Dressener and Community have done an amazing job at collecting different examples, tutorials actions that are in the marketplace. But also if you go to the GitHub marketplace and look for actions that do linting or continuous delivery or workflow automation, you could probably find those building blocks already done for you. Or you could go to the repository and have an idea of what people are doing.

00:20:40 Okay.

00:20:40 Yeah. So I think those are some good resources.

00:20:44 But any other calls to action you want to have people do if they want to know more information?

00:20:50 I’m just going to open the same call to action that I’ve been opening for folks in scientific computing. If you need a hand, just reach out to me. I am more than happy to help you all. I’m not very good at website, so I will probably be less helpful there, but I can also help. Okay, just try it.

00:21:09 Are you doing that as part of your role as a developer advocate?

00:21:13 It is part of my role. It’s not something that I have to that is like, this is one of the actions that you have to do as a developer advocate, but it’s one of the things that I just want to do to give back to the community, help those open source maintainers that are struggling a tiny bit. Yeah. It’s also something that I enjoy doing, so it’s a bit of everything.

00:21:36 I think that’s wonderful.

00:21:38 We get so much from open source projects, so giving back to the community is really cool. And then if people want to know more about you, how do they ask you on Twitter?

00:21:48 Is that the best people can reach me out on Twitter? I am easily finding and also they can go to my website that is trialart trlart Dev and they can contact me from there.

00:22:04 Thanks so much for answering my questions about GitHub act. I’m sure people will reach out to you. So thanks a lot.

00:22:10 Thank you for having me.

00:22:14 Thank you, Tanya. I’m inspired now to try out some new things with GitHub Actions and thank you Datadog for sponsoring visit, testandcode.com Datadog to get started. And thank you all of the listeners that support the show through Patreon join them by going to testandcode.comsupport all of those links and links to items that we talked about in the show are up on the show notes at Test And Code.com. One, two, three. That’s all for now. Now go out and test something.