pip : “pip installs packages” or maybe “Package Installer for Python” pip is an invaluable tool when developing with Python. You can use pip to install Python packages from pypi.org, or a different index, or a local directory. The way pip installs from a local directory is about to change, and the story is fascinating.

Transcript for episode 163 of the Test & Code Podcast

This transcript starts as an auto generated transcript.
PRs welcome if you want to help fix any errors.

00:00:00 Pip Pip installs packages or maybe package installer for Python. I forgot to ask Stefan if he knows what it really stands for. Regardless, Pip is an invaluable tool when developing with Python. A lot of people know Pip as a way to install third party packages from Pipi.org. You can also use Pip to install from other indexes or is it indices. You can also use Pip to install a package in a local directory. That’s the part I want to jump in and explore with Stefan Bedouli. The way Pip installs from a local directory is about to change and I wanted to talk to Stefan to see if I need to care about the change or not. It turns out, even if you don’t need to know all the details of this, the story is fascinating.

00:00:58 Welcome to Testing Code.

00:01:08 Welcome to Testing Code. I’m super happy to have Stefan Bedouli on the show, so thanks for coming on.

00:01:18 Thanks for before we start, tell me a little bit about yourself.

00:01:24 Sure. So I’m in Belgium.

00:01:29 Maybe my French accent is showing, so apologies about that. So I’ve been a professional software engineer by passion and profession since about 30 years now.

00:01:40 And like ten years ago, I’ve been fortunate to be a founding partner for European software project company named Daxon, where we focus on crafting solutions based on open source and open standard technologies.

00:01:55 That joined my patient for open source, and that’s my main occupation these days.

00:02:01 And I’m also an elected board member of the Odo Community Association, which is an open source community creating and maintaining a large number of addons for the Odo ERP.

00:02:11 And last year, I think in April the Pep team asked me if I would like to get the commit bit, which I accepted. And I’m not trying to help as best as I can.

00:02:25 You’ve been doing a lot of open source in the last ten years.

00:02:28 Yeah.

00:02:29 Is that kind of when you started working with Python or have you been using Python for longer?

00:02:33 Oh, I’ve been using Python since I think, version 1.4 or something.

00:02:39 Okay. Back in the 90s.

00:02:41 Okay, nice.

00:02:43 I’m a very long time user of Python, but I never really had the chance to get involved in the community for various reasons.

00:02:54 Okay, so you’ve been involved with Pip for a while then, or at least in the last year or so?

00:03:00 Yeah. And I think I started becoming more involved in this in the packaging community like five years ago or something, because we wanted to solve basically installation problems in the audio community. The thing is that we maintain like maybe 3000 different extension modules for Odo and there are dependencies between all those modules. And telling our user which module to install to get a working system was a bit complex because our modules are spread in like 200 different GitHub repositories. And so telling them you need to install that one and that one and that one is really difficult. So together with the OCA board, we decided to publish our addons on Pip. So I started working on that. And actually I it for various reasons. All kinds of HKS and little issues in Python packaging.

00:04:11 There are some of them for historical reasons. And issues I had were mostly about the wheelbuilding process around working with VCs references in requirements or editable installs, for instance.

00:04:27 And so I didn’t really find any tool helping for that. So I wanted to contribute as upstream as possible and so I started contributing to people to help fix those.

00:04:41 And I must say that the team was pretty friendly so I could do some fixes, make some improvements, like something I contributed back then as a mechanism to cache the wheels you build locally from a VCs reference. So when you do Pip install, get plus https, blah blah blah with a commit ash in the reference is going to cache it. So if you reinstall it, it’s going to go faster.

00:05:11 Also, at some point I got interested in Pipfree also in relation with VCs reference, because in that case Pip freeze back then just did not work. It didn’t spit out the original VCs reference for that. I was told, okay, that’s an interoperability problem, so you have to write a pet.

00:05:35 So I went on and it took a year. It was quite an experience.

00:05:41 But then eventually it became Pep Six 10, which was accepted.

00:05:49 So I implemented that in Tip recently. Also I found like there was kind of consensus around modern editable installs artists, I thought.

00:06:02 So I proposed to write a pet for that one, which I quote with Daniel Holmes.

00:06:08 And that was also quite a ride with a competing step. It was quite interesting, but in the end it was accepted when I signed. These days I’m trying to implement that. For instance, now I try to a short list of improvements, a new feature I would like to add to pipeline or things that are bit loose like for instance, you know that everyone is saying you should never do setup by install because you need to get Pip installed right. But actually behind your back, Pip is actually doing setup by install in many situations and that’s something we want to change in Pip to go to a process where we build the wheel and then install it.

00:06:57 But it takes time.

00:07:00 These are important improvements to modernize the packaging ecosystem, but it takes time because we have to keep compatibility and care for deprecation.

00:07:13 But it’s really fun and quite interesting.

00:07:18 That’s so cool. I think I got the right person on the line. Then.

00:07:23 One of the things that surprised me, I feel like I use an abuse Pip a lot.

00:07:31 Partly because I work in a situation where we’re behind a firewall and sometimes with needing to set up proxies and stuff as a company, we’ve solved that already. But for a while a kind of a hacky solution was to just download all this stuff, like do it, download and download everything that we needed into just a directory somewhere, and then just point that up, point all the Pip installs through that instead using the index flag or something, and then just pointing it there or not an index.

00:08:15 Yeah. With find links.

00:08:17 Yeah, find links. That’s it. Yeah. When I was using setup tools, I would do the editable installs all the time. Of course, now I’ve switched and I’m using Piproject Tamil and Flit often and editable installs don’t really work anymore.

00:08:35 If you try to edit install, like with Pip install, e with a directory, and if it’s a flip package, it’ll pop up and say you can’t do it right now. It’s a setup tools thing.

00:08:51 Yes, that’s correct. And hopefully in three months it will be. Oh, really possible. Yeah.

00:08:57 That’s so quick. I’m so excited.

00:09:05 Thank you, Python for sponsoring this episode. Python 310 release candidate is now available. Yay. Does your editor support it though? Pycharm does. Pycharm 2021 Two can be used today for Python 310 as well as previous versions of Python. Of course, the coded Insight feature will help you with structural pattern matching, explicit type aliases and unit types, and all the other goodies of Python 310. There’s lots of other updates too, like Code with me has been updated with synchronized code completion and even the ability to share consoles. That’s cool. And did you know that you can run tests as a pre commit action? Actually, on that note, did you know PyCharm had its own pre commit actions? You’re going to preferences, version control, commit. There’s all sorts of stuff you can do, including with 2021 running tests. That’s nice. Try PyCharm for four months by going to Test And Code PyCharm, even with a flip based project, without the editable, you can say Pip install and then a directory and it’ll do it correctly.

00:10:15 Great.

00:10:16 However, when I do that, I get this big deprecation warning and that was the impetus to try to get somebody on to explain this.

00:10:30 It looks scary and I just want to know, is it scary? Do I need to worry about it?

00:10:36 So I’m going to go ahead and read it. So if I do like Pip install in a directory name and I usually give it a path like boo or something, it says Deprecation of future Pipers will change local packages to be built in place without first copying to a temporary directory.

00:10:54 We recommend that you use feature entry build. I think to test your packages with this new behavior before it becomes default.

00:11:08 21. Three will remove support for this functionality. You can find discussions at and it gives a link to an issue.

00:11:16 Okay, so I assume you’re familiar with this.

00:11:20 Yeah.

00:11:21 What does this mean and do I have to care about it?

00:11:25 Yeah. So first thing, it’s totally valid to do Pip install of local directory right. You don’t need to worry about that. And it will continue to be supported.

00:11:42 Okay, good. Because I have that instruction in my book and I want to make sure I can keep printing that.

00:11:47 Yeah, no worries.

00:11:50 It will continue to work.

00:11:54 The thing is that actually, most people don’t even need to worry about that deprecation warning, because the vast majority of users will not be impacted when we change the behavior of that particular installation mode.

00:12:16 Actually, I think I need to explain what Pep does when you do Pip install directory.

00:12:22 Yeah, that’d be great.

00:12:24 Currently, when you do that, Pip is first trying to do a copy of the directory and all its content and sub directories to a temporary location and then doing the installation from there.

00:12:41 All right.

00:12:42 Okay.

00:12:43 And it turns out that sounds easy, but it turns out that actually it’s not. And over time, people have raised all kinds of problems with that approach.

00:13:02 And so what we are changing and I can explain after why we are doing this, what we are changing is that instead of copying to a temporary directory and installing from there now, the installation will take place from the original directory.

00:13:25 What that means if your project is using Setup Tools, for instance, when you do a setup install or set up by install of a setup Tools, project Setup Tools is creating a build directory locally, and it’s creating a info directory locally.

00:13:52 And that’s the thing that will change is that for Setup Tools, project people will start seeing those work or artifacts appearing in their project directory when they do this kind of install. That’s the difference.

00:14:09 If you use Flit, for instance, you will not see that because Flit is not creating artifacts locally to do its built.

00:14:19 Actually, it will not make any difference.

00:14:25 So maybe you’re all set already in your case.

00:14:31 Yeah, we use the bonus.

00:14:34 And actually last year or something, there was a try to make the switch of behavior in pipe without a duplication phase.

00:14:47 But then we discovered that was creating some problems for some workflows, like, for instance, people installing from a read only directory, then Setup Tools cannot create those artifacts. And it’s brilliant.

00:15:06 And there was even more exotic situations where doing successive builds from inside, like a Docker image with different Python versions would create slightly different artifacts that could be incompatible and set of tools would not detect it.

00:15:31 That could lead to Invalid wheels and things like that.

00:15:36 And so we decided to do deprecation period to warn people that something was changing and that they should pay attention to it and prepare for the change, which could come with the October release or maybe later if we discover other problems along the way.

00:15:57 Okay, so this 21 three is supposed to come around in October. Yeah, that’s just right around the corner. So cool.

00:16:04 It’s doing four release per year. So yeah, the next one is in October.

00:16:11 Okay.

00:16:14 Mostly I’m getting out of it is I personally don’t need to worry about it. However, if you were doing something kind of interesting like from a read only directory that seems different, is that going to work then? Or is a read only directory just not going to work in the future?

00:16:34 It really depends on your build back end.

00:16:37 If you use a build back end like Flit which only creates artifacts in temporary directories and so on, and manage temporary directories by itself, it will work. We set up tours if it will not work. So that’s a workflow we are breaking and we know you are breaking. And I’ll explain why we are doing it because I think it’s important in the pipe team, we don’t like to do breaking changes, it’s important to give stability to the community.

00:17:10 But in this case, basically that was the only possible solution.

00:17:19 And in that case either setup tools will have to evolve to support that use case or people will have to copy themselves to a writeable directory and then do the installation from there.

00:17:38 And basically that brings me to the reason we are doing that.

00:17:44 I think it’s worth explaining for the community so people understand why we had to do that.

00:17:53 Actually there are several reasons.

00:17:58 The first one is performance. Actually we got many feedback in the PPC tracker that doing Pip install for local directory was slow in some situation and maybe you have not seen it, but if you have like for instance a project that has multigabyte Git directory and even if it could be a large project with a large Git history, if you do Pip install from the root of that project, it’s going to copy everything to a temporary directory is going to copy that multi gigabyte, get directory to your Templation and it’s slow and people get out of this space in their temp location and stuff like that and say why there’s no reason.

00:18:52 So that’s one thing. And actually we do have access to ignore the docs or dot docs directories because those tend to be big and we don’t need them to build.

00:19:05 That was going to be my next question because my tox directory is right there also pretty big.

00:19:12 But we do have an exception, a specific exception for talks and knocks in the Pip code base to work around that. And to be honest, that’s a bit of a hack and people are asking can’t you ignore also the git directory? And you say well no, we can’t because some projects use like setup tools SEM to generate version numbers from Git Tags.

00:19:45 For that one they need a Git directory to do a correct bill.

00:19:51 So we cannot get directory. So that’s one reason performance.

00:19:56 And then the other reason is sometimes your Python project is inside a larger project, like when you have I don’t know an application like Blender or something which has a Python bindings, and they typically live in the subdirectory of the main project.

00:20:20 And to build the Python binding of such projects, they usually need something that is not inside the Python module directory, but just a sibling directory or something.

00:20:36 And if we copy the directory where the Python project is, we don’t get those sibling directories. And also the build fails.

00:20:46 And there are many situations like that.

00:20:49 And long story short, basically there is no way for people to know how to do this copy to a temporary directory correctly. In all cases, it’s just impossible.

00:21:03 And that’s the reason. And there was lots of discussion. But that’s the reason we reached this conclusion that we must just stop doing that and basically push that problem of knowing what to copy to somewhere else. And that somewhere else can be the user, it can be the build back end.

00:21:27 If you do a custom build back end with pet five, one, seven, you know what you need. And if you want your bill to work in a read only directory where you know exactly what you need to copy to a template location to do your bid and things like that.

00:21:43 Okay.

00:21:46 That’S the story.

00:21:48 Yeah. All the different corner cases.

00:21:53 That’s an interesting story because I think when that was done at the beginning, I was not there.

00:22:03 We could not find this story exactly why this was done.

00:22:07 But basically we can imagine that it’s really to allow working from read only directory or avoiding leaving build artifacts in the source directory. And those are good reasons. But the problem is that that was done at the time where people set up tools were really tied together.

00:22:33 And today the company is trying to untie those five, one, seven and so on.

00:22:42 Basically, my conclusion is that that feature is useful, but basically it was implemented in the wrong tool. It was implemented in papers. It should have been implemented in the blue backend.

00:22:54 Oh, yeah. I mean, somewhere to flip setup tools could have had an option to copy it into a different directory.

00:23:01 Yeah, exactly.

00:23:04 Wow.

00:23:05 That’s fascinating.

00:23:09 I’m glad we had this discussion. Another reason why I wanted to talk about it openly is because of I had a couple of people comment because like I said, I put a sample project in my book and I tell people to Pip install with a slash, took the path so that you can get the local directory installed. And that will kind of disable Pip from trying to find it on pipe.

00:23:38 And a couple of people, not too many, but a couple of people have been surprised and said, I don’t understand why you’re doing that because Pip is only for Pi packages.

00:23:49 That’s just a misunderstanding. That’s how a lot of people use it. And I guess a lot of people just use it for grabbing things off IPI, but you can use it for so much more.

00:24:00 Yeah.

00:24:03 The way I like to see. Pip is a command line interface for the database of install distribution in the Python environment. Things like you can install things in it, you can uninstall, you can upgrade.

00:24:22 But the source of your packages is not necessarily Pip. Of course it can be local index, it could be local directories, it can be a VCs. You can install from git. Url. There are quite a lot of supporting people for that.

00:24:44 And a lot of companies like my company is doing a local thing that looks like Pipe, but it’s basically a mirror into Pip for things that are not there and then it caches them. But it also has local stuff and company only thing that we don’t want to share with the rest of the world.

00:25:09 Yeah, that’s important.

00:25:11 Actually, Pip is a very flexible tool.

00:25:17 I don’t think it will ever grow to be something like Poetry or Pipelines, something like Full environment management. That’s not the scope at the moment, but I personally see as a very important building block maybe to build such IR level tools.

00:25:38 There are tools where you can afford using a CLI to manage your install and install stuff that could be the right building block to to help build such higher level tools.

00:25:55 One of the fun things that happened since I’ve been paying attention to Pip is the Pip list turning into a column list or something, but it also went through a phase where it was sort of a warning showed up. I can’t remember if it was a warning or just a notice to people that this was going to change.

00:26:21 Just be aware of it.

00:26:26 Is that because I assume this is because a lot of people are using Pipelist in scripts and in other cases where it might mess things up if the output changed at all.

00:26:38 Yeah, we have to be very careful because people do all sorts of things with Pip and I think at the time parting the output of Paperlist was things people were doing.

00:26:51 In the meantime, Paper list has a JSON output. So I hope people are not parsing the column output anymore.

00:27:00 And if they do, they really need to switch to the JSON output because like for instance, when we support modern editable, that’s something I was working this week. And the column format will change because we are adding a column with the editable project location.

00:27:20 Oh good.

00:27:22 So when you do a Fleet install minus SIM link, for instance, people will recognize that and display the project location in the people list column.

00:27:38 People should not parse that column output, but then there is the JSON output.

00:27:44 That’s so cool. I didn’t know that you could do JSON output. That’s nice.

00:27:49 Yeah. Click list minus formatting equal.

00:27:55 There’s all sorts of goodies in here. So people should really do like Pipelist help and see all the cool things you can do.

00:28:03 Yeah, it can do quite. And one long term project I don’t know still need to be discussed with other maintenance like for instance having Richard JSON. Output to get like metadata on the project. So really you have a way to kind of query what is installed in very much details and more details than today where you just have basically the name and the version unlimited information.

00:28:32 I didn’t even know you could do pipelist o and list your outdated things.

00:28:38 That’s a cool one too.

00:28:40 That’s neat, sweet, cool people. Definitely and I’m one of them take Pip for granted. It’s just part of my tool chain and there’s so much power in there. Very cool. Well, I learned a whole bunch and so thank you so much, Stefan for coming on the show.

00:28:59 My pleasure.

00:29:05 Thank you Stefan for your work on Pip and packaging and for joining us today to talk about how Pip install of a local directory works. Thank you Patreon supporters join them at testandcode.com support thank you PyCharm for sponsoring the show. Check them out at testandcode.com PyCharm those links are in the show notes of course@testingco.com 163 that’s all for now. Now go out and Pip install something.