How python deployment wouldn't have to suck
By: Kerim in python
Today i read a post on why python deployment sucks. Reading through it i had to agree very often. I think nothing that is mentioned in that post is actually new or hasn't been "thought" about by most. Well at least i have thought about it several times in the past. What was missing however was (as usual) a solution.
I am not really the absolute specialist when it comes to such questions in conjunction with Python. But taking a look around at other languages i think its not really that difficult to come up with some "theoretical" solution that would work, provided one could muster the time and people to actually implement and promote it.
establish a ".P" (dot-p) Framework
Microsoft has done it and frankly it is a good solution. Why not establish a versioned "standard" framework for Python including not only Python with its included batteries but a bit more than that. Components and libs that work in conjunction with one another so that we have a more stable basis for programs. This framework would also register itself as the handler of python files and perhaps include an update function.
autoinstall of the framework
In order to make it easier for the user we should create binary applications that contain a functional exe (in the case of windows) that downloads the framework if it is not installed already. This way we would reduce the need for inexperienced people (or lazy ones) to download it manually or to spend too much time reading the requirements of the software.
All this would allow us to distribute sourcecode or perhaps "compiled" pycs to each other without any problem (except for the base binary with the installer). Our installation routines would only have to set the startup menues and the icons as well as copy the scripts to the apropriate directory.
But actually we could do more than just that...
autoinstall of packages
One of my worst nightmares when it comes to python is the "required libs" section in the installation instructions. I would LOVE to have a function in the standard lib (or the .P-Framework) that would allow to dynamically load needed packages. My idea would be to expand the standard import mechanism. Something like:
import myPackage,{>2.0}
would lead the interpreter to look locally for the package "myPackage" with a version of 2.0 or greater. If he doesn't find it then he should simply shop in the cheeseshop (or who knows where), download and install it and then continue with the program. Using that mechanism you wouldnt have the problem of downloading half a dozen packages from the net before you can try a program of 200KByte length.
Now of course all these ideas are as usual just that... ideas. I would hope for someone to actually pick them up and write a good and decent PEP for it or simply do it. I know it sounds pretty arrogant to ask such features from someone else and not do it yourself. Frankly i think this is not even something that can be done by one person alone, even if he had the time.
Your comments ?




on 29 September 2009 at 04:21 Flavio said …
Well there is Pypi. What we would need to accomplish the automatic installation of the "extra batteries" for a given package, would be to develop Meta-packages, like the Debian distribution has. This metapackages would be a simple list fo packages to install.
In this department of metapackages, we have Buildout, PIP, Virtualenv, etc. If only these projects got together....That's the blessing/curse we see in Python web frameworks .
It would obviously help to have package maintainers guarantee, that their packages are available in Pypi as Binary eggs for multiple platforms, ideally. It has to be binary eggs because windows users don't have compilers in their system and don't care about installing them. Besides, in my Ubuntu Box, I can't install scipy or matplotlib via easy_install(WTF!?!), that's very sad.
my 2 cents.
on 29 September 2009 at 08:49 bryancole said …
Package / distribution management is hard to get right and to get everyone to agree on. Presenting the user with the security issues associated with downloading and installing pacakges from PyPI automatically is not realistic. Particularly so in a corporate environment where auto-installing software is often forbidden.
I think an easier solution would be some better, standard tools for building "custom python runtimes". These would be self-contained packages containing the python VM and all the additional libraries and extensions required by an application. This would be very similar to the output of tools like py2exe, bbfreeze but *without* the application-libraries. I.e. the runtime contains all the 3rd-party stuff that the application author did *not* write (or, like virual-env, but *with* the python VM included). The application can then be distributed as an egg, and it is run using the custom runtime library provided by the author.
The advantage of this approach is that the custom runtime is immune to the existence or version of a system python install, it contains only the libraries actually required by the application. Keeping it separate from the application libraries makes it very easy to upgrade the application egg. The basic assumption is that the application libraries may be upgraded fairly often but the runtime will be upgraded/changed much less frequently.
Having a single standard runtime (i.e. a standard install of python from www.python.org) is probably not realistic either, because compiled modules will not work across versions and new python releases are fairly frequent.
on 29 September 2009 at 09:00 bryancole said …
I see virtualenv does bundle the VM. However, it doesn't find all the required libraries for you, or pack them into a zip-archive, like py2exe or bbfreeze do.
on 29 September 2009 at 11:12 Kerim Mansour said …
I find the solutions based on something like py2exe rather awkward.
Think about it this way....
If you have a python application chances are that it is in a range well below 1 Megabyte. Using py2exe this blows up everything to at least 5 MB, if you have a gui then even more.
Now to distribute lets say 10 MB applications with functionality that you could have also as a lets say 500 kb download when using .NET is frustrating.
You might ask yourself if this is really important but i found many people to really do take such issues seriously despite having a DSL6000 line.
on 29 September 2009 at 12:34 Simon Hibbs said …
The Python standard library is the equivalent of the .NET framework. That's it's job. It is continualy being revised and expanded, and there are several larger supersets out there such as the Enthought version, so on this point you're askign for something that either already exists or is already doable.
I think an auto-installer for the framework is a possibility. This might work by creating a small executable stub containing the Python application itself and an application that checks for an existing Python installation, plus a downloader/installer that pulls down the Python software and/or the application's dependencies. For Windows or the Mac this would be a huge win, but realy on Linux you already have package and dependency management software and should realy work through those.
Simon
on 29 September 2009 at 14:54 Kevin Teague said …
You can't do auto-installs from import statements since Python distribution names on PyPI don't necessarily map to module and package names. For example, the docutils distribution supplies the roman module. There is no reverse lookup available when doing 'import roman' to determine that the docutils distribution is needed. Even if such a reverse lookup was created, there would still be the potential for ambiguity. For example, there is nothing stopping a developer from creating a distribution dealing with the Roman Empire and creating a roman module in it.
There are also many problems to tackle when dealing with library dependencies. If your app requires 'myPackage > 2.0', then you need to interpret what the 'latest' release of that distribution is. Version numbering of Python distributions is not standardized right now. When downloading a distribution, do you only use PyPI? Are there mirrors available? Private package indexes you want to pull from? Custom build settings and libraries that need to be linked to? Because there are many different, valid answers to these questions depending on what is being installed where, it makes sense to allow the installer to determine the answers to these questions and resolve all of the library dependencies at install time. Finally, if you embedded "import myPackage,{>2.0}" in your code, then there would be no way to easily parse that dependency metadata from your code. This information is much more easily accessible when it's put into the 'install_requries' field of a distribution's setup.py file.
You can use tools such as Buildout to automate library dependency installation, and with custom recipes even use it to automatically generate complete, self-contained application files. However, the Ruby people are one-up on Python here in that they have a known location for placing and discovering available libraries via gems, so it's possible to have an installer for Ruby skip the installation of already available packages. You can do this with Buildout, but that's a developer-specific thing, not something an end user could take advantage of. On the flipside, because you can write something like, "gem "extlib", ">= 1.0.8"; require "extlib"" in Ruby, they have problems with authors producing libraries which are unnecessarily coupled to specific environments or library versions.
on 29 September 2009 at 15:11 Jens said …
imho the CPython implementation is, like most (all?) "script interpreter frameworks", targetted at developers and probably was never meant to be used by non-programmers... nor were windows or mac the main target OSes. That given, i'd guess adding an auto-installer is your best option. I agree with Simon that its already possbile...
I wouldn't recommend an autoinstaller for application dependencies as i yet have to see one thats doing its job and just works, without getting in your way..... The app developer should do it and the framework should ensure isolation; maybe a modern version of java's classpath. I know there is pythonpath, but nobody seems to use it...
Or skip CPython and use IronPython, getting .net for free. Or Jython and get Java.
on 30 September 2009 at 11:59 Jonathan Hartley said …
Some nice thoughts there. Once I've got my current project building and distributing properly, I'm going to pick one tiny aspect of this issue and try and do my bit to fix something.
on 2 October 2009 at 05:37 Kerim Mansour said …
I agree mostly with Simon. Python itself would qualify as the .P Framework with its slogan "batteries included".
Most projects i know however do need more than just python. So any "extra-installation-on-request" feature in my view makes sense.
Simon proposed an autoinstaller for the framework.
Programming such a thing is not really the problem.
On windows a simple check:
try:
reg = OpenKey(HKEY_LOCAL_MACHINE, "SOFTWARE\\Python\\Pythoncore\\")
#print reg
CloseKey(reg)
except EnvironmentError:
print "Python was not found on this system"
would do the job.
Then you would install python.
But I would not distribute this as a python program because in order to do so you would end up with the same problem again ... you would need to pack your python sources into an exe along with the python runtime. Then you already have your x MB again without any functionality of your originally intended program.
Perhaps the solution lies in platform dependend installers that are included in a batch script that tries to start your own code ?
Something like:
1) checkandInstall python
2) run myModule.py
@Kevin:
I agree with your summary of the current state of Python. The question is if we could "define" a standard that would change that ;)
For example, your python program could come with a config/manifest file that explicitly lists the distributions needed instead of the modules.
These then would be autoinstalled.