Making tests before installation with setuptools

I dream that packages don't install if the tests are failing. I made it at least.

My solution is gory but practical.

in setup.py I added:

def test():
    """let's script the command line python -munittest discover"""
    loader= unittest.TestLoader()
    suite=loader.discover(".", "test_.*.py")
    runner=unittest.TextTestRunner()
    result=runner.run(suite)
    if  not result.wasSuccessful():
        raise Exception( "Test Failed: Aborting install")
    print("#### Test passed")

if "install" in sys.argv or "bdist_egg" in sys.argv or "sdist" in sys.argv:
    test()


And since practicality beats purity...

Still wondering if it is a bad idea



Okay, I test my packages before deploying, okay, there is tox. But no test can make as much variations as what users have as an environment. Even though I don't find it nice, at least it has the authoritative psycho-rigid behaviour I want.

Just like in Perl, if it does not pass the tests, it should not be installed.

I still wish I could:
  • make an autoreport tool (calling a REST server) to know how reliable are my packages, and which OSes/python versions have problems;
  • have a tool that let user interract with ticketting system;
  • bypass the tests with a --force flag, and call the test suite in a unified way once the package is installed.
 Still dreaming.

4 comments:

Thomas K said...

I'm interested to hear some more rationale for this. Imagine that I use pip to install your package. A test fails, and the installation aborts. What do I do?

Most likely, I'll curse you a bit, bypass the test system (either by finding a flag or just by editing setup.py) and see if the functionality I need will work anyway. If it does, I've wasted a couple of minutes working out how to skip the tests. If it doesn't, I'd have found the problem anyway.

The other aspects - especially aggregating test failures - clearly make sense. But can you explain the advantage of refusing to install a package with a failing test?

jul said...

Well, if we abstract the fact I am always right (except when I make mistakes) there are more than one rationale. First being: not loosing time.

1) The sooner we shoot and fix a problem the smaller it might become. If my tests say «it is wrong» as I check behaviour and reported bugs (liar, you have not made the regression tests yet) then you have a ticking logical bomb in your code just waiting to explode. The more you wait to report the problem, the more code will be impacted (yours and mine). And I hate coding.
2) And if your code works while tests are failing you might be using a side effect that make further API changes trickier.
3) that's what does perl -eshell -mCPAN install package and gem install package and it makes me more confident in using package. It may be a placebo effect, but I feel relieved when tests pass.

I admit that I am lazy. That's the main rationale. Not having to support code that don't work is plain lazyness.

Thomas K said...

I agree about the advantages of testing. I could understand 'run the tests before installation, and report any failures'. It's just refusing to install if there are any test failures that I take issue with, because if that happened to me I'm quite sure that I would work around it to get the package installed anyway.

Re using dodgy code: there may well be a bug in a module or function that I'm not even using. And surely it's up to me whether I risk using code that's failed a test - if there's no clear alternative, I'll probably take that chance.

By contrast, in Debian packaging, the tests are run when the package is built, but not on installation. If one version fails the tests, the package isn't built, and the previous version will stay in the repository until someone fixes it or decides some tests can be skipped. So the end user can always install it without having to override anything.

jul said...

@Thomas.

I do agree: the ideal solution is to accept the installation if it fails, that's the rationale for a *--force* flag in setuptools.

I like to make on step after another, and squash problems that bugs me at a slow pace. I may have a solution next week for this (I hope I can monkey patch the argparser used in setuptools).

Regarding Debian, you overlook the SID <-> testing <-> stable workflow that is equivalent to make human beings tests and report problems after packages were installed before a package is considered «stable/tested».