Hello everyone,
as the (still current) failure to start Jenkins jobs shows, our current
way of integrating with external dependencies (global-jjb) is beyond
fragile.
The way our jobs work is that:
1) we have a base image, created by builder-packer-* jobs on a regular
basis and roll up distro upgrades plus some other things (like mininet,
etc.) that we need
2) the Jenkins job launches on that base image and call two scripts from
global-jjb, both of which end up installing more things:
a) python-tools-install.sh
b) lf-env.sh
3) the actual job runs
4) some more stuff invoking lf-env.sh to setup another Python
environment runs.
Now, it is clear that everything in 1) is invariant and updated in a
controlled way.
The problem is with 2), where again, everything is supposed to be
invariant for a particular version of global-jjb -- yet we reinstall
these things on every single job run.
Not only is this subject to random breakage (like now, or when pip
repositories are unavailable), etc.
It also takes around 3 minutes of each job execution, which does not
sound like much, but it is full 30%(!) of runtime of
yangtools-release-merge (which takes around 10 minutes).
We obviously can and must do better: global-jjb's environment-impacting
scripts must all be executed during builder-packer, so that they become
proper invariants.
For that, global-jjb needs to grow two things:
1) a way to install *all* of its dependencies without doing anything
else, for use in packer jobs
2) compatibility checks on the environment to ensure it is uptodate
enough to run a particular global-jjb version's scripts
With that, our jobs should be both faster and more reliable.
Does anybody see a problem why this would not work?
If not, I will be filing LFIT issues to get this done.
Regards,
Robert