IT-22529 Vexxhost stack instantiation failed

Robert Varga

So this again is interplay between CSIT, Vexxhost, openstack and us.

There were wide-scale Jenkins problems during past 24 hours, as for example:

spent 12 hours waiting in the queue. This job normally executes within
90 minutes.

has 2 new failures in three-node scenarios

took 13 hours to build.
It usually completes in 20 minutes.
reveals this:

01:03:35 -----END_OF_BUILD-----
This is after end of build, i.e. post-build actions which the ODL
community has no way of affecting.

01:03:35 [JaCoCo plugin] Loading packages..
12:21:28 [JaCoCo plugin] Done.
i.e. it took 11 hours and 18 minutes...

12:21:28 [JaCoCo plugin] Loading packages..
12:26:24 [JaCoCo plugin] Done.
... then 5 minutes ...

12:26:24 [JaCoCo plugin] Loading packages..
13:43:35 [JaCoCo plugin] Done.
... and some 76 minutes ...

13:43:35 [JaCoCo plugin] Loading packages..
14:25:10 [JaCoCo plugin] Done.
... and some 42 minutes ...

14:25:10 [JaCoCo plugin] Loading packages..
14:25:20 [JaCoCo plugin] Done.
14:25:20 [PostBuildScript] - [ERROR] An error occured during post-build processing.
14:25:32 org.jenkinsci.plugins.postbuildscript.PostBuildScriptException: java.lang.InterruptedException
and this is me, taking that job out back. It probably would still be
running, holding back retriggers (of which 7 have since passed

While all of this was going on,
shows OpenDaylight infra to be happy as a clam the last 24 hours.

Something is seriously not adding up here and it is Not Fun[*].


[*] All of this delayed Phosphorus CSIT results by ~10 hours, i.e. no
new data during morning CET.

On 21/07/2021 22:22, Project Services wrote:
Reply above this line.

Hello *Robert Varga*,

*Eric Ball* has added a comment on your request *IT-22529*

_The resource usage is currently low, so I was able to successfully
re-run the ovsdb job. I'm re-running the openflowplugin job now.

I'll also submit a request to Vexxhost to expand or remove the
memory cap. We have a limit to the total number of instances active
at once, which should keep our resource usage from hitting any of
the usage caps. But it seems that the memory cap doesn't take into
account the ram used by stacks dynamically spun up by the jobs, so a
large number of those jobs running concurrently can still have this

You may reply to their comment *here


Hello *Robert Varga*,

*System Administrator* changed the status of your request *IT-22529
to *Waiting for customer*.

View request
· Turn off this request's notifications

This is shared with ODL and Robert Varga.

Help Center, powered by Jira Service Desk
sent you this message.