toggle quoted messageShow quoted text
Infra is better now and missing BGPCEP patch is merged so I think we can pick next AR build as RC.
Any updates on the Magnesium SR3 issues?
i know you can't join next TSC at 9 am pst, so can you let us know if increasing the vexhost is going to be your long term solution? I'd like to be able to release Magnesium SR3 this week if possible
On Tue, Dec 1, 2020 at 9:41 AM Luis Gomez <ecelgp@...
I think increasing the vexhost quota might not be the right solution here because: 1) it may impact the vexhost bill (this is why you added a quota in first place) and 2) it is hard to adjust the quota so that no CSIT fails.
If you see my last comment in the ticket, I remember when we hit this issue in the past, Thanh adjusted the maximum number of robot minions that are allowed to run in parallel and by doing so the CSIT jobs where queuing in Jenkins without failing. If we do this again, we can avoid impact on vexhost bill as well as we can internally adjust the maximum number of robot minions without involving vexhost.
On Dec 1, 2020, at 2:26 AM, Anil Belur <abelur@...
Greetings Daniel, Luis:
I've raised a request with vexhost to increase the quota limits. The primary reason the quotas were put in place to make sure we don't see a spike in the invoice as seen a few months ago.
Do we know why we are exceeding the limits recently since the quotas have been in place for a while?
On Wed, Nov 25, 2020 at 3:15 PM Luis Gomez <ecelgp@...
Here it is:
On Nov 24, 2020, at 3:53 PM, Robert Varga <nite@...
On 24/11/2020 18:29, Daniel de la Rosa wrote:
I think it is better to fix the infra issues first but I'm gonna
let @Robert Varga <mailto:nite@...> confirm.. So do we need to open
an LFN IT ticket ?
No hurry, I guess -- all code is in, we now just need to be confident it
This looks like we want LF IT to take a look with some amount of urgency
-- so can we file a ticket, please?
On Tue, Nov 24, 2020 at 9:05 AM Luis Gomez <ecelgp@...
BTW are we in a hurry to release Mg SR3 or can we fix the infra
Considering the distribution test launches a bunch of CSIT jobs in
parallel, the fix for this is either:
- Increase the max number of cloud instances: not sure if there is a
penalty in doing this, it should not if we just pay for cloud usage
- Implement a CSIT job execution queue: Instead of failing the CSIT
job, the jobs could be queued until the cloud resources are available.
On Nov 23, 2020, at 10:44 PM, Luis Gomez via
All the reds here:
are infra failure:
*03:48:16* WARN: Failed to initialize stack. Reason: Resource CREATE failed: Forbidden: resources.vm_1_group.resources.resources.instance: Quota exceeded for cores: Requested 2, but already used 356 of 350 cores (HTTP 403) (Request-ID: req-cbc3d8c8-b59b-4430-aed7-ab664be171d8)
which means there is not enough capacity to test an ODL distribution.
On Nov 22, 2020, at 10:21 PM, Daniel de la Rosa
<ddelarosa0707@... <mailto:ddelarosa0707@...>> wrote:
Hello TSC and all
Friendly reminder to help on this ASAP
On Fri, Nov 20, 2020 at 9:35 AM Daniel de la Rosa via
Hello TSC and all
I have picked Magnesium AR 473 as RC so please help with CSIT