[release] [kernel-dev] Regression in Mg SR1


JamO Luhrsen
 


On 5/1/20 2:36 PM, JamO Luhrsen via lists.opendaylight.org wrote:


On 5/1/20 1:01 PM, Luis Gomez wrote:
FYI this change merged in April 27th:

here's the revert:
https://git.opendaylight.org/gerrit/c/controller/+/89554



Produced regression in all these many suites:

This one seems easiest to try to debug. The gist of the problem is this:

- bring up older controller version and do some configs
- copy snapshots/ and *journal/ folders off to new controller version
- start new controller version
- notice that the data/config is not there (404 on cars:cars)

That's all I have though, by looking at the robot logs. Looking at the karaf log, it's
weirdly silent after the new controller boots up like normal. All that's there are
the two log statements we write to it from robot:

2020-05-01T01:47:57,330 | INFO | pipe-log:log "ROBOT MESSAGE: Starting test controller-akka1.txt.Verify_Data_Is_Restored" | core | 123 - org.apache.karaf.log.core - 4.2.6 | ROBOT MESSAGE: Starting test controller-akka1.txt.Verify_Data_Is_Restored 2020-05-01T01:51:01,859 | INFO | pipe-log:log "ROBOT MESSAGE: Starting test controller-akka1.txt.Archive_Older_Karaf_Log" | core | 123 - org.apache.karaf.log.core - 4.2.6 | ROBOT MESSAGE: Starting test controller-akka1.txt.Archive_Older_Karaf_Log


The bgp jobs seem to be even more broken though. More ERRORs, etc. Not sure
if we need to look at those separately or not.


These suites are in some degree dealing with the snapshot folder that might have changed after the mentioned patch.
Did snapshot change? I know journal did, but we addressed that here:
https://git.opendaylight.org/gerrit/c/integration/test/+/88658

I am not sure at this moment we should investigate the issues + repair the test (it can take a while) or just revert and try next SR.

I would guess some folks might have also reservations in introducing this change in an SR.
Once the revert I created gives me a distribution, I'll run it through these four
jobs in the sandbox. If those all pass like expected, and we don't get any quick
fix on the CSIT side, it might make sense to merge the revert and get moving
on a new release candidate.

FYI, The four jobs reported above were all fine with the revert.

JamO


Thanks,
JamO

BR/Luis



    


Daniel de la Rosa
 

Thank @Jamo Luhrsen So @Robert Varga i think we should merge this revert ASAP so we can respin and get a new Magnesium SR1 release ASAP

Thanks

On Fri, May 1, 2020 at 10:27 PM JamO Luhrsen <jluhrsen@...> wrote:


On 5/1/20 2:36 PM, JamO Luhrsen via lists.opendaylight.org wrote:


On 5/1/20 1:01 PM, Luis Gomez wrote:
FYI this change merged in April 27th:

here's the revert:
https://git.opendaylight.org/gerrit/c/controller/+/89554



Produced regression in all these many suites:

This one seems easiest to try to debug. The gist of the problem is this:

- bring up older controller version and do some configs
- copy snapshots/ and *journal/ folders off to new controller version
- start new controller version
- notice that the data/config is not there (404 on cars:cars)

That's all I have though, by looking at the robot logs. Looking at the karaf log, it's
weirdly silent after the new controller boots up like normal. All that's there are
the two log statements we write to it from robot:

2020-05-01T01:47:57,330 | INFO | pipe-log:log "ROBOT MESSAGE: Starting test controller-akka1.txt.Verify_Data_Is_Restored" | core | 123 - org.apache.karaf.log.core - 4.2.6 | ROBOT MESSAGE: Starting test controller-akka1.txt.Verify_Data_Is_Restored 2020-05-01T01:51:01,859 | INFO | pipe-log:log "ROBOT MESSAGE: Starting test controller-akka1.txt.Archive_Older_Karaf_Log" | core | 123 - org.apache.karaf.log.core - 4.2.6 | ROBOT MESSAGE: Starting test controller-akka1.txt.Archive_Older_Karaf_Log


The bgp jobs seem to be even more broken though. More ERRORs, etc. Not sure
if we need to look at those separately or not.


These suites are in some degree dealing with the snapshot folder that might have changed after the mentioned patch.
Did snapshot change? I know journal did, but we addressed that here:
https://git.opendaylight.org/gerrit/c/integration/test/+/88658

I am not sure at this moment we should investigate the issues + repair the test (it can take a while) or just revert and try next SR.

I would guess some folks might have also reservations in introducing this change in an SR.
Once the revert I created gives me a distribution, I'll run it through these four
jobs in the sandbox. If those all pass like expected, and we don't get any quick
fix on the CSIT side, it might make sense to merge the revert and get moving
on a new release candidate.

FYI, The four jobs reported above were all fine with the revert.

JamO


Thanks,
JamO

BR/Luis



    



--
Daniel de la Rosa
Customer Support Manager
Lumina Networks Inc.
e: ddelarosa@...
m:  +1 408 7728120


Robert Varga
 

On 04/05/2020 17:38, Daniel de la Rosa wrote:
Thank @Jamo Luhrsen <mailto:jluhrsen@...> So @Robert Varga
<mailto:nite@...> i think we should merge this revert ASAP so we can
respin and get a new Magnesium SR1 release ASAP
The revert is in, next autorelease should have it.

Regards,
Robert


JamO Luhrsen
 

On 5/4/20 1:12 PM, Robert Varga wrote:
On 04/05/2020 17:38, Daniel de la Rosa wrote:
Thank @Jamo Luhrsen <mailto:jluhrsen@...> So @Robert Varga
<mailto:nite@...> i think we should merge this revert ASAP so we can
respin and get a new Magnesium SR1 release ASAP
The revert is in, next autorelease should have it.
It'll be this one to look at:
https://jenkins.opendaylight.org/releng/view/autorelease/job/autorelease-release-magnesium-mvn35-openjdk11/261


Thanks,
JamO

Regards,
Robert


Daniel de la Rosa
 


On Mon, May 4, 2020 at 1:34 PM Jamo Luhrsen <jluhrsen@...> wrote:

On 5/4/20 1:12 PM, Robert Varga wrote:
> On 04/05/2020 17:38, Daniel de la Rosa wrote:
>> Thank @Jamo Luhrsen <mailto:jluhrsen@...> So @Robert Varga
>> <mailto:nite@...> i think we should merge this revert ASAP so we can
>> respin and get a new Magnesium SR1 release ASAP
> The revert is in, next autorelease should have it.

It'll be this one to look at:
https://jenkins.opendaylight.org/releng/view/autorelease/job/autorelease-release-magnesium-mvn35-openjdk11/261


Thanks,
JamO

> Regards,
> Robert
>


--
Daniel de la Rosa
Customer Support Manager
Lumina Networks Inc.
e: ddelarosa@...
m:  +1 408 7728120