[release] [releng][TSC] phosphorus release status - master branch has been locked


Robert Varga
 

On 23/09/2021 05:54, Daniel de la Rosa wrote:
Phosphorus AR#212 integration #169 has only one test case failed
bgpcep https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/ <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>
so looks to me like a good RC candidate
It seems we have a few regressions in both BGP and PCEP. I suspect I know the culprit behind at least one of the failures, should have an updated bgpcep ready later today.

Regards,
Robert


Daniel de la Rosa
 

It seems that we are still having Bgp and pcep issues but are they all critical ? Or can we fix them later so we can release phosphorus  ?

On Thu, Sep 23, 2021 at 5:08 AM Robert Varga <nite@...> wrote:
On 23/09/2021 05:54, Daniel de la Rosa wrote:
>
> Phosphorus AR#212 integration #169 has only one test case failed
>
> bgpcep        https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/  <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>
>
> so looks to me like a good RC candidate

It seems we have a few regressions in both BGP and PCEP. I suspect I
know the culprit behind at least one of the failures, should have an
updated bgpcep ready later today.

Regards,
Robert


Robert Varga
 

On 27/09/2021 21:18, Daniel de la Rosa wrote:
It seems that we are still having Bgp and pcep issues but are they all critical ? Or can we fix them later so we can release phosphorus  ?
Yeah, it's three test cases, all of them are PCEP-related. They are failing reliably, which seems to indicate a systemic problem.

I'll try to see if I can debug/repro it tomorrow. We might punt to SR1 (which is around the corner) if it ends up being something hard.

Regards,
Robert




On Thu, Sep 23, 2021 at 5:08 AM Robert Varga <nite@... <mailto:nite@...>> wrote:
On 23/09/2021 05:54, Daniel de la Rosa wrote:
>
> Phosphorus AR#212 integration #169 has only one test case failed
>
> bgpcep
https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/> <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>>
>
> so looks to me like a good RC candidate
It seems we have a few regressions in both BGP and PCEP. I suspect I
know the culprit behind at least one of the failures, should have an
updated bgpcep ready later today.
Regards,
Robert


Robert Varga
 

On 28/09/2021 05:24, Robert Varga wrote:
On 27/09/2021 21:18, Daniel de la Rosa wrote:
It seems that we are still having Bgp and pcep issues but are they all critical ? Or can we fix them later so we can release phosphorus  ?
Yeah, it's three test cases, all of them are PCEP-related. They are failing reliably, which seems to indicate a systemic problem.
I'll try to see if I can debug/repro it tomorrow. We might punt to SR1 (which is around the corner) if it ends up being something hard.
Alright, this looks like a pccmock issue:

https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/bgpcep-csit-1node-userfeatures-all-phosphorus/179/robot-plugin/log.html.gz#s1-s2-k2-k3

https://jira.opendaylight.org/browse/BGPCEP-981 tracks it.

Regards,
Robert


Robert Varga
 

On 28/09/2021 13:33, Robert Varga wrote:
On 28/09/2021 05:24, Robert Varga wrote:
On 27/09/2021 21:18, Daniel de la Rosa wrote:
It seems that we are still having Bgp and pcep issues but are they all critical ? Or can we fix them later so we can release phosphorus  ?
Yeah, it's three test cases, all of them are PCEP-related. They are failing reliably, which seems to indicate a systemic problem.

I'll try to see if I can debug/repro it tomorrow. We might punt to SR1 (which is around the corner) if it ends up being something hard.
Alright, this looks like a pccmock issue:
https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/bgpcep-csit-1node-userfeatures-all-phosphorus/179/robot-plugin/log.html.gz#s1-s2-k2-k3 https://jira.opendaylight.org/browse/BGPCEP-981 tracks it.
Okay, I think I found the culprint. A fixed bgpcep should be out in about two hours or so and should be reflect in next AR build.

Bye,
Robert


Robert Varga
 

On 27/09/2021 21:18, Daniel de la Rosa wrote:
It seems that we are still having Bgp and pcep issues but are they all critical ? Or can we fix them later so we can release phosphorus  ?
On Thu, Sep 23, 2021 at 5:08 AM Robert Varga <nite@... <mailto:nite@...>> wrote:
On 23/09/2021 05:54, Daniel de la Rosa wrote:
>
> Phosphorus AR#212 integration #169 has only one test case failed
>
> bgpcep
https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/> <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>>
>
> so looks to me like a good RC candidate
It seems we have a few regressions in both BGP and PCEP. I suspect I
know the culprit behind at least one of the failures, should have an
updated bgpcep ready later today.
Okay, so we have one remaining issue in BGPCEP, but after spending today trying to make sense of what is going on, it is not a regression, just shifted timing in code.

It may have been detected transiently before, but now it is getting caught in every built.

The underlying problem has been there since late 2017, maybe even as a day-0 issue.

From BGPCEP perspective we are good to release https://jenkins.opendaylight.org/releng/job/autorelease-release-phosphorus-mvn35-openjdk11/221/

Regards,
Robert


Daniel de la Rosa
 



On Wed, Sep 29, 2021 at 9:52 AM Robert Varga <nite@...> wrote:
On 27/09/2021 21:18, Daniel de la Rosa wrote:
> It seems that we are still having Bgp and pcep issues but are they all
> critical ? Or can we fix them later so we can release phosphorus  ?
>
> On Thu, Sep 23, 2021 at 5:08 AM Robert Varga <nite@...
> <mailto:nite@...>> wrote:
>
>     On 23/09/2021 05:54, Daniel de la Rosa wrote:
>      >
>      > Phosphorus AR#212 integration #169 has only one test case failed
>      >
>      > bgpcep
>     https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
>     <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>
>     <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
>     <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>>
>      >
>      > so looks to me like a good RC candidate
>
>     It seems we have a few regressions in both BGP and PCEP. I suspect I
>     know the culprit behind at least one of the failures, should have an
>     updated bgpcep ready later today.

Okay, so we have one remaining issue in BGPCEP, but after spending today
trying to make sense of what is going on, it is not a regression, just
shifted timing in code.

It may have been detected transiently before, but now it is getting
caught in every built.

The underlying problem has been there since late 2017, maybe even as a
day-0 issue.

 From BGPCEP perspective we are good to release
https://jenkins.opendaylight.org/releng/job/autorelease-release-phosphorus-mvn35-openjdk11/221/

Thanks Robert.. I've come up with this  phosphorus release approval for TSC

 

but I couldn't find the right mri test ... it is not 39 right?




Regards,
Robert


Robert Varga
 

On 29/09/2021 20:29, Daniel de la Rosa wrote:
On Wed, Sep 29, 2021 at 9:52 AM Robert Varga <nite@... <mailto:nite@...>> wrote:
On 27/09/2021 21:18, Daniel de la Rosa wrote:
> It seems that we are still having Bgp and pcep issues but are
they all
> critical ? Or can we fix them later so we can release phosphorus  ?
>
> On Thu, Sep 23, 2021 at 5:08 AM Robert Varga <nite@...
<mailto:nite@...>
> <mailto:nite@... <mailto:nite@...>>> wrote:
>
>     On 23/09/2021 05:54, Daniel de la Rosa wrote:
>      >
>      > Phosphorus AR#212 integration #169 has only one test case
failed
>      >
>      > bgpcep
>
https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>
>
 <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/ <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>>
>
 <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/ <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>
>
 <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/ <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>>>
>      >
>      > so looks to me like a good RC candidate
>
>     It seems we have a few regressions in both BGP and PCEP. I
suspect I
>     know the culprit behind at least one of the failures, should
have an
>     updated bgpcep ready later today.
Okay, so we have one remaining issue in BGPCEP, but after spending
today
trying to make sense of what is going on, it is not a regression, just
shifted timing in code.
It may have been detected transiently before, but now it is getting
caught in every built.
The underlying problem has been there since late 2017, maybe even as a
day-0 issue.
 From BGPCEP perspective we are good to release
https://jenkins.opendaylight.org/releng/job/autorelease-release-phosphorus-mvn35-openjdk11/221/
<https://jenkins.opendaylight.org/releng/job/autorelease-release-phosphorus-mvn35-openjdk11/221/>
Thanks Robert.. I've come up with this  phosphorus release approval for TSC
https://wiki.opendaylight.org/display/ODL/Phosphorus+Formal+Release+Approval <https://wiki.opendaylight.org/display/ODL/Phosphorus+Formal+Release+Approval>
but I couldn't find the right mri test ... it is not 39 right?
https://jenkins.opendaylight.org/releng/view/autorelease/job/integration-distribution-mri-test-phosphorus/39/ <https://jenkins.opendaylight.org/releng/view/autorelease/job/integration-distribution-mri-test-phosphorus/39/>
So the mri-test-phosphorus does not kick off of autorelease, but runs periodically. I have kicked off #40, but #39 should be accurate.

Regards,
Robert


Daniel de la Rosa
 



On Thu, Sep 30, 2021 at 4:26 AM Robert Varga <nite@...> wrote:
On 29/09/2021 20:29, Daniel de la Rosa wrote:
>
>
> On Wed, Sep 29, 2021 at 9:52 AM Robert Varga <nite@...
> <mailto:nite@...>> wrote:
>
>     On 27/09/2021 21:18, Daniel de la Rosa wrote:
>      > It seems that we are still having Bgp and pcep issues but are
>     they all
>      > critical ? Or can we fix them later so we can release phosphorus  ?
>      >
>      > On Thu, Sep 23, 2021 at 5:08 AM Robert Varga <nite@...
>     <mailto:nite@...>
>      > <mailto:nite@... <mailto:nite@...>>> wrote:
>      >
>      >     On 23/09/2021 05:54, Daniel de la Rosa wrote:
>      >      >
>      >      > Phosphorus AR#212 integration #169 has only one test case
>     failed
>      >      >
>      >      > bgpcep
>      >
>     https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
>     <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>
>      >   
>       <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/ <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>>
>      >   
>       <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/ <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>
>      >   
>       <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/ <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>>>
>      >      >
>      >      > so looks to me like a good RC candidate
>      >
>      >     It seems we have a few regressions in both BGP and PCEP. I
>     suspect I
>      >     know the culprit behind at least one of the failures, should
>     have an
>      >     updated bgpcep ready later today.
>
>     Okay, so we have one remaining issue in BGPCEP, but after spending
>     today
>     trying to make sense of what is going on, it is not a regression, just
>     shifted timing in code.
>
>     It may have been detected transiently before, but now it is getting
>     caught in every built.
>
>     The underlying problem has been there since late 2017, maybe even as a
>     day-0 issue.
>
>       From BGPCEP perspective we are good to release
>     https://jenkins.opendaylight.org/releng/job/autorelease-release-phosphorus-mvn35-openjdk11/221/
>     <https://jenkins.opendaylight.org/releng/job/autorelease-release-phosphorus-mvn35-openjdk11/221/>
>
>
> Thanks Robert.. I've come up with this  phosphorus release approval for TSC
>
> https://wiki.opendaylight.org/display/ODL/Phosphorus+Formal+Release+Approval
> <https://wiki.opendaylight.org/display/ODL/Phosphorus+Formal+Release+Approval>
>
> but I couldn't find the right mri test ... it is not 39 right?
>
> https://jenkins.opendaylight.org/releng/view/autorelease/job/integration-distribution-mri-test-phosphorus/39/
> <https://jenkins.opendaylight.org/releng/view/autorelease/job/integration-distribution-mri-test-phosphorus/39/>

So the mri-test-phosphorus does not kick off of autorelease, but runs
periodically. I have kicked off #40, but #39 should be accurate.

both #40 and #39 is having this issue

ERROR: Build aborted. Can't trigger undefined projects. 1 of the below project(s) can't be resolved:
 > netconf-csit-1node-scale-max-devices-only-master



 

Regards,
Robert


Luis Gomez
 

I think netconf devs got confused on how our CSIT branches work, I do not see any CSIT job for phosphorous and therefore I do not think we are testing netconf in this release:


BR/Luis


On Sep 30, 2021, at 8:19 AM, Daniel de la Rosa <ddelarosa0707@...> wrote:



On Thu, Sep 30, 2021 at 4:26 AM Robert Varga <nite@...> wrote:
On 29/09/2021 20:29, Daniel de la Rosa wrote:
>
>
> On Wed, Sep 29, 2021 at 9:52 AM Robert Varga <nite@...
> <mailto:nite@...>> wrote:
>
>     On 27/09/2021 21:18, Daniel de la Rosa wrote:
>      > It seems that we are still having Bgp and pcep issues but are
>     they all
>      > critical ? Or can we fix them later so we can release phosphorus  ?
>      >
>      > On Thu, Sep 23, 2021 at 5:08 AM Robert Varga <nite@...
>     <mailto:nite@...>
>      > <mailto:nite@... <mailto:nite@...>>> wrote:
>      >
>      >     On 23/09/2021 05:54, Daniel de la Rosa wrote:
>      >      >
>      >      > Phosphorus AR#212 integration #169 has only one test case
>     failed
>      >      >
>      >      > bgpcep
>      >
>     https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
>     <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>
>      >   
>       <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/ <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>>
>      >   
>       <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/ <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>
>      >   
>       <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/ <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>>>
>      >      >
>      >      > so looks to me like a good RC candidate
>      >
>      >     It seems we have a few regressions in both BGP and PCEP. I
>     suspect I
>      >     know the culprit behind at least one of the failures, should
>     have an
>      >     updated bgpcep ready later today.
>
>     Okay, so we have one remaining issue in BGPCEP, but after spending
>     today
>     trying to make sense of what is going on, it is not a regression, just
>     shifted timing in code.
>
>     It may have been detected transiently before, but now it is getting
>     caught in every built.
>
>     The underlying problem has been there since late 2017, maybe even as a
>     day-0 issue.
>
>       From BGPCEP perspective we are good to release
>     https://jenkins.opendaylight.org/releng/job/autorelease-release-phosphorus-mvn35-openjdk11/221/
>     <https://jenkins.opendaylight.org/releng/job/autorelease-release-phosphorus-mvn35-openjdk11/221/>
>
>
> Thanks Robert.. I've come up with this  phosphorus release approval for TSC
>
> https://wiki.opendaylight.org/display/ODL/Phosphorus+Formal+Release+Approval
> <https://wiki.opendaylight.org/display/ODL/Phosphorus+Formal+Release+Approval>
>
> but I couldn't find the right mri test ... it is not 39 right?
>
> https://jenkins.opendaylight.org/releng/view/autorelease/job/integration-distribution-mri-test-phosphorus/39/
> <https://jenkins.opendaylight.org/releng/view/autorelease/job/integration-distribution-mri-test-phosphorus/39/>

So the mri-test-phosphorus does not kick off of autorelease, but runs
periodically. I have kicked off #40, but #39 should be accurate.

both #40 and #39 is having this issue

ERROR: Build aborted. Can't trigger undefined projects. 1 of the below project(s) can't be resolved:
 > netconf-csit-1node-scale-max-devices-only-master



 

Regards,
Robert






Robert Varga
 

On 30/09/2021 17:38, Luis Gomez wrote:
I think netconf devs got confused on how our CSIT branches work, I do not see any CSIT job for phosphorous and therefore I do not think we are testing netconf in this release:
https://jenkins.opendaylight.org/releng/view/netconf/ <https://jenkins.opendaylight.org/releng/view/netconf/>
I think netconf devs would be less confused if you could share some writeup of how exactly is test execution wired across integration/test and releng/builder :)

Please note that NETCONF is an MRI project, hence its naming follows its versioning scheme since Silicon GA.

Since we are after autorelease has branched stable/phosphorus, but netconf.git has _not_ branched 2.0.x, netconf's master is still hosting netconf-2.0.x and that in turn is integrated into *both* Phosphorus and Sulfur.

The same is true for odlparent, mdsal, controller, aaa and bgpcep. Yangtools is a bit ahead of the curve and has 7.0.x for Phosphorus and master is hosting yangtools-8.0.0-to-be (and destined for Sulfur).

I hope that paints the current state of git branches enough to understand where we are.

Currently all testing of MRI projects is still wired on the assumption that what we are testing int/dist with that particular MRI project. See bgpcep tests for an example.

That is quite wrong, as MRI projects' CSIT must be able to execute on whatever artifacts are produced, for example, {mriproject}-maven-stage-{branch} job.

Tomas has sunk a significant chunk of his time to correct this and we are still kicking the wheels on that here:

https://jenkins.opendaylight.org/releng/view/netconf/job/netconf-maven-mri-stage-master/ . What that job does is build and stage netconf for release (just link autorelease does for openflowplugin) and then triggers

https://jenkins.opendaylight.org/releng/view/netconf/job/netconf-distribution-mri-test-master/ . This job runs the NETCONF CSIT, but not on int/dist's karaf.tar.gz, but rather on netconf's netconf-karaf.tar.gz.

Which is pretty much what should be happening for all MRI projects, as that way we get fully-vetted artifacts before we decide to run our {project}-release-merge job to publish them to Nexus (step done manually by Anil for autorelease).

Based on all of this, please start saying goodbye to the idea that MRI projects have CSIT job names bearing SimRel names -- currently that would mean executing netconf/master CSIT twice (sulfur and phosphorus).

Also, please be patient while we bring all of this long-overlooked task to fruition.

As for netconf-csit-1node-scale-max-devices-only job, specifically, that one underwent a rewrite and is still pending stabilization. The test never really worked as it should have.

As for overall Phosphorus GA release, I just do not see the point holding it up anymore. Once it is out, we will be finally able to wipe Aluminium, reducing cognitive load significantly due to the genius Netvirt no longer being in the picture.

If there is anything wrong in GA, we have SR1 scheduled exactly 4 weeks from today.

Regards,
Robert

BR/Luis

On Sep 30, 2021, at 8:19 AM, Daniel de la Rosa <ddelarosa0707@... <mailto:ddelarosa0707@...>> wrote:



On Thu, Sep 30, 2021 at 4:26 AM Robert Varga <nite@... <mailto:nite@...>> wrote:

On 29/09/2021 20:29, Daniel de la Rosa wrote:
>
>
> On Wed, Sep 29, 2021 at 9:52 AM Robert Varga <nite@...
<mailto:nite@...>
> <mailto:nite@... <mailto:nite@...>>> wrote:
>
>     On 27/09/2021 21:18, Daniel de la Rosa wrote:
>      > It seems that we are still having Bgp and pcep issues but are
>     they all
>      > critical ? Or can we fix them later so we can release
phosphorus  ?
>      >
>      > On Thu, Sep 23, 2021 at 5:08 AM Robert Varga <nite@...
<mailto:nite@...>
>     <mailto:nite@... <mailto:nite@...>>
>      > <mailto:nite@... <mailto:nite@...> <mailto:nite@...
<mailto:nite@...>>>> wrote:
>      >
>      >     On 23/09/2021 05:54, Daniel de la Rosa wrote:
>      >      >
>      >      > Phosphorus AR#212 integration #169 has only one
test case
>     failed
>      >      >
>      >      > bgpcep
>      >
>
https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>
>
 <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>>
>      >
>
 <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>>>
>      >
>
 <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>>
>      >
>
 <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>>>>
>      >      >
>      >      > so looks to me like a good RC candidate
>      >
>      >     It seems we have a few regressions in both BGP and
PCEP. I
>     suspect I
>      >     know the culprit behind at least one of the failures,
should
>     have an
>      >     updated bgpcep ready later today.
>
>     Okay, so we have one remaining issue in BGPCEP, but after
spending
>     today
>     trying to make sense of what is going on, it is not a
regression, just
>     shifted timing in code.
>
>     It may have been detected transiently before, but now it is
getting
>     caught in every built.
>
>     The underlying problem has been there since late 2017, maybe
even as a
>     day-0 issue.
>
>       From BGPCEP perspective we are good to release
>
https://jenkins.opendaylight.org/releng/job/autorelease-release-phosphorus-mvn35-openjdk11/221/
<https://jenkins.opendaylight.org/releng/job/autorelease-release-phosphorus-mvn35-openjdk11/221/>
>
 <https://jenkins.opendaylight.org/releng/job/autorelease-release-phosphorus-mvn35-openjdk11/221/
<https://jenkins.opendaylight.org/releng/job/autorelease-release-phosphorus-mvn35-openjdk11/221/>>
>
>
> Thanks Robert.. I've come up with this  phosphorus release
approval for TSC
>
>
https://wiki.opendaylight.org/display/ODL/Phosphorus+Formal+Release+Approval
<https://wiki.opendaylight.org/display/ODL/Phosphorus+Formal+Release+Approval>

>
<https://wiki.opendaylight.org/display/ODL/Phosphorus+Formal+Release+Approval
<https://wiki.opendaylight.org/display/ODL/Phosphorus+Formal+Release+Approval>>
>
> but I couldn't find the right mri test ... it is not 39 right?
>
>
https://jenkins.opendaylight.org/releng/view/autorelease/job/integration-distribution-mri-test-phosphorus/39/
<https://jenkins.opendaylight.org/releng/view/autorelease/job/integration-distribution-mri-test-phosphorus/39/>

>
<https://jenkins.opendaylight.org/releng/view/autorelease/job/integration-distribution-mri-test-phosphorus/39/
<https://jenkins.opendaylight.org/releng/view/autorelease/job/integration-distribution-mri-test-phosphorus/39/>>

So the mri-test-phosphorus does not kick off of autorelease, but runs
periodically. I have kicked off #40, but #39 should be accurate.


both #40 and #39 is having this issue

ERROR: Build aborted. Can't trigger undefined projects. 1 of the below project(s) can't be resolved:
> netconf-csit-1node-scale-max-devices-only-master




Regards,
Robert



Daniel de la Rosa
 




On Thu, Sep 30, 2021 at 9:12 AM Robert Varga <nite@...> wrote:
On 30/09/2021 17:38, Luis Gomez wrote:
> I think netconf devs got confused on how our CSIT branches work, I do
> not see any CSIT job for phosphorous and therefore I do not think we are
> testing netconf in this release:
>
> https://jenkins.opendaylight.org/releng/view/netconf/
> <https://jenkins.opendaylight.org/releng/view/netconf/>

I think netconf devs would be less confused if you could share some
writeup of how exactly is test execution wired across integration/test
and releng/builder :)

Please note that NETCONF is an MRI project, hence its naming follows its
versioning scheme since Silicon GA.

Since we are after autorelease has branched stable/phosphorus, but
netconf.git has _not_ branched 2.0.x, netconf's master is still hosting
netconf-2.0.x and that in turn is integrated into *both* Phosphorus and
Sulfur.

The same is true for odlparent, mdsal, controller, aaa and bgpcep.
Yangtools is a bit ahead of the curve and has 7.0.x for Phosphorus and
master is hosting yangtools-8.0.0-to-be (and destined for Sulfur).

I hope that paints the current state of git branches enough to
understand where we are.

Currently all testing of MRI projects is still wired on the assumption
that what we are testing int/dist with that particular MRI project. See
bgpcep tests for an example.

That is quite wrong, as MRI projects' CSIT must be able to execute on
whatever artifacts are produced, for example,
{mriproject}-maven-stage-{branch} job.

Tomas has sunk a significant chunk of his time to correct this and we
are still kicking the wheels on that here:

https://jenkins.opendaylight.org/releng/view/netconf/job/netconf-maven-mri-stage-master/
. What that job does is build and stage netconf for release (just link
autorelease does for openflowplugin) and then triggers

https://jenkins.opendaylight.org/releng/view/netconf/job/netconf-distribution-mri-test-master/
. This job runs the NETCONF CSIT, but not on int/dist's karaf.tar.gz,
but rather on netconf's netconf-karaf.tar.gz.

Which is pretty much what should be happening for all MRI projects, as
that way we get fully-vetted artifacts before we decide to run our
{project}-release-merge job to publish them to Nexus (step done manually
by Anil for autorelease).

Based on all of this, please start saying goodbye to the idea that MRI
projects have CSIT job names bearing SimRel names -- currently that
would mean executing netconf/master CSIT twice (sulfur and phosphorus).

Also, please be patient while we bring all of this long-overlooked task
to fruition.

As for netconf-csit-1node-scale-max-devices-only job, specifically, that
one underwent a rewrite and is still pending stabilization. The test
never really worked as it should have.

As for overall Phosphorus GA release, I just do not see the point
holding it up anymore. Once it is out, we will be finally able to wipe
Aluminium, reducing cognitive load significantly due to the genius
Netvirt no longer being in the picture.

If there is anything wrong in GA, we have SR1 scheduled exactly 4 weeks
from today.

Ok. I've sent an email to TSC so the current phosphorus candidate can be approved


Thanks

 

Regards,
Robert

>
> BR/Luis
>
>
>> On Sep 30, 2021, at 8:19 AM, Daniel de la Rosa
>> <ddelarosa0707@... <mailto:ddelarosa0707@...>> wrote:
>>
>>
>>
>> On Thu, Sep 30, 2021 at 4:26 AM Robert Varga <nite@...
>> <mailto:nite@...>> wrote:
>>
>>     On 29/09/2021 20:29, Daniel de la Rosa wrote:
>>     >
>>     >
>>     > On Wed, Sep 29, 2021 at 9:52 AM Robert Varga <nite@...
>>     <mailto:nite@...>
>>     > <mailto:nite@... <mailto:nite@...>>> wrote:
>>     >
>>     >     On 27/09/2021 21:18, Daniel de la Rosa wrote:
>>     >      > It seems that we are still having Bgp and pcep issues but are
>>     >     they all
>>     >      > critical ? Or can we fix them later so we can release
>>     phosphorus  ?
>>     >      >
>>     >      > On Thu, Sep 23, 2021 at 5:08 AM Robert Varga <nite@...
>>     <mailto:nite@...>
>>     >     <mailto:nite@... <mailto:nite@...>>
>>     >      > <mailto:nite@... <mailto:nite@...> <mailto:nite@...
>>     <mailto:nite@...>>>> wrote:
>>     >      >
>>     >      >     On 23/09/2021 05:54, Daniel de la Rosa wrote:
>>     >      >      >
>>     >      >      > Phosphorus AR#212 integration #169 has only one
>>     test case
>>     >     failed
>>     >      >      >
>>     >      >      > bgpcep
>>     >      >
>>     >
>>     https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
>>     <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>
>>     >   
>>      <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
>>     <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>>
>>     >      >
>>     >     
>>      <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
>>     <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>
>>     <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
>>     <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>>>
>>     >      >
>>     >     
>>      <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
>>     <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>
>>     <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
>>     <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>>
>>     >      >
>>     >     
>>      <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
>>     <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>
>>     <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
>>     <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>>>>
>>     >      >      >
>>     >      >      > so looks to me like a good RC candidate
>>     >      >
>>     >      >     It seems we have a few regressions in both BGP and
>>     PCEP. I
>>     >     suspect I
>>     >      >     know the culprit behind at least one of the failures,
>>     should
>>     >     have an
>>     >      >     updated bgpcep ready later today.
>>     >
>>     >     Okay, so we have one remaining issue in BGPCEP, but after
>>     spending
>>     >     today
>>     >     trying to make sense of what is going on, it is not a
>>     regression, just
>>     >     shifted timing in code.
>>     >
>>     >     It may have been detected transiently before, but now it is
>>     getting
>>     >     caught in every built.
>>     >
>>     >     The underlying problem has been there since late 2017, maybe
>>     even as a
>>     >     day-0 issue.
>>     >
>>     >       From BGPCEP perspective we are good to release
>>     >
>>     https://jenkins.opendaylight.org/releng/job/autorelease-release-phosphorus-mvn35-openjdk11/221/
>>     <https://jenkins.opendaylight.org/releng/job/autorelease-release-phosphorus-mvn35-openjdk11/221/>
>>     >   
>>      <https://jenkins.opendaylight.org/releng/job/autorelease-release-phosphorus-mvn35-openjdk11/221/
>>     <https://jenkins.opendaylight.org/releng/job/autorelease-release-phosphorus-mvn35-openjdk11/221/>>
>>     >
>>     >
>>     > Thanks Robert.. I've come up with this  phosphorus release
>>     approval for TSC
>>     >
>>     >
>>     https://wiki.opendaylight.org/display/ODL/Phosphorus+Formal+Release+Approval
>>     <https://wiki.opendaylight.org/display/ODL/Phosphorus+Formal+Release+Approval>
>>
>>     >
>>     <https://wiki.opendaylight.org/display/ODL/Phosphorus+Formal+Release+Approval
>>     <https://wiki.opendaylight.org/display/ODL/Phosphorus+Formal+Release+Approval>>
>>     >
>>     > but I couldn't find the right mri test ... it is not 39 right?
>>     >
>>     >
>>     https://jenkins.opendaylight.org/releng/view/autorelease/job/integration-distribution-mri-test-phosphorus/39/
>>     <https://jenkins.opendaylight.org/releng/view/autorelease/job/integration-distribution-mri-test-phosphorus/39/>
>>
>>     >
>>     <https://jenkins.opendaylight.org/releng/view/autorelease/job/integration-distribution-mri-test-phosphorus/39/
>>     <https://jenkins.opendaylight.org/releng/view/autorelease/job/integration-distribution-mri-test-phosphorus/39/>>
>>
>>     So the mri-test-phosphorus does not kick off of autorelease, but runs
>>     periodically. I have kicked off #40, but #39 should be accurate.
>>
>>
>> both #40 and #39 is having this issue
>>
>> ERROR: Build aborted. Can't trigger undefined projects. 1 of the below project(s) can't be resolved:
>>   > netconf-csit-1node-scale-max-devices-only-master
>>
>>
>>
>>
>>     Regards,
>>     Robert
>>
>>
>>
>>
>


Luis Gomez
 

OK, I think I was the confused here :)

After reading your mail, it makes sense to start testing MRI local distribution as you are doing here:

https://jenkins.opendaylight.org/releng/view/netconf/job/netconf-distribution-mri-test-master/

Also for this job that is testing opendaylight distribution (not local one):

https://jenkins.opendaylight.org/releng/view/distribution/job/integration-distribution-mri-test-phosphorus/

I would have kept the stream name in the netconf job name (even when this means more jobs) just to track the test trend for a specific ODL stream, now multiple ODL versions use the same (master) job.

But no biggie, I guess we can even remove the netconf project from the distribution MRI test if it is already covered in the local distribution test.

BR/Luis

On Sep 30, 2021, at 9:12 AM, Robert Varga <nite@...> wrote:

On 30/09/2021 17:38, Luis Gomez wrote:
I think netconf devs got confused on how our CSIT branches work, I do not see any CSIT job for phosphorous and therefore I do not think we are testing netconf in this release:
https://jenkins.opendaylight.org/releng/view/netconf/ <https://jenkins.opendaylight.org/releng/view/netconf/>
I think netconf devs would be less confused if you could share some writeup of how exactly is test execution wired across integration/test and releng/builder :)

Please note that NETCONF is an MRI project, hence its naming follows its versioning scheme since Silicon GA.

Since we are after autorelease has branched stable/phosphorus, but netconf.git has _not_ branched 2.0.x, netconf's master is still hosting netconf-2.0.x and that in turn is integrated into *both* Phosphorus and Sulfur.

The same is true for odlparent, mdsal, controller, aaa and bgpcep. Yangtools is a bit ahead of the curve and has 7.0.x for Phosphorus and master is hosting yangtools-8.0.0-to-be (and destined for Sulfur).

I hope that paints the current state of git branches enough to understand where we are.

Currently all testing of MRI projects is still wired on the assumption that what we are testing int/dist with that particular MRI project. See bgpcep tests for an example.

That is quite wrong, as MRI projects' CSIT must be able to execute on whatever artifacts are produced, for example, {mriproject}-maven-stage-{branch} job.

Tomas has sunk a significant chunk of his time to correct this and we are still kicking the wheels on that here:

https://jenkins.opendaylight.org/releng/view/netconf/job/netconf-maven-mri-stage-master/ . What that job does is build and stage netconf for release (just link autorelease does for openflowplugin) and then triggers

https://jenkins.opendaylight.org/releng/view/netconf/job/netconf-distribution-mri-test-master/ . This job runs the NETCONF CSIT, but not on int/dist's karaf.tar.gz, but rather on netconf's netconf-karaf.tar.gz.

Which is pretty much what should be happening for all MRI projects, as that way we get fully-vetted artifacts before we decide to run our {project}-release-merge job to publish them to Nexus (step done manually by Anil for autorelease).

Based on all of this, please start saying goodbye to the idea that MRI projects have CSIT job names bearing SimRel names -- currently that would mean executing netconf/master CSIT twice (sulfur and phosphorus).

Also, please be patient while we bring all of this long-overlooked task to fruition.

As for netconf-csit-1node-scale-max-devices-only job, specifically, that one underwent a rewrite and is still pending stabilization. The test never really worked as it should have.

As for overall Phosphorus GA release, I just do not see the point holding it up anymore. Once it is out, we will be finally able to wipe Aluminium, reducing cognitive load significantly due to the genius Netvirt no longer being in the picture.

If there is anything wrong in GA, we have SR1 scheduled exactly 4 weeks from today.

Regards,
Robert

BR/Luis
On Sep 30, 2021, at 8:19 AM, Daniel de la Rosa <ddelarosa0707@... <mailto:ddelarosa0707@...>> wrote:



On Thu, Sep 30, 2021 at 4:26 AM Robert Varga <nite@... <mailto:nite@...>> wrote:

On 29/09/2021 20:29, Daniel de la Rosa wrote:
>
>
> On Wed, Sep 29, 2021 at 9:52 AM Robert Varga <nite@...
<mailto:nite@...>
> <mailto:nite@... <mailto:nite@...>>> wrote:
>
> On 27/09/2021 21:18, Daniel de la Rosa wrote:
> > It seems that we are still having Bgp and pcep issues but are
> they all
> > critical ? Or can we fix them later so we can release
phosphorus ?
> >
> > On Thu, Sep 23, 2021 at 5:08 AM Robert Varga <nite@...
<mailto:nite@...>
> <mailto:nite@... <mailto:nite@...>>
> > <mailto:nite@... <mailto:nite@...> <mailto:nite@...
<mailto:nite@...>>>> wrote:
> >
> > On 23/09/2021 05:54, Daniel de la Rosa wrote:
> > >
> > > Phosphorus AR#212 integration #169 has only one
test case
> failed
> > >
> > > bgpcep
> >
>
https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>
> <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>>
> >
> <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>>>
> >
> <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>>
> >
> <https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/
<https://jenkins.opendaylight.org/releng/job/bgpcep-csit-1node-userfeatures-all-phosphorus/167/>>>>
> > >
> > > so looks to me like a good RC candidate
> >
> > It seems we have a few regressions in both BGP and
PCEP. I
> suspect I
> > know the culprit behind at least one of the failures,
should
> have an
> > updated bgpcep ready later today.
>
> Okay, so we have one remaining issue in BGPCEP, but after
spending
> today
> trying to make sense of what is going on, it is not a
regression, just
> shifted timing in code.
>
> It may have been detected transiently before, but now it is
getting
> caught in every built.
>
> The underlying problem has been there since late 2017, maybe
even as a
> day-0 issue.
>
> From BGPCEP perspective we are good to release
>
https://jenkins.opendaylight.org/releng/job/autorelease-release-phosphorus-mvn35-openjdk11/221/
<https://jenkins.opendaylight.org/releng/job/autorelease-release-phosphorus-mvn35-openjdk11/221/>
> <https://jenkins.opendaylight.org/releng/job/autorelease-release-phosphorus-mvn35-openjdk11/221/
<https://jenkins.opendaylight.org/releng/job/autorelease-release-phosphorus-mvn35-openjdk11/221/>>
>
>
> Thanks Robert.. I've come up with this phosphorus release
approval for TSC
>
>
https://wiki.opendaylight.org/display/ODL/Phosphorus+Formal+Release+Approval
<https://wiki.opendaylight.org/display/ODL/Phosphorus+Formal+Release+Approval>

>
<https://wiki.opendaylight.org/display/ODL/Phosphorus+Formal+Release+Approval
<https://wiki.opendaylight.org/display/ODL/Phosphorus+Formal+Release+Approval>>
>
> but I couldn't find the right mri test ... it is not 39 right?
>
>
https://jenkins.opendaylight.org/releng/view/autorelease/job/integration-distribution-mri-test-phosphorus/39/
<https://jenkins.opendaylight.org/releng/view/autorelease/job/integration-distribution-mri-test-phosphorus/39/>

>
<https://jenkins.opendaylight.org/releng/view/autorelease/job/integration-distribution-mri-test-phosphorus/39/
<https://jenkins.opendaylight.org/releng/view/autorelease/job/integration-distribution-mri-test-phosphorus/39/>>

So the mri-test-phosphorus does not kick off of autorelease, but runs
periodically. I have kicked off #40, but #39 should be accurate.


both #40 and #39 is having this issue

ERROR: Build aborted. Can't trigger undefined projects. 1 of the below project(s) can't be resolved:
> netconf-csit-1node-scale-max-devices-only-master




Regards,
Robert