[integration-dev] Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue


Sam Hague
 



On Mar 23, 2018 12:20 AM, "B Sathwik" <b.sathwik@...> wrote:

Vishal,

   Suneelu was asking for the ovsdb TRACE logs for the 3 node CSIT runs.

   I need to know how to enable the same while running 3 node CSIT jobs in sandbox.

In sandbox, simply add your custom trace settings in the CONTROLLERDEBUGMAPparam, like ovsdb:TRACE. Read the comment on that parameter. You can add multiple log settings.

 

Any pointers ?

 

Regards

Sathwik

From: Vishal Thapar
Sent: Thursday, March 22, 2018 5:39 PM
To: Faseela K <faseela.k@...>

Subject: RE: Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

Hi Faseela,

 

I didn’t say that issue will not occur if we enhance Genius 3 node CSIT. Only that Genius 3 node CSIT isn’t configured like an actual cluster deployment.

 

Yes, there is an issue in OVSDB with disconnect/connect in rapid succession [1] and that is the issue we’re hitting in Genius 3 node CSIT. Issue is not with create/delete of bridge but connect/disconnect on OVSDB channel. Suneelu had fixed it for HWVTEP but there were some open questions for OVSDB and there were still open discussions on fix. Would be good to revive this discussion at DDF where Jamo and Anil both would be there. If we have OVSDB 3 node CSIT where we can reproduce this reliably, we can try 2-3 options and test out which one works.

 

Reason it is intermittent is because issue depends on EOS. If switch connects to node that isn’t leader for OVSDB Instance, you run into this issue. Also, there are these exceptions, not sure if these are cause or effect of the issue.

 

Caused by: java.lang.IllegalArgumentException: Metadata not available for modification NodeModification [identifier=(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry, modificationType=TOUCH, childModification={(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry[{(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)target=tcp:10.30.170.65:6640}]=NodeModification [identifier=(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry[{(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)target=tcp:10.30.170.65:6640}], modificationType=DELETE, childModification={}]}]

 

 

Regards,

Vishal.

 

[1] https://lists.opendaylight.org/pipermail/ovsdb-dev/2018-February/004567.html

 

From: Faseela K
Sent: 22 March 2018 15:19
To: Vishal Thapar <vishal.thapar@...>
Cc: integration-dev@lists.opendaylight.org; ovsdb-dev@....org; B Sathwik <b.sathwik@...>; genius-dev@....org
Subject: Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

Hi Vishal,

 

    As we have already discussed, genius 3 node CSIT is randomly failing, due to bridge not showing up in topology/operational DS, on delete and create of bridge.

    You were indicating that, the clustered CSIT of genius will need some enhancements(add HAPROXY?) so that this issue will not occur.

    Could you please give pointers to Sathwik, so that he can start looking into it?

    Also, even if we don’t use HAPROXY, and delete and a create a bridge, why is there an issue in ovsdb plugin to detect the same?

    https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/genius-csit-verify-3node-upstream/220/robot-plugin/log.html.gz

 

Thanks,

Faseela


_______________________________________________
integration-dev mailing list
integration-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/integration-dev



B Sathwik <b.sathwik@...>
 

Hi,

 

Started sandbox job with ovsdb TRACE logs for 3node genius CSIT.

 

https://jenkins.opendaylight.org/sandbox/job/genius-csit-3node-sathwikgate-all-fluorine/

 

Regards

Sathwik

 

From: Sam Hague [mailto:shague@...]
Sent: Friday, March 23, 2018 7:41 PM
To: B Sathwik <b.sathwik@...>
Cc: Vishal Thapar <vishal.thapar@...>; Faseela K <faseela.k@...>; integration-dev@...; ovsdb-dev@...; genius-dev@...; K.V Suneelu Verma <k.v.suneelu.verma@...>
Subject: Re: [integration-dev] Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

 

 

On Mar 23, 2018 12:20 AM, "B Sathwik" <b.sathwik@...> wrote:

Vishal,

   Suneelu was asking for the ovsdb TRACE logs for the 3 node CSIT runs.

   I need to know how to enable the same while running 3 node CSIT jobs in sandbox.

In sandbox, simply add your custom trace settings in the CONTROLLERDEBUGMAPparam, like ovsdb:TRACE. Read the comment on that parameter. You can add multiple log settings.

 

Any pointers ?

 

Regards

Sathwik

From: Vishal Thapar
Sent: Thursday, March 22, 2018 5:39 PM
To: Faseela K <faseela.k@...>

Subject: RE: Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

Hi Faseela,

 

I didn’t say that issue will not occur if we enhance Genius 3 node CSIT. Only that Genius 3 node CSIT isn’t configured like an actual cluster deployment.

 

Yes, there is an issue in OVSDB with disconnect/connect in rapid succession [1] and that is the issue we’re hitting in Genius 3 node CSIT. Issue is not with create/delete of bridge but connect/disconnect on OVSDB channel. Suneelu had fixed it for HWVTEP but there were some open questions for OVSDB and there were still open discussions on fix. Would be good to revive this discussion at DDF where Jamo and Anil both would be there. If we have OVSDB 3 node CSIT where we can reproduce this reliably, we can try 2-3 options and test out which one works.

 

Reason it is intermittent is because issue depends on EOS. If switch connects to node that isn’t leader for OVSDB Instance, you run into this issue. Also, there are these exceptions, not sure if these are cause or effect of the issue.

 

Caused by: java.lang.IllegalArgumentException: Metadata not available for modification NodeModification [identifier=(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry, modificationType=TOUCH, childModification={(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry[{(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)target=tcp:10.30.170.65:6640}]=NodeModification [identifier=(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry[{(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)target=tcp:10.30.170.65:6640}], modificationType=DELETE, childModification={}]}]

 

 

Regards,

Vishal.

 

[1] https://lists.opendaylight.org/pipermail/ovsdb-dev/2018-February/004567.html

 

From: Faseela K
Sent: 22 March 2018 15:19
To: Vishal Thapar <vishal.thapar@...>
Cc: integration-dev@...; ovsdb-dev@...; B Sathwik <b.sathwik@...>; genius-dev@...
Subject: Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

Hi Vishal,

 

    As we have already discussed, genius 3 node CSIT is randomly failing, due to bridge not showing up in topology/operational DS, on delete and create of bridge.

    You were indicating that, the clustered CSIT of genius will need some enhancements(add HAPROXY?) so that this issue will not occur.

    Could you please give pointers to Sathwik, so that he can start looking into it?

    Also, even if we don’t use HAPROXY, and delete and a create a bridge, why is there an issue in ovsdb plugin to detect the same?

    https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/genius-csit-verify-3node-upstream/220/robot-plugin/log.html.gz

 

Thanks,

Faseela


_______________________________________________
integration-dev mailing list
integration-dev@...
https://lists.opendaylight.org/mailman/listinfo/integration-dev

 


Vishal Thapar <vishal.thapar@...>
 

There seem to be some infra issues going on:

 

07:48:39 Cleaning up Robot installation...

07:48:39 $ ssh-agent -k

07:48:39 unset SSH_AUTH_SOCK;

07:48:39 unset SSH_AGENT_PID;

07:48:39 echo Agent pid 8178 killed;

07:48:39 [ssh-agent] Stopped.

07:48:39 Robot results publisher started...

07:48:39 -Parsing output xml:

07:48:40 Failed!

07:48:40 hudson.AbortException: No files found in path /w/workspace/genius-csit-3node-sathwikgate-all-fluorine with configured filemask: output.xml

07:48:40       at hudson.plugins.robot.RobotParser$RobotParserCallable.invoke(RobotParser.java:77)

 

 

From: B Sathwik
Sent: 26 March 2018 12:51
To: Sam Hague <shague@...>
Cc: Vishal Thapar <vishal.thapar@...>; Faseela K <faseela.k@...>; integration-dev@...; ovsdb-dev@...; genius-dev@...; K.V Suneelu Verma <k.v.suneelu.verma@...>
Subject: RE: [integration-dev] Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

Hi,

 

Started sandbox job with ovsdb TRACE logs for 3node genius CSIT.

 

https://jenkins.opendaylight.org/sandbox/job/genius-csit-3node-sathwikgate-all-fluorine/

 

Regards

Sathwik

 

From: Sam Hague [mailto:shague@...]
Sent: Friday, March 23, 2018 7:41 PM
To: B Sathwik <b.sathwik@...>
Cc: Vishal Thapar <vishal.thapar@...>; Faseela K <faseela.k@...>; integration-dev@...; ovsdb-dev@...; genius-dev@...; K.V Suneelu Verma <k.v.suneelu.verma@...>
Subject: Re: [integration-dev] Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

 

 

On Mar 23, 2018 12:20 AM, "B Sathwik" <b.sathwik@...> wrote:

Vishal,

   Suneelu was asking for the ovsdb TRACE logs for the 3 node CSIT runs.

   I need to know how to enable the same while running 3 node CSIT jobs in sandbox.

In sandbox, simply add your custom trace settings in the CONTROLLERDEBUGMAPparam, like ovsdb:TRACE. Read the comment on that parameter. You can add multiple log settings.

 

Any pointers ?

 

Regards

Sathwik

From: Vishal Thapar
Sent: Thursday, March 22, 2018 5:39 PM
To: Faseela K <faseela.k@...>

Subject: RE: Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

Hi Faseela,

 

I didn’t say that issue will not occur if we enhance Genius 3 node CSIT. Only that Genius 3 node CSIT isn’t configured like an actual cluster deployment.

 

Yes, there is an issue in OVSDB with disconnect/connect in rapid succession [1] and that is the issue we’re hitting in Genius 3 node CSIT. Issue is not with create/delete of bridge but connect/disconnect on OVSDB channel. Suneelu had fixed it for HWVTEP but there were some open questions for OVSDB and there were still open discussions on fix. Would be good to revive this discussion at DDF where Jamo and Anil both would be there. If we have OVSDB 3 node CSIT where we can reproduce this reliably, we can try 2-3 options and test out which one works.

 

Reason it is intermittent is because issue depends on EOS. If switch connects to node that isn’t leader for OVSDB Instance, you run into this issue. Also, there are these exceptions, not sure if these are cause or effect of the issue.

 

Caused by: java.lang.IllegalArgumentException: Metadata not available for modification NodeModification [identifier=(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry, modificationType=TOUCH, childModification={(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry[{(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)target=tcp:10.30.170.65:6640}]=NodeModification [identifier=(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry[{(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)target=tcp:10.30.170.65:6640}], modificationType=DELETE, childModification={}]}]

 

 

Regards,

Vishal.

 

[1] https://lists.opendaylight.org/pipermail/ovsdb-dev/2018-February/004567.html

 

From: Faseela K
Sent: 22 March 2018 15:19
To: Vishal Thapar <vishal.thapar@...>
Cc: integration-dev@...; ovsdb-dev@...; B Sathwik <b.sathwik@...>; genius-dev@...
Subject: Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

Hi Vishal,

 

    As we have already discussed, genius 3 node CSIT is randomly failing, due to bridge not showing up in topology/operational DS, on delete and create of bridge.

    You were indicating that, the clustered CSIT of genius will need some enhancements(add HAPROXY?) so that this issue will not occur.

    Could you please give pointers to Sathwik, so that he can start looking into it?

    Also, even if we don’t use HAPROXY, and delete and a create a bridge, why is there an issue in ovsdb plugin to detect the same?

    https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/genius-csit-verify-3node-upstream/220/robot-plugin/log.html.gz

 

Thanks,

Faseela


_______________________________________________
integration-dev mailing list
integration-dev@...
https://lists.opendaylight.org/mailman/listinfo/integration-dev

 


Tomáš Markovič <tomas.markovic@...>
 

Also from,
07:48:19 [ ERROR ] Expected at least 1 argument, got 0.
You can see you are using wrong testplan:

genius-sathwikgate-fluorine.txt / genius-sathwikgate.txt

which do not exist, so change them accordingly to what you want.

Regards,
Tomas Markovic


On Mon, Mar 26, 2018 at 9:20 AM, B Sathwik <b.sathwik@...> wrote:

Hi,

 

Started sandbox job with ovsdb TRACE logs for 3node genius CSIT.

 

https://jenkins.opendaylight.org/sandbox/job/genius-csit-3node-sathwikgate-all-fluorine/

 

Regards

Sathwik

 

From: Sam Hague [mailto:shague@...]
Sent: Friday, March 23, 2018 7:41 PM
To: B Sathwik <b.sathwik@...>
Cc: Vishal Thapar <vishal.thapar@...>; Faseela K <faseela.k@...>; integration-dev@lists.opendaylight.org; ovsdb-dev@....org; genius-dev@....org; K.V Suneelu Verma <k.v.suneelu.verma@ericsson.com>
Subject: Re: [integration-dev] Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

 

 

On Mar 23, 2018 12:20 AM, "B Sathwik" <b.sathwik@...> wrote:

Vishal,

   Suneelu was asking for the ovsdb TRACE logs for the 3 node CSIT runs.

   I need to know how to enable the same while running 3 node CSIT jobs in sandbox.

In sandbox, simply add your custom trace settings in the CONTROLLERDEBUGMAPparam, like ovsdb:TRACE. Read the comment on that parameter. You can add multiple log settings.

 

Any pointers ?

 

Regards

Sathwik

From: Vishal Thapar
Sent: Thursday, March 22, 2018 5:39 PM
To: Faseela K <faseela.k@...>

Subject: RE: Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

Hi Faseela,

 

I didn’t say that issue will not occur if we enhance Genius 3 node CSIT. Only that Genius 3 node CSIT isn’t configured like an actual cluster deployment.

 

Yes, there is an issue in OVSDB with disconnect/connect in rapid succession [1] and that is the issue we’re hitting in Genius 3 node CSIT. Issue is not with create/delete of bridge but connect/disconnect on OVSDB channel. Suneelu had fixed it for HWVTEP but there were some open questions for OVSDB and there were still open discussions on fix. Would be good to revive this discussion at DDF where Jamo and Anil both would be there. If we have OVSDB 3 node CSIT where we can reproduce this reliably, we can try 2-3 options and test out which one works.

 

Reason it is intermittent is because issue depends on EOS. If switch connects to node that isn’t leader for OVSDB Instance, you run into this issue. Also, there are these exceptions, not sure if these are cause or effect of the issue.

 

Caused by: java.lang.IllegalArgumentException: Metadata not available for modification NodeModification [identifier=(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry, modificationType=TOUCH, childModification={(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry[{(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)target=tcp:10.30.170.65:6640}]=NodeModification [identifier=(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry[{(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)target=tcp:10.30.170.65:6640}], modificationType=DELETE, childModification={}]}]

 

 

Regards,

Vishal.

 

[1] https://lists.opendaylight.org/pipermail/ovsdb-dev/2018-February/004567.html

 

From: Faseela K
Sent: 22 March 2018 15:19
To: Vishal Thapar <vishal.thapar@...>
Cc: integration-dev@lists.opendaylight.org; ovsdb-dev@....org; B Sathwik <b.sathwik@...>; genius-dev@....org
Subject: Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

Hi Vishal,

 

    As we have already discussed, genius 3 node CSIT is randomly failing, due to bridge not showing up in topology/operational DS, on delete and create of bridge.

    You were indicating that, the clustered CSIT of genius will need some enhancements(add HAPROXY?) so that this issue will not occur.

    Could you please give pointers to Sathwik, so that he can start looking into it?

    Also, even if we don’t use HAPROXY, and delete and a create a bridge, why is there an issue in ovsdb plugin to detect the same?

    https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/genius-csit-verify-3node-upstream/220/robot-plugin/log.html.gz

 

Thanks,

Faseela


_______________________________________________
integration-dev mailing list
integration-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/integration-dev

 


_______________________________________________
integration-dev mailing list
integration-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/integration-dev



B Sathwik <b.sathwik@...>
 

Hi,

 

Changed the test plan accordingly and rebuild it

 

Facing the following error. It’s a infra issue

 

2: Waiting for 15 minutes to create sandbox-genius-csit-3node-sathwikgate-all-fluorine-3.

1: CREATE_FAILED

ERROR: Failed to initialize infrastructure. Reason: Resource CREATE failed: OverLimit: resources.vm_1_group.resources[1].resources.volume: VolumeSizeExceedsAvailableQuota: Requested volume or snapshot exceeds allowed gigabytes quota. Requested 40G, quota is 8192G and 8160G has been consumed. (HTTP 413) (Request-ID: req-ebe75897-6320-49ea-b052-6d139ff869d1)

 

 

Regards

Sathwik

 

From: Tomáš Markovič [mailto:tomas.markovic@...]
Sent: Monday, March 26, 2018 1:45 PM
To: B Sathwik <b.sathwik@...>
Cc: Sam Hague <shague@...>; genius-dev@...; Vishal Thapar <vishal.thapar@...>; Faseela K <faseela.k@...>; ovsdb-dev@...; integration-dev@...; K.V Suneelu Verma <k.v.suneelu.verma@...>
Subject: Re: [integration-dev] Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

Also from,

07:48:19 [ ERROR ] Expected at least 1 argument, got 0.

You can see you are using wrong testplan:

 

genius-sathwikgate-fluorine.txt / genius-sathwikgate.txt

 

which do not exist, so change them accordingly to what you want.

 

Regards,

Tomas Markovic

 

 

On Mon, Mar 26, 2018 at 9:20 AM, B Sathwik <b.sathwik@...> wrote:

Hi,

 

Started sandbox job with ovsdb TRACE logs for 3node genius CSIT.

 

https://jenkins.opendaylight.org/sandbox/job/genius-csit-3node-sathwikgate-all-fluorine/

 

Regards

Sathwik

 

From: Sam Hague [mailto:shague@...]
Sent: Friday, March 23, 2018 7:41 PM
To: B Sathwik <b.sathwik@...>
Cc: Vishal Thapar <vishal.thapar@...>; Faseela K <faseela.k@...>; integration-dev@...; ovsdb-dev@...; genius-dev@...; K.V Suneelu Verma <k.v.suneelu.verma@...>
Subject: Re: [integration-dev] Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

 

 

On Mar 23, 2018 12:20 AM, "B Sathwik" <b.sathwik@...> wrote:

Vishal,

   Suneelu was asking for the ovsdb TRACE logs for the 3 node CSIT runs.

   I need to know how to enable the same while running 3 node CSIT jobs in sandbox.

In sandbox, simply add your custom trace settings in the CONTROLLERDEBUGMAPparam, like ovsdb:TRACE. Read the comment on that parameter. You can add multiple log settings.

 

Any pointers ?

 

Regards

Sathwik

From: Vishal Thapar
Sent: Thursday, March 22, 2018 5:39 PM
To: Faseela K <faseela.k@...>

Subject: RE: Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

Hi Faseela,

 

I didn’t say that issue will not occur if we enhance Genius 3 node CSIT. Only that Genius 3 node CSIT isn’t configured like an actual cluster deployment.

 

Yes, there is an issue in OVSDB with disconnect/connect in rapid succession [1] and that is the issue we’re hitting in Genius 3 node CSIT. Issue is not with create/delete of bridge but connect/disconnect on OVSDB channel. Suneelu had fixed it for HWVTEP but there were some open questions for OVSDB and there were still open discussions on fix. Would be good to revive this discussion at DDF where Jamo and Anil both would be there. If we have OVSDB 3 node CSIT where we can reproduce this reliably, we can try 2-3 options and test out which one works.

 

Reason it is intermittent is because issue depends on EOS. If switch connects to node that isn’t leader for OVSDB Instance, you run into this issue. Also, there are these exceptions, not sure if these are cause or effect of the issue.

 

Caused by: java.lang.IllegalArgumentException: Metadata not available for modification NodeModification [identifier=(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry, modificationType=TOUCH, childModification={(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry[{(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)target=tcp:10.30.170.65:6640}]=NodeModification [identifier=(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry[{(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)target=tcp:10.30.170.65:6640}], modificationType=DELETE, childModification={}]}]

 

 

Regards,

Vishal.

 

[1] https://lists.opendaylight.org/pipermail/ovsdb-dev/2018-February/004567.html

 

From: Faseela K
Sent: 22 March 2018 15:19
To: Vishal Thapar <vishal.thapar@...>
Cc: integration-dev@...; ovsdb-dev@...; B Sathwik <b.sathwik@...>; genius-dev@...
Subject: Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

Hi Vishal,

 

    As we have already discussed, genius 3 node CSIT is randomly failing, due to bridge not showing up in topology/operational DS, on delete and create of bridge.

    You were indicating that, the clustered CSIT of genius will need some enhancements(add HAPROXY?) so that this issue will not occur.

    Could you please give pointers to Sathwik, so that he can start looking into it?

    Also, even if we don’t use HAPROXY, and delete and a create a bridge, why is there an issue in ovsdb plugin to detect the same?

    https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/genius-csit-verify-3node-upstream/220/robot-plugin/log.html.gz

 

Thanks,

Faseela


_______________________________________________
integration-dev mailing list
integration-dev@...
https://lists.opendaylight.org/mailman/listinfo/integration-dev

 


_______________________________________________
integration-dev mailing list
integration-dev@...
https://lists.opendaylight.org/mailman/listinfo/integration-dev

 


Vishal Thapar <vishal.thapar@...>
 

Thanks Tomas, I missed the testplan part as I was facing exact same issue in my patch test job and wrongly assumed cause was same. After Sathwick’s change, it is indeed same infra issue.

 

https://jenkins.opendaylight.org/releng/job/genius-csit-1node-gate-all-fluorine/19/console

 

Regards,

Vishal.

 

From: B Sathwik
Sent: 26 March 2018 14:19
To: Tomáš Markovič <tomas.markovic@...>
Cc: Sam Hague <shague@...>; genius-dev@...; Vishal Thapar <vishal.thapar@...>; Faseela K <faseela.k@...>; ovsdb-dev@...; integration-dev@...; K.V Suneelu Verma <k.v.suneelu.verma@...>
Subject: RE: [integration-dev] Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

Hi,

 

Changed the test plan accordingly and rebuild it

 

Facing the following error. It’s a infra issue

 

2: Waiting for 15 minutes to create sandbox-genius-csit-3node-sathwikgate-all-fluorine-3.

1: CREATE_FAILED

ERROR: Failed to initialize infrastructure. Reason: Resource CREATE failed: OverLimit: resources.vm_1_group.resources[1].resources.volume: VolumeSizeExceedsAvailableQuota: Requested volume or snapshot exceeds allowed gigabytes quota. Requested 40G, quota is 8192G and 8160G has been consumed. (HTTP 413) (Request-ID: req-ebe75897-6320-49ea-b052-6d139ff869d1)

 

 

Regards

Sathwik

 

From: Tomáš Markovič [mailto:tomas.markovic@...]
Sent: Monday, March 26, 2018 1:45 PM
To: B Sathwik <b.sathwik@...>
Cc: Sam Hague <shague@...>; genius-dev@...; Vishal Thapar <vishal.thapar@...>; Faseela K <faseela.k@...>; ovsdb-dev@...; integration-dev@...; K.V Suneelu Verma <k.v.suneelu.verma@...>
Subject: Re: [integration-dev] Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

Also from,

07:48:19 [ ERROR ] Expected at least 1 argument, got 0.

You can see you are using wrong testplan:

 

genius-sathwikgate-fluorine.txt / genius-sathwikgate.txt

 

which do not exist, so change them accordingly to what you want.

 

Regards,

Tomas Markovic

 

 

On Mon, Mar 26, 2018 at 9:20 AM, B Sathwik <b.sathwik@...> wrote:

Hi,

 

Started sandbox job with ovsdb TRACE logs for 3node genius CSIT.

 

https://jenkins.opendaylight.org/sandbox/job/genius-csit-3node-sathwikgate-all-fluorine/

 

Regards

Sathwik

 

From: Sam Hague [mailto:shague@...]
Sent: Friday, March 23, 2018 7:41 PM
To: B Sathwik <b.sathwik@...>
Cc: Vishal Thapar <vishal.thapar@...>; Faseela K <faseela.k@...>; integration-dev@...; ovsdb-dev@...; genius-dev@...; K.V Suneelu Verma <k.v.suneelu.verma@...>
Subject: Re: [integration-dev] Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

 

 

On Mar 23, 2018 12:20 AM, "B Sathwik" <b.sathwik@...> wrote:

Vishal,

   Suneelu was asking for the ovsdb TRACE logs for the 3 node CSIT runs.

   I need to know how to enable the same while running 3 node CSIT jobs in sandbox.

In sandbox, simply add your custom trace settings in the CONTROLLERDEBUGMAPparam, like ovsdb:TRACE. Read the comment on that parameter. You can add multiple log settings.

 

Any pointers ?

 

Regards

Sathwik

From: Vishal Thapar
Sent: Thursday, March 22, 2018 5:39 PM
To: Faseela K <faseela.k@...>

Subject: RE: Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

Hi Faseela,

 

I didn’t say that issue will not occur if we enhance Genius 3 node CSIT. Only that Genius 3 node CSIT isn’t configured like an actual cluster deployment.

 

Yes, there is an issue in OVSDB with disconnect/connect in rapid succession [1] and that is the issue we’re hitting in Genius 3 node CSIT. Issue is not with create/delete of bridge but connect/disconnect on OVSDB channel. Suneelu had fixed it for HWVTEP but there were some open questions for OVSDB and there were still open discussions on fix. Would be good to revive this discussion at DDF where Jamo and Anil both would be there. If we have OVSDB 3 node CSIT where we can reproduce this reliably, we can try 2-3 options and test out which one works.

 

Reason it is intermittent is because issue depends on EOS. If switch connects to node that isn’t leader for OVSDB Instance, you run into this issue. Also, there are these exceptions, not sure if these are cause or effect of the issue.

 

Caused by: java.lang.IllegalArgumentException: Metadata not available for modification NodeModification [identifier=(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry, modificationType=TOUCH, childModification={(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry[{(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)target=tcp:10.30.170.65:6640}]=NodeModification [identifier=(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry[{(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)target=tcp:10.30.170.65:6640}], modificationType=DELETE, childModification={}]}]

 

 

Regards,

Vishal.

 

[1] https://lists.opendaylight.org/pipermail/ovsdb-dev/2018-February/004567.html

 

From: Faseela K
Sent: 22 March 2018 15:19
To: Vishal Thapar <vishal.thapar@...>
Cc: integration-dev@...; ovsdb-dev@...; B Sathwik <b.sathwik@...>; genius-dev@...
Subject: Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

Hi Vishal,

 

    As we have already discussed, genius 3 node CSIT is randomly failing, due to bridge not showing up in topology/operational DS, on delete and create of bridge.

    You were indicating that, the clustered CSIT of genius will need some enhancements(add HAPROXY?) so that this issue will not occur.

    Could you please give pointers to Sathwik, so that he can start looking into it?

    Also, even if we don’t use HAPROXY, and delete and a create a bridge, why is there an issue in ovsdb plugin to detect the same?

    https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/genius-csit-verify-3node-upstream/220/robot-plugin/log.html.gz

 

Thanks,

Faseela


_______________________________________________
integration-dev mailing list
integration-dev@...
https://lists.opendaylight.org/mailman/listinfo/integration-dev

 


_______________________________________________
integration-dev mailing list
integration-dev@...
https://lists.opendaylight.org/mailman/listinfo/integration-dev

 


Faseela K
 

Sam was mentioning in last genius weekly meeting that there is a JIRA already for this, and Suneelu is working on it.

@Vishal : Could you please share the JIRA?

We are hitting the issue intermittently, and when we try to debug with ovsdb TRACE logs, it never happens.

 

Thanks,

Faseela

 

From: Vishal Thapar
Sent: Monday, March 26, 2018 2:22 PM
To: B Sathwik <b.sathwik@...>; Tomáš Markovič <tomas.markovic@...>
Cc: Sam Hague <shague@...>; genius-dev@...; Faseela K <faseela.k@...>; ovsdb-dev@...; integration-dev@...; K.V Suneelu Verma <k.v.suneelu.verma@...>
Subject: RE: [integration-dev] Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

Thanks Tomas, I missed the testplan part as I was facing exact same issue in my patch test job and wrongly assumed cause was same. After Sathwick’s change, it is indeed same infra issue.

 

https://jenkins.opendaylight.org/releng/job/genius-csit-1node-gate-all-fluorine/19/console

 

Regards,

Vishal.

 

From: B Sathwik
Sent: 26 March 2018 14:19
To: Tomáš Markovič <tomas.markovic@...>
Cc: Sam Hague <shague@...>; genius-dev@...; Vishal Thapar <vishal.thapar@...>; Faseela K <faseela.k@...>; ovsdb-dev@...; integration-dev@...; K.V Suneelu Verma <k.v.suneelu.verma@...>
Subject: RE: [integration-dev] Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

Hi,

 

Changed the test plan accordingly and rebuild it

 

Facing the following error. It’s a infra issue

 

2: Waiting for 15 minutes to create sandbox-genius-csit-3node-sathwikgate-all-fluorine-3.

1: CREATE_FAILED

ERROR: Failed to initialize infrastructure. Reason: Resource CREATE failed: OverLimit: resources.vm_1_group.resources[1].resources.volume: VolumeSizeExceedsAvailableQuota: Requested volume or snapshot exceeds allowed gigabytes quota. Requested 40G, quota is 8192G and 8160G has been consumed. (HTTP 413) (Request-ID: req-ebe75897-6320-49ea-b052-6d139ff869d1)

 

 

Regards

Sathwik

 

From: Tomáš Markovič [mailto:tomas.markovic@...]
Sent: Monday, March 26, 2018 1:45 PM
To: B Sathwik <b.sathwik@...>
Cc: Sam Hague <shague@...>; genius-dev@...; Vishal Thapar <vishal.thapar@...>; Faseela K <faseela.k@...>; ovsdb-dev@...; integration-dev@...; K.V Suneelu Verma <k.v.suneelu.verma@...>
Subject: Re: [integration-dev] Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

Also from,

07:48:19 [ ERROR ] Expected at least 1 argument, got 0.

You can see you are using wrong testplan:

 

genius-sathwikgate-fluorine.txt / genius-sathwikgate.txt

 

which do not exist, so change them accordingly to what you want.

 

Regards,

Tomas Markovic

 

 

On Mon, Mar 26, 2018 at 9:20 AM, B Sathwik <b.sathwik@...> wrote:

Hi,

 

Started sandbox job with ovsdb TRACE logs for 3node genius CSIT.

 

https://jenkins.opendaylight.org/sandbox/job/genius-csit-3node-sathwikgate-all-fluorine/

 

Regards

Sathwik

 

From: Sam Hague [mailto:shague@...]
Sent: Friday, March 23, 2018 7:41 PM
To: B Sathwik <b.sathwik@...>
Cc: Vishal Thapar <vishal.thapar@...>; Faseela K <faseela.k@...>; integration-dev@...; ovsdb-dev@...; genius-dev@...; K.V Suneelu Verma <k.v.suneelu.verma@...>
Subject: Re: [integration-dev] Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

 

 

On Mar 23, 2018 12:20 AM, "B Sathwik" <b.sathwik@...> wrote:

Vishal,

   Suneelu was asking for the ovsdb TRACE logs for the 3 node CSIT runs.

   I need to know how to enable the same while running 3 node CSIT jobs in sandbox.

In sandbox, simply add your custom trace settings in the CONTROLLERDEBUGMAPparam, like ovsdb:TRACE. Read the comment on that parameter. You can add multiple log settings.

 

Any pointers ?

 

Regards

Sathwik

From: Vishal Thapar
Sent: Thursday, March 22, 2018 5:39 PM
To: Faseela K <faseela.k@...>

Subject: RE: Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

Hi Faseela,

 

I didn’t say that issue will not occur if we enhance Genius 3 node CSIT. Only that Genius 3 node CSIT isn’t configured like an actual cluster deployment.

 

Yes, there is an issue in OVSDB with disconnect/connect in rapid succession [1] and that is the issue we’re hitting in Genius 3 node CSIT. Issue is not with create/delete of bridge but connect/disconnect on OVSDB channel. Suneelu had fixed it for HWVTEP but there were some open questions for OVSDB and there were still open discussions on fix. Would be good to revive this discussion at DDF where Jamo and Anil both would be there. If we have OVSDB 3 node CSIT where we can reproduce this reliably, we can try 2-3 options and test out which one works.

 

Reason it is intermittent is because issue depends on EOS. If switch connects to node that isn’t leader for OVSDB Instance, you run into this issue. Also, there are these exceptions, not sure if these are cause or effect of the issue.

 

Caused by: java.lang.IllegalArgumentException: Metadata not available for modification NodeModification [identifier=(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry, modificationType=TOUCH, childModification={(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry[{(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)target=tcp:10.30.170.65:6640}]=NodeModification [identifier=(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry[{(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)target=tcp:10.30.170.65:6640}], modificationType=DELETE, childModification={}]}]

 

 

Regards,

Vishal.

 

[1] https://lists.opendaylight.org/pipermail/ovsdb-dev/2018-February/004567.html

 

From: Faseela K
Sent: 22 March 2018 15:19
To: Vishal Thapar <vishal.thapar@...>
Cc: integration-dev@...; ovsdb-dev@...; B Sathwik <b.sathwik@...>; genius-dev@...
Subject: Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

 

Hi Vishal,

 

    As we have already discussed, genius 3 node CSIT is randomly failing, due to bridge not showing up in topology/operational DS, on delete and create of bridge.

    You were indicating that, the clustered CSIT of genius will need some enhancements(add HAPROXY?) so that this issue will not occur.

    Could you please give pointers to Sathwik, so that he can start looking into it?

    Also, even if we don’t use HAPROXY, and delete and a create a bridge, why is there an issue in ovsdb plugin to detect the same?

    https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/genius-csit-verify-3node-upstream/220/robot-plugin/log.html.gz

 

Thanks,

Faseela


_______________________________________________
integration-dev mailing list
integration-dev@...
https://lists.opendaylight.org/mailman/listinfo/integration-dev

 


_______________________________________________
integration-dev mailing list
integration-dev@...
https://lists.opendaylight.org/mailman/listinfo/integration-dev

 


Jamo Luhrsen <jluhrsen@...>
 

On 4/27/18 11:39 AM, Faseela K wrote:
Sam was mentioning in last genius weekly meeting that there is a JIRA already for this, and Suneelu is working on it.
@Vishal : Could you please share the JIRA?
We are hitting the issue intermittently, and when we try to debug with ovsdb TRACE logs, it never happens.
is this tested multiple times with TRACE logs enabled and we never hit
the issue? If so, that leads me to believe some race condition is happening
so perfectly that the little bit of slowdown we get with extra logging
is enough to avoid it. fun :)

JamO

Thanks,
Faseela
*From:*Vishal Thapar
*Sent:* Monday, March 26, 2018 2:22 PM
*To:* B Sathwik <b.sathwik@...>; Tomáš Markovič <tomas.markovic@...>
*Cc:* Sam Hague <shague@...>; genius-dev@...; Faseela K <faseela.k@...>; ovsdb-dev@...; integration-dev@...; K.V Suneelu Verma <k.v.suneelu.verma@...>
*Subject:* RE: [integration-dev] Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue
Thanks Tomas, I missed the testplan part as I was facing exact same issue in my patch test job and wrongly assumed cause was same. After Sathwick’s change, it is indeed same infra issue.
https://jenkins.opendaylight.org/releng/job/genius-csit-1node-gate-all-fluorine/19/console
Regards,
Vishal.
*From:*B Sathwik
*Sent:* 26 March 2018 14:19
*To:* Tomáš Markovič <tomas.markovic@... <mailto:tomas.markovic@...>>
*Cc:* Sam Hague <shague@... <mailto:shague@...>>; genius-dev@... <mailto:genius-dev@...>; Vishal Thapar <vishal.thapar@... <mailto:vishal.thapar@...>>; Faseela K <faseela.k@... <mailto:faseela.k@...>>; ovsdb-dev@... <mailto:ovsdb-dev@...>; integration-dev@... <mailto:integration-dev@...>; K.V Suneelu Verma <k.v.suneelu.verma@... <mailto:k.v.suneelu.verma@...>>
*Subject:* RE: [integration-dev] Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue
Hi,
Changed the test plan accordingly and rebuild it
Facing the following error. It’s a infra issue
2: Waiting for 15 minutes to create sandbox-genius-csit-3node-sathwikgate-all-fluorine-3.
1: CREATE_FAILED
ERROR: Failed to initialize infrastructure. Reason: Resource CREATE failed: OverLimit: resources.vm_1_group.resources[1].resources.volume: VolumeSizeExceedsAvailableQuota: Requested volume or snapshot exceeds allowed gigabytes quota. Requested 40G, quota is 8192G and 8160G has been consumed. (HTTP 413) (Request-ID: req-ebe75897-6320-49ea-b052-6d139ff869d1)
Regards
Sathwik
*From:*Tomáš Markovič [mailto:tomas.markovic@...]
*Sent:* Monday, March 26, 2018 1:45 PM
*To:* B Sathwik <b.sathwik@... <mailto:b.sathwik@...>>
*Cc:* Sam Hague <shague@... <mailto:shague@...>>; genius-dev@... <mailto:genius-dev@...>; Vishal Thapar <vishal.thapar@... <mailto:vishal.thapar@...>>; Faseela K <faseela.k@... <mailto:faseela.k@...>>; ovsdb-dev@... <mailto:ovsdb-dev@...>; integration-dev@... <mailto:integration-dev@...>; K.V Suneelu Verma <k.v.suneelu.verma@... <mailto:k.v.suneelu.verma@...>>
*Subject:* Re: [integration-dev] Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue
Also from,
*07:48:19* [ ERROR ] Expected at least 1 argument, got 0.
You can see you are using wrong testplan:
genius-sathwikgate-fluorine.txt / genius-sathwikgate.txt
which do not exist, so change them accordingly to what you want.
Regards,
Tomas Markovic
On Mon, Mar 26, 2018 at 9:20 AM, B Sathwik <b.sathwik@... <mailto:b.sathwik@...>> wrote:
Hi,
Started sandbox job with ovsdb TRACE logs for 3node genius CSIT.
https://jenkins.opendaylight.org/sandbox/job/genius-csit-3node-sathwikgate-all-fluorine/
Regards
Sathwik
*From:*Sam Hague [mailto:shague@... <mailto:shague@...>]
*Sent:* Friday, March 23, 2018 7:41 PM
*To:* B Sathwik <b.sathwik@... <mailto:b.sathwik@...>>
*Cc:* Vishal Thapar <vishal.thapar@... <mailto:vishal.thapar@...>>; Faseela K
<faseela.k@... <mailto:faseela.k@...>>; integration-dev@...
<mailto:integration-dev@...>; ovsdb-dev@...
<mailto:ovsdb-dev@...>; genius-dev@...
<mailto:genius-dev@...>; K.V Suneelu Verma <k.v.suneelu.verma@...
<mailto:k.v.suneelu.verma@...>>
*Subject:* Re: [integration-dev] Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue
On Mar 23, 2018 12:20 AM, "B Sathwik" <b.sathwik@... <mailto:b.sathwik@...>> wrote:
Vishal,
   Suneelu was asking for the ovsdb TRACE logs for the 3 node CSIT runs.
   I need to know how to enable the same while running 3 node CSIT jobs in sandbox.
In sandbox, simply add your custom trace settings in the CONTROLLERDEBUGMAPparam, like ovsdb:TRACE. Read the comment
on that parameter. You can add multiple log settings.
Any pointers ?
Regards
Sathwik
*From:* Vishal Thapar
*Sent:* Thursday, March 22, 2018 5:39 PM
*To:* Faseela K <faseela.k@... <mailto:faseela.k@...>>
*Cc:* integration-dev@... <mailto:integration-dev@...>;
ovsdb-dev@... <mailto:ovsdb-dev@...>; B Sathwik <b.sathwik@...
<mailto:b.sathwik@...>>; genius-dev@... <mailto:genius-dev@...>
*Subject:* RE: Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue
Hi Faseela,
I didn’t say that issue will not occur if we enhance Genius 3 node CSIT. Only that Genius 3 node CSIT isn’t
configured like an actual cluster deployment.
Yes, there is an issue in OVSDB with disconnect/connect in rapid succession [1] and that is the issue we’re
hitting in Genius 3 node CSIT. Issue is not with create/delete of bridge but connect/disconnect on OVSDB
channel. Suneelu had fixed it for HWVTEP but there were some open questions for OVSDB and there were still open
discussions on fix. Would be good to revive this discussion at DDF where Jamo and Anil both would be there. If
we have OVSDB 3 node CSIT where we can reproduce this reliably, we can try 2-3 options and test out which one works.
Reason it is intermittent is because issue depends on EOS. If switch connects to node that isn’t leader for
OVSDB Instance, you run into this issue. Also, there are these exceptions, not sure if these are cause or effect
of the issue.
Caused by: java.lang.IllegalArgumentException: Metadata not available for modification NodeModification
[identifier=(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry,
modificationType=TOUCH,
childModification={(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry[{(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)target=tcp:10.30.170.65
<http://10.30.170.65>:6640}]=NodeModification
[identifier=(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry[{(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)target=tcp:10.30.170.65:6640}],
modificationType=DELETE, childModification={}]}]
Regards,
Vishal.
[1] https://lists.opendaylight.org/pipermail/ovsdb-dev/2018-February/004567.html
*From:* Faseela K
*Sent:* 22 March 2018 15:19
*To:* Vishal Thapar <vishal.thapar@... <mailto:vishal.thapar@...>>
*Cc:* integration-dev@... <mailto:integration-dev@...>;
ovsdb-dev@... <mailto:ovsdb-dev@...>; B Sathwik <b.sathwik@...
<mailto:b.sathwik@...>>; genius-dev@... <mailto:genius-dev@...>
*Subject:* Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue
Hi Vishal,
    As we have already discussed, genius 3 node CSIT is randomly failing, due to bridge not showing up in
topology/operational DS, on delete and create of bridge.
    You were indicating that, the clustered CSIT of genius will need some enhancements(add HAPROXY?) so that
this issue will not occur.
    Could you please give pointers to Sathwik, so that he can start looking into it?
    Also, even if we don’t use HAPROXY, and delete and a create a bridge, why is there an issue in ovsdb plugin
to detect the same?
https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/genius-csit-verify-3node-upstream/220/robot-plugin/log.html.gz
Thanks,
Faseela
_______________________________________________
integration-dev mailing list
integration-dev@... <mailto:integration-dev@...>
https://lists.opendaylight.org/mailman/listinfo/integration-dev
_______________________________________________
integration-dev mailing list
integration-dev@... <mailto:integration-dev@...>
https://lists.opendaylight.org/mailman/listinfo/integration-dev
_______________________________________________
ovsdb-dev mailing list
ovsdb-dev@...
https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev


Jamo Luhrsen <jluhrsen@...>
 

re-sending with new email for Vishal

On 4/27/18 1:46 PM, Jamo Luhrsen wrote:
On 4/27/18 11:39 AM, Faseela K wrote:
Sam was mentioning in last genius weekly meeting that there is a JIRA already for this, and Suneelu is working on it.

@Vishal : Could you please share the JIRA?

We are hitting the issue intermittently, and when we try to debug with ovsdb TRACE logs, it never happens.
is this tested multiple times with TRACE logs enabled and we never hit
the issue? If so, that leads me to believe some race condition is happening
so perfectly that the little bit of slowdown we get with extra logging
is enough to avoid it. fun :)
JamO

Thanks,

Faseela

*From:*Vishal Thapar
*Sent:* Monday, March 26, 2018 2:22 PM
*To:* B Sathwik <b.sathwik@...>; Tomáš Markovič <tomas.markovic@...>
*Cc:* Sam Hague <shague@...>; genius-dev@...; Faseela K <faseela.k@...>; ovsdb-dev@...; integration-dev@...; K.V Suneelu Verma <k.v.suneelu.verma@...>
*Subject:* RE: [integration-dev] Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

Thanks Tomas, I missed the testplan part as I was facing exact same issue in my patch test job and wrongly assumed cause was same. After Sathwick’s change, it is indeed same infra issue.

https://jenkins.opendaylight.org/releng/job/genius-csit-1node-gate-all-fluorine/19/console

Regards,

Vishal.

*From:*B Sathwik
*Sent:* 26 March 2018 14:19
*To:* Tomáš Markovič <tomas.markovic@... <mailto:tomas.markovic@...>>
*Cc:* Sam Hague <shague@... <mailto:shague@...>>; genius-dev@... <mailto:genius-dev@...>; Vishal Thapar <vishal.thapar@... <mailto:vishal.thapar@...>>; Faseela K <faseela.k@... <mailto:faseela.k@...>>; ovsdb-dev@... <mailto:ovsdb-dev@...>; integration-dev@... <mailto:integration-dev@...>; K.V Suneelu Verma <k.v.suneelu.verma@... <mailto:k.v.suneelu.verma@...>>
*Subject:* RE: [integration-dev] Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

Hi,

Changed the test plan accordingly and rebuild it

Facing the following error. It’s a infra issue

2: Waiting for 15 minutes to create sandbox-genius-csit-3node-sathwikgate-all-fluorine-3.

1: CREATE_FAILED

ERROR: Failed to initialize infrastructure. Reason: Resource CREATE failed: OverLimit: resources.vm_1_group.resources[1].resources.volume: VolumeSizeExceedsAvailableQuota: Requested volume or snapshot exceeds allowed gigabytes quota. Requested 40G, quota is 8192G and 8160G has been consumed. (HTTP 413) (Request-ID: req-ebe75897-6320-49ea-b052-6d139ff869d1)

Regards

Sathwik

*From:*Tomáš Markovič [mailto:tomas.markovic@...]
*Sent:* Monday, March 26, 2018 1:45 PM
*To:* B Sathwik <b.sathwik@... <mailto:b.sathwik@...>>
*Cc:* Sam Hague <shague@... <mailto:shague@...>>; genius-dev@... <mailto:genius-dev@...>; Vishal Thapar <vishal.thapar@... <mailto:vishal.thapar@...>>; Faseela K <faseela.k@... <mailto:faseela.k@...>>; ovsdb-dev@... <mailto:ovsdb-dev@...>; integration-dev@... <mailto:integration-dev@...>; K.V Suneelu Verma <k.v.suneelu.verma@... <mailto:k.v.suneelu.verma@...>>
*Subject:* Re: [integration-dev] Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

Also from,

*07:48:19* [ ERROR ] Expected at least 1 argument, got 0.

You can see you are using wrong testplan:

genius-sathwikgate-fluorine.txt / genius-sathwikgate.txt

which do not exist, so change them accordingly to what you want.

Regards,

Tomas Markovic

On Mon, Mar 26, 2018 at 9:20 AM, B Sathwik <b.sathwik@... <mailto:b.sathwik@...>> wrote:

    Hi,

    Started sandbox job with ovsdb TRACE logs for 3node genius CSIT.

    https://jenkins.opendaylight.org/sandbox/job/genius-csit-3node-sathwikgate-all-fluorine/

    Regards

    Sathwik

    *From:*Sam Hague [mailto:shague@... <mailto:shague@...>]
    *Sent:* Friday, March 23, 2018 7:41 PM
    *To:* B Sathwik <b.sathwik@... <mailto:b.sathwik@...>>
    *Cc:* Vishal Thapar <vishal.thapar@... <mailto:vishal.thapar@...>>; Faseela K
    <faseela.k@... <mailto:faseela.k@...>>; integration-dev@...
    <mailto:integration-dev@...>; ovsdb-dev@...
    <mailto:ovsdb-dev@...>; genius-dev@...
    <mailto:genius-dev@...>; K.V Suneelu Verma <k.v.suneelu.verma@...
    <mailto:k.v.suneelu.verma@...>>
    *Subject:* Re: [integration-dev] Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

    On Mar 23, 2018 12:20 AM, "B Sathwik" <b.sathwik@... <mailto:b.sathwik@...>> wrote:

        Vishal,

            Suneelu was asking for the ovsdb TRACE logs for the 3 node CSIT runs.

            I need to know how to enable the same while running 3 node CSIT jobs in sandbox.

    In sandbox, simply add your custom trace settings in the CONTROLLERDEBUGMAPparam, like ovsdb:TRACE. Read the comment
    on that parameter. You can add multiple log settings.

        Any pointers ?

        Regards

        Sathwik

        *From:* Vishal Thapar
        *Sent:* Thursday, March 22, 2018 5:39 PM
        *To:* Faseela K <faseela.k@... <mailto:faseela.k@...>>


        *Cc:* integration-dev@... <mailto:integration-dev@...>;
        ovsdb-dev@... <mailto:ovsdb-dev@...>; B Sathwik <b.sathwik@...
        <mailto:b.sathwik@...>>; genius-dev@... <mailto:genius-dev@...>

        *Subject:* RE: Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

        Hi Faseela,

        I didn’t say that issue will not occur if we enhance Genius 3 node CSIT. Only that Genius 3 node CSIT isn’t
        configured like an actual cluster deployment.

        Yes, there is an issue in OVSDB with disconnect/connect in rapid succession [1] and that is the issue we’re
        hitting in Genius 3 node CSIT. Issue is not with create/delete of bridge but connect/disconnect on OVSDB
        channel. Suneelu had fixed it for HWVTEP but there were some open questions for OVSDB and there were still open
        discussions on fix. Would be good to revive this discussion at DDF where Jamo and Anil both would be there. If
        we have OVSDB 3 node CSIT where we can reproduce this reliably, we can try 2-3 options and test out which one works.

        Reason it is intermittent is because issue depends on EOS. If switch connects to node that isn’t leader for
        OVSDB Instance, you run into this issue. Also, there are these exceptions, not sure if these are cause or effect
        of the issue.

        Caused by: java.lang.IllegalArgumentException: Metadata not available for modification NodeModification
        [identifier=(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry,
        modificationType=TOUCH,
childModification={(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry[{(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)target=tcp:10.30.170.65
        <http://10.30.170.65>:6640}]=NodeModification
[identifier=(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)manager-entry[{(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)target=tcp:10.30.170.65:6640}],
        modificationType=DELETE, childModification={}]}]

        Regards,

        Vishal.

        [1] https://lists.opendaylight.org/pipermail/ovsdb-dev/2018-February/004567.html

        *From:* Faseela K
        *Sent:* 22 March 2018 15:19
        *To:* Vishal Thapar <vishal.thapar@... <mailto:vishal.thapar@...>>
        *Cc:* integration-dev@... <mailto:integration-dev@...>;
        ovsdb-dev@... <mailto:ovsdb-dev@...>; B Sathwik <b.sathwik@...
        <mailto:b.sathwik@...>>; genius-dev@... <mailto:genius-dev@...>
        *Subject:* Genius CSIT intermittent 3 node failures due to OVSDB reconnect and connect issue

        Hi Vishal,

             As we have already discussed, genius 3 node CSIT is randomly failing, due to bridge not showing up in
        topology/operational DS, on delete and create of bridge.

             You were indicating that, the clustered CSIT of genius will need some enhancements(add HAPROXY?) so that
        this issue will not occur.

             Could you please give pointers to Sathwik, so that he can start looking into it?

             Also, even if we don’t use HAPROXY, and delete and a create a bridge, why is there an issue in ovsdb plugin
        to detect the same?

https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/genius-csit-verify-3node-upstream/220/robot-plugin/log.html.gz

        Thanks,

        Faseela


        _______________________________________________
        integration-dev mailing list
        integration-dev@... <mailto:integration-dev@...>
        https://lists.opendaylight.org/mailman/listinfo/integration-dev


    _______________________________________________
    integration-dev mailing list
    integration-dev@... <mailto:integration-dev@...>
    https://lists.opendaylight.org/mailman/listinfo/integration-dev


_______________________________________________
ovsdb-dev mailing list
ovsdb-dev@...
https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev