[ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest OXYGEN/Fluorine) in attempt to connect to OVS


Sam Hague
 



On Thu, Mar 29, 2018, 1:49 PM Luis Gomez <ecelgp@...> wrote:

> On Mar 29, 2018, at 11:18 AM, Jamo Luhrsen <jluhrsen@...> wrote:
>
> Since the beginning the idea of compatible-with-all is just that. If your feature
> is in that list, then we should be able to have it installed and do anything we
> want with any other feature that is installed.
>
> at some point, there were features that were found to be compatible in a single
> node installation, but not in a clustered installation. That was messy and it
> sounds like OFP is still working around that by not installing all in 3node jobs.
>
> personally, I prefer that we keep the noise and let these things happen so bugs
> can be found and fixed. I also know that others prefer it the other way where
> we only bring in the exact things we care about. Doing the latter means we will
> miss bugs.

Fair enough, any project is free to test whatever it feels. For OFP I do not think we are in a point the cluster is stable with just OFP feature so no point to add more noise.
Same thinking, too much noise. Once we get basic 3node working then we can add in all if it makes sense, but right now it doesn't make sense to test a huge blob of unrelated features for use cases that may never happen.

>
> JamO
>
> On 3/28/18 10:11 PM, Vishal Thapar wrote:
>> Hi Jamo,
>> It did help uncover the problem, but I’d say main question is when we say compatible-with-all, does it inherently imply compatibility in a cluster setup too? Not sure how many of compatible with all even have a cluster job running. Maybe we want to make 3 node cluster csit a requirement for compatible with all?
>> Sam,
>> I believe Genius 3-node is just re-using the 3node csit and probably something needs to be tweaked to not use all but -only. We don’t need and and shouldn’t be using -all.
>> Regards,
>> Vishal.
>> *From:*Sam Hague [mailto:shague@...]
>> *Sent:* 29 March 2018 01:41
>> *To:* Jamo Luhrsen <jluhrsen@...>
>> *Cc:* Vishal Thapar <vishal.thapar@...>; Michael Vorburger <vorburger@...>; Kiran N Upadhyaya <kiran.n.upadhyaya@...>; Guillaume Gaudonville <guillaume.gaudonville@...>; <ovsdb-dev@...> <ovsdb-dev@...>; Tim Rozet <trozet@...>; Valentina Krasnobaeva <valentina.krasnobaeva@...>
>> *Subject:* Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest OXYGEN/Fluorine) in attempt to connect to OVS
>> On Wed, Mar 28, 2018 at 12:26 PM, Jamo Luhrsen <jluhrsen@... <mailto:jluhrsen@...>> wrote:
>>    Isn't this a new problem just uncovered/introduced? If so, then doesn't that mean we
>>    probably found it easier *because* we are running the -all- job.
>> I don't know if it has anything to do with the -all job - my original questions were just why are
>> we using -all and why is sfc in a -all distro
>> - We know 3node has issues so as Luis mentioned, so I thought no one used -all in the 3node jobs
>> - sfc ovs is firing in this job, but I don't know how anything using ovs can truly be independent from
>> other bundles. Maybe this sfcovsdblistener is agnostic but it still seems problematic. genius
>> is supposed to allow coexistence so maybe this is OK. But moving away from -all distro would
>> also make this issue go away.
>>    JamO
>>    On 3/27/18 10:21 PM, Vishal Thapar wrote:
>>        Missed Luis’ mail who suggests not using -all for 3 node tests. I think we should prioritize that ASAP.
>>        Regards,
>>        Vishal.
>>        *From:*ovsdb-dev-bounces@... <mailto:ovsdb-dev-bounces@...>
>>        [mailto:ovsdb-dev-bounces@... <mailto:ovsdb-dev-bounces@...>] *On Behalf
>>        Of *Vishal Thapar
>>        *Sent:* 28 March 2018 10:50
>>        *To:* Michael Vorburger <vorburger@... <mailto:vorburger@...>>; Sam Hague <shague@...
>>        <mailto:shague@...>>
>>        *Cc:* Kiran N Upadhyaya <kiran.n.upadhyaya@... <mailto:kiran.n.upadhyaya@...>>; Guillaume
>>        Gaudonville <guillaume.gaudonville@... <mailto:guillaume.gaudonville@...>>;
>>        <ovsdb-dev@... <mailto:ovsdb-dev@...>> <ovsdb-dev@...
>>        <mailto:ovsdb-dev@...>>; Tim Rozet <trozet@... <mailto:trozet@...>>; Valentina
>>        Krasnobaeva <valentina.krasnobaeva@... <mailto:valentina.krasnobaeva@...>>
>>        *Subject:* Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest
>>        OXYGEN/Fluorine) in attempt to connect to OVS
>>        Hi Sam, Michael,
>>        This is post Tim’s fix and is different issue. Yes, Genius CSIT is using -all, probably need to fix Genius CSIT
>>        to not use -all but Genius only. Issue is coming from some deadlock in bgpcep start/close. Adding more of the
>>        stack trace:
>>        2018-03-27T05:52:15,752 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 -
>>        org.opendaylight.infrautils.me <http://org.opendaylight.infrautils.me>trics-impl - 1.4.0.SNAPSHOT | Deadlocked
>>        thread stack trace: opendaylight-cluster-data-akka.actor.default-dispatcher-2 locked on
>>        org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl@23621442
>>        <mailto:org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl@23621442>
>>        <mailto:org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl@23621442
>>        <mailto:org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl@23621442>> (owned by
>>        opendaylight-cluster-data-notification-dispatcher-55):
>>                  at
>>        org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl.instantiateServiceInstance(ProgrammingServiceImpl.java:175)
>>                  at
>>        org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.lambda$startServices$0(ClusterSingletonServiceGroupImpl.java:648)
>>                  at
>>        org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl$$Lambda$1108/2009038775.accept(Unknown Source)
>>                  at java.util.ArrayList.forEach(ArrayList.java:1257)
>>        2018-03-27T05:52:16,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 -
>>        org.opendaylight.infrautils.me <http://org.opendaylight.infrautils.me>trics-impl - 1.4.0.SNAPSHOT | Oh nose -
>>        there are 3 deadlocked threads!! :-(
>>        2018-03-27T05:52:16,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 -
>>        org.opendaylight.infrautils.me <http://org.opendaylight.infrautils.me>trics-impl - 1.4.0.SNAPSHOT | Deadlocked
>>        thread stack trace: opendaylight-cluster-data-notification-dispatcher-55 locked on
>>        java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86
>>        <mailto:java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86>
>>        <mailto:java.util.concurrent.locks.ReentrantLock
>>        <mailto:java.util.concurrent.locks.ReentrantLock>$FairSync@2da5ef86> (owned by
>>        opendaylight-cluster-data-akka.actor.default-dispatcher-2):
>>                   at sun.misc.Unsafe.park(Native Method)
>>                   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>>                   at
>>        java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>>                   at
>>        java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>>                   at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>>                   at java.util.concurrent.locks.ReentrantLock$FairSync.lock(ReentrantLock.java:224)
>>                   at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>>                   at
>>        org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.unregisterService(ClusterSingletonServiceGroupImpl.java:393)
>>                   at
>>        org.opendaylight.mdsal.singleton.dom.impl.AbstractClusterSingletonServiceProviderImpl.removeRegistration(AbstractClusterSingletonServiceProviderImpl.java:151)
>>                   at org.opendaylight.mdsal.singleton.dom.impl.AbstractClusterSingletonServiceProviderImpl$1.re
>>        <http://1.re>moveRegistration(AbstractClusterSingletonServiceProviderImpl.java:123)
>>                   at org.opendaylight.yangtools.concepts.AbstractRegistration.close(AbstractRegistration.java:41)
>>                   at org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl.close(ProgrammingServiceImpl.java:449)
>>                   at
>>        org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyProviderBean$PCEPTopologyProviderBeanCSS.close(PCEPTopologyProviderBean.java:190)
>>                   at
>>        org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyProviderBean.close(PCEPTopologyProviderBean.java:86)
>>                   at
>>        org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.closeTopology(PCEPTopologyDeployerImpl.java:151)
>>                   at
>>        org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.updateTopologyProvider(PCEPTopologyDeployerImpl.java:92)
>>                   at
>>        org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.lambda$onDataTreeChanged$0(PCEPTopologyDeployerImpl.java:75)
>>                   at
>>        org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl$$Lambda$1076/552124597.accept(Unknown
>>        Source)
>>        So, technically bgpcep is not coming up in -all correctly but since autorelease/distribution check don’t test
>>        for functionality it wasn’t caught till Genius ran a CSIT with -all. Could be some poorly written start/close in
>>        BGPCEP also, probably Michael can give more inputs on this. Then we can decide what to raise bug on.
>>        Thanks and Regards,
>>        Vishal.
>>        *From:*Michael Vorburger [mailto:vorburger@... <mailto:vorburger@...>]
>>        *Sent:* 27 March 2018 23:30
>>        *To:* Sam Hague <shague@... <mailto:shague@...> <mailto:shague@... <mailto:shague@...>>>
>>        *Cc:* Vishal Thapar <vishal.thapar@... <mailto:vishal.thapar@...>
>>        <mailto:vishal.thapar@... <mailto:vishal.thapar@...>>>; Valentina Krasnobaeva
>>        <valentina.krasnobaeva@... <mailto:valentina.krasnobaeva@...>
>>        <mailto:valentina.krasnobaeva@... <mailto:valentina.krasnobaeva@...>>>; Kiran N Upadhyaya
>>        <kiran.n.upadhyaya@... <mailto:kiran.n.upadhyaya@...> <mailto:kiran.n.upadhyaya@...
>>        <mailto:kiran.n.upadhyaya@...>>>; Guillaume Gaudonville <guillaume.gaudonville@...
>>        <mailto:guillaume.gaudonville@...> <mailto:guillaume.gaudonville@...
>>        <mailto:guillaume.gaudonville@...>>>; <ovsdb-dev@...
>>        <mailto:ovsdb-dev@...> <mailto:ovsdb-dev@...
>>        <mailto:ovsdb-dev@...>>> <ovsdb-dev@...
>>        <mailto:ovsdb-dev@...> <mailto:ovsdb-dev@...
>>        <mailto:ovsdb-dev@...>>>; Tim Rozet <trozet@... <mailto:trozet@...>
>>        <mailto:trozet@... <mailto:trozet@...>>>
>>        *Subject:* Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest
>>        OXYGEN/Fluorine) in attempt to connect to OVS
>>        On Tue, Mar 27, 2018 at 6:17 PM, Sam Hague <shague@... <mailto:shague@...>
>>        <mailto:shague@... <mailto:shague@...>>> wrote:
>>             Curious but why is the genius 3node bringing in sfc features, is it because this is an integration-all
>>        test? Even
>>             so, I dont think it makes sense to do this.
>>        Not 100% sure what you mean, but yes my understanding is that the distribution integration-all feature in
>>        distribution includes *ALL* features.
>>        If you don't think that makes sense, add integration-dev in reply and take it up with the maintainers of
>>        integration/distribution..
>>             On Tue, Mar 27, 2018, 9:10 AM Michael Vorburger <vorburger@... <mailto:vorburger@...>
>>        <mailto:vorburger@... <mailto:vorburger@...>>> wrote:
>>                 On Tue, Mar 27, 2018 at 5:51 PM, Sam Hague <shague@... <mailto:shague@...>
>>        <mailto:shague@... <mailto:shague@...>>> wrote:
>>                     Adding Tim.
>>                     Are the logs below from the original job valentina mentioned or the latest ? The original failure
>>        was from a
>>                     bad fix which we did fix, so need to understand if this is another issue.
>>                     On Mar 26, 2018 11:10 PM, "Vishal Thapar" <vishal.thapar@...
>>        <mailto:vishal.thapar@...> <mailto:vishal.thapar@... <mailto:vishal.thapar@...>>>
>>                     wrote:
>>                         Don’t see the socket issue, but Sathwick is now seeing some other issue. InfraUtils is throwing
>>        lots of
>>                         Deadlock Exceptions:
>>                 Cool. Could you please open a new JIRA and include the full stack trace it (should) show after (or
>>        before?) that
>>                 "Deadlocked thread stack trace: " ERROR log? From what little I understand so far, this JIRA should
>>        probably be
>>                 openend in project controller. Although it could be caused by a mistake made in application code. Can
>>        you please
>>                 clarify exactly what caused this to start happening? If it's a particular change, it would be
>>        interesting to
>>                 look more closely at it does anything special (wrong) re. locking and threading etc.
>>                 Tx,
>>                 M.
>>                 --
>>                 Michael Vorburger, Red Hat
>>        vorburger@... <mailto:vorburger@...> <mailto:vorburger@... <mailto:vorburger@...>> |
>>        IRC: vorburger @freenode | ~ = http://vorburger.ch
>>                 <http://vorburger.ch/>
>>                         2018-03-27T05:53:29,133 | INFO  | awaitility[checkBundleDiagInfos] |
>>        SystemReadyImpl                  |
>>                         360 - org.opendaylight.infrautils.re <http://org.opendaylight.infrautils.re>ady-impl -
>>        1.4.0.SNAPSHOT | checkBundleDiagInfos: Elapsed time 206s,
>>                         remaining time 93s, diag: Booting {Installed=0, Resolved=7, Unknown=0, GracePeriod=1, Waiting=0,
>>                         Starting=0, Active=557, Stopping=0, Failure=0}
>>                         2018-03-27T05:53:29,245 | INFO  | opendaylight-cluster-data-notification-dispatcher-66 |
>>                         SfcOvsNodeDataListener | 468 - org.opendaylight.sfc.ovs - 0.8.0.SNAPSHOT |
>>        SfcOvsNodeDataListener::add()
>>                         bridge name [OvsdbBridgeName{_value=BR1}] DPID [DatapathId{_value=00:00:f6:3e:dc:fc:7b:4e}]
>>                         2018-03-27T05:53:29,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                                          | 358 - org.opendaylight.infrautils.me
>>        <http://org.opendaylight.infrautils.me>trics-impl - 1.4.0.SNAPSHOT | Oh nose - there are 3 deadlocked
>>                         threads!! :-(
>>                         2018-03-27T05:53:29,253 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                                          | 358 - org.opendaylight.infrautils.me
>>        <http://org.opendaylight.infrautils.me>trics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace:
>>                         opendaylight-cluster-data-notification-dispatcher-55 locked on
>>        java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86
>>        <mailto:java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86>
>>                         <mailto:java.util.concurrent.locks.ReentrantLock
>>        <mailto:java.util.concurrent.locks.ReentrantLock>$FairSync@2da5ef86> (owned by
>>                         opendaylight-cluster-data-akka.actor.default-dispatcher-2):
>>                                  at sun.misc.Unsafe.park(Native Method)
>>        https://logs.opendaylight.org/sandbox/vex-yul-odl-jenkins-2/genius-csit-3node-sathwikgate-all-fluorine/5/odl1_karaf.log.gz
>>                         Regards,
>>                         Vishal.
>>                         *From:*ovsdb-dev-bounces@...
>>        <mailto:ovsdb-dev-bounces@...> <mailto:ovsdb-dev-bounces@...
>>        <mailto:ovsdb-dev-bounces@...>>
>>                         [mailto:ovsdb-dev-bounces@...
>>        <mailto:ovsdb-dev-bounces@...> <mailto:ovsdb-dev-bounces@...
>>        <mailto:ovsdb-dev-bounces@...>>] *On
>>                         Behalf Of *Sam Hague
>>                         *Sent:* 26 March 2018 21:17
>>                         *To:* Valentina Krasnobaeva <valentina.krasnobaeva@...
>>        <mailto:valentina.krasnobaeva@...> <mailto:valentina.krasnobaeva@...
>>        <mailto:valentina.krasnobaeva@...>>>
>>                         *Cc:* Kiran N Upadhyaya <kiran.n.upadhyaya@... <mailto:kiran.n.upadhyaya@...>
>>        <mailto:kiran.n.upadhyaya@... <mailto:kiran.n.upadhyaya@...>>>;
>>                         Guillaume Gaudonville <guillaume.gaudonville@... <mailto:guillaume.gaudonville@...>
>>        <mailto:guillaume.gaudonville@... <mailto:guillaume.gaudonville@...>>>;
>>                         <ovsdb-dev@... <mailto:ovsdb-dev@...>
>>        <mailto:ovsdb-dev@... <mailto:ovsdb-dev@...>>>
>>                         <ovsdb-dev@... <mailto:ovsdb-dev@...>
>>        <mailto:ovsdb-dev@... <mailto:ovsdb-dev@...>>>
>>                         *Subject:* Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest
>>                         OXYGEN/Fluorine) in attempt to connect to OVS
>>                         Yes, that patch was the issue. Here is a fix that should merge soon.
>>        https://git.opendaylight.org/gerrit/#/c/70106/
>>                         On Mon, Mar 26, 2018 at 8:17 AM, Valentina Krasnobaeva <valentina.krasnobaeva@...
>>        <mailto:valentina.krasnobaeva@...>
>>                         <mailto:valentina.krasnobaeva@... <mailto:valentina.krasnobaeva@...>>> wrote:
>>                             Hi OVSDB Team, Hi All
>>                             Since today's morning  with latest builds of OXYGEN and Fluorine (all branches were rebased) I
>>                             always see this error in ovsdb log,
>>                             when devstack executes "sudo ovs-vsctl set-manager tcp:10.100.0.51:6640
>>        <http://10.100.0.51:6640> <http://10.100.0.51:6640>":
>>                             2018-03-26T16:37:14,404 | WARN  | nioEventLoopGroup-5-1 | ChannelInitializer                      | 60 -
>>                             io.netty.common - 4.1.16.Final | Failed to initialize a channel. Closing: [id: 0x2976135c,
>>                             L:/10.100.0.51:6640 <http://10.100.0.51:6640> <http://10.100.0.51:6640> -
>>        R:/10.100.0.11:34774 <http://10.100.0.11:34774> <http://10.100.0.11:34774>]
>>                             java.lang.NullPointerException: null
>>                                      at
>>                                    org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:325)
>>        [400:org.opendaylight.ovsdb.li <http://org.opendaylight.ovsdb.li>brary:1.6.1.SNAPSHOT]
>>                                      at
>>                                    org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:321)
>>        [400:org.opendaylight.ovsdb.li <http://org.opendaylight.ovsdb.li>brary:1.6.1.SNAPSHOT]
>>                                      at io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:113)
>>                             [63:io.netty.transport:4.1.16.Final]
>>                                      at io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:105)
>>                             [63:io.netty.transport:4.1.16.Final]
>>                                      at
>>                             io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:606)
>>                             [63:io.netty.transport:4.1.16.Final]
>>                                      at io.netty.channel.DefaultChannelPipeline.access$000(DefaultChannelPipeline.java:46)
>>                             [63:io.netty.transport:4.1.16.Final]
>>                                      at
>>                                    io.netty.channel.DefaultChannelPipeline$PendingHandlerAddedTask.execute(DefaultChannelPipeline.java:1412)
>>                             [63:io.netty.transport:4.1.16.Final]
>>                                      at
>>                                    io.netty.channel.DefaultChannelPipeline.callHandlerAddedForAllHandlers(DefaultChannelPipeline.java:1131)
>>                             [63:io.netty.transport:4.1.16.Final]
>>                                      at
>>                                    io.netty.channel.DefaultChannelPipeline.invokeHandlerAddedIfNeeded(DefaultChannelPipeline.java:656)
>>                             [63:io.netty.transport:4.1.16.Final]
>>                                      at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:510)
>>                             [63:io.netty.transport:4.1.16.Final]
>>                                      at
>>        io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:423)
>>                             [63:io.netty.transport:4.1.16.Final]
>>                                      at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:482)
>>                             [63:io.netty.transport:4.1.16.Final]
>>                                      at
>>                             io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
>>                             [60:io.netty.common:4.1.16.Final]
>>                                      at
>>                                    io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
>>                             [60:io.netty.common:4.1.16.Final]
>>                                      at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
>>                             [63:io.netty.transport:4.1.16.Final]
>>                                      at
>>                             io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
>>                             [60:io.netty.common:4.1.16.Final]
>>                                      at
>>                                    io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
>>                             [60:io.netty.common:4.1.16.Final]
>>                                      at java.lang.Thread.run(Thread.java:748) [?:?]
>>                             2018-03-26T16:37:15,410 | INFO  | nioEventLoopGroup-4-1 | LoggingHandler                          | 60 -
>>                             io.netty.common - 4.1.16.Final | [id: 0x57fd99d7, L:/0:0:0:0:0:0:0:0:6640] READ: [id:
>>        0x1c79d0b8,
>>                             L:/10.100.0.51:6640 <http://10.100.0.51:6640> <http://10.100.0.51:6640> -
>>        R:/10.100.0.11:34776 <http://10.100.0.11:34776> <http://10.100.0.11:34776>]
>>                             ovsdb process:
>>                             root     16311  0.0  0.0  28768  2152 ?        S<s  17:05   0:00 ovsdb-server: monitoring
>>        pid 16312
>>                             (healthy)
>>                             root     16312  0.0  0.0  29040  5304 ?        S<   17:05   0:00  \_ ovsdb-server
>>                             /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info
>>                             --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key
>>                             --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert
>>                             --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log
>>                             --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor
>>                             from its log:
>>                             2018-03-26T15:09:00.632Z|00005|reconnect|INFO|tcp:10.100.0.51:6640
>>        <http://10.100.0.51:6640> <http://10.100.0.51:6640>:
>>                             connecting...
>>                             2018-03-26T15:09:00.632Z|00006|reconnect|INFO|tcp:10.100.0.51:6640
>>        <http://10.100.0.51:6640> <http://10.100.0.51:6640>: connected
>>                             2018-03-26T15:09:00.637Z|00007|reconnect|INFO|tcp:10.100.0.51:6640
>>        <http://10.100.0.51:6640> <http://10.100.0.51:6640>:
>>                             connection closed by peer
>>                             2018-03-26T15:09:01.639Z|00008|reconnect|INFO|tcp:10.100.0.51:6640
>>        <http://10.100.0.51:6640> <http://10.100.0.51:6640>:
>>                             connecting...
>>                             2018-03-26T15:09:01.640Z|00009|reconnect|INFO|tcp:10.100.0.51:6640
>>        <http://10.100.0.51:6640> <http://10.100.0.51:6640>: connected
>>                             2018-03-26T15:09:01.642Z|00010|reconnect|INFO|tcp:10.100.0.51:6640
>>        <http://10.100.0.51:6640> <http://10.100.0.51:6640>:
>>                             connection closed by peer
>>                             2018-03-26T15:09:01.642Z|00011|reconnect|INFO|tcp:10.100.0.51:6640
>>        <http://10.100.0.51:6640> <http://10.100.0.51:6640>:
>>                             waiting 2 seconds before reconnect
>>                             2018-03-26T15:09:03.643Z|00012|reconnect|INFO|tcp:10.100.0.51:6640
>>        <http://10.100.0.51:6640> <http://10.100.0.51:6640>:
>>                             connecting...
>>                             2018-03-26T15:09:03.644Z|00013|reconnect|INFO|tcp:10.100.0.51:6640
>>        <http://10.100.0.51:6640> <http://10.100.0.51:6640>: connected
>>                             2018-03-26T15:09:03.648Z|00014|reconnect|INFO|tcp:10.100.0.51:6640
>>        <http://10.100.0.51:6640> <http://10.100.0.51:6640>:
>>                             connection closed by peer
>>                             2018-03-26T15:09:03.648Z|00015|reconnect|INFO|tcp:10.100.0.51:6640
>>        <http://10.100.0.51:6640> <http://10.100.0.51:6640>:
>>                             waiting 4 seconds before reconnect
>>                             2018-03-26T15:09:07.648Z|00016|reconnect|INFO|tcp:10.100.0.51:6640
>>        <http://10.100.0.51:6640> <http://10.100.0.51:6640>:
>>                             connecting...
>>                             and of course, there is no br-int on ovsdb:
>>                               # ovs-vsctl show
>>                             d3b8068d-fcfb-4654-b6fa-5fea41d9922d
>>                                  Manager "tcp:10.100.0.51:6640 <http://10.100.0.51:6640> <http://10.100.0.51:6640>"
>>                                  ovs_version: "2.6.1"
>>                             I suppose, that this latest patch from ovsdb could provoke the issue:
>>                             $ git show 77f0ef93babd
>>                             commit 77f0ef93babd9a1fd4a7f3508675830882c374aa (HEAD, origin/stable/oxygen)
>>                             Author: Tim Rozet <trozet@... <mailto:trozet@...> <mailto:trozet@...
>>        <mailto:trozet@...>>>
>>                             Date:   Fri Mar 23 14:23:40 2018 -0400
>>                                  OVSDB-449 Fixes loading keystores dynamically per connection
>>                                  Currently when OVSDB manager is started with SSL, the SSL Context is
>>                                  only fetched once, which inherently means the keystores are only read
>>                                  that one time.  This patch changes the behavior so that the SSL Context
>>                                  is fetched everytime a new connection is opened, which in turn will
>>                                  update the keystore data.
>>                                  Change-Id: Iaafbd34afcb5f4708b48eb3d64eca14ef0a107e8
>>                                  Signed-off-by: Tim Rozet <trozet@... <mailto:trozet@...>
>>        <mailto:trozet@... <mailto:trozet@...>>>
>>                                  (cherry picked from commit e4a170988175b98e2411782f2a85b63edb64d4cf)
>>                             Has anyone already had this issue ? If so, please, share you workaround. I use openvswitch         2.6.1-1.
>>                             Thanks for any help,
>>                             Regards
>>                             _______________________________________________
>>                             ovsdb-dev mailing list
>>        ovsdb-dev@... <mailto:ovsdb-dev@...>
>>        <mailto:ovsdb-dev@... <mailto:ovsdb-dev@...>>
>>        https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev
>>        _______________________________________________
>>        ovsdb-dev mailing list
>>        ovsdb-dev@... <mailto:ovsdb-dev@...>
>>        https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev
> _______________________________________________
> ovsdb-dev mailing list
> ovsdb-dev@...
> https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev


Luis Gomez
 

On Mar 29, 2018, at 11:18 AM, Jamo Luhrsen <jluhrsen@...> wrote:

Since the beginning the idea of compatible-with-all is just that. If your feature
is in that list, then we should be able to have it installed and do anything we
want with any other feature that is installed.

at some point, there were features that were found to be compatible in a single
node installation, but not in a clustered installation. That was messy and it
sounds like OFP is still working around that by not installing all in 3node jobs.

personally, I prefer that we keep the noise and let these things happen so bugs
can be found and fixed. I also know that others prefer it the other way where
we only bring in the exact things we care about. Doing the latter means we will
miss bugs.
Fair enough, any project is free to test whatever it feels. For OFP I do not think we are in a point the cluster is stable with just OFP feature so no point to add more noise.


JamO

On 3/28/18 10:11 PM, Vishal Thapar wrote:
Hi Jamo,
It did help uncover the problem, but I’d say main question is when we say compatible-with-all, does it inherently imply compatibility in a cluster setup too? Not sure how many of compatible with all even have a cluster job running. Maybe we want to make 3 node cluster csit a requirement for compatible with all?
Sam,
I believe Genius 3-node is just re-using the 3node csit and probably something needs to be tweaked to not use all but -only. We don’t need and and shouldn’t be using -all.
Regards,
Vishal.
*From:*Sam Hague [mailto:shague@...]
*Sent:* 29 March 2018 01:41
*To:* Jamo Luhrsen <jluhrsen@...>
*Cc:* Vishal Thapar <vishal.thapar@...>; Michael Vorburger <vorburger@...>; Kiran N Upadhyaya <kiran.n.upadhyaya@...>; Guillaume Gaudonville <guillaume.gaudonville@...>; <ovsdb-dev@...> <ovsdb-dev@...>; Tim Rozet <trozet@...>; Valentina Krasnobaeva <valentina.krasnobaeva@...>
*Subject:* Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest OXYGEN/Fluorine) in attempt to connect to OVS
On Wed, Mar 28, 2018 at 12:26 PM, Jamo Luhrsen <jluhrsen@... <mailto:jluhrsen@...>> wrote:
Isn't this a new problem just uncovered/introduced? If so, then doesn't that mean we
probably found it easier *because* we are running the -all- job.
I don't know if it has anything to do with the -all job - my original questions were just why are
we using -all and why is sfc in a -all distro
- We know 3node has issues so as Luis mentioned, so I thought no one used -all in the 3node jobs
- sfc ovs is firing in this job, but I don't know how anything using ovs can truly be independent from
other bundles. Maybe this sfcovsdblistener is agnostic but it still seems problematic. genius
is supposed to allow coexistence so maybe this is OK. But moving away from -all distro would
also make this issue go away.
JamO
On 3/27/18 10:21 PM, Vishal Thapar wrote:
Missed Luis’ mail who suggests not using -all for 3 node tests. I think we should prioritize that ASAP.
Regards,
Vishal.
*From:*ovsdb-dev-bounces@... <mailto:ovsdb-dev-bounces@...>
[mailto:ovsdb-dev-bounces@... <mailto:ovsdb-dev-bounces@...>] *On Behalf
Of *Vishal Thapar
*Sent:* 28 March 2018 10:50
*To:* Michael Vorburger <vorburger@... <mailto:vorburger@...>>; Sam Hague <shague@...
<mailto:shague@...>>
*Cc:* Kiran N Upadhyaya <kiran.n.upadhyaya@... <mailto:kiran.n.upadhyaya@...>>; Guillaume
Gaudonville <guillaume.gaudonville@... <mailto:guillaume.gaudonville@...>>;
<ovsdb-dev@... <mailto:ovsdb-dev@...>> <ovsdb-dev@...
<mailto:ovsdb-dev@...>>; Tim Rozet <trozet@... <mailto:trozet@...>>; Valentina
Krasnobaeva <valentina.krasnobaeva@... <mailto:valentina.krasnobaeva@...>>
*Subject:* Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest
OXYGEN/Fluorine) in attempt to connect to OVS
Hi Sam, Michael,
This is post Tim’s fix and is different issue. Yes, Genius CSIT is using -all, probably need to fix Genius CSIT
to not use -all but Genius only. Issue is coming from some deadlock in bgpcep start/close. Adding more of the
stack trace:
2018-03-27T05:52:15,752 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher | 358 -
org.opendaylight.infrautils.me <http://org.opendaylight.infrautils.me>trics-impl - 1.4.0.SNAPSHOT | Deadlocked
thread stack trace: opendaylight-cluster-data-akka.actor.default-dispatcher-2 locked on
org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl@23621442
<mailto:org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl@23621442>
<mailto:org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl@23621442
<mailto:org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl@23621442>> (owned by
opendaylight-cluster-data-notification-dispatcher-55):
at
org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl.instantiateServiceInstance(ProgrammingServiceImpl.java:175)
at
org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.lambda$startServices$0(ClusterSingletonServiceGroupImpl.java:648)
at
org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl$$Lambda$1108/2009038775.accept(Unknown Source)
at java.util.ArrayList.forEach(ArrayList.java:1257)
2018-03-27T05:52:16,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher | 358 -
org.opendaylight.infrautils.me <http://org.opendaylight.infrautils.me>trics-impl - 1.4.0.SNAPSHOT | Oh nose -
there are 3 deadlocked threads!! :-(
2018-03-27T05:52:16,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher | 358 -
org.opendaylight.infrautils.me <http://org.opendaylight.infrautils.me>trics-impl - 1.4.0.SNAPSHOT | Deadlocked
thread stack trace: opendaylight-cluster-data-notification-dispatcher-55 locked on
java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86
<mailto:java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86>
<mailto:java.util.concurrent.locks.ReentrantLock
<mailto:java.util.concurrent.locks.ReentrantLock>$FairSync@2da5ef86> (owned by
opendaylight-cluster-data-akka.actor.default-dispatcher-2):
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at java.util.concurrent.locks.ReentrantLock$FairSync.lock(ReentrantLock.java:224)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at
org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.unregisterService(ClusterSingletonServiceGroupImpl.java:393)
at
org.opendaylight.mdsal.singleton.dom.impl.AbstractClusterSingletonServiceProviderImpl.removeRegistration(AbstractClusterSingletonServiceProviderImpl.java:151)
at org.opendaylight.mdsal.singleton.dom.impl.AbstractClusterSingletonServiceProviderImpl$1.re
<http://1.re>moveRegistration(AbstractClusterSingletonServiceProviderImpl.java:123)
at org.opendaylight.yangtools.concepts.AbstractRegistration.close(AbstractRegistration.java:41)
at org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl.close(ProgrammingServiceImpl.java:449)
at
org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyProviderBean$PCEPTopologyProviderBeanCSS.close(PCEPTopologyProviderBean.java:190)
at
org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyProviderBean.close(PCEPTopologyProviderBean.java:86)
at
org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.closeTopology(PCEPTopologyDeployerImpl.java:151)
at
org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.updateTopologyProvider(PCEPTopologyDeployerImpl.java:92)
at
org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.lambda$onDataTreeChanged$0(PCEPTopologyDeployerImpl.java:75)
at
org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl$$Lambda$1076/552124597.accept(Unknown
Source)
So, technically bgpcep is not coming up in -all correctly but since autorelease/distribution check don’t test
for functionality it wasn’t caught till Genius ran a CSIT with -all. Could be some poorly written start/close in
BGPCEP also, probably Michael can give more inputs on this. Then we can decide what to raise bug on.
Thanks and Regards,
Vishal.
*From:*Michael Vorburger [mailto:vorburger@... <mailto:vorburger@...>]
*Sent:* 27 March 2018 23:30
*To:* Sam Hague <shague@... <mailto:shague@...> <mailto:shague@... <mailto:shague@...>>>
*Cc:* Vishal Thapar <vishal.thapar@... <mailto:vishal.thapar@...>
<mailto:vishal.thapar@... <mailto:vishal.thapar@...>>>; Valentina Krasnobaeva
<valentina.krasnobaeva@... <mailto:valentina.krasnobaeva@...>
<mailto:valentina.krasnobaeva@... <mailto:valentina.krasnobaeva@...>>>; Kiran N Upadhyaya
<kiran.n.upadhyaya@... <mailto:kiran.n.upadhyaya@...> <mailto:kiran.n.upadhyaya@...
<mailto:kiran.n.upadhyaya@...>>>; Guillaume Gaudonville <guillaume.gaudonville@...
<mailto:guillaume.gaudonville@...> <mailto:guillaume.gaudonville@...
<mailto:guillaume.gaudonville@...>>>; <ovsdb-dev@...
<mailto:ovsdb-dev@...> <mailto:ovsdb-dev@...
<mailto:ovsdb-dev@...>>> <ovsdb-dev@...
<mailto:ovsdb-dev@...> <mailto:ovsdb-dev@...
<mailto:ovsdb-dev@...>>>; Tim Rozet <trozet@... <mailto:trozet@...>
<mailto:trozet@... <mailto:trozet@...>>>
*Subject:* Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest
OXYGEN/Fluorine) in attempt to connect to OVS
On Tue, Mar 27, 2018 at 6:17 PM, Sam Hague <shague@... <mailto:shague@...>
<mailto:shague@... <mailto:shague@...>>> wrote:
Curious but why is the genius 3node bringing in sfc features, is it because this is an integration-all
test? Even
so, I dont think it makes sense to do this.
Not 100% sure what you mean, but yes my understanding is that the distribution integration-all feature in
distribution includes *ALL* features.
If you don't think that makes sense, add integration-dev in reply and take it up with the maintainers of
integration/distribution..
On Tue, Mar 27, 2018, 9:10 AM Michael Vorburger <vorburger@... <mailto:vorburger@...>
<mailto:vorburger@... <mailto:vorburger@...>>> wrote:
On Tue, Mar 27, 2018 at 5:51 PM, Sam Hague <shague@... <mailto:shague@...>
<mailto:shague@... <mailto:shague@...>>> wrote:
Adding Tim.
Are the logs below from the original job valentina mentioned or the latest ? The original failure
was from a
bad fix which we did fix, so need to understand if this is another issue.
On Mar 26, 2018 11:10 PM, "Vishal Thapar" <vishal.thapar@...
<mailto:vishal.thapar@...> <mailto:vishal.thapar@... <mailto:vishal.thapar@...>>>
wrote:
Don’t see the socket issue, but Sathwick is now seeing some other issue. InfraUtils is throwing
lots of
Deadlock Exceptions:
Cool. Could you please open a new JIRA and include the full stack trace it (should) show after (or
before?) that
"Deadlocked thread stack trace: " ERROR log? From what little I understand so far, this JIRA should
probably be
openend in project controller. Although it could be caused by a mistake made in application code. Can
you please
clarify exactly what caused this to start happening? If it's a particular change, it would be
interesting to
look more closely at it does anything special (wrong) re. locking and threading etc.
Tx,
M.
--
Michael Vorburger, Red Hat
vorburger@... <mailto:vorburger@...> <mailto:vorburger@... <mailto:vorburger@...>> |
IRC: vorburger @freenode | ~ = http://vorburger.ch
<http://vorburger.ch/>
2018-03-27T05:53:29,133 | INFO | awaitility[checkBundleDiagInfos] |
SystemReadyImpl |
360 - org.opendaylight.infrautils.re <http://org.opendaylight.infrautils.re>ady-impl -
1.4.0.SNAPSHOT | checkBundleDiagInfos: Elapsed time 206s,
remaining time 93s, diag: Booting {Installed=0, Resolved=7, Unknown=0, GracePeriod=1, Waiting=0,
Starting=0, Active=557, Stopping=0, Failure=0}
2018-03-27T05:53:29,245 | INFO | opendaylight-cluster-data-notification-dispatcher-66 |
SfcOvsNodeDataListener | 468 - org.opendaylight.sfc.ovs - 0.8.0.SNAPSHOT |
SfcOvsNodeDataListener::add()
bridge name [OvsdbBridgeName{_value=BR1}] DPID [DatapathId{_value=00:00:f6:3e:dc:fc:7b:4e}]
2018-03-27T05:53:29,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher | 358 - org.opendaylight.infrautils.me
<http://org.opendaylight.infrautils.me>trics-impl - 1.4.0.SNAPSHOT | Oh nose - there are 3 deadlocked
threads!! :-(
2018-03-27T05:53:29,253 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher | 358 - org.opendaylight.infrautils.me
<http://org.opendaylight.infrautils.me>trics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace:
opendaylight-cluster-data-notification-dispatcher-55 locked on
java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86
<mailto:java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86>
<mailto:java.util.concurrent.locks.ReentrantLock
<mailto:java.util.concurrent.locks.ReentrantLock>$FairSync@2da5ef86> (owned by
opendaylight-cluster-data-akka.actor.default-dispatcher-2):
at sun.misc.Unsafe.park(Native Method)
https://logs.opendaylight.org/sandbox/vex-yul-odl-jenkins-2/genius-csit-3node-sathwikgate-all-fluorine/5/odl1_karaf.log.gz
Regards,
Vishal.
*From:*ovsdb-dev-bounces@...
<mailto:ovsdb-dev-bounces@...> <mailto:ovsdb-dev-bounces@...
<mailto:ovsdb-dev-bounces@...>>
[mailto:ovsdb-dev-bounces@...
<mailto:ovsdb-dev-bounces@...> <mailto:ovsdb-dev-bounces@...
<mailto:ovsdb-dev-bounces@...>>] *On
Behalf Of *Sam Hague
*Sent:* 26 March 2018 21:17
*To:* Valentina Krasnobaeva <valentina.krasnobaeva@...
<mailto:valentina.krasnobaeva@...> <mailto:valentina.krasnobaeva@...
<mailto:valentina.krasnobaeva@...>>>
*Cc:* Kiran N Upadhyaya <kiran.n.upadhyaya@... <mailto:kiran.n.upadhyaya@...>
<mailto:kiran.n.upadhyaya@... <mailto:kiran.n.upadhyaya@...>>>;
Guillaume Gaudonville <guillaume.gaudonville@... <mailto:guillaume.gaudonville@...>
<mailto:guillaume.gaudonville@... <mailto:guillaume.gaudonville@...>>>;
<ovsdb-dev@... <mailto:ovsdb-dev@...>
<mailto:ovsdb-dev@... <mailto:ovsdb-dev@...>>>
<ovsdb-dev@... <mailto:ovsdb-dev@...>
<mailto:ovsdb-dev@... <mailto:ovsdb-dev@...>>>
*Subject:* Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest
OXYGEN/Fluorine) in attempt to connect to OVS
Yes, that patch was the issue. Here is a fix that should merge soon.
https://git.opendaylight.org/gerrit/#/c/70106/
On Mon, Mar 26, 2018 at 8:17 AM, Valentina Krasnobaeva <valentina.krasnobaeva@...
<mailto:valentina.krasnobaeva@...>
<mailto:valentina.krasnobaeva@... <mailto:valentina.krasnobaeva@...>>> wrote:
Hi OVSDB Team, Hi All
Since today's morning with latest builds of OXYGEN and Fluorine (all branches were rebased) I
always see this error in ovsdb log,
when devstack executes "sudo ovs-vsctl set-manager tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>":
2018-03-26T16:37:14,404 | WARN | nioEventLoopGroup-5-1 | ChannelInitializer | 60 -
io.netty.common - 4.1.16.Final | Failed to initialize a channel. Closing: [id: 0x2976135c,
L:/10.100.0.51:6640 <http://10.100.0.51:6640> <http://10.100.0.51:6640> -
R:/10.100.0.11:34774 <http://10.100.0.11:34774> <http://10.100.0.11:34774>]
java.lang.NullPointerException: null
at
org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:325)
[400:org.opendaylight.ovsdb.li <http://org.opendaylight.ovsdb.li>brary:1.6.1.SNAPSHOT]
at
org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:321)
[400:org.opendaylight.ovsdb.li <http://org.opendaylight.ovsdb.li>brary:1.6.1.SNAPSHOT]
at io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:113)
[63:io.netty.transport:4.1.16.Final]
at io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:105)
[63:io.netty.transport:4.1.16.Final]
at
io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:606)
[63:io.netty.transport:4.1.16.Final]
at io.netty.channel.DefaultChannelPipeline.access$000(DefaultChannelPipeline.java:46)
[63:io.netty.transport:4.1.16.Final]
at
io.netty.channel.DefaultChannelPipeline$PendingHandlerAddedTask.execute(DefaultChannelPipeline.java:1412)
[63:io.netty.transport:4.1.16.Final]
at
io.netty.channel.DefaultChannelPipeline.callHandlerAddedForAllHandlers(DefaultChannelPipeline.java:1131)
[63:io.netty.transport:4.1.16.Final]
at
io.netty.channel.DefaultChannelPipeline.invokeHandlerAddedIfNeeded(DefaultChannelPipeline.java:656)
[63:io.netty.transport:4.1.16.Final]
at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:510)
[63:io.netty.transport:4.1.16.Final]
at
io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:423)
[63:io.netty.transport:4.1.16.Final]
at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:482)
[63:io.netty.transport:4.1.16.Final]
at
io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
[60:io.netty.common:4.1.16.Final]
at
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
[60:io.netty.common:4.1.16.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
[63:io.netty.transport:4.1.16.Final]
at
io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
[60:io.netty.common:4.1.16.Final]
at
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
[60:io.netty.common:4.1.16.Final]
at java.lang.Thread.run(Thread.java:748) [?:?]
2018-03-26T16:37:15,410 | INFO | nioEventLoopGroup-4-1 | LoggingHandler | 60 -
io.netty.common - 4.1.16.Final | [id: 0x57fd99d7, L:/0:0:0:0:0:0:0:0:6640] READ: [id:
0x1c79d0b8,
L:/10.100.0.51:6640 <http://10.100.0.51:6640> <http://10.100.0.51:6640> -
R:/10.100.0.11:34776 <http://10.100.0.11:34776> <http://10.100.0.11:34776>]
ovsdb process:
root 16311 0.0 0.0 28768 2152 ? S<s 17:05 0:00 ovsdb-server: monitoring
pid 16312
(healthy)
root 16312 0.0 0.0 29040 5304 ? S< 17:05 0:00 \_ ovsdb-server
/etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info
--remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key
--certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert
--no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log
--pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor
from its log:
2018-03-26T15:09:00.632Z|00005|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>:
connecting...
2018-03-26T15:09:00.632Z|00006|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>: connected
2018-03-26T15:09:00.637Z|00007|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>:
connection closed by peer
2018-03-26T15:09:01.639Z|00008|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>:
connecting...
2018-03-26T15:09:01.640Z|00009|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>: connected
2018-03-26T15:09:01.642Z|00010|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>:
connection closed by peer
2018-03-26T15:09:01.642Z|00011|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>:
waiting 2 seconds before reconnect
2018-03-26T15:09:03.643Z|00012|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>:
connecting...
2018-03-26T15:09:03.644Z|00013|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>: connected
2018-03-26T15:09:03.648Z|00014|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>:
connection closed by peer
2018-03-26T15:09:03.648Z|00015|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>:
waiting 4 seconds before reconnect
2018-03-26T15:09:07.648Z|00016|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>:
connecting...
and of course, there is no br-int on ovsdb:
# ovs-vsctl show
d3b8068d-fcfb-4654-b6fa-5fea41d9922d
Manager "tcp:10.100.0.51:6640 <http://10.100.0.51:6640> <http://10.100.0.51:6640>"
ovs_version: "2.6.1"
I suppose, that this latest patch from ovsdb could provoke the issue:
$ git show 77f0ef93babd
commit 77f0ef93babd9a1fd4a7f3508675830882c374aa (HEAD, origin/stable/oxygen)
Author: Tim Rozet <trozet@... <mailto:trozet@...> <mailto:trozet@...
<mailto:trozet@...>>>
Date: Fri Mar 23 14:23:40 2018 -0400
OVSDB-449 Fixes loading keystores dynamically per connection
Currently when OVSDB manager is started with SSL, the SSL Context is
only fetched once, which inherently means the keystores are only read
that one time. This patch changes the behavior so that the SSL Context
is fetched everytime a new connection is opened, which in turn will
update the keystore data.
Change-Id: Iaafbd34afcb5f4708b48eb3d64eca14ef0a107e8
Signed-off-by: Tim Rozet <trozet@... <mailto:trozet@...>
<mailto:trozet@... <mailto:trozet@...>>>
(cherry picked from commit e4a170988175b98e2411782f2a85b63edb64d4cf)
Has anyone already had this issue ? If so, please, share you workaround. I use openvswitch 2.6.1-1.
Thanks for any help,
Regards
_______________________________________________
ovsdb-dev mailing list
ovsdb-dev@... <mailto:ovsdb-dev@...>
<mailto:ovsdb-dev@... <mailto:ovsdb-dev@...>>
https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev
_______________________________________________
ovsdb-dev mailing list
ovsdb-dev@... <mailto:ovsdb-dev@...>
https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev
_______________________________________________
ovsdb-dev mailing list
ovsdb-dev@...
https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev


Jamo Luhrsen <jluhrsen@...>
 

Since the beginning the idea of compatible-with-all is just that. If your feature
is in that list, then we should be able to have it installed and do anything we
want with any other feature that is installed.

at some point, there were features that were found to be compatible in a single
node installation, but not in a clustered installation. That was messy and it
sounds like OFP is still working around that by not installing all in 3node jobs.

personally, I prefer that we keep the noise and let these things happen so bugs
can be found and fixed. I also know that others prefer it the other way where
we only bring in the exact things we care about. Doing the latter means we will
miss bugs.

JamO

On 3/28/18 10:11 PM, Vishal Thapar wrote:
Hi Jamo,
It did help uncover the problem, but I’d say main question is when we say compatible-with-all, does it inherently imply compatibility in a cluster setup too? Not sure how many of compatible with all even have a cluster job running. Maybe we want to make 3 node cluster csit a requirement for compatible with all?
Sam,
I believe Genius 3-node is just re-using the 3node csit and probably something needs to be tweaked to not use all but -only. We don’t need and and shouldn’t be using -all.
Regards,
Vishal.
*From:*Sam Hague [mailto:shague@...]
*Sent:* 29 March 2018 01:41
*To:* Jamo Luhrsen <jluhrsen@...>
*Cc:* Vishal Thapar <vishal.thapar@...>; Michael Vorburger <vorburger@...>; Kiran N Upadhyaya <kiran.n.upadhyaya@...>; Guillaume Gaudonville <guillaume.gaudonville@...>; <ovsdb-dev@...> <ovsdb-dev@...>; Tim Rozet <trozet@...>; Valentina Krasnobaeva <valentina.krasnobaeva@...>
*Subject:* Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest OXYGEN/Fluorine) in attempt to connect to OVS
On Wed, Mar 28, 2018 at 12:26 PM, Jamo Luhrsen <jluhrsen@... <mailto:jluhrsen@...>> wrote:
Isn't this a new problem just uncovered/introduced? If so, then doesn't that mean we
probably found it easier *because* we are running the -all- job.
I don't know if it has anything to do with the -all job - my original questions were just why are
we using -all and why is sfc in a -all distro
- We know 3node has issues so as Luis mentioned, so I thought no one used -all in the 3node jobs
- sfc ovs is firing in this job, but I don't know how anything using ovs can truly be independent from
other bundles. Maybe this sfcovsdblistener is agnostic but it still seems problematic. genius
is supposed to allow coexistence so maybe this is OK. But moving away from -all distro would
also make this issue go away.
JamO
On 3/27/18 10:21 PM, Vishal Thapar wrote:
Missed Luis’ mail who suggests not using -all for 3 node tests. I think we should prioritize that ASAP.
Regards,
Vishal.
*From:*ovsdb-dev-bounces@... <mailto:ovsdb-dev-bounces@...>
[mailto:ovsdb-dev-bounces@... <mailto:ovsdb-dev-bounces@...>] *On Behalf
Of *Vishal Thapar
*Sent:* 28 March 2018 10:50
*To:* Michael Vorburger <vorburger@... <mailto:vorburger@...>>; Sam Hague <shague@...
<mailto:shague@...>>
*Cc:* Kiran N Upadhyaya <kiran.n.upadhyaya@... <mailto:kiran.n.upadhyaya@...>>; Guillaume
Gaudonville <guillaume.gaudonville@... <mailto:guillaume.gaudonville@...>>;
<ovsdb-dev@... <mailto:ovsdb-dev@...>> <ovsdb-dev@...
<mailto:ovsdb-dev@...>>; Tim Rozet <trozet@... <mailto:trozet@...>>; Valentina
Krasnobaeva <valentina.krasnobaeva@... <mailto:valentina.krasnobaeva@...>>
*Subject:* Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest
OXYGEN/Fluorine) in attempt to connect to OVS
Hi Sam, Michael,
This is post Tim’s fix and is different issue. Yes, Genius CSIT is using -all, probably need to fix Genius CSIT
to not use -all but Genius only. Issue is coming from some deadlock in bgpcep start/close. Adding more of the
stack trace:
2018-03-27T05:52:15,752 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 -
org.opendaylight.infrautils.me <http://org.opendaylight.infrautils.me>trics-impl - 1.4.0.SNAPSHOT | Deadlocked
thread stack trace: opendaylight-cluster-data-akka.actor.default-dispatcher-2 locked on
org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl@23621442
<mailto:org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl@23621442>
<mailto:org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl@23621442
<mailto:org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl@23621442>> (owned by
opendaylight-cluster-data-notification-dispatcher-55):
         at
org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl.instantiateServiceInstance(ProgrammingServiceImpl.java:175)
         at
org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.lambda$startServices$0(ClusterSingletonServiceGroupImpl.java:648)
         at
org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl$$Lambda$1108/2009038775.accept(Unknown Source)
         at java.util.ArrayList.forEach(ArrayList.java:1257)
2018-03-27T05:52:16,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 -
org.opendaylight.infrautils.me <http://org.opendaylight.infrautils.me>trics-impl - 1.4.0.SNAPSHOT | Oh nose -
there are 3 deadlocked threads!! :-(
2018-03-27T05:52:16,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 -
org.opendaylight.infrautils.me <http://org.opendaylight.infrautils.me>trics-impl - 1.4.0.SNAPSHOT | Deadlocked
thread stack trace: opendaylight-cluster-data-notification-dispatcher-55 locked on
java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86
<mailto:java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86>
<mailto:java.util.concurrent.locks.ReentrantLock
<mailto:java.util.concurrent.locks.ReentrantLock>$FairSync@2da5ef86> (owned by
opendaylight-cluster-data-akka.actor.default-dispatcher-2):
          at sun.misc.Unsafe.park(Native Method)
          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
          at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
          at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
          at java.util.concurrent.locks.ReentrantLock$FairSync.lock(ReentrantLock.java:224)
          at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
          at
org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.unregisterService(ClusterSingletonServiceGroupImpl.java:393)
          at
org.opendaylight.mdsal.singleton.dom.impl.AbstractClusterSingletonServiceProviderImpl.removeRegistration(AbstractClusterSingletonServiceProviderImpl.java:151)
          at org.opendaylight.mdsal.singleton.dom.impl.AbstractClusterSingletonServiceProviderImpl$1.re
<http://1.re>moveRegistration(AbstractClusterSingletonServiceProviderImpl.java:123)
          at org.opendaylight.yangtools.concepts.AbstractRegistration.close(AbstractRegistration.java:41)
          at org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl.close(ProgrammingServiceImpl.java:449)
          at
org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyProviderBean$PCEPTopologyProviderBeanCSS.close(PCEPTopologyProviderBean.java:190)
          at
org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyProviderBean.close(PCEPTopologyProviderBean.java:86)
          at
org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.closeTopology(PCEPTopologyDeployerImpl.java:151)
          at
org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.updateTopologyProvider(PCEPTopologyDeployerImpl.java:92)
          at
org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.lambda$onDataTreeChanged$0(PCEPTopologyDeployerImpl.java:75)
          at
org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl$$Lambda$1076/552124597.accept(Unknown
Source)
So, technically bgpcep is not coming up in -all correctly but since autorelease/distribution check don’t test
for functionality it wasn’t caught till Genius ran a CSIT with -all. Could be some poorly written start/close in
BGPCEP also, probably Michael can give more inputs on this. Then we can decide what to raise bug on.
Thanks and Regards,
Vishal.
*From:*Michael Vorburger [mailto:vorburger@... <mailto:vorburger@...>]
*Sent:* 27 March 2018 23:30
*To:* Sam Hague <shague@... <mailto:shague@...> <mailto:shague@... <mailto:shague@...>>>
*Cc:* Vishal Thapar <vishal.thapar@... <mailto:vishal.thapar@...>
<mailto:vishal.thapar@... <mailto:vishal.thapar@...>>>; Valentina Krasnobaeva
<valentina.krasnobaeva@... <mailto:valentina.krasnobaeva@...>
<mailto:valentina.krasnobaeva@... <mailto:valentina.krasnobaeva@...>>>; Kiran N Upadhyaya
<kiran.n.upadhyaya@... <mailto:kiran.n.upadhyaya@...> <mailto:kiran.n.upadhyaya@...
<mailto:kiran.n.upadhyaya@...>>>; Guillaume Gaudonville <guillaume.gaudonville@...
<mailto:guillaume.gaudonville@...> <mailto:guillaume.gaudonville@...
<mailto:guillaume.gaudonville@...>>>; <ovsdb-dev@...
<mailto:ovsdb-dev@...> <mailto:ovsdb-dev@...
<mailto:ovsdb-dev@...>>> <ovsdb-dev@...
<mailto:ovsdb-dev@...> <mailto:ovsdb-dev@...
<mailto:ovsdb-dev@...>>>; Tim Rozet <trozet@... <mailto:trozet@...>
<mailto:trozet@... <mailto:trozet@...>>>
*Subject:* Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest
OXYGEN/Fluorine) in attempt to connect to OVS
On Tue, Mar 27, 2018 at 6:17 PM, Sam Hague <shague@... <mailto:shague@...>
<mailto:shague@... <mailto:shague@...>>> wrote:
    Curious but why is the genius 3node bringing in sfc features, is it because this is an integration-all
test? Even
    so, I dont think it makes sense to do this.
Not 100% sure what you mean, but yes my understanding is that the distribution integration-all feature in
distribution includes *ALL* features.
If you don't think that makes sense, add integration-dev in reply and take it up with the maintainers of
integration/distribution..
    On Tue, Mar 27, 2018, 9:10 AM Michael Vorburger <vorburger@... <mailto:vorburger@...>
<mailto:vorburger@... <mailto:vorburger@...>>> wrote:
        On Tue, Mar 27, 2018 at 5:51 PM, Sam Hague <shague@... <mailto:shague@...>
<mailto:shague@... <mailto:shague@...>>> wrote:
            Adding Tim.
            Are the logs below from the original job valentina mentioned or the latest ? The original failure
was from a
            bad fix which we did fix, so need to understand if this is another issue.
            On Mar 26, 2018 11:10 PM, "Vishal Thapar" <vishal.thapar@...
<mailto:vishal.thapar@...> <mailto:vishal.thapar@... <mailto:vishal.thapar@...>>>
            wrote:
                Don’t see the socket issue, but Sathwick is now seeing some other issue. InfraUtils is throwing
lots of
                Deadlock Exceptions:
        Cool. Could you please open a new JIRA and include the full stack trace it (should) show after (or
before?) that
        "Deadlocked thread stack trace: " ERROR log? From what little I understand so far, this JIRA should
probably be
        openend in project controller. Although it could be caused by a mistake made in application code. Can
you please
        clarify exactly what caused this to start happening? If it's a particular change, it would be
interesting to
        look more closely at it does anything special (wrong) re. locking and threading etc.
        Tx,
        M.
        --
        Michael Vorburger, Red Hat
vorburger@... <mailto:vorburger@...> <mailto:vorburger@... <mailto:vorburger@...>> |
IRC: vorburger @freenode | ~ = http://vorburger.ch
        <http://vorburger.ch/>
                2018-03-27T05:53:29,133 | INFO  | awaitility[checkBundleDiagInfos] |
SystemReadyImpl                  |
                360 - org.opendaylight.infrautils.re <http://org.opendaylight.infrautils.re>ady-impl -
1.4.0.SNAPSHOT | checkBundleDiagInfos: Elapsed time 206s,
                remaining time 93s, diag: Booting {Installed=0, Resolved=7, Unknown=0, GracePeriod=1, Waiting=0,
                Starting=0, Active=557, Stopping=0, Failure=0}
                2018-03-27T05:53:29,245 | INFO  | opendaylight-cluster-data-notification-dispatcher-66 |
                SfcOvsNodeDataListener | 468 - org.opendaylight.sfc.ovs - 0.8.0.SNAPSHOT |
SfcOvsNodeDataListener::add()
                bridge name [OvsdbBridgeName{_value=BR1}] DPID [DatapathId{_value=00:00:f6:3e:dc:fc:7b:4e}]
                2018-03-27T05:53:29,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                         | 358 - org.opendaylight.infrautils.me
<http://org.opendaylight.infrautils.me>trics-impl - 1.4.0.SNAPSHOT | Oh nose - there are 3 deadlocked
                threads!! :-(
                2018-03-27T05:53:29,253 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                         | 358 - org.opendaylight.infrautils.me
<http://org.opendaylight.infrautils.me>trics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace:
                opendaylight-cluster-data-notification-dispatcher-55 locked on
java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86
<mailto:java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86>
                <mailto:java.util.concurrent.locks.ReentrantLock
<mailto:java.util.concurrent.locks.ReentrantLock>$FairSync@2da5ef86> (owned by
                opendaylight-cluster-data-akka.actor.default-dispatcher-2):
                         at sun.misc.Unsafe.park(Native Method)
https://logs.opendaylight.org/sandbox/vex-yul-odl-jenkins-2/genius-csit-3node-sathwikgate-all-fluorine/5/odl1_karaf.log.gz
                Regards,
                Vishal.
                *From:*ovsdb-dev-bounces@...
<mailto:ovsdb-dev-bounces@...> <mailto:ovsdb-dev-bounces@...
<mailto:ovsdb-dev-bounces@...>>
                [mailto:ovsdb-dev-bounces@...
<mailto:ovsdb-dev-bounces@...> <mailto:ovsdb-dev-bounces@...
<mailto:ovsdb-dev-bounces@...>>] *On
                Behalf Of *Sam Hague
                *Sent:* 26 March 2018 21:17
                *To:* Valentina Krasnobaeva <valentina.krasnobaeva@...
<mailto:valentina.krasnobaeva@...> <mailto:valentina.krasnobaeva@...
<mailto:valentina.krasnobaeva@...>>>
                *Cc:* Kiran N Upadhyaya <kiran.n.upadhyaya@... <mailto:kiran.n.upadhyaya@...>
<mailto:kiran.n.upadhyaya@... <mailto:kiran.n.upadhyaya@...>>>;
                Guillaume Gaudonville <guillaume.gaudonville@... <mailto:guillaume.gaudonville@...>
<mailto:guillaume.gaudonville@... <mailto:guillaume.gaudonville@...>>>;
                <ovsdb-dev@... <mailto:ovsdb-dev@...>
<mailto:ovsdb-dev@... <mailto:ovsdb-dev@...>>>
                <ovsdb-dev@... <mailto:ovsdb-dev@...>
<mailto:ovsdb-dev@... <mailto:ovsdb-dev@...>>>
                *Subject:* Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest
                OXYGEN/Fluorine) in attempt to connect to OVS
                Yes, that patch was the issue. Here is a fix that should merge soon.
https://git.opendaylight.org/gerrit/#/c/70106/
                On Mon, Mar 26, 2018 at 8:17 AM, Valentina Krasnobaeva <valentina.krasnobaeva@...
<mailto:valentina.krasnobaeva@...>
                <mailto:valentina.krasnobaeva@... <mailto:valentina.krasnobaeva@...>>> wrote:
                    Hi OVSDB Team, Hi All
                    Since today's morning  with latest builds of OXYGEN and Fluorine (all branches were rebased) I
                    always see this error in ovsdb log,
                    when devstack executes "sudo ovs-vsctl set-manager tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>":
                    2018-03-26T16:37:14,404 | WARN  | nioEventLoopGroup-5-1 | ChannelInitializer | 60 -
                    io.netty.common - 4.1.16.Final | Failed to initialize a channel. Closing: [id: 0x2976135c,
                    L:/10.100.0.51:6640 <http://10.100.0.51:6640> <http://10.100.0.51:6640> -
R:/10.100.0.11:34774 <http://10.100.0.11:34774> <http://10.100.0.11:34774>]
                    java.lang.NullPointerException: null
                             at
org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:325)
[400:org.opendaylight.ovsdb.li <http://org.opendaylight.ovsdb.li>brary:1.6.1.SNAPSHOT]
                             at
org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:321)
[400:org.opendaylight.ovsdb.li <http://org.opendaylight.ovsdb.li>brary:1.6.1.SNAPSHOT]
                             at io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:113)
                    [63:io.netty.transport:4.1.16.Final]
                             at io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:105)
                    [63:io.netty.transport:4.1.16.Final]
                             at
                    io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:606)
                    [63:io.netty.transport:4.1.16.Final]
                             at io.netty.channel.DefaultChannelPipeline.access$000(DefaultChannelPipeline.java:46)
                    [63:io.netty.transport:4.1.16.Final]
                             at
io.netty.channel.DefaultChannelPipeline$PendingHandlerAddedTask.execute(DefaultChannelPipeline.java:1412)
                    [63:io.netty.transport:4.1.16.Final]
                             at
io.netty.channel.DefaultChannelPipeline.callHandlerAddedForAllHandlers(DefaultChannelPipeline.java:1131)
                    [63:io.netty.transport:4.1.16.Final]
                             at
io.netty.channel.DefaultChannelPipeline.invokeHandlerAddedIfNeeded(DefaultChannelPipeline.java:656)
                    [63:io.netty.transport:4.1.16.Final]
                             at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:510)
                    [63:io.netty.transport:4.1.16.Final]
                             at
io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:423)
                    [63:io.netty.transport:4.1.16.Final]
                             at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:482)
                    [63:io.netty.transport:4.1.16.Final]
                             at
                    io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
                    [60:io.netty.common:4.1.16.Final]
                             at
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
                    [60:io.netty.common:4.1.16.Final]
                             at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
                    [63:io.netty.transport:4.1.16.Final]
                             at
                    io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
                    [60:io.netty.common:4.1.16.Final]
                             at
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
                    [60:io.netty.common:4.1.16.Final]
                             at java.lang.Thread.run(Thread.java:748) [?:?]
                    2018-03-26T16:37:15,410 | INFO  | nioEventLoopGroup-4-1 | LoggingHandler | 60 -
                    io.netty.common - 4.1.16.Final | [id: 0x57fd99d7, L:/0:0:0:0:0:0:0:0:6640] READ: [id:
0x1c79d0b8,
                    L:/10.100.0.51:6640 <http://10.100.0.51:6640> <http://10.100.0.51:6640> -
R:/10.100.0.11:34776 <http://10.100.0.11:34776> <http://10.100.0.11:34776>]
                    ovsdb process:
                    root     16311  0.0  0.0  28768  2152 ?        S<s  17:05   0:00 ovsdb-server: monitoring
pid 16312
                    (healthy)
                    root     16312  0.0  0.0  29040  5304 ?        S<   17:05   0:00  \_ ovsdb-server
                    /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info
                    --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key
                    --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert
                    --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log
                    --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor
                    from its log:
                    2018-03-26T15:09:00.632Z|00005|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>:
                    connecting...
                    2018-03-26T15:09:00.632Z|00006|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>: connected
                    2018-03-26T15:09:00.637Z|00007|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>:
                    connection closed by peer
                    2018-03-26T15:09:01.639Z|00008|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>:
                    connecting...
                    2018-03-26T15:09:01.640Z|00009|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>: connected
                    2018-03-26T15:09:01.642Z|00010|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>:
                    connection closed by peer
                    2018-03-26T15:09:01.642Z|00011|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>:
                    waiting 2 seconds before reconnect
                    2018-03-26T15:09:03.643Z|00012|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>:
                    connecting...
                    2018-03-26T15:09:03.644Z|00013|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>: connected
                    2018-03-26T15:09:03.648Z|00014|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>:
                    connection closed by peer
                    2018-03-26T15:09:03.648Z|00015|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>:
                    waiting 4 seconds before reconnect
                    2018-03-26T15:09:07.648Z|00016|reconnect|INFO|tcp:10.100.0.51:6640
<http://10.100.0.51:6640> <http://10.100.0.51:6640>:
                    connecting...
                    and of course, there is no br-int on ovsdb:
                      # ovs-vsctl show
                    d3b8068d-fcfb-4654-b6fa-5fea41d9922d
                         Manager "tcp:10.100.0.51:6640 <http://10.100.0.51:6640> <http://10.100.0.51:6640>"
                         ovs_version: "2.6.1"
                    I suppose, that this latest patch from ovsdb could provoke the issue:
                    $ git show 77f0ef93babd
                    commit 77f0ef93babd9a1fd4a7f3508675830882c374aa (HEAD, origin/stable/oxygen)
                    Author: Tim Rozet <trozet@... <mailto:trozet@...> <mailto:trozet@...
<mailto:trozet@...>>>
                    Date:   Fri Mar 23 14:23:40 2018 -0400
                         OVSDB-449 Fixes loading keystores dynamically per connection
                         Currently when OVSDB manager is started with SSL, the SSL Context is
                         only fetched once, which inherently means the keystores are only read
                         that one time.  This patch changes the behavior so that the SSL Context
                         is fetched everytime a new connection is opened, which in turn will
                         update the keystore data.
                         Change-Id: Iaafbd34afcb5f4708b48eb3d64eca14ef0a107e8
                         Signed-off-by: Tim Rozet <trozet@... <mailto:trozet@...>
<mailto:trozet@... <mailto:trozet@...>>>
                         (cherry picked from commit e4a170988175b98e2411782f2a85b63edb64d4cf)
                    Has anyone already had this issue ? If so, please, share you workaround. I use openvswitch 2.6.1-1.
                    Thanks for any help,
                    Regards
                    _______________________________________________
                    ovsdb-dev mailing list
ovsdb-dev@... <mailto:ovsdb-dev@...>
<mailto:ovsdb-dev@... <mailto:ovsdb-dev@...>>
https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev
_______________________________________________
ovsdb-dev mailing list
ovsdb-dev@... <mailto:ovsdb-dev@...>
https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev


Vishal Thapar <vishal.thapar@...>
 

Hi Jamo,

 

It did help uncover the problem, but I’d say main question is when we say compatible-with-all, does it inherently imply compatibility in a cluster setup too? Not sure how many of compatible with all even have a cluster job running. Maybe we want to make 3 node cluster csit a requirement for compatible with all?

 

Sam,

I believe Genius 3-node is just re-using the 3node csit and probably something needs to be tweaked to not use all but -only. We don’t need and and shouldn’t be using -all.

 

Regards,

Vishal.

 

From: Sam Hague [mailto:shague@...]
Sent: 29 March 2018 01:41
To: Jamo Luhrsen <jluhrsen@...>
Cc: Vishal Thapar <vishal.thapar@...>; Michael Vorburger <vorburger@...>; Kiran N Upadhyaya <kiran.n.upadhyaya@...>; Guillaume Gaudonville <guillaume.gaudonville@...>; <ovsdb-dev@...> <ovsdb-dev@...>; Tim Rozet <trozet@...>; Valentina Krasnobaeva <valentina.krasnobaeva@...>
Subject: Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest OXYGEN/Fluorine) in attempt to connect to OVS

 

 

 

On Wed, Mar 28, 2018 at 12:26 PM, Jamo Luhrsen <jluhrsen@...> wrote:

Isn't this a new problem just uncovered/introduced? If so, then doesn't that mean we
probably found it easier *because* we are running the -all- job.

I don't know if it has anything to do with the -all job - my original questions were just why are

we using -all and why is sfc in a -all distro 

- We know 3node has issues so as Luis mentioned, so I thought no one used -all in the 3node jobs

- sfc ovs is firing in this job, but I don't know how anything using ovs can truly be independent from

other bundles. Maybe this sfcovsdblistener is agnostic but it still seems problematic. genius

is supposed to allow coexistence so maybe this is OK. But moving away from -all distro would

also make this issue go away.


JamO

On 3/27/18 10:21 PM, Vishal Thapar wrote:

Missed Luis’ mail who suggests not using -all for 3 node tests. I think we should prioritize that ASAP.

Regards,

Vishal.

*From:*ovsdb-dev-bounces@... [mailto:ovsdb-dev-bounces@...] *On Behalf Of *Vishal Thapar
*Sent:* 28 March 2018 10:50
*To:* Michael Vorburger <vorburger@...>; Sam Hague <shague@...>
*Cc:* Kiran N Upadhyaya <kiran.n.upadhyaya@...>; Guillaume Gaudonville <guillaume.gaudonville@...>; <ovsdb-dev@...> <ovsdb-dev@...>; Tim Rozet <trozet@...>; Valentina Krasnobaeva <valentina.krasnobaeva@...>
*Subject:* Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest OXYGEN/Fluorine) in attempt to connect to OVS

Hi Sam, Michael,

This is post Tim’s fix and is different issue. Yes, Genius CSIT is using -all, probably need to fix Genius CSIT to not use -all but Genius only. Issue is coming from some deadlock in bgpcep start/close. Adding more of the stack trace:

2018-03-27T05:52:15,752 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace: opendaylight-cluster-data-akka.actor.default-dispatcher-2 locked on org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl@23621442 <mailto:org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl@23621442> (owned by opendaylight-cluster-data-notification-dispatcher-55):

         at org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl.instantiateServiceInstance(ProgrammingServiceImpl.java:175)

         at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.lambda$startServices$0(ClusterSingletonServiceGroupImpl.java:648)

         at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl$$Lambda$1108/2009038775.accept(Unknown Source)

         at java.util.ArrayList.forEach(ArrayList.java:1257)

2018-03-27T05:52:16,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Oh nose - there are 3 deadlocked threads!! :-(

2018-03-27T05:52:16,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace: opendaylight-cluster-data-notification-dispatcher-55 locked on java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86 <mailto:java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86> (owned by opendaylight-cluster-data-akka.actor.default-dispatcher-2):

          at sun.misc.Unsafe.park(Native Method)

          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)

          at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)

          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)

          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)

          at java.util.concurrent.locks.ReentrantLock$FairSync.lock(ReentrantLock.java:224)

          at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)

          at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.unregisterService(ClusterSingletonServiceGroupImpl.java:393)

          at org.opendaylight.mdsal.singleton.dom.impl.AbstractClusterSingletonServiceProviderImpl.removeRegistration(AbstractClusterSingletonServiceProviderImpl.java:151)

          at org.opendaylight.mdsal.singleton.dom.impl.AbstractClusterSingletonServiceProviderImpl$1.removeRegistration(AbstractClusterSingletonServiceProviderImpl.java:123)

          at org.opendaylight.yangtools.concepts.AbstractRegistration.close(AbstractRegistration.java:41)

          at org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl.close(ProgrammingServiceImpl.java:449)

          at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyProviderBean$PCEPTopologyProviderBeanCSS.close(PCEPTopologyProviderBean.java:190)

          at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyProviderBean.close(PCEPTopologyProviderBean.java:86)

          at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.closeTopology(PCEPTopologyDeployerImpl.java:151)

          at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.updateTopologyProvider(PCEPTopologyDeployerImpl.java:92)

          at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.lambda$onDataTreeChanged$0(PCEPTopologyDeployerImpl.java:75)

          at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl$$Lambda$1076/552124597.accept(Unknown Source)

So, technically bgpcep is not coming up in -all correctly but since autorelease/distribution check don’t test for functionality it wasn’t caught till Genius ran a CSIT with -all. Could be some poorly written start/close in BGPCEP also, probably Michael can give more inputs on this. Then we can decide what to raise bug on.

Thanks and Regards,

Vishal.

*From:*Michael Vorburger [mailto:vorburger@...]
*Sent:* 27 March 2018 23:30
*To:* Sam Hague <shague@... <mailto:shague@...>>
*Cc:* Vishal Thapar <vishal.thapar@... <mailto:vishal.thapar@...>>; Valentina Krasnobaeva <valentina.krasnobaeva@... <mailto:valentina.krasnobaeva@...>>; Kiran N Upadhyaya <kiran.n.upadhyaya@... <mailto:kiran.n.upadhyaya@...>>; Guillaume Gaudonville <guillaume.gaudonville@... <mailto:guillaume.gaudonville@...>>; <ovsdb-dev@... <mailto:ovsdb-dev@...>> <ovsdb-dev@... <mailto:ovsdb-dev@...>>; Tim Rozet <trozet@... <mailto:trozet@...>>
*Subject:* Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest OXYGEN/Fluorine) in attempt to connect to OVS

On Tue, Mar 27, 2018 at 6:17 PM, Sam Hague <shague@... <mailto:shague@...>> wrote:

    Curious but why is the genius 3node bringing in sfc features, is it because this is an integration-all test? Even
    so, I dont think it makes sense to do this.

Not 100% sure what you mean, but yes my understanding is that the distribution integration-all feature in distribution includes *ALL* features.

If you don't think that makes sense, add integration-dev in reply and take it up with the maintainers of integration/distribution..

    On Tue, Mar 27, 2018, 9:10 AM Michael Vorburger <vorburger@... <mailto:vorburger@...>> wrote:

        On Tue, Mar 27, 2018 at 5:51 PM, Sam Hague <shague@... <mailto:shague@...>> wrote:

            Adding Tim.

            Are the logs below from the original job valentina mentioned or the latest ? The original failure was from a
            bad fix which we did fix, so need to understand if this is another issue.

            On Mar 26, 2018 11:10 PM, "Vishal Thapar" <vishal.thapar@... <mailto:vishal.thapar@...>>
            wrote:

                Don’t see the socket issue, but Sathwick is now seeing some other issue. InfraUtils is throwing lots of
                Deadlock Exceptions:

        Cool. Could you please open a new JIRA and include the full stack trace it (should) show after (or before?) that
        "Deadlocked thread stack trace: " ERROR log? From what little I understand so far, this JIRA should probably be
        openend in project controller. Although it could be caused by a mistake made in application code. Can you please
        clarify exactly what caused this to start happening? If it's a particular change, it would be interesting to
        look more closely at it does anything special (wrong) re. locking and threading etc.


        Tx,

        M.

        --

        Michael Vorburger, Red Hat
        vorburger@... <mailto:vorburger@...> | IRC: vorburger @freenode | ~ = http://vorburger.ch
        <http://vorburger.ch/>

                2018-03-27T05:53:29,133 | INFO  | awaitility[checkBundleDiagInfos] | SystemReadyImpl                  |
                360 - org.opendaylight.infrautils.ready-impl - 1.4.0.SNAPSHOT | checkBundleDiagInfos: Elapsed time 206s,
                remaining time 93s, diag: Booting {Installed=0, Resolved=7, Unknown=0, GracePeriod=1, Waiting=0,
                Starting=0, Active=557, Stopping=0, Failure=0}

                2018-03-27T05:53:29,245 | INFO  | opendaylight-cluster-data-notification-dispatcher-66 |
                SfcOvsNodeDataListener | 468 - org.opendaylight.sfc.ovs - 0.8.0.SNAPSHOT | SfcOvsNodeDataListener::add()
                bridge name [OvsdbBridgeName{_value=BR1}] DPID [DatapathId{_value=00:00:f6:3e:dc:fc:7b:4e}]

                2018-03-27T05:53:29,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                                  | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Oh nose - there are 3 deadlocked
                threads!! :-(

                2018-03-27T05:53:29,253 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                                  | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace:
                opendaylight-cluster-data-notification-dispatcher-55 locked on
                java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86
                <mailto:java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86> (owned by
                opendaylight-cluster-data-akka.actor.default-dispatcher-2):

                         at sun.misc.Unsafe.park(Native Method)

                https://logs.opendaylight.org/sandbox/vex-yul-odl-jenkins-2/genius-csit-3node-sathwikgate-all-fluorine/5/odl1_karaf.log.gz

                Regards,

                Vishal.

                *From:*ovsdb-dev-bounces@... <mailto:ovsdb-dev-bounces@...>
                [mailto:ovsdb-dev-bounces@... <mailto:ovsdb-dev-bounces@...>] *On
                Behalf Of *Sam Hague
                *Sent:* 26 March 2018 21:17
                *To:* Valentina Krasnobaeva <valentina.krasnobaeva@... <mailto:valentina.krasnobaeva@...>>
                *Cc:* Kiran N Upadhyaya <kiran.n.upadhyaya@... <mailto:kiran.n.upadhyaya@...>>;
                Guillaume Gaudonville <guillaume.gaudonville@... <mailto:guillaume.gaudonville@...>>;
                <ovsdb-dev@... <mailto:ovsdb-dev@...>>
                <ovsdb-dev@... <mailto:ovsdb-dev@...>>
                *Subject:* Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest
                OXYGEN/Fluorine) in attempt to connect to OVS

                Yes, that patch was the issue. Here is a fix that should merge soon.

                https://git.opendaylight.org/gerrit/#/c/70106/

                On Mon, Mar 26, 2018 at 8:17 AM, Valentina Krasnobaeva <valentina.krasnobaeva@...
                <mailto:valentina.krasnobaeva@...>> wrote:

                    Hi OVSDB Team, Hi All

                    Since today's morning  with latest builds of OXYGEN and Fluorine (all branches were rebased) I
                    always see this error in ovsdb log,

                    when devstack executes "sudo ovs-vsctl set-manager tcp:10.100.0.51:6640 <http://10.100.0.51:6640>":

                    2018-03-26T16:37:14,404 | WARN  | nioEventLoopGroup-5-1 | ChannelInitializer               | 60 -
                    io.netty.common - 4.1.16.Final | Failed to initialize a channel. Closing: [id: 0x2976135c,
                    L:/10.100.0.51:6640 <http://10.100.0.51:6640> - R:/10.100.0.11:34774 <http://10.100.0.11:34774>]


                    java.lang.NullPointerException: null
                             at
                    org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:325) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
                             at
                    org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:321) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
                             at io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:113)
                    [63:io.netty.transport:4.1.16.Final]
                             at io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:105)
                    [63:io.netty.transport:4.1.16.Final]
                             at
                    io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:606)
                    [63:io.netty.transport:4.1.16.Final]
                             at io.netty.channel.DefaultChannelPipeline.access$000(DefaultChannelPipeline.java:46)
                    [63:io.netty.transport:4.1.16.Final]
                             at
                    io.netty.channel.DefaultChannelPipeline$PendingHandlerAddedTask.execute(DefaultChannelPipeline.java:1412)
                    [63:io.netty.transport:4.1.16.Final]
                             at
                    io.netty.channel.DefaultChannelPipeline.callHandlerAddedForAllHandlers(DefaultChannelPipeline.java:1131)
                    [63:io.netty.transport:4.1.16.Final]
                             at
                    io.netty.channel.DefaultChannelPipeline.invokeHandlerAddedIfNeeded(DefaultChannelPipeline.java:656)
                    [63:io.netty.transport:4.1.16.Final]
                             at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:510)
                    [63:io.netty.transport:4.1.16.Final]
                             at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:423)
                    [63:io.netty.transport:4.1.16.Final]
                             at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:482)
                    [63:io.netty.transport:4.1.16.Final]
                             at
                    io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
                    [60:io.netty.common:4.1.16.Final]
                             at
                    io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
                    [60:io.netty.common:4.1.16.Final]
                             at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
                    [63:io.netty.transport:4.1.16.Final]
                             at
                    io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
                    [60:io.netty.common:4.1.16.Final]
                             at
                    io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
                    [60:io.netty.common:4.1.16.Final]
                             at java.lang.Thread.run(Thread.java:748) [?:?]
                    2018-03-26T16:37:15,410 | INFO  | nioEventLoopGroup-4-1 | LoggingHandler                   | 60 -
                    io.netty.common - 4.1.16.Final | [id: 0x57fd99d7, L:/0:0:0:0:0:0:0:0:6640] READ: [id: 0x1c79d0b8,

                    L:/10.100.0.51:6640 <http://10.100.0.51:6640> - R:/10.100.0.11:34776 <http://10.100.0.11:34776>]

                    ovsdb process:

                    root     16311  0.0  0.0  28768  2152 ?        S<s  17:05   0:00 ovsdb-server: monitoring pid 16312
                    (healthy)
                    root     16312  0.0  0.0  29040  5304 ?        S<   17:05   0:00  \_ ovsdb-server
                    /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info
                    --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key
                    --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert
                    --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log
                    --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor

                    from its log:

                    2018-03-26T15:09:00.632Z|00005|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
                    connecting...
                    2018-03-26T15:09:00.632Z|00006|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>: connected
                    2018-03-26T15:09:00.637Z|00007|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
                    connection closed by peer
                    2018-03-26T15:09:01.639Z|00008|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
                    connecting...
                    2018-03-26T15:09:01.640Z|00009|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>: connected
                    2018-03-26T15:09:01.642Z|00010|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
                    connection closed by peer
                    2018-03-26T15:09:01.642Z|00011|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
                    waiting 2 seconds before reconnect
                    2018-03-26T15:09:03.643Z|00012|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
                    connecting...
                    2018-03-26T15:09:03.644Z|00013|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>: connected
                    2018-03-26T15:09:03.648Z|00014|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
                    connection closed by peer
                    2018-03-26T15:09:03.648Z|00015|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
                    waiting 4 seconds before reconnect
                    2018-03-26T15:09:07.648Z|00016|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
                    connecting...

                    and of course, there is no br-int on ovsdb:

                      # ovs-vsctl show
                    d3b8068d-fcfb-4654-b6fa-5fea41d9922d
                         Manager "tcp:10.100.0.51:6640 <http://10.100.0.51:6640>"
                         ovs_version: "2.6.1"

                    I suppose, that this latest patch from ovsdb could provoke the issue:

                    $ git show 77f0ef93babd
                    commit 77f0ef93babd9a1fd4a7f3508675830882c374aa (HEAD, origin/stable/oxygen)
                    Author: Tim Rozet <trozet@... <mailto:trozet@...>>
                    Date:   Fri Mar 23 14:23:40 2018 -0400

                         OVSDB-449 Fixes loading keystores dynamically per connection

                         Currently when OVSDB manager is started with SSL, the SSL Context is
                         only fetched once, which inherently means the keystores are only read
                         that one time.  This patch changes the behavior so that the SSL Context
                         is fetched everytime a new connection is opened, which in turn will
                         update the keystore data.

                         Change-Id: Iaafbd34afcb5f4708b48eb3d64eca14ef0a107e8
                         Signed-off-by: Tim Rozet <trozet@... <mailto:trozet@...>>
                         (cherry picked from commit e4a170988175b98e2411782f2a85b63edb64d4cf)

                    Has anyone already had this issue ? If so, please, share you workaround. I use openvswitch  2.6.1-1.

                    Thanks for any help,

                    Regards


                    _______________________________________________
                    ovsdb-dev mailing list
                    ovsdb-dev@... <mailto:ovsdb-dev@...>
                    https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev


_______________________________________________
ovsdb-dev mailing list
ovsdb-dev@...
https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev

 


Sam Hague
 



On Wed, Mar 28, 2018 at 12:26 PM, Jamo Luhrsen <jluhrsen@...> wrote:
Isn't this a new problem just uncovered/introduced? If so, then doesn't that mean we
probably found it easier *because* we are running the -all- job.
I don't know if it has anything to do with the -all job - my original questions were just why are
we using -all and why is sfc in a -all distro 
- We know 3node has issues so as Luis mentioned, so I thought no one used -all in the 3node jobs
- sfc ovs is firing in this job, but I don't know how anything using ovs can truly be independent from
other bundles. Maybe this sfcovsdblistener is agnostic but it still seems problematic. genius
is supposed to allow coexistence so maybe this is OK. But moving away from -all distro would
also make this issue go away.

JamO

On 3/27/18 10:21 PM, Vishal Thapar wrote:
Missed Luis’ mail who suggests not using -all for 3 node tests. I think we should prioritize that ASAP.

Regards,

Vishal.

*From:*ovsdb-dev-bounces@lists.opendaylight.org [mailto:ovsdb-dev-bounces@lists.opendaylight.org] *On Behalf Of *Vishal Thapar
*Sent:* 28 March 2018 10:50
*To:* Michael Vorburger <vorburger@...>; Sam Hague <shague@...>
*Cc:* Kiran N Upadhyaya <kiran.n.upadhyaya@...m>; Guillaume Gaudonville <guillaume.gaudonville@...om>; <ovsdb-dev@....org> <ovsdb-dev@....org>; Tim Rozet <trozet@...>; Valentina Krasnobaeva <valentina.krasnobaeva@...om>
*Subject:* Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest OXYGEN/Fluorine) in attempt to connect to OVS

Hi Sam, Michael,

This is post Tim’s fix and is different issue. Yes, Genius CSIT is using -all, probably need to fix Genius CSIT to not use -all but Genius only. Issue is coming from some deadlock in bgpcep start/close. Adding more of the stack trace:

2018-03-27T05:52:15,752 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace: opendaylight-cluster-data-akka.actor.default-dispatcher-2 locked on org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl@23621442 <mailto:org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl@23621442> (owned by opendaylight-cluster-data-notification-dispatcher-55):

         at org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl.instantiateServiceInstance(ProgrammingServiceImpl.java:175)

         at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.lambda$startServices$0(ClusterSingletonServiceGroupImpl.java:648)

         at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl$$Lambda$1108/2009038775.accept(Unknown Source)

         at java.util.ArrayList.forEach(ArrayList.java:1257)

2018-03-27T05:52:16,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Oh nose - there are 3 deadlocked threads!! :-(

2018-03-27T05:52:16,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace: opendaylight-cluster-data-notification-dispatcher-55 locked on java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86 <mailto:java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86> (owned by opendaylight-cluster-data-akka.actor.default-dispatcher-2):

          at sun.misc.Unsafe.park(Native Method)

          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)

          at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)

          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)

          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)

          at java.util.concurrent.locks.ReentrantLock$FairSync.lock(ReentrantLock.java:224)

          at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)

          at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.unregisterService(ClusterSingletonServiceGroupImpl.java:393)

          at org.opendaylight.mdsal.singleton.dom.impl.AbstractClusterSingletonServiceProviderImpl.removeRegistration(AbstractClusterSingletonServiceProviderImpl.java:151)

          at org.opendaylight.mdsal.singleton.dom.impl.AbstractClusterSingletonServiceProviderImpl$1.removeRegistration(AbstractClusterSingletonServiceProviderImpl.java:123)

          at org.opendaylight.yangtools.concepts.AbstractRegistration.close(AbstractRegistration.java:41)

          at org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl.close(ProgrammingServiceImpl.java:449)

          at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyProviderBean$PCEPTopologyProviderBeanCSS.close(PCEPTopologyProviderBean.java:190)

          at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyProviderBean.close(PCEPTopologyProviderBean.java:86)

          at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.closeTopology(PCEPTopologyDeployerImpl.java:151)

          at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.updateTopologyProvider(PCEPTopologyDeployerImpl.java:92)

          at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.lambda$onDataTreeChanged$0(PCEPTopologyDeployerImpl.java:75)

          at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl$$Lambda$1076/552124597.accept(Unknown Source)

So, technically bgpcep is not coming up in -all correctly but since autorelease/distribution check don’t test for functionality it wasn’t caught till Genius ran a CSIT with -all. Could be some poorly written start/close in BGPCEP also, probably Michael can give more inputs on this. Then we can decide what to raise bug on.

Thanks and Regards,

Vishal.

*From:*Michael Vorburger [mailto:vorburger@...]
*Sent:* 27 March 2018 23:30
*To:* Sam Hague <shague@... <mailto:shague@...>>
*Cc:* Vishal Thapar <vishal.thapar@... <mailto:vishal.thapar@ericsson.com>>; Valentina Krasnobaeva <valentina.krasnobaeva@...om <mailto:valentina.krasnobaeva@6wind.com>>; Kiran N Upadhyaya <kiran.n.upadhyaya@...m <mailto:kiran.n.upadhyaya@ericsson.com>>; Guillaume Gaudonville <guillaume.gaudonville@...om <mailto:guillaume.gaudonville@6wind.com>>; <ovsdb-dev@....org <mailto:ovsdb-dev@...ylight.org>> <ovsdb-dev@....org <mailto:ovsdb-dev@...ylight.org>>; Tim Rozet <trozet@... <mailto:trozet@...>>
*Subject:* Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest OXYGEN/Fluorine) in attempt to connect to OVS

On Tue, Mar 27, 2018 at 6:17 PM, Sam Hague <shague@... <mailto:shague@...>> wrote:

    Curious but why is the genius 3node bringing in sfc features, is it because this is an integration-all test? Even
    so, I dont think it makes sense to do this.

Not 100% sure what you mean, but yes my understanding is that the distribution integration-all feature in distribution includes *ALL* features.

If you don't think that makes sense, add integration-dev in reply and take it up with the maintainers of integration/distribution..

    On Tue, Mar 27, 2018, 9:10 AM Michael Vorburger <vorburger@... <mailto:vorburger@...>> wrote:

        On Tue, Mar 27, 2018 at 5:51 PM, Sam Hague <shague@... <mailto:shague@...>> wrote:

            Adding Tim.

            Are the logs below from the original job valentina mentioned or the latest ? The original failure was from a
            bad fix which we did fix, so need to understand if this is another issue.

            On Mar 26, 2018 11:10 PM, "Vishal Thapar" <vishal.thapar@... <mailto:vishal.thapar@ericsson.com>>
            wrote:

                Don’t see the socket issue, but Sathwick is now seeing some other issue. InfraUtils is throwing lots of
                Deadlock Exceptions:

        Cool. Could you please open a new JIRA and include the full stack trace it (should) show after (or before?) that
        "Deadlocked thread stack trace: " ERROR log? From what little I understand so far, this JIRA should probably be
        openend in project controller. Although it could be caused by a mistake made in application code. Can you please
        clarify exactly what caused this to start happening? If it's a particular change, it would be interesting to
        look more closely at it does anything special (wrong) re. locking and threading etc.


        Tx,

        M.

        --

        Michael Vorburger, Red Hat
        vorburger@... <mailto:vorburger@...| IRC: vorburger @freenode | ~ = http://vorburger.ch
        <http://vorburger.ch/>

                2018-03-27T05:53:29,133 | INFO  | awaitility[checkBundleDiagInfos] | SystemReadyImpl                  |
                360 - org.opendaylight.infrautils.ready-impl - 1.4.0.SNAPSHOT | checkBundleDiagInfos: Elapsed time 206s,
                remaining time 93s, diag: Booting {Installed=0, Resolved=7, Unknown=0, GracePeriod=1, Waiting=0,
                Starting=0, Active=557, Stopping=0, Failure=0}

                2018-03-27T05:53:29,245 | INFO  | opendaylight-cluster-data-notification-dispatcher-66 |
                SfcOvsNodeDataListener | 468 - org.opendaylight.sfc.ovs - 0.8.0.SNAPSHOT | SfcOvsNodeDataListener::add()
                bridge name [OvsdbBridgeName{_value=BR1}] DPID [DatapathId{_value=00:00:f6:3e:dc:fc:7b:4e}]

                2018-03-27T05:53:29,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                                  | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Oh nose - there are 3 deadlocked
                threads!! :-(

                2018-03-27T05:53:29,253 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                                  | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace:
                opendaylight-cluster-data-notification-dispatcher-55 locked on
                java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86
                <mailto:java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86> (owned by
                opendaylight-cluster-data-akka.actor.default-dispatcher-2):

                         at sun.misc.Unsafe.park(Native Method)

                https://logs.opendaylight.org/sandbox/vex-yul-odl-jenkins-2/genius-csit-3node-sathwikgate-all-fluorine/5/odl1_karaf.log.gz

                Regards,

                Vishal.

                *From:*ovsdb-dev-bounces@lists.opendaylight.org <mailto:ovsdb-dev-bounces@lists.opendaylight.org>
                [mailto:ovsdb-dev-bounces@lists.opendaylight.org <mailto:ovsdb-dev-bounces@lists.opendaylight.org>] *On
                Behalf Of *Sam Hague
                *Sent:* 26 March 2018 21:17
                *To:* Valentina Krasnobaeva <valentina.krasnobaeva@...om <mailto:valentina.krasnobaeva@6wind.com>>
                *Cc:* Kiran N Upadhyaya <kiran.n.upadhyaya@...m <mailto:kiran.n.upadhyaya@ericsson.com>>;
                Guillaume Gaudonville <guillaume.gaudonville@...om <mailto:guillaume.gaudonville@6wind.com>>;
                <ovsdb-dev@....org <mailto:ovsdb-dev@...ylight.org>>
                <ovsdb-dev@....org <mailto:ovsdb-dev@...ylight.org>>
                *Subject:* Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest
                OXYGEN/Fluorine) in attempt to connect to OVS

                Yes, that patch was the issue. Here is a fix that should merge soon.

                https://git.opendaylight.org/gerrit/#/c/70106/

                On Mon, Mar 26, 2018 at 8:17 AM, Valentina Krasnobaeva <valentina.krasnobaeva@...om
                <mailto:valentina.krasnobaeva@6wind.com>> wrote:

                    Hi OVSDB Team, Hi All

                    Since today's morning  with latest builds of OXYGEN and Fluorine (all branches were rebased) I
                    always see this error in ovsdb log,

                    when devstack executes "sudo ovs-vsctl set-manager tcp:10.100.0.51:6640 <http://10.100.0.51:6640>":

                    2018-03-26T16:37:14,404 | WARN  | nioEventLoopGroup-5-1 | ChannelInitializer               | 60 -
                    io.netty.common - 4.1.16.Final | Failed to initialize a channel. Closing: [id: 0x2976135c,
                    L:/10.100.0.51:6640 <http://10.100.0.51:6640> - R:/10.100.0.11:34774 <http://10.100.0.11:34774>]

                    java.lang.NullPointerException: null
                             at
                    org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:325) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
                             at
                    org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:321) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
                             at io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:113)
                    [63:io.netty.transport:4.1.16.Final]
                             at io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:105)
                    [63:io.netty.transport:4.1.16.Final]
                             at
                    io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:606)
                    [63:io.netty.transport:4.1.16.Final]
                             at io.netty.channel.DefaultChannelPipeline.access$000(DefaultChannelPipeline.java:46)
                    [63:io.netty.transport:4.1.16.Final]
                             at
                    io.netty.channel.DefaultChannelPipeline$PendingHandlerAddedTask.execute(DefaultChannelPipeline.java:1412)
                    [63:io.netty.transport:4.1.16.Final]
                             at
                    io.netty.channel.DefaultChannelPipeline.callHandlerAddedForAllHandlers(DefaultChannelPipeline.java:1131)
                    [63:io.netty.transport:4.1.16.Final]
                             at
                    io.netty.channel.DefaultChannelPipeline.invokeHandlerAddedIfNeeded(DefaultChannelPipeline.java:656)
                    [63:io.netty.transport:4.1.16.Final]
                             at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:510)
                    [63:io.netty.transport:4.1.16.Final]
                             at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:423)
                    [63:io.netty.transport:4.1.16.Final]
                             at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:482)
                    [63:io.netty.transport:4.1.16.Final]
                             at
                    io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
                    [60:io.netty.common:4.1.16.Final]
                             at
                    io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
                    [60:io.netty.common:4.1.16.Final]
                             at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
                    [63:io.netty.transport:4.1.16.Final]
                             at
                    io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
                    [60:io.netty.common:4.1.16.Final]
                             at
                    io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
                    [60:io.netty.common:4.1.16.Final]
                             at java.lang.Thread.run(Thread.java:748) [?:?]
                    2018-03-26T16:37:15,410 | INFO  | nioEventLoopGroup-4-1 | LoggingHandler                   | 60 -
                    io.netty.common - 4.1.16.Final | [id: 0x57fd99d7, L:/0:0:0:0:0:0:0:0:6640] READ: [id: 0x1c79d0b8,
                    L:/10.100.0.51:6640 <http://10.100.0.51:6640> - R:/10.100.0.11:34776 <http://10.100.0.11:34776>]

                    ovsdb process:

                    root     16311  0.0  0.0  28768  2152 ?        S<s  17:05   0:00 ovsdb-server: monitoring pid 16312
                    (healthy)
                    root     16312  0.0  0.0  29040  5304 ?        S<   17:05   0:00  \_ ovsdb-server
                    /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info
                    --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key
                    --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert
                    --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log
                    --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor

                    from its log:

                    2018-03-26T15:09:00.632Z|00005|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
                    connecting...
                    2018-03-26T15:09:00.632Z|00006|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>: connected
                    2018-03-26T15:09:00.637Z|00007|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
                    connection closed by peer
                    2018-03-26T15:09:01.639Z|00008|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
                    connecting...
                    2018-03-26T15:09:01.640Z|00009|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>: connected
                    2018-03-26T15:09:01.642Z|00010|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
                    connection closed by peer
                    2018-03-26T15:09:01.642Z|00011|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
                    waiting 2 seconds before reconnect
                    2018-03-26T15:09:03.643Z|00012|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
                    connecting...
                    2018-03-26T15:09:03.644Z|00013|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>: connected
                    2018-03-26T15:09:03.648Z|00014|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
                    connection closed by peer
                    2018-03-26T15:09:03.648Z|00015|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
                    waiting 4 seconds before reconnect
                    2018-03-26T15:09:07.648Z|00016|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
                    connecting...

                    and of course, there is no br-int on ovsdb:

                      # ovs-vsctl show
                    d3b8068d-fcfb-4654-b6fa-5fea41d9922d
                         Manager "tcp:10.100.0.51:6640 <http://10.100.0.51:6640>"
                         ovs_version: "2.6.1"

                    I suppose, that this latest patch from ovsdb could provoke the issue:

                    $ git show 77f0ef93babd
                    commit 77f0ef93babd9a1fd4a7f3508675830882c374aa (HEAD, origin/stable/oxygen)
                    Author: Tim Rozet <trozet@... <mailto:trozet@...>>
                    Date:   Fri Mar 23 14:23:40 2018 -0400

                         OVSDB-449 Fixes loading keystores dynamically per connection

                         Currently when OVSDB manager is started with SSL, the SSL Context is
                         only fetched once, which inherently means the keystores are only read
                         that one time.  This patch changes the behavior so that the SSL Context
                         is fetched everytime a new connection is opened, which in turn will
                         update the keystore data.

                         Change-Id: Iaafbd34afcb5f4708b48eb3d64eca14ef0a107e8
                         Signed-off-by: Tim Rozet <trozet@... <mailto:trozet@...>>
                         (cherry picked from commit e4a170988175b98e2411782f2a85b63edb64d4cf)

                    Has anyone already had this issue ? If so, please, share you workaround. I use openvswitch  2.6.1-1.

                    Thanks for any help,

                    Regards


                    _______________________________________________
                    ovsdb-dev mailing list
                    ovsdb-dev@...rg <mailto:ovsdb-dev@...ylight.org>
                    https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev


_______________________________________________
ovsdb-dev mailing list
ovsdb-dev@...rg
https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev



Jamo Luhrsen <jluhrsen@...>
 

Isn't this a new problem just uncovered/introduced? If so, then doesn't that mean we
probably found it easier *because* we are running the -all- job.

JamO

On 3/27/18 10:21 PM, Vishal Thapar wrote:
Missed Luis’ mail who suggests not using -all for 3 node tests. I think we should prioritize that ASAP.
Regards,
Vishal.
*From:*ovsdb-dev-bounces@... [mailto:ovsdb-dev-bounces@...] *On Behalf Of *Vishal Thapar
*Sent:* 28 March 2018 10:50
*To:* Michael Vorburger <vorburger@...>; Sam Hague <shague@...>
*Cc:* Kiran N Upadhyaya <kiran.n.upadhyaya@...>; Guillaume Gaudonville <guillaume.gaudonville@...>; <ovsdb-dev@...> <ovsdb-dev@...>; Tim Rozet <trozet@...>; Valentina Krasnobaeva <valentina.krasnobaeva@...>
*Subject:* Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest OXYGEN/Fluorine) in attempt to connect to OVS
Hi Sam, Michael,
This is post Tim’s fix and is different issue. Yes, Genius CSIT is using -all, probably need to fix Genius CSIT to not use -all but Genius only. Issue is coming from some deadlock in bgpcep start/close. Adding more of the stack trace:
2018-03-27T05:52:15,752 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace: opendaylight-cluster-data-akka.actor.default-dispatcher-2 locked on org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl@23621442 <mailto:org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl@23621442> (owned by opendaylight-cluster-data-notification-dispatcher-55):
        at org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl.instantiateServiceInstance(ProgrammingServiceImpl.java:175)
        at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.lambda$startServices$0(ClusterSingletonServiceGroupImpl.java:648)
        at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl$$Lambda$1108/2009038775.accept(Unknown Source)
        at java.util.ArrayList.forEach(ArrayList.java:1257)
2018-03-27T05:52:16,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Oh nose - there are 3 deadlocked threads!! :-(
2018-03-27T05:52:16,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace: opendaylight-cluster-data-notification-dispatcher-55 locked on java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86 <mailto:java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86> (owned by opendaylight-cluster-data-akka.actor.default-dispatcher-2):
         at sun.misc.Unsafe.park(Native Method)
         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
         at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
         at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
         at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
         at java.util.concurrent.locks.ReentrantLock$FairSync.lock(ReentrantLock.java:224)
         at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
         at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.unregisterService(ClusterSingletonServiceGroupImpl.java:393)
         at org.opendaylight.mdsal.singleton.dom.impl.AbstractClusterSingletonServiceProviderImpl.removeRegistration(AbstractClusterSingletonServiceProviderImpl.java:151)
         at org.opendaylight.mdsal.singleton.dom.impl.AbstractClusterSingletonServiceProviderImpl$1.removeRegistration(AbstractClusterSingletonServiceProviderImpl.java:123)
         at org.opendaylight.yangtools.concepts.AbstractRegistration.close(AbstractRegistration.java:41)
         at org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl.close(ProgrammingServiceImpl.java:449)
         at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyProviderBean$PCEPTopologyProviderBeanCSS.close(PCEPTopologyProviderBean.java:190)
         at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyProviderBean.close(PCEPTopologyProviderBean.java:86)
         at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.closeTopology(PCEPTopologyDeployerImpl.java:151)
         at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.updateTopologyProvider(PCEPTopologyDeployerImpl.java:92)
         at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.lambda$onDataTreeChanged$0(PCEPTopologyDeployerImpl.java:75)
         at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl$$Lambda$1076/552124597.accept(Unknown Source)
So, technically bgpcep is not coming up in -all correctly but since autorelease/distribution check don’t test for functionality it wasn’t caught till Genius ran a CSIT with -all. Could be some poorly written start/close in BGPCEP also, probably Michael can give more inputs on this. Then we can decide what to raise bug on.
Thanks and Regards,
Vishal.
*From:*Michael Vorburger [mailto:vorburger@...]
*Sent:* 27 March 2018 23:30
*To:* Sam Hague <shague@... <mailto:shague@...>>
*Cc:* Vishal Thapar <vishal.thapar@... <mailto:vishal.thapar@...>>; Valentina Krasnobaeva <valentina.krasnobaeva@... <mailto:valentina.krasnobaeva@...>>; Kiran N Upadhyaya <kiran.n.upadhyaya@... <mailto:kiran.n.upadhyaya@...>>; Guillaume Gaudonville <guillaume.gaudonville@... <mailto:guillaume.gaudonville@...>>; <ovsdb-dev@... <mailto:ovsdb-dev@...>> <ovsdb-dev@... <mailto:ovsdb-dev@...>>; Tim Rozet <trozet@... <mailto:trozet@...>>
*Subject:* Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest OXYGEN/Fluorine) in attempt to connect to OVS
On Tue, Mar 27, 2018 at 6:17 PM, Sam Hague <shague@... <mailto:shague@...>> wrote:
Curious but why is the genius 3node bringing in sfc features, is it because this is an integration-all test? Even
so, I dont think it makes sense to do this.
Not 100% sure what you mean, but yes my understanding is that the distribution integration-all feature in distribution includes *ALL* features.
If you don't think that makes sense, add integration-dev in reply and take it up with the maintainers of integration/distribution..
On Tue, Mar 27, 2018, 9:10 AM Michael Vorburger <vorburger@... <mailto:vorburger@...>> wrote:
On Tue, Mar 27, 2018 at 5:51 PM, Sam Hague <shague@... <mailto:shague@...>> wrote:
Adding Tim.
Are the logs below from the original job valentina mentioned or the latest ? The original failure was from a
bad fix which we did fix, so need to understand if this is another issue.
On Mar 26, 2018 11:10 PM, "Vishal Thapar" <vishal.thapar@... <mailto:vishal.thapar@...>>
wrote:
Don’t see the socket issue, but Sathwick is now seeing some other issue. InfraUtils is throwing lots of
Deadlock Exceptions:
Cool. Could you please open a new JIRA and include the full stack trace it (should) show after (or before?) that
"Deadlocked thread stack trace: " ERROR log? From what little I understand so far, this JIRA should probably be
openend in project controller. Although it could be caused by a mistake made in application code. Can you please
clarify exactly what caused this to start happening? If it's a particular change, it would be interesting to
look more closely at it does anything special (wrong) re. locking and threading etc.
Tx,
M.
--
Michael Vorburger, Red Hat
vorburger@... <mailto:vorburger@...> | IRC: vorburger @freenode | ~ = http://vorburger.ch
<http://vorburger.ch/>
2018-03-27T05:53:29,133 | INFO  | awaitility[checkBundleDiagInfos] | SystemReadyImpl                  |
360 - org.opendaylight.infrautils.ready-impl - 1.4.0.SNAPSHOT | checkBundleDiagInfos: Elapsed time 206s,
remaining time 93s, diag: Booting {Installed=0, Resolved=7, Unknown=0, GracePeriod=1, Waiting=0,
Starting=0, Active=557, Stopping=0, Failure=0}
2018-03-27T05:53:29,245 | INFO  | opendaylight-cluster-data-notification-dispatcher-66 |
SfcOvsNodeDataListener | 468 - org.opendaylight.sfc.ovs - 0.8.0.SNAPSHOT | SfcOvsNodeDataListener::add()
bridge name [OvsdbBridgeName{_value=BR1}] DPID [DatapathId{_value=00:00:f6:3e:dc:fc:7b:4e}]
2018-03-27T05:53:29,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Oh nose - there are 3 deadlocked
threads!! :-(
2018-03-27T05:53:29,253 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace:
opendaylight-cluster-data-notification-dispatcher-55 locked on
java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86
<mailto:java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86> (owned by
opendaylight-cluster-data-akka.actor.default-dispatcher-2):
        at sun.misc.Unsafe.park(Native Method)
https://logs.opendaylight.org/sandbox/vex-yul-odl-jenkins-2/genius-csit-3node-sathwikgate-all-fluorine/5/odl1_karaf.log.gz
Regards,
Vishal.
*From:*ovsdb-dev-bounces@... <mailto:ovsdb-dev-bounces@...>
[mailto:ovsdb-dev-bounces@... <mailto:ovsdb-dev-bounces@...>] *On
Behalf Of *Sam Hague
*Sent:* 26 March 2018 21:17
*To:* Valentina Krasnobaeva <valentina.krasnobaeva@... <mailto:valentina.krasnobaeva@...>>
*Cc:* Kiran N Upadhyaya <kiran.n.upadhyaya@... <mailto:kiran.n.upadhyaya@...>>;
Guillaume Gaudonville <guillaume.gaudonville@... <mailto:guillaume.gaudonville@...>>;
<ovsdb-dev@... <mailto:ovsdb-dev@...>>
<ovsdb-dev@... <mailto:ovsdb-dev@...>>
*Subject:* Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest
OXYGEN/Fluorine) in attempt to connect to OVS
Yes, that patch was the issue. Here is a fix that should merge soon.
https://git.opendaylight.org/gerrit/#/c/70106/
On Mon, Mar 26, 2018 at 8:17 AM, Valentina Krasnobaeva <valentina.krasnobaeva@...
<mailto:valentina.krasnobaeva@...>> wrote:
Hi OVSDB Team, Hi All
Since today's morning  with latest builds of OXYGEN and Fluorine (all branches were rebased) I
always see this error in ovsdb log,
when devstack executes "sudo ovs-vsctl set-manager tcp:10.100.0.51:6640 <http://10.100.0.51:6640>":
2018-03-26T16:37:14,404 | WARN  | nioEventLoopGroup-5-1 | ChannelInitializer               | 60 -
io.netty.common - 4.1.16.Final | Failed to initialize a channel. Closing: [id: 0x2976135c,
L:/10.100.0.51:6640 <http://10.100.0.51:6640> - R:/10.100.0.11:34774 <http://10.100.0.11:34774>]
java.lang.NullPointerException: null
        at
org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:325) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
        at
org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:321) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
        at io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:113)
[63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:105)
[63:io.netty.transport:4.1.16.Final]
        at
io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:606)
[63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.access$000(DefaultChannelPipeline.java:46)
[63:io.netty.transport:4.1.16.Final]
        at
io.netty.channel.DefaultChannelPipeline$PendingHandlerAddedTask.execute(DefaultChannelPipeline.java:1412)
[63:io.netty.transport:4.1.16.Final]
        at
io.netty.channel.DefaultChannelPipeline.callHandlerAddedForAllHandlers(DefaultChannelPipeline.java:1131)
[63:io.netty.transport:4.1.16.Final]
        at
io.netty.channel.DefaultChannelPipeline.invokeHandlerAddedIfNeeded(DefaultChannelPipeline.java:656)
[63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:510)
[63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:423)
[63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:482)
[63:io.netty.transport:4.1.16.Final]
        at
io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
[60:io.netty.common:4.1.16.Final]
        at
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
[60:io.netty.common:4.1.16.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
[63:io.netty.transport:4.1.16.Final]
        at
io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
[60:io.netty.common:4.1.16.Final]
        at
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
[60:io.netty.common:4.1.16.Final]
        at java.lang.Thread.run(Thread.java:748) [?:?]
2018-03-26T16:37:15,410 | INFO  | nioEventLoopGroup-4-1 | LoggingHandler                   | 60 -
io.netty.common - 4.1.16.Final | [id: 0x57fd99d7, L:/0:0:0:0:0:0:0:0:6640] READ: [id: 0x1c79d0b8,
L:/10.100.0.51:6640 <http://10.100.0.51:6640> - R:/10.100.0.11:34776 <http://10.100.0.11:34776>]
ovsdb process:
root     16311  0.0  0.0  28768  2152 ?        S<s  17:05   0:00 ovsdb-server: monitoring pid 16312
(healthy)
root     16312  0.0  0.0  29040  5304 ?        S<   17:05   0:00  \_ ovsdb-server
/etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info
--remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key
--certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert
--no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log
--pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor
from its log:
2018-03-26T15:09:00.632Z|00005|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
connecting...
2018-03-26T15:09:00.632Z|00006|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>: connected
2018-03-26T15:09:00.637Z|00007|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
connection closed by peer
2018-03-26T15:09:01.639Z|00008|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
connecting...
2018-03-26T15:09:01.640Z|00009|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>: connected
2018-03-26T15:09:01.642Z|00010|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
connection closed by peer
2018-03-26T15:09:01.642Z|00011|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
waiting 2 seconds before reconnect
2018-03-26T15:09:03.643Z|00012|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
connecting...
2018-03-26T15:09:03.644Z|00013|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>: connected
2018-03-26T15:09:03.648Z|00014|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
connection closed by peer
2018-03-26T15:09:03.648Z|00015|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
waiting 4 seconds before reconnect
2018-03-26T15:09:07.648Z|00016|reconnect|INFO|tcp:10.100.0.51:6640 <http://10.100.0.51:6640>:
connecting...
and of course, there is no br-int on ovsdb:
 # ovs-vsctl show
d3b8068d-fcfb-4654-b6fa-5fea41d9922d
    Manager "tcp:10.100.0.51:6640 <http://10.100.0.51:6640>"
    ovs_version: "2.6.1"
I suppose, that this latest patch from ovsdb could provoke the issue:
$ git show 77f0ef93babd
commit 77f0ef93babd9a1fd4a7f3508675830882c374aa (HEAD, origin/stable/oxygen)
Author: Tim Rozet <trozet@... <mailto:trozet@...>>
Date:   Fri Mar 23 14:23:40 2018 -0400
    OVSDB-449 Fixes loading keystores dynamically per connection
    Currently when OVSDB manager is started with SSL, the SSL Context is
    only fetched once, which inherently means the keystores are only read
    that one time.  This patch changes the behavior so that the SSL Context
    is fetched everytime a new connection is opened, which in turn will
    update the keystore data.
    Change-Id: Iaafbd34afcb5f4708b48eb3d64eca14ef0a107e8
    Signed-off-by: Tim Rozet <trozet@... <mailto:trozet@...>>
    (cherry picked from commit e4a170988175b98e2411782f2a85b63edb64d4cf)
Has anyone already had this issue ? If so, please, share you workaround. I use openvswitch  2.6.1-1.
Thanks for any help,
Regards
_______________________________________________
ovsdb-dev mailing list
ovsdb-dev@... <mailto:ovsdb-dev@...>
https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev
_______________________________________________
ovsdb-dev mailing list
ovsdb-dev@...
https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev


Vishal Thapar <vishal.thapar@...>
 

Missed Luis’ mail who suggests not using -all for 3 node tests. I think we should prioritize that ASAP.

 

Regards,

Vishal.

 

From: ovsdb-dev-bounces@... [mailto:ovsdb-dev-bounces@...] On Behalf Of Vishal Thapar
Sent: 28 March 2018 10:50
To: Michael Vorburger <vorburger@...>; Sam Hague <shague@...>
Cc: Kiran N Upadhyaya <kiran.n.upadhyaya@...>; Guillaume Gaudonville <guillaume.gaudonville@...>; <ovsdb-dev@...> <ovsdb-dev@...>; Tim Rozet <trozet@...>; Valentina Krasnobaeva <valentina.krasnobaeva@...>
Subject: Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest OXYGEN/Fluorine) in attempt to connect to OVS

 

Hi Sam, Michael,

 

This is post Tim’s fix and is different issue. Yes, Genius CSIT is using -all, probably need to fix Genius CSIT to not use -all but Genius only. Issue is coming from some deadlock in bgpcep start/close. Adding more of the stack trace:

 

2018-03-27T05:52:15,752 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace: opendaylight-cluster-data-akka.actor.default-dispatcher-2 locked on org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl@23621442 (owned by opendaylight-cluster-data-notification-dispatcher-55):

        at org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl.instantiateServiceInstance(ProgrammingServiceImpl.java:175)

        at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.lambda$startServices$0(ClusterSingletonServiceGroupImpl.java:648)

        at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl$$Lambda$1108/2009038775.accept(Unknown Source)

        at java.util.ArrayList.forEach(ArrayList.java:1257)

 

2018-03-27T05:52:16,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Oh nose - there are 3 deadlocked threads!! :-(
2018-03-27T05:52:16,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace: opendaylight-cluster-data-notification-dispatcher-55 locked on java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86 (owned by opendaylight-cluster-data-akka.actor.default-dispatcher-2):
         at sun.misc.Unsafe.park(Native Method)
         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
         at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
         at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
         at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
         at java.util.concurrent.locks.ReentrantLock$FairSync.lock(ReentrantLock.java:224)
         at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
         at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.unregisterService(ClusterSingletonServiceGroupImpl.java:393)
         at org.opendaylight.mdsal.singleton.dom.impl.AbstractClusterSingletonServiceProviderImpl.removeRegistration(AbstractClusterSingletonServiceProviderImpl.java:151)
         at org.opendaylight.mdsal.singleton.dom.impl.AbstractClusterSingletonServiceProviderImpl$1.removeRegistration(AbstractClusterSingletonServiceProviderImpl.java:123)
         at org.opendaylight.yangtools.concepts.AbstractRegistration.close(AbstractRegistration.java:41)
         at org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl.close(ProgrammingServiceImpl.java:449)
         at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyProviderBean$PCEPTopologyProviderBeanCSS.close(PCEPTopologyProviderBean.java:190)
         at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyProviderBean.close(PCEPTopologyProviderBean.java:86)
         at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.closeTopology(PCEPTopologyDeployerImpl.java:151)
         at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.updateTopologyProvider(PCEPTopologyDeployerImpl.java:92)
         at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.lambda$onDataTreeChanged$0(PCEPTopologyDeployerImpl.java:75)
         at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl$$Lambda$1076/552124597.accept(Unknown Source)

 

So, technically bgpcep is not coming up in -all correctly but since autorelease/distribution check don’t test for functionality it wasn’t caught till Genius ran a CSIT with -all. Could be some poorly written start/close in BGPCEP also, probably Michael can give more inputs on this. Then we can decide what to raise bug on.

 

Thanks and Regards,

Vishal.

 

From: Michael Vorburger [mailto:vorburger@...]
Sent: 27 March 2018 23:30
To: Sam Hague <shague@...>
Cc: Vishal Thapar <vishal.thapar@...>; Valentina Krasnobaeva <valentina.krasnobaeva@...>; Kiran N Upadhyaya <kiran.n.upadhyaya@...>; Guillaume Gaudonville <guillaume.gaudonville@...>; <ovsdb-dev@...> <ovsdb-dev@...>; Tim Rozet <trozet@...>
Subject: Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest OXYGEN/Fluorine) in attempt to connect to OVS

 

On Tue, Mar 27, 2018 at 6:17 PM, Sam Hague <shague@...> wrote:

Curious but why is the genius 3node bringing in sfc features, is it because this is an integration-all test? Even so, I dont think it makes sense to do this.

 

Not 100% sure what you mean, but yes my understanding is that the distribution integration-all feature in distribution includes *ALL* features.

 

If you don't think that makes sense, add integration-dev in reply and take it up with the maintainers of integration/distribution..

 

On Tue, Mar 27, 2018, 9:10 AM Michael Vorburger <vorburger@...> wrote:

On Tue, Mar 27, 2018 at 5:51 PM, Sam Hague <shague@...> wrote:

Adding Tim.

 

Are the logs below from the original job valentina mentioned or the latest ? The original failure was from a bad fix which we did fix, so need to understand if this is another issue.

 

 

On Mar 26, 2018 11:10 PM, "Vishal Thapar" <vishal.thapar@...> wrote:

Don’t see the socket issue, but Sathwick is now seeing some other issue. InfraUtils is throwing lots of Deadlock Exceptions:

 

Cool. Could you please open a new JIRA and include the full stack trace it (should) show after (or before?) that "Deadlocked thread stack trace: " ERROR log? From what little I understand so far, this JIRA should probably be openend in project controller. Although it could be caused by a mistake made in application code. Can you please clarify exactly what caused this to start happening? If it's a particular change, it would be interesting to look more closely at it does anything special (wrong) re. locking and threading etc.


Tx,

M.

--

Michael Vorburger, Red Hat
vorburger@... | IRC: vorburger @freenode | ~ = http://vorburger.ch

 

 

2018-03-27T05:53:29,133 | INFO  | awaitility[checkBundleDiagInfos] | SystemReadyImpl                  | 360 - org.opendaylight.infrautils.ready-impl - 1.4.0.SNAPSHOT | checkBundleDiagInfos: Elapsed time 206s, remaining time 93s, diag: Booting {Installed=0, Resolved=7, Unknown=0, GracePeriod=1, Waiting=0, Starting=0, Active=557, Stopping=0, Failure=0}

2018-03-27T05:53:29,245 | INFO  | opendaylight-cluster-data-notification-dispatcher-66 | SfcOvsNodeDataListener           | 468 - org.opendaylight.sfc.ovs - 0.8.0.SNAPSHOT | SfcOvsNodeDataListener::add() bridge name [OvsdbBridgeName{_value=BR1}] DPID [DatapathId{_value=00:00:f6:3e:dc:fc:7b:4e}]

2018-03-27T05:53:29,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Oh nose - there are 3 deadlocked threads!! :-(

2018-03-27T05:53:29,253 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace: opendaylight-cluster-data-notification-dispatcher-55 locked on java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86 (owned by opendaylight-cluster-data-akka.actor.default-dispatcher-2):

        at sun.misc.Unsafe.park(Native Method)

 

 

https://logs.opendaylight.org/sandbox/vex-yul-odl-jenkins-2/genius-csit-3node-sathwikgate-all-fluorine/5/odl1_karaf.log.gz

 

Regards,

Vishal.

 

From: ovsdb-dev-bounces@... [mailto:ovsdb-dev-bounces@...] On Behalf Of Sam Hague
Sent: 26 March 2018 21:17
To: Valentina Krasnobaeva <valentina.krasnobaeva@...>
Cc: Kiran N Upadhyaya <kiran.n.upadhyaya@...>; Guillaume Gaudonville <guillaume.gaudonville@...>; <ovsdb-dev@...> <ovsdb-dev@...>
Subject: Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest OXYGEN/Fluorine) in attempt to connect to OVS

 

Yes, that patch was the issue. Here is a fix that should merge soon.

 

 

On Mon, Mar 26, 2018 at 8:17 AM, Valentina Krasnobaeva <valentina.krasnobaeva@...> wrote:

Hi OVSDB Team, Hi All

Since today's morning  with latest builds of OXYGEN and Fluorine (all branches were rebased) I always see this error in ovsdb log,

when devstack executes "sudo ovs-vsctl set-manager tcp:10.100.0.51:6640":

2018-03-26T16:37:14,404 | WARN  | nioEventLoopGroup-5-1 | ChannelInitializer               | 60 - io.netty.common - 4.1.16.Final | Failed to initialize a channel. Closing: [id: 0x2976135c, L:/10.100.0.51:6640 - R:/10.100.0.11:34774]
java.lang.NullPointerException: null
        at org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:325) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
        at org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:321) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
        at io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:113) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:105) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:606) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.access$000(DefaultChannelPipeline.java:46) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline$PendingHandlerAddedTask.execute(DefaultChannelPipeline.java:1412) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.callHandlerAddedForAllHandlers(DefaultChannelPipeline.java:1131) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.invokeHandlerAddedIfNeeded(DefaultChannelPipeline.java:656) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:510) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:423) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:482) [63:io.netty.transport:4.1.16.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) [60:io.netty.common:4.1.16.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403) [60:io.netty.common:4.1.16.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463) [63:io.netty.transport:4.1.16.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [60:io.netty.common:4.1.16.Final]
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138) [60:io.netty.common:4.1.16.Final]
        at java.lang.Thread.run(Thread.java:748) [?:?]
2018-03-26T16:37:15,410 | INFO  | nioEventLoopGroup-4-1 | LoggingHandler                   | 60 - io.netty.common - 4.1.16.Final | [id: 0x57fd99d7, L:/0:0:0:0:0:0:0:0:6640] READ: [id: 0x1c79d0b8, L:/10.100.0.51:6640 - R:/10.100.0.11:34776]
   

ovsdb process:

root     16311  0.0  0.0  28768  2152 ?        S<s  17:05   0:00 ovsdb-server: monitoring pid 16312 (healthy)                                                                                                                                                                                                                                                                                                                                                       
root     16312  0.0  0.0  29040  5304 ?        S<   17:05   0:00  \_ ovsdb-server /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor

from its log:

2018-03-26T15:09:00.632Z|00005|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:00.632Z|00006|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:00.637Z|00007|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:01.639Z|00008|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:01.640Z|00009|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:01.642Z|00010|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:01.642Z|00011|reconnect|INFO|tcp:10.100.0.51:6640: waiting 2 seconds before reconnect
2018-03-26T15:09:03.643Z|00012|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:03.644Z|00013|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:03.648Z|00014|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:03.648Z|00015|reconnect|INFO|tcp:10.100.0.51:6640: waiting 4 seconds before reconnect
2018-03-26T15:09:07.648Z|00016|reconnect|INFO|tcp:10.100.0.51:6640: connecting...

and of course, there is no br-int on ovsdb:

 # ovs-vsctl show
d3b8068d-fcfb-4654-b6fa-5fea41d9922d
    Manager "tcp:10.100.0.51:6640"
    ovs_version: "2.6.1"

I suppose, that this latest patch from ovsdb could provoke the issue:

$ git show 77f0ef93babd
commit 77f0ef93babd9a1fd4a7f3508675830882c374aa (HEAD, origin/stable/oxygen)
Author: Tim Rozet <trozet@...>
Date:   Fri Mar 23 14:23:40 2018 -0400

    OVSDB-449 Fixes loading keystores dynamically per connection
   
    Currently when OVSDB manager is started with SSL, the SSL Context is
    only fetched once, which inherently means the keystores are only read
    that one time.  This patch changes the behavior so that the SSL Context
    is fetched everytime a new connection is opened, which in turn will
    update the keystore data.
   
    Change-Id: Iaafbd34afcb5f4708b48eb3d64eca14ef0a107e8
    Signed-off-by: Tim Rozet <trozet@...>
    (cherry picked from commit e4a170988175b98e2411782f2a85b63edb64d4cf)

Has anyone already had this issue ? If so, please, share you workaround. I use openvswitch  2.6.1-1.

Thanks for any help,

Regards


_______________________________________________
ovsdb-dev mailing list
ovsdb-dev@...
https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev

 

 

 

 


Vishal Thapar <vishal.thapar@...>
 

Hi Sam, Michael,

 

This is post Tim’s fix and is different issue. Yes, Genius CSIT is using -all, probably need to fix Genius CSIT to not use -all but Genius only. Issue is coming from some deadlock in bgpcep start/close. Adding more of the stack trace:

 

2018-03-27T05:52:15,752 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace: opendaylight-cluster-data-akka.actor.default-dispatcher-2 locked on org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl@23621442 (owned by opendaylight-cluster-data-notification-dispatcher-55):

        at org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl.instantiateServiceInstance(ProgrammingServiceImpl.java:175)

        at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.lambda$startServices$0(ClusterSingletonServiceGroupImpl.java:648)

        at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl$$Lambda$1108/2009038775.accept(Unknown Source)

        at java.util.ArrayList.forEach(ArrayList.java:1257)

 

2018-03-27T05:52:16,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Oh nose - there are 3 deadlocked threads!! :-(
2018-03-27T05:52:16,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace: opendaylight-cluster-data-notification-dispatcher-55 locked on java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86 (owned by opendaylight-cluster-data-akka.actor.default-dispatcher-2):
         at sun.misc.Unsafe.park(Native Method)
         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
         at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
         at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
         at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
         at java.util.concurrent.locks.ReentrantLock$FairSync.lock(ReentrantLock.java:224)
         at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
         at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.unregisterService(ClusterSingletonServiceGroupImpl.java:393)
         at org.opendaylight.mdsal.singleton.dom.impl.AbstractClusterSingletonServiceProviderImpl.removeRegistration(AbstractClusterSingletonServiceProviderImpl.java:151)
         at org.opendaylight.mdsal.singleton.dom.impl.AbstractClusterSingletonServiceProviderImpl$1.removeRegistration(AbstractClusterSingletonServiceProviderImpl.java:123)
         at org.opendaylight.yangtools.concepts.AbstractRegistration.close(AbstractRegistration.java:41)
         at org.opendaylight.bgpcep.programming.impl.ProgrammingServiceImpl.close(ProgrammingServiceImpl.java:449)
         at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyProviderBean$PCEPTopologyProviderBeanCSS.close(PCEPTopologyProviderBean.java:190)
         at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyProviderBean.close(PCEPTopologyProviderBean.java:86)
         at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.closeTopology(PCEPTopologyDeployerImpl.java:151)
         at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.updateTopologyProvider(PCEPTopologyDeployerImpl.java:92)
         at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl.lambda$onDataTreeChanged$0(PCEPTopologyDeployerImpl.java:75)
         at org.opendaylight.bgpcep.pcep.topology.provider.config.PCEPTopologyDeployerImpl$$Lambda$1076/552124597.accept(Unknown Source)

 

So, technically bgpcep is not coming up in -all correctly but since autorelease/distribution check don’t test for functionality it wasn’t caught till Genius ran a CSIT with -all. Could be some poorly written start/close in BGPCEP also, probably Michael can give more inputs on this. Then we can decide what to raise bug on.

 

Thanks and Regards,

Vishal.

 

From: Michael Vorburger [mailto:vorburger@...]
Sent: 27 March 2018 23:30
To: Sam Hague <shague@...>
Cc: Vishal Thapar <vishal.thapar@...>; Valentina Krasnobaeva <valentina.krasnobaeva@...>; Kiran N Upadhyaya <kiran.n.upadhyaya@...>; Guillaume Gaudonville <guillaume.gaudonville@...>; <ovsdb-dev@...> <ovsdb-dev@...>; Tim Rozet <trozet@...>
Subject: Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest OXYGEN/Fluorine) in attempt to connect to OVS

 

On Tue, Mar 27, 2018 at 6:17 PM, Sam Hague <shague@...> wrote:

Curious but why is the genius 3node bringing in sfc features, is it because this is an integration-all test? Even so, I dont think it makes sense to do this.

 

Not 100% sure what you mean, but yes my understanding is that the distribution integration-all feature in distribution includes *ALL* features.

 

If you don't think that makes sense, add integration-dev in reply and take it up with the maintainers of integration/distribution..

 

On Tue, Mar 27, 2018, 9:10 AM Michael Vorburger <vorburger@...> wrote:

On Tue, Mar 27, 2018 at 5:51 PM, Sam Hague <shague@...> wrote:

Adding Tim.

 

Are the logs below from the original job valentina mentioned or the latest ? The original failure was from a bad fix which we did fix, so need to understand if this is another issue.

 

 

On Mar 26, 2018 11:10 PM, "Vishal Thapar" <vishal.thapar@...> wrote:

Don’t see the socket issue, but Sathwick is now seeing some other issue. InfraUtils is throwing lots of Deadlock Exceptions:

 

Cool. Could you please open a new JIRA and include the full stack trace it (should) show after (or before?) that "Deadlocked thread stack trace: " ERROR log? From what little I understand so far, this JIRA should probably be openend in project controller. Although it could be caused by a mistake made in application code. Can you please clarify exactly what caused this to start happening? If it's a particular change, it would be interesting to look more closely at it does anything special (wrong) re. locking and threading etc.


Tx,

M.

--

Michael Vorburger, Red Hat
vorburger@... | IRC: vorburger @freenode | ~ = http://vorburger.ch

 

 

2018-03-27T05:53:29,133 | INFO  | awaitility[checkBundleDiagInfos] | SystemReadyImpl                  | 360 - org.opendaylight.infrautils.ready-impl - 1.4.0.SNAPSHOT | checkBundleDiagInfos: Elapsed time 206s, remaining time 93s, diag: Booting {Installed=0, Resolved=7, Unknown=0, GracePeriod=1, Waiting=0, Starting=0, Active=557, Stopping=0, Failure=0}

2018-03-27T05:53:29,245 | INFO  | opendaylight-cluster-data-notification-dispatcher-66 | SfcOvsNodeDataListener           | 468 - org.opendaylight.sfc.ovs - 0.8.0.SNAPSHOT | SfcOvsNodeDataListener::add() bridge name [OvsdbBridgeName{_value=BR1}] DPID [DatapathId{_value=00:00:f6:3e:dc:fc:7b:4e}]

2018-03-27T05:53:29,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Oh nose - there are 3 deadlocked threads!! :-(

2018-03-27T05:53:29,253 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace: opendaylight-cluster-data-notification-dispatcher-55 locked on java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86 (owned by opendaylight-cluster-data-akka.actor.default-dispatcher-2):

        at sun.misc.Unsafe.park(Native Method)

 

 

https://logs.opendaylight.org/sandbox/vex-yul-odl-jenkins-2/genius-csit-3node-sathwikgate-all-fluorine/5/odl1_karaf.log.gz

 

Regards,

Vishal.

 

From: ovsdb-dev-bounces@... [mailto:ovsdb-dev-bounces@...] On Behalf Of Sam Hague
Sent: 26 March 2018 21:17
To: Valentina Krasnobaeva <valentina.krasnobaeva@...>
Cc: Kiran N Upadhyaya <kiran.n.upadhyaya@...>; Guillaume Gaudonville <guillaume.gaudonville@...>; <ovsdb-dev@...> <ovsdb-dev@...>
Subject: Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest OXYGEN/Fluorine) in attempt to connect to OVS

 

Yes, that patch was the issue. Here is a fix that should merge soon.

 

 

On Mon, Mar 26, 2018 at 8:17 AM, Valentina Krasnobaeva <valentina.krasnobaeva@...> wrote:

Hi OVSDB Team, Hi All

Since today's morning  with latest builds of OXYGEN and Fluorine (all branches were rebased) I always see this error in ovsdb log,

when devstack executes "sudo ovs-vsctl set-manager tcp:10.100.0.51:6640":

2018-03-26T16:37:14,404 | WARN  | nioEventLoopGroup-5-1 | ChannelInitializer               | 60 - io.netty.common - 4.1.16.Final | Failed to initialize a channel. Closing: [id: 0x2976135c, L:/10.100.0.51:6640 - R:/10.100.0.11:34774]
java.lang.NullPointerException: null
        at org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:325) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
        at org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:321) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
        at io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:113) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:105) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:606) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.access$000(DefaultChannelPipeline.java:46) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline$PendingHandlerAddedTask.execute(DefaultChannelPipeline.java:1412) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.callHandlerAddedForAllHandlers(DefaultChannelPipeline.java:1131) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.invokeHandlerAddedIfNeeded(DefaultChannelPipeline.java:656) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:510) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:423) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:482) [63:io.netty.transport:4.1.16.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) [60:io.netty.common:4.1.16.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403) [60:io.netty.common:4.1.16.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463) [63:io.netty.transport:4.1.16.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [60:io.netty.common:4.1.16.Final]
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138) [60:io.netty.common:4.1.16.Final]
        at java.lang.Thread.run(Thread.java:748) [?:?]
2018-03-26T16:37:15,410 | INFO  | nioEventLoopGroup-4-1 | LoggingHandler                   | 60 - io.netty.common - 4.1.16.Final | [id: 0x57fd99d7, L:/0:0:0:0:0:0:0:0:6640] READ: [id: 0x1c79d0b8, L:/10.100.0.51:6640 - R:/10.100.0.11:34776]
   

ovsdb process:

root     16311  0.0  0.0  28768  2152 ?        S<s  17:05   0:00 ovsdb-server: monitoring pid 16312 (healthy)                                                                                                                                                                                                                                                                                                                                                       
root     16312  0.0  0.0  29040  5304 ?        S<   17:05   0:00  \_ ovsdb-server /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor

from its log:

2018-03-26T15:09:00.632Z|00005|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:00.632Z|00006|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:00.637Z|00007|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:01.639Z|00008|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:01.640Z|00009|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:01.642Z|00010|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:01.642Z|00011|reconnect|INFO|tcp:10.100.0.51:6640: waiting 2 seconds before reconnect
2018-03-26T15:09:03.643Z|00012|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:03.644Z|00013|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:03.648Z|00014|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:03.648Z|00015|reconnect|INFO|tcp:10.100.0.51:6640: waiting 4 seconds before reconnect
2018-03-26T15:09:07.648Z|00016|reconnect|INFO|tcp:10.100.0.51:6640: connecting...

and of course, there is no br-int on ovsdb:

 # ovs-vsctl show
d3b8068d-fcfb-4654-b6fa-5fea41d9922d
    Manager "tcp:10.100.0.51:6640"
    ovs_version: "2.6.1"

I suppose, that this latest patch from ovsdb could provoke the issue:

$ git show 77f0ef93babd
commit 77f0ef93babd9a1fd4a7f3508675830882c374aa (HEAD, origin/stable/oxygen)
Author: Tim Rozet <trozet@...>
Date:   Fri Mar 23 14:23:40 2018 -0400

    OVSDB-449 Fixes loading keystores dynamically per connection
   
    Currently when OVSDB manager is started with SSL, the SSL Context is
    only fetched once, which inherently means the keystores are only read
    that one time.  This patch changes the behavior so that the SSL Context
    is fetched everytime a new connection is opened, which in turn will
    update the keystore data.
   
    Change-Id: Iaafbd34afcb5f4708b48eb3d64eca14ef0a107e8
    Signed-off-by: Tim Rozet <trozet@...>
    (cherry picked from commit e4a170988175b98e2411782f2a85b63edb64d4cf)

Has anyone already had this issue ? If so, please, share you workaround. I use openvswitch  2.6.1-1.

Thanks for any help,

Regards


_______________________________________________
ovsdb-dev mailing list
ovsdb-dev@...
https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev

 

 

 

 


Luis Gomez
 

integration-compatible-with-all features are brought in any test that says -all- in the job name. These features should be fully compatible meaning they should not interfere with any other project feature or system test. If this is not the case we need to move the offending feature (sfc?) to the integration-all group.

Another question is whether to run cluster (3node) test with -all- tag. I personally do not do that in OFP because I have been told in the past not all features in ODL are cluster aware so this can be problematic, so for OFP I restrict -all- test to single node functional test.

BR/Luis


On Mar 27, 2018, at 10:59 AM, Michael Vorburger <vorburger@...> wrote:

On Tue, Mar 27, 2018 at 6:17 PM, Sam Hague <shague@...> wrote:
Curious but why is the genius 3node bringing in sfc features, is it because this is an integration-all test? Even so, I dont think it makes sense to do this.

Not 100% sure what you mean, but yes my understanding is that the distribution integration-all feature in distribution includes *ALL* features.

If you don't think that makes sense, add integration-dev in reply and take it up with the maintainers of integration/distribution..

On Tue, Mar 27, 2018, 9:10 AM Michael Vorburger <vorburger@...> wrote:
On Tue, Mar 27, 2018 at 5:51 PM, Sam Hague <shague@...> wrote:
Adding Tim.

Are the logs below from the original job valentina mentioned or the latest ? The original failure was from a bad fix which we did fix, so need to understand if this is another issue.


On Mar 26, 2018 11:10 PM, "Vishal Thapar" <vishal.thapar@...> wrote:

Don’t see the socket issue, but Sathwick is now seeing some other issue. InfraUtils is throwing lots of Deadlock Exceptions:


Cool. Could you please open a new JIRA and include the full stack trace it (should) show after (or before?) that "Deadlocked thread stack trace: " ERROR log? From what little I understand so far, this JIRA should probably be openend in project controller. Although it could be caused by a mistake made in application code. Can you please clarify exactly what caused this to start happening? If it's a particular change, it would be interesting to look more closely at it does anything special (wrong) re. locking and threading etc.

Tx,
M.
--
Michael Vorburger, Red Hat
vorburger@... | IRC: vorburger @freenode | ~ = http://vorburger.ch

 

2018-03-27T05:53:29,133 | INFO  | awaitility[checkBundleDiagInfos] | SystemReadyImpl                  | 360 - org.opendaylight.infrautils.ready-impl - 1.4.0.SNAPSHOT | checkBundleDiagInfos: Elapsed time 206s, remaining time 93s, diag: Booting {Installed=0, Resolved=7, Unknown=0, GracePeriod=1, Waiting=0, Starting=0, Active=557, Stopping=0, Failure=0}

2018-03-27T05:53:29,245 | INFO  | opendaylight-cluster-data-notification-dispatcher-66 | SfcOvsNodeDataListener           | 468 - org.opendaylight.sfc.ovs - 0.8.0.SNAPSHOT | SfcOvsNodeDataListener::add() bridge name [OvsdbBridgeName{_value=BR1}] DPID [DatapathId{_value=00:00:f6:3e:dc:fc:7b:4e}]

2018-03-27T05:53:29,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Oh nose - there are 3 deadlocked threads!! :-(

2018-03-27T05:53:29,253 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace: opendaylight-cluster-data-notification-dispatcher-55 locked on java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86 (owned by opendaylight-cluster-data-akka.actor.default-dispatcher-2):

        at sun.misc.Unsafe.park(Native Method)

 

 

https://logs.opendaylight.org/sandbox/vex-yul-odl-jenkins-2/genius-csit-3node-sathwikgate-all-fluorine/5/odl1_karaf.log.gz

 

Regards,

Vishal.

 

From: ovsdb-dev-bounces@lists.opendaylight.org [mailto:ovsdb-dev-bounces@lists.opendaylight.org] On Behalf Of Sam Hague
Sent: 26 March 2018 21:17
To: Valentina Krasnobaeva <valentina.krasnobaeva@6wind.com>
Cc: Kiran N Upadhyaya <kiran.n.upadhyaya@ericsson.com>; Guillaume Gaudonville <guillaume.gaudonville@6wind.com>; <ovsdb-dev@....org> <ovsdb-dev@....org>
Subject: Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest OXYGEN/Fluorine) in attempt to connect to OVS

 

Yes, that patch was the issue. Here is a fix that should merge soon.

 

 

On Mon, Mar 26, 2018 at 8:17 AM, Valentina Krasnobaeva <valentina.krasnobaeva@6wind.com> wrote:

Hi OVSDB Team, Hi All

Since today's morning  with latest builds of OXYGEN and Fluorine (all branches were rebased) I always see this error in ovsdb log,

when devstack executes "sudo ovs-vsctl set-manager tcp:10.100.0.51:6640":

2018-03-26T16:37:14,404 | WARN  | nioEventLoopGroup-5-1 | ChannelInitializer               | 60 - io.netty.common - 4.1.16.Final | Failed to initialize a channel. Closing: [id: 0x2976135c, L:/10.100.0.51:6640 - R:/10.100.0.11:34774]
java.lang.NullPointerException: null
        at org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:325) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
        at org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:321) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
        at io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:113) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:105) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:606) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.access$000(DefaultChannelPipeline.java:46) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline$PendingHandlerAddedTask.execute(DefaultChannelPipeline.java:1412) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.callHandlerAddedForAllHandlers(DefaultChannelPipeline.java:1131) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.invokeHandlerAddedIfNeeded(DefaultChannelPipeline.java:656) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:510) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:423) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:482) [63:io.netty.transport:4.1.16.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) [60:io.netty.common:4.1.16.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403) [60:io.netty.common:4.1.16.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463) [63:io.netty.transport:4.1.16.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [60:io.netty.common:4.1.16.Final]
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138) [60:io.netty.common:4.1.16.Final]
        at java.lang.Thread.run(Thread.java:748) [?:?]
2018-03-26T16:37:15,410 | INFO  | nioEventLoopGroup-4-1 | LoggingHandler                   | 60 - io.netty.common - 4.1.16.Final | [id: 0x57fd99d7, L:/0:0:0:0:0:0:0:0:6640] READ: [id: 0x1c79d0b8, L:/10.100.0.51:6640 - R:/10.100.0.11:34776]
   

ovsdb process:

root     16311  0.0  0.0  28768  2152 ?        S<s  17:05   0:00 ovsdb-server: monitoring pid 16312 (healthy)                                                                                                                                                                                                                                                                                                                                                       
root     16312  0.0  0.0  29040  5304 ?        S<   17:05   0:00  \_ ovsdb-server /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor

from its log:

2018-03-26T15:09:00.632Z|00005|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:00.632Z|00006|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:00.637Z|00007|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:01.639Z|00008|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:01.640Z|00009|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:01.642Z|00010|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:01.642Z|00011|reconnect|INFO|tcp:10.100.0.51:6640: waiting 2 seconds before reconnect
2018-03-26T15:09:03.643Z|00012|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:03.644Z|00013|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:03.648Z|00014|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:03.648Z|00015|reconnect|INFO|tcp:10.100.0.51:6640: waiting 4 seconds before reconnect
2018-03-26T15:09:07.648Z|00016|reconnect|INFO|tcp:10.100.0.51:6640: connecting...

and of course, there is no br-int on ovsdb:

 # ovs-vsctl show
d3b8068d-fcfb-4654-b6fa-5fea41d9922d
    Manager "tcp:10.100.0.51:6640"
    ovs_version: "2.6.1"

I suppose, that this latest patch from ovsdb could provoke the issue:

$ git show 77f0ef93babd
commit 77f0ef93babd9a1fd4a7f3508675830882c374aa (HEAD, origin/stable/oxygen)
Author: Tim Rozet <trozet@...>
Date:   Fri Mar 23 14:23:40 2018 -0400

    OVSDB-449 Fixes loading keystores dynamically per connection
   
    Currently when OVSDB manager is started with SSL, the SSL Context is
    only fetched once, which inherently means the keystores are only read
    that one time.  This patch changes the behavior so that the SSL Context
    is fetched everytime a new connection is opened, which in turn will
    update the keystore data.
   
    Change-Id: Iaafbd34afcb5f4708b48eb3d64eca14ef0a107e8
    Signed-off-by: Tim Rozet <trozet@...>
    (cherry picked from commit e4a170988175b98e2411782f2a85b63edb64d4cf)

Has anyone already had this issue ? If so, please, share you workaround. I use openvswitch  2.6.1-1.

Thanks for any help,

Regards


_______________________________________________
ovsdb-dev mailing list
ovsdb-dev@....org
https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev

 





_______________________________________________
ovsdb-dev mailing list
ovsdb-dev@...
https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev


Michael Vorburger <vorburger@...>
 

On Tue, Mar 27, 2018 at 6:17 PM, Sam Hague <shague@...> wrote:
Curious but why is the genius 3node bringing in sfc features, is it because this is an integration-all test? Even so, I dont think it makes sense to do this.

Not 100% sure what you mean, but yes my understanding is that the distribution integration-all feature in distribution includes *ALL* features.

If you don't think that makes sense, add integration-dev in reply and take it up with the maintainers of integration/distribution..

On Tue, Mar 27, 2018, 9:10 AM Michael Vorburger <vorburger@...> wrote:
On Tue, Mar 27, 2018 at 5:51 PM, Sam Hague <shague@...> wrote:
Adding Tim.

Are the logs below from the original job valentina mentioned or the latest ? The original failure was from a bad fix which we did fix, so need to understand if this is another issue.


On Mar 26, 2018 11:10 PM, "Vishal Thapar" <vishal.thapar@...> wrote:

Don’t see the socket issue, but Sathwick is now seeing some other issue. InfraUtils is throwing lots of Deadlock Exceptions:


Cool. Could you please open a new JIRA and include the full stack trace it (should) show after (or before?) that "Deadlocked thread stack trace: " ERROR log? From what little I understand so far, this JIRA should probably be openend in project controller. Although it could be caused by a mistake made in application code. Can you please clarify exactly what caused this to start happening? If it's a particular change, it would be interesting to look more closely at it does anything special (wrong) re. locking and threading etc.

Tx,
M.
--
Michael Vorburger, Red Hat
vorburger@... | IRC: vorburger @freenode | ~ = http://vorburger.ch

 

2018-03-27T05:53:29,133 | INFO  | awaitility[checkBundleDiagInfos] | SystemReadyImpl                  | 360 - org.opendaylight.infrautils.ready-impl - 1.4.0.SNAPSHOT | checkBundleDiagInfos: Elapsed time 206s, remaining time 93s, diag: Booting {Installed=0, Resolved=7, Unknown=0, GracePeriod=1, Waiting=0, Starting=0, Active=557, Stopping=0, Failure=0}

2018-03-27T05:53:29,245 | INFO  | opendaylight-cluster-data-notification-dispatcher-66 | SfcOvsNodeDataListener           | 468 - org.opendaylight.sfc.ovs - 0.8.0.SNAPSHOT | SfcOvsNodeDataListener::add() bridge name [OvsdbBridgeName{_value=BR1}] DPID [DatapathId{_value=00:00:f6:3e:dc:fc:7b:4e}]

2018-03-27T05:53:29,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Oh nose - there are 3 deadlocked threads!! :-(

2018-03-27T05:53:29,253 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace: opendaylight-cluster-data-notification-dispatcher-55 locked on java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86 (owned by opendaylight-cluster-data-akka.actor.default-dispatcher-2):

        at sun.misc.Unsafe.park(Native Method)

 

 

https://logs.opendaylight.org/sandbox/vex-yul-odl-jenkins-2/genius-csit-3node-sathwikgate-all-fluorine/5/odl1_karaf.log.gz

 

Regards,

Vishal.

 

From: ovsdb-dev-bounces@lists.opendaylight.org [mailto:ovsdb-dev-bounces@lists.opendaylight.org] On Behalf Of Sam Hague
Sent: 26 March 2018 21:17
To: Valentina Krasnobaeva <valentina.krasnobaeva@6wind.com>
Cc: Kiran N Upadhyaya <kiran.n.upadhyaya@ericsson.com>; Guillaume Gaudonville <guillaume.gaudonville@6wind.com>; <ovsdb-dev@....org> <ovsdb-dev@....org>
Subject: Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest OXYGEN/Fluorine) in attempt to connect to OVS

 

Yes, that patch was the issue. Here is a fix that should merge soon.

 

 

On Mon, Mar 26, 2018 at 8:17 AM, Valentina Krasnobaeva <valentina.krasnobaeva@6wind.com> wrote:

Hi OVSDB Team, Hi All

Since today's morning  with latest builds of OXYGEN and Fluorine (all branches were rebased) I always see this error in ovsdb log,

when devstack executes "sudo ovs-vsctl set-manager tcp:10.100.0.51:6640":

2018-03-26T16:37:14,404 | WARN  | nioEventLoopGroup-5-1 | ChannelInitializer               | 60 - io.netty.common - 4.1.16.Final | Failed to initialize a channel. Closing: [id: 0x2976135c, L:/10.100.0.51:6640 - R:/10.100.0.11:34774]
java.lang.NullPointerException: null
        at org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:325) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
        at org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:321) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
        at io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:113) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:105) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:606) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.access$000(DefaultChannelPipeline.java:46) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline$PendingHandlerAddedTask.execute(DefaultChannelPipeline.java:1412) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.callHandlerAddedForAllHandlers(DefaultChannelPipeline.java:1131) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.invokeHandlerAddedIfNeeded(DefaultChannelPipeline.java:656) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:510) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:423) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:482) [63:io.netty.transport:4.1.16.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) [60:io.netty.common:4.1.16.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403) [60:io.netty.common:4.1.16.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463) [63:io.netty.transport:4.1.16.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [60:io.netty.common:4.1.16.Final]
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138) [60:io.netty.common:4.1.16.Final]
        at java.lang.Thread.run(Thread.java:748) [?:?]
2018-03-26T16:37:15,410 | INFO  | nioEventLoopGroup-4-1 | LoggingHandler                   | 60 - io.netty.common - 4.1.16.Final | [id: 0x57fd99d7, L:/0:0:0:0:0:0:0:0:6640] READ: [id: 0x1c79d0b8, L:/10.100.0.51:6640 - R:/10.100.0.11:34776]
   

ovsdb process:

root     16311  0.0  0.0  28768  2152 ?        S<s  17:05   0:00 ovsdb-server: monitoring pid 16312 (healthy)                                                                                                                                                                                                                                                                                                                                                       
root     16312  0.0  0.0  29040  5304 ?        S<   17:05   0:00  \_ ovsdb-server /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor

from its log:

2018-03-26T15:09:00.632Z|00005|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:00.632Z|00006|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:00.637Z|00007|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:01.639Z|00008|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:01.640Z|00009|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:01.642Z|00010|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:01.642Z|00011|reconnect|INFO|tcp:10.100.0.51:6640: waiting 2 seconds before reconnect
2018-03-26T15:09:03.643Z|00012|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:03.644Z|00013|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:03.648Z|00014|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:03.648Z|00015|reconnect|INFO|tcp:10.100.0.51:6640: waiting 4 seconds before reconnect
2018-03-26T15:09:07.648Z|00016|reconnect|INFO|tcp:10.100.0.51:6640: connecting...

and of course, there is no br-int on ovsdb:

 # ovs-vsctl show
d3b8068d-fcfb-4654-b6fa-5fea41d9922d
    Manager "tcp:10.100.0.51:6640"
    ovs_version: "2.6.1"

I suppose, that this latest patch from ovsdb could provoke the issue:

$ git show 77f0ef93babd
commit 77f0ef93babd9a1fd4a7f3508675830882c374aa (HEAD, origin/stable/oxygen)
Author: Tim Rozet <trozet@...>
Date:   Fri Mar 23 14:23:40 2018 -0400

    OVSDB-449 Fixes loading keystores dynamically per connection
   
    Currently when OVSDB manager is started with SSL, the SSL Context is
    only fetched once, which inherently means the keystores are only read
    that one time.  This patch changes the behavior so that the SSL Context
    is fetched everytime a new connection is opened, which in turn will
    update the keystore data.
   
    Change-Id: Iaafbd34afcb5f4708b48eb3d64eca14ef0a107e8
    Signed-off-by: Tim Rozet <trozet@...>
    (cherry picked from commit e4a170988175b98e2411782f2a85b63edb64d4cf)

Has anyone already had this issue ? If so, please, share you workaround. I use openvswitch  2.6.1-1.

Thanks for any help,

Regards


_______________________________________________
ovsdb-dev mailing list
ovsdb-dev@....org
https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev

 






Sam Hague
 

Curious but why is the genius 3node bringing in sfc features, is it because this is an integration-all test? Even so, I dont think it makes sense to do this.


On Tue, Mar 27, 2018, 9:10 AM Michael Vorburger <vorburger@...> wrote:
On Tue, Mar 27, 2018 at 5:51 PM, Sam Hague <shague@...> wrote:
Adding Tim.

Are the logs below from the original job valentina mentioned or the latest ? The original failure was from a bad fix which we did fix, so need to understand if this is another issue.


On Mar 26, 2018 11:10 PM, "Vishal Thapar" <vishal.thapar@...> wrote:

Don’t see the socket issue, but Sathwick is now seeing some other issue. InfraUtils is throwing lots of Deadlock Exceptions:


Cool. Could you please open a new JIRA and include the full stack trace it (should) show after (or before?) that "Deadlocked thread stack trace: " ERROR log? From what little I understand so far, this JIRA should probably be openend in project controller. Although it could be caused by a mistake made in application code. Can you please clarify exactly what caused this to start happening? If it's a particular change, it would be interesting to look more closely at it does anything special (wrong) re. locking and threading etc.

Tx,
M.
--
Michael Vorburger, Red Hat
vorburger@... | IRC: vorburger @freenode | ~ = http://vorburger.ch

 

2018-03-27T05:53:29,133 | INFO  | awaitility[checkBundleDiagInfos] | SystemReadyImpl                  | 360 - org.opendaylight.infrautils.ready-impl - 1.4.0.SNAPSHOT | checkBundleDiagInfos: Elapsed time 206s, remaining time 93s, diag: Booting {Installed=0, Resolved=7, Unknown=0, GracePeriod=1, Waiting=0, Starting=0, Active=557, Stopping=0, Failure=0}

2018-03-27T05:53:29,245 | INFO  | opendaylight-cluster-data-notification-dispatcher-66 | SfcOvsNodeDataListener           | 468 - org.opendaylight.sfc.ovs - 0.8.0.SNAPSHOT | SfcOvsNodeDataListener::add() bridge name [OvsdbBridgeName{_value=BR1}] DPID [DatapathId{_value=00:00:f6:3e:dc:fc:7b:4e}]

2018-03-27T05:53:29,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Oh nose - there are 3 deadlocked threads!! :-(

2018-03-27T05:53:29,253 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace: opendaylight-cluster-data-notification-dispatcher-55 locked on java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86 (owned by opendaylight-cluster-data-akka.actor.default-dispatcher-2):

        at sun.misc.Unsafe.park(Native Method)

 

 

https://logs.opendaylight.org/sandbox/vex-yul-odl-jenkins-2/genius-csit-3node-sathwikgate-all-fluorine/5/odl1_karaf.log.gz

 

Regards,

Vishal.

 

From: ovsdb-dev-bounces@... [mailto:ovsdb-dev-bounces@...] On Behalf Of Sam Hague
Sent: 26 March 2018 21:17
To: Valentina Krasnobaeva <valentina.krasnobaeva@...>
Cc: Kiran N Upadhyaya <kiran.n.upadhyaya@...>; Guillaume Gaudonville <guillaume.gaudonville@...>; <ovsdb-dev@...> <ovsdb-dev@...>
Subject: Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest OXYGEN/Fluorine) in attempt to connect to OVS

 

Yes, that patch was the issue. Here is a fix that should merge soon.

 

 

On Mon, Mar 26, 2018 at 8:17 AM, Valentina Krasnobaeva <valentina.krasnobaeva@...> wrote:

Hi OVSDB Team, Hi All

Since today's morning  with latest builds of OXYGEN and Fluorine (all branches were rebased) I always see this error in ovsdb log,

when devstack executes "sudo ovs-vsctl set-manager tcp:10.100.0.51:6640":

2018-03-26T16:37:14,404 | WARN  | nioEventLoopGroup-5-1 | ChannelInitializer               | 60 - io.netty.common - 4.1.16.Final | Failed to initialize a channel. Closing: [id: 0x2976135c, L:/10.100.0.51:6640 - R:/10.100.0.11:34774]
java.lang.NullPointerException: null
        at org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:325) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
        at org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:321) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
        at io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:113) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:105) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:606) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.access$000(DefaultChannelPipeline.java:46) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline$PendingHandlerAddedTask.execute(DefaultChannelPipeline.java:1412) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.callHandlerAddedForAllHandlers(DefaultChannelPipeline.java:1131) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.invokeHandlerAddedIfNeeded(DefaultChannelPipeline.java:656) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:510) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:423) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:482) [63:io.netty.transport:4.1.16.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) [60:io.netty.common:4.1.16.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403) [60:io.netty.common:4.1.16.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463) [63:io.netty.transport:4.1.16.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [60:io.netty.common:4.1.16.Final]
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138) [60:io.netty.common:4.1.16.Final]
        at java.lang.Thread.run(Thread.java:748) [?:?]
2018-03-26T16:37:15,410 | INFO  | nioEventLoopGroup-4-1 | LoggingHandler                   | 60 - io.netty.common - 4.1.16.Final | [id: 0x57fd99d7, L:/0:0:0:0:0:0:0:0:6640] READ: [id: 0x1c79d0b8, L:/10.100.0.51:6640 - R:/10.100.0.11:34776]
   

ovsdb process:

root     16311  0.0  0.0  28768  2152 ?        S<s  17:05   0:00 ovsdb-server: monitoring pid 16312 (healthy)                                                                                                                                                                                                                                                                                                                                                       
root     16312  0.0  0.0  29040  5304 ?        S<   17:05   0:00  \_ ovsdb-server /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor

from its log:

2018-03-26T15:09:00.632Z|00005|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:00.632Z|00006|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:00.637Z|00007|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:01.639Z|00008|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:01.640Z|00009|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:01.642Z|00010|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:01.642Z|00011|reconnect|INFO|tcp:10.100.0.51:6640: waiting 2 seconds before reconnect
2018-03-26T15:09:03.643Z|00012|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:03.644Z|00013|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:03.648Z|00014|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:03.648Z|00015|reconnect|INFO|tcp:10.100.0.51:6640: waiting 4 seconds before reconnect
2018-03-26T15:09:07.648Z|00016|reconnect|INFO|tcp:10.100.0.51:6640: connecting...

and of course, there is no br-int on ovsdb:

 # ovs-vsctl show
d3b8068d-fcfb-4654-b6fa-5fea41d9922d
    Manager "tcp:10.100.0.51:6640"
    ovs_version: "2.6.1"

I suppose, that this latest patch from ovsdb could provoke the issue:

$ git show 77f0ef93babd
commit 77f0ef93babd9a1fd4a7f3508675830882c374aa (HEAD, origin/stable/oxygen)
Author: Tim Rozet <trozet@...>
Date:   Fri Mar 23 14:23:40 2018 -0400

    OVSDB-449 Fixes loading keystores dynamically per connection
   
    Currently when OVSDB manager is started with SSL, the SSL Context is
    only fetched once, which inherently means the keystores are only read
    that one time.  This patch changes the behavior so that the SSL Context
    is fetched everytime a new connection is opened, which in turn will
    update the keystore data.
   
    Change-Id: Iaafbd34afcb5f4708b48eb3d64eca14ef0a107e8
    Signed-off-by: Tim Rozet <trozet@...>
    (cherry picked from commit e4a170988175b98e2411782f2a85b63edb64d4cf)

Has anyone already had this issue ? If so, please, share you workaround. I use openvswitch  2.6.1-1.

Thanks for any help,

Regards


_______________________________________________
ovsdb-dev mailing list
ovsdb-dev@...
https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev

 





Michael Vorburger <vorburger@...>
 

On Tue, Mar 27, 2018 at 5:51 PM, Sam Hague <shague@...> wrote:
Adding Tim.

Are the logs below from the original job valentina mentioned or the latest ? The original failure was from a bad fix which we did fix, so need to understand if this is another issue.


On Mar 26, 2018 11:10 PM, "Vishal Thapar" <vishal.thapar@...> wrote:

Don’t see the socket issue, but Sathwick is now seeing some other issue. InfraUtils is throwing lots of Deadlock Exceptions:


Cool. Could you please open a new JIRA and include the full stack trace it (should) show after (or before?) that "Deadlocked thread stack trace: " ERROR log? From what little I understand so far, this JIRA should probably be openend in project controller. Although it could be caused by a mistake made in application code. Can you please clarify exactly what caused this to start happening? If it's a particular change, it would be interesting to look more closely at it does anything special (wrong) re. locking and threading etc.

Tx,
M.
--
Michael Vorburger, Red Hat
vorburger@... | IRC: vorburger @freenode | ~ = http://vorburger.ch

 

2018-03-27T05:53:29,133 | INFO  | awaitility[checkBundleDiagInfos] | SystemReadyImpl                  | 360 - org.opendaylight.infrautils.ready-impl - 1.4.0.SNAPSHOT | checkBundleDiagInfos: Elapsed time 206s, remaining time 93s, diag: Booting {Installed=0, Resolved=7, Unknown=0, GracePeriod=1, Waiting=0, Starting=0, Active=557, Stopping=0, Failure=0}

2018-03-27T05:53:29,245 | INFO  | opendaylight-cluster-data-notification-dispatcher-66 | SfcOvsNodeDataListener           | 468 - org.opendaylight.sfc.ovs - 0.8.0.SNAPSHOT | SfcOvsNodeDataListener::add() bridge name [OvsdbBridgeName{_value=BR1}] DPID [DatapathId{_value=00:00:f6:3e:dc:fc:7b:4e}]

2018-03-27T05:53:29,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Oh nose - there are 3 deadlocked threads!! :-(

2018-03-27T05:53:29,253 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace: opendaylight-cluster-data-notification-dispatcher-55 locked on java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86 (owned by opendaylight-cluster-data-akka.actor.default-dispatcher-2):

        at sun.misc.Unsafe.park(Native Method)

 

 

https://logs.opendaylight.org/sandbox/vex-yul-odl-jenkins-2/genius-csit-3node-sathwikgate-all-fluorine/5/odl1_karaf.log.gz

 

Regards,

Vishal.

 

From: ovsdb-dev-bounces@lists.opendaylight.org [mailto:ovsdb-dev-bounces@lists.opendaylight.org] On Behalf Of Sam Hague
Sent: 26 March 2018 21:17
To: Valentina Krasnobaeva <valentina.krasnobaeva@6wind.com>
Cc: Kiran N Upadhyaya <kiran.n.upadhyaya@ericsson.com>; Guillaume Gaudonville <guillaume.gaudonville@6wind.com>; <ovsdb-dev@....org> <ovsdb-dev@....org>
Subject: Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest OXYGEN/Fluorine) in attempt to connect to OVS

 

Yes, that patch was the issue. Here is a fix that should merge soon.

 

 

On Mon, Mar 26, 2018 at 8:17 AM, Valentina Krasnobaeva <valentina.krasnobaeva@6wind.com> wrote:

Hi OVSDB Team, Hi All

Since today's morning  with latest builds of OXYGEN and Fluorine (all branches were rebased) I always see this error in ovsdb log,

when devstack executes "sudo ovs-vsctl set-manager tcp:10.100.0.51:6640":

2018-03-26T16:37:14,404 | WARN  | nioEventLoopGroup-5-1 | ChannelInitializer               | 60 - io.netty.common - 4.1.16.Final | Failed to initialize a channel. Closing: [id: 0x2976135c, L:/10.100.0.51:6640 - R:/10.100.0.11:34774]
java.lang.NullPointerException: null
        at org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:325) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
        at org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:321) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
        at io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:113) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:105) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:606) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.access$000(DefaultChannelPipeline.java:46) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline$PendingHandlerAddedTask.execute(DefaultChannelPipeline.java:1412) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.callHandlerAddedForAllHandlers(DefaultChannelPipeline.java:1131) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.invokeHandlerAddedIfNeeded(DefaultChannelPipeline.java:656) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:510) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:423) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:482) [63:io.netty.transport:4.1.16.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) [60:io.netty.common:4.1.16.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403) [60:io.netty.common:4.1.16.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463) [63:io.netty.transport:4.1.16.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [60:io.netty.common:4.1.16.Final]
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138) [60:io.netty.common:4.1.16.Final]
        at java.lang.Thread.run(Thread.java:748) [?:?]
2018-03-26T16:37:15,410 | INFO  | nioEventLoopGroup-4-1 | LoggingHandler                   | 60 - io.netty.common - 4.1.16.Final | [id: 0x57fd99d7, L:/0:0:0:0:0:0:0:0:6640] READ: [id: 0x1c79d0b8, L:/10.100.0.51:6640 - R:/10.100.0.11:34776]
   

ovsdb process:

root     16311  0.0  0.0  28768  2152 ?        S<s  17:05   0:00 ovsdb-server: monitoring pid 16312 (healthy)                                                                                                                                                                                                                                                                                                                                                       
root     16312  0.0  0.0  29040  5304 ?        S<   17:05   0:00  \_ ovsdb-server /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor

from its log:

2018-03-26T15:09:00.632Z|00005|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:00.632Z|00006|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:00.637Z|00007|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:01.639Z|00008|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:01.640Z|00009|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:01.642Z|00010|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:01.642Z|00011|reconnect|INFO|tcp:10.100.0.51:6640: waiting 2 seconds before reconnect
2018-03-26T15:09:03.643Z|00012|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:03.644Z|00013|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:03.648Z|00014|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:03.648Z|00015|reconnect|INFO|tcp:10.100.0.51:6640: waiting 4 seconds before reconnect
2018-03-26T15:09:07.648Z|00016|reconnect|INFO|tcp:10.100.0.51:6640: connecting...

and of course, there is no br-int on ovsdb:

 # ovs-vsctl show
d3b8068d-fcfb-4654-b6fa-5fea41d9922d
    Manager "tcp:10.100.0.51:6640"
    ovs_version: "2.6.1"

I suppose, that this latest patch from ovsdb could provoke the issue:

$ git show 77f0ef93babd
commit 77f0ef93babd9a1fd4a7f3508675830882c374aa (HEAD, origin/stable/oxygen)
Author: Tim Rozet <trozet@...>
Date:   Fri Mar 23 14:23:40 2018 -0400

    OVSDB-449 Fixes loading keystores dynamically per connection
   
    Currently when OVSDB manager is started with SSL, the SSL Context is
    only fetched once, which inherently means the keystores are only read
    that one time.  This patch changes the behavior so that the SSL Context
    is fetched everytime a new connection is opened, which in turn will
    update the keystore data.
   
    Change-Id: Iaafbd34afcb5f4708b48eb3d64eca14ef0a107e8
    Signed-off-by: Tim Rozet <trozet@...>
    (cherry picked from commit e4a170988175b98e2411782f2a85b63edb64d4cf)

Has anyone already had this issue ? If so, please, share you workaround. I use openvswitch  2.6.1-1.

Thanks for any help,

Regards


_______________________________________________
ovsdb-dev mailing list
ovsdb-dev@....org
https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev

 





Sam Hague
 

Adding Tim.

Are the logs below from the original job valentina mentioned or the latest ? The original failure was from a bad fix which we did fix, so need to understand if this is another issue.

On Mar 26, 2018 11:10 PM, "Vishal Thapar" <vishal.thapar@...> wrote:

Don’t see the socket issue, but Sathwick is now seeing some other issue. InfraUtils is throwing lots of Deadlock Exceptions:

 

2018-03-27T05:53:29,133 | INFO  | awaitility[checkBundleDiagInfos] | SystemReadyImpl                  | 360 - org.opendaylight.infrautils.ready-impl - 1.4.0.SNAPSHOT | checkBundleDiagInfos: Elapsed time 206s, remaining time 93s, diag: Booting {Installed=0, Resolved=7, Unknown=0, GracePeriod=1, Waiting=0, Starting=0, Active=557, Stopping=0, Failure=0}

2018-03-27T05:53:29,245 | INFO  | opendaylight-cluster-data-notification-dispatcher-66 | SfcOvsNodeDataListener           | 468 - org.opendaylight.sfc.ovs - 0.8.0.SNAPSHOT | SfcOvsNodeDataListener::add() bridge name [OvsdbBridgeName{_value=BR1}] DPID [DatapathId{_value=00:00:f6:3e:dc:fc:7b:4e}]

2018-03-27T05:53:29,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Oh nose - there are 3 deadlocked threads!! :-(

2018-03-27T05:53:29,253 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace: opendaylight-cluster-data-notification-dispatcher-55 locked on java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86 (owned by opendaylight-cluster-data-akka.actor.default-dispatcher-2):

        at sun.misc.Unsafe.park(Native Method)

 

 

https://logs.opendaylight.org/sandbox/vex-yul-odl-jenkins-2/genius-csit-3node-sathwikgate-all-fluorine/5/odl1_karaf.log.gz

 

Regards,

Vishal.

 

From: ovsdb-dev-bounces@... [mailto:ovsdb-dev-bounces@...] On Behalf Of Sam Hague
Sent: 26 March 2018 21:17
To: Valentina Krasnobaeva <valentina.krasnobaeva@...>
Cc: Kiran N Upadhyaya <kiran.n.upadhyaya@...>; Guillaume Gaudonville <guillaume.gaudonville@...>; <ovsdb-dev@...> <ovsdb-dev@...>
Subject: Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest OXYGEN/Fluorine) in attempt to connect to OVS

 

Yes, that patch was the issue. Here is a fix that should merge soon.

 

 

On Mon, Mar 26, 2018 at 8:17 AM, Valentina Krasnobaeva <valentina.krasnobaeva@...> wrote:

Hi OVSDB Team, Hi All

Since today's morning  with latest builds of OXYGEN and Fluorine (all branches were rebased) I always see this error in ovsdb log,

when devstack executes "sudo ovs-vsctl set-manager tcp:10.100.0.51:6640":

2018-03-26T16:37:14,404 | WARN  | nioEventLoopGroup-5-1 | ChannelInitializer               | 60 - io.netty.common - 4.1.16.Final | Failed to initialize a channel. Closing: [id: 0x2976135c, L:/10.100.0.51:6640 - R:/10.100.0.11:34774]
java.lang.NullPointerException: null
        at org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:325) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
        at org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:321) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
        at io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:113) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:105) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:606) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.access$000(DefaultChannelPipeline.java:46) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline$PendingHandlerAddedTask.execute(DefaultChannelPipeline.java:1412) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.callHandlerAddedForAllHandlers(DefaultChannelPipeline.java:1131) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.invokeHandlerAddedIfNeeded(DefaultChannelPipeline.java:656) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:510) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:423) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:482) [63:io.netty.transport:4.1.16.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) [60:io.netty.common:4.1.16.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403) [60:io.netty.common:4.1.16.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463) [63:io.netty.transport:4.1.16.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [60:io.netty.common:4.1.16.Final]
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138) [60:io.netty.common:4.1.16.Final]
        at java.lang.Thread.run(Thread.java:748) [?:?]
2018-03-26T16:37:15,410 | INFO  | nioEventLoopGroup-4-1 | LoggingHandler                   | 60 - io.netty.common - 4.1.16.Final | [id: 0x57fd99d7, L:/0:0:0:0:0:0:0:0:6640] READ: [id: 0x1c79d0b8, L:/10.100.0.51:6640 - R:/10.100.0.11:34776]
   

ovsdb process:

root     16311  0.0  0.0  28768  2152 ?        S<s  17:05   0:00 ovsdb-server: monitoring pid 16312 (healthy)                                                                                                                                                                                                                                                                                                                                                       
root     16312  0.0  0.0  29040  5304 ?        S<   17:05   0:00  \_ ovsdb-server /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor

from its log:

2018-03-26T15:09:00.632Z|00005|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:00.632Z|00006|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:00.637Z|00007|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:01.639Z|00008|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:01.640Z|00009|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:01.642Z|00010|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:01.642Z|00011|reconnect|INFO|tcp:10.100.0.51:6640: waiting 2 seconds before reconnect
2018-03-26T15:09:03.643Z|00012|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:03.644Z|00013|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:03.648Z|00014|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:03.648Z|00015|reconnect|INFO|tcp:10.100.0.51:6640: waiting 4 seconds before reconnect
2018-03-26T15:09:07.648Z|00016|reconnect|INFO|tcp:10.100.0.51:6640: connecting...

and of course, there is no br-int on ovsdb:

 # ovs-vsctl show
d3b8068d-fcfb-4654-b6fa-5fea41d9922d
    Manager "tcp:10.100.0.51:6640"
    ovs_version: "2.6.1"

I suppose, that this latest patch from ovsdb could provoke the issue:

$ git show 77f0ef93babd
commit 77f0ef93babd9a1fd4a7f3508675830882c374aa (HEAD, origin/stable/oxygen)
Author: Tim Rozet <trozet@...>
Date:   Fri Mar 23 14:23:40 2018 -0400

    OVSDB-449 Fixes loading keystores dynamically per connection
   
    Currently when OVSDB manager is started with SSL, the SSL Context is
    only fetched once, which inherently means the keystores are only read
    that one time.  This patch changes the behavior so that the SSL Context
    is fetched everytime a new connection is opened, which in turn will
    update the keystore data.
   
    Change-Id: Iaafbd34afcb5f4708b48eb3d64eca14ef0a107e8
    Signed-off-by: Tim Rozet <trozet@...>
    (cherry picked from commit e4a170988175b98e2411782f2a85b63edb64d4cf)

Has anyone already had this issue ? If so, please, share you workaround. I use openvswitch  2.6.1-1.

Thanks for any help,

Regards


_______________________________________________
ovsdb-dev mailing list
ovsdb-dev@...
https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev

 



Vishal Thapar <vishal.thapar@...>
 

Don’t see the socket issue, but Sathwick is now seeing some other issue. InfraUtils is throwing lots of Deadlock Exceptions:

 

2018-03-27T05:53:29,133 | INFO  | awaitility[checkBundleDiagInfos] | SystemReadyImpl                  | 360 - org.opendaylight.infrautils.ready-impl - 1.4.0.SNAPSHOT | checkBundleDiagInfos: Elapsed time 206s, remaining time 93s, diag: Booting {Installed=0, Resolved=7, Unknown=0, GracePeriod=1, Waiting=0, Starting=0, Active=557, Stopping=0, Failure=0}

2018-03-27T05:53:29,245 | INFO  | opendaylight-cluster-data-notification-dispatcher-66 | SfcOvsNodeDataListener           | 468 - org.opendaylight.sfc.ovs - 0.8.0.SNAPSHOT | SfcOvsNodeDataListener::add() bridge name [OvsdbBridgeName{_value=BR1}] DPID [DatapathId{_value=00:00:f6:3e:dc:fc:7b:4e}]

2018-03-27T05:53:29,252 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Oh nose - there are 3 deadlocked threads!! :-(

2018-03-27T05:53:29,253 | ERROR | infrautils.metrics.ThreadsWatcher-0 | ThreadsWatcher                   | 358 - org.opendaylight.infrautils.metrics-impl - 1.4.0.SNAPSHOT | Deadlocked thread stack trace: opendaylight-cluster-data-notification-dispatcher-55 locked on java.util.concurrent.locks.ReentrantLock$FairSync@2da5ef86 (owned by opendaylight-cluster-data-akka.actor.default-dispatcher-2):

        at sun.misc.Unsafe.park(Native Method)

 

 

https://logs.opendaylight.org/sandbox/vex-yul-odl-jenkins-2/genius-csit-3node-sathwikgate-all-fluorine/5/odl1_karaf.log.gz

 

Regards,

Vishal.

 

From: ovsdb-dev-bounces@... [mailto:ovsdb-dev-bounces@...] On Behalf Of Sam Hague
Sent: 26 March 2018 21:17
To: Valentina Krasnobaeva <valentina.krasnobaeva@...>
Cc: Kiran N Upadhyaya <kiran.n.upadhyaya@...>; Guillaume Gaudonville <guillaume.gaudonville@...>; <ovsdb-dev@...> <ovsdb-dev@...>
Subject: Re: [ovsdb-dev] [ovsdb] NullPointerException in OvsdbConnectionService.java:325 (latest OXYGEN/Fluorine) in attempt to connect to OVS

 

Yes, that patch was the issue. Here is a fix that should merge soon.

 

 

On Mon, Mar 26, 2018 at 8:17 AM, Valentina Krasnobaeva <valentina.krasnobaeva@...> wrote:

Hi OVSDB Team, Hi All

Since today's morning  with latest builds of OXYGEN and Fluorine (all branches were rebased) I always see this error in ovsdb log,

when devstack executes "sudo ovs-vsctl set-manager tcp:10.100.0.51:6640":

2018-03-26T16:37:14,404 | WARN  | nioEventLoopGroup-5-1 | ChannelInitializer               | 60 - io.netty.common - 4.1.16.Final | Failed to initialize a channel. Closing: [id: 0x2976135c, L:/10.100.0.51:6640 - R:/10.100.0.11:34774]
java.lang.NullPointerException: null
        at org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:325) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
        at org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:321) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
        at io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:113) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:105) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:606) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.access$000(DefaultChannelPipeline.java:46) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline$PendingHandlerAddedTask.execute(DefaultChannelPipeline.java:1412) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.callHandlerAddedForAllHandlers(DefaultChannelPipeline.java:1131) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.invokeHandlerAddedIfNeeded(DefaultChannelPipeline.java:656) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:510) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:423) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:482) [63:io.netty.transport:4.1.16.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) [60:io.netty.common:4.1.16.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403) [60:io.netty.common:4.1.16.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463) [63:io.netty.transport:4.1.16.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [60:io.netty.common:4.1.16.Final]
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138) [60:io.netty.common:4.1.16.Final]
        at java.lang.Thread.run(Thread.java:748) [?:?]
2018-03-26T16:37:15,410 | INFO  | nioEventLoopGroup-4-1 | LoggingHandler                   | 60 - io.netty.common - 4.1.16.Final | [id: 0x57fd99d7, L:/0:0:0:0:0:0:0:0:6640] READ: [id: 0x1c79d0b8, L:/10.100.0.51:6640 - R:/10.100.0.11:34776]
   

ovsdb process:

root     16311  0.0  0.0  28768  2152 ?        S<s  17:05   0:00 ovsdb-server: monitoring pid 16312 (healthy)                                                                                                                                                                                                                                                                                                                                                       
root     16312  0.0  0.0  29040  5304 ?        S<   17:05   0:00  \_ ovsdb-server /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor

from its log:

2018-03-26T15:09:00.632Z|00005|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:00.632Z|00006|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:00.637Z|00007|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:01.639Z|00008|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:01.640Z|00009|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:01.642Z|00010|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:01.642Z|00011|reconnect|INFO|tcp:10.100.0.51:6640: waiting 2 seconds before reconnect
2018-03-26T15:09:03.643Z|00012|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:03.644Z|00013|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:03.648Z|00014|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:03.648Z|00015|reconnect|INFO|tcp:10.100.0.51:6640: waiting 4 seconds before reconnect
2018-03-26T15:09:07.648Z|00016|reconnect|INFO|tcp:10.100.0.51:6640: connecting...

and of course, there is no br-int on ovsdb:

 # ovs-vsctl show
d3b8068d-fcfb-4654-b6fa-5fea41d9922d
    Manager "tcp:10.100.0.51:6640"
    ovs_version: "2.6.1"

I suppose, that this latest patch from ovsdb could provoke the issue:

$ git show 77f0ef93babd
commit 77f0ef93babd9a1fd4a7f3508675830882c374aa (HEAD, origin/stable/oxygen)
Author: Tim Rozet <trozet@...>
Date:   Fri Mar 23 14:23:40 2018 -0400

    OVSDB-449 Fixes loading keystores dynamically per connection
   
    Currently when OVSDB manager is started with SSL, the SSL Context is
    only fetched once, which inherently means the keystores are only read
    that one time.  This patch changes the behavior so that the SSL Context
    is fetched everytime a new connection is opened, which in turn will
    update the keystore data.
   
    Change-Id: Iaafbd34afcb5f4708b48eb3d64eca14ef0a107e8
    Signed-off-by: Tim Rozet <trozet@...>
    (cherry picked from commit e4a170988175b98e2411782f2a85b63edb64d4cf)

Has anyone already had this issue ? If so, please, share you workaround. I use openvswitch  2.6.1-1.

Thanks for any help,

Regards


_______________________________________________
ovsdb-dev mailing list
ovsdb-dev@...
https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev

 


Sam Hague
 

Yes, that patch was the issue. Here is a fix that should merge soon.


On Mon, Mar 26, 2018 at 8:17 AM, Valentina Krasnobaeva <valentina.krasnobaeva@...> wrote:
Hi OVSDB Team, Hi All

Since today's morning  with latest builds of OXYGEN and Fluorine (all branches were rebased) I always see this error in ovsdb log,

when devstack executes "sudo ovs-vsctl set-manager tcp:10.100.0.51:6640":

2018-03-26T16:37:14,404 | WARN  | nioEventLoopGroup-5-1 | ChannelInitializer               | 60 - io.netty.common - 4.1.16.Final | Failed to initialize a channel. Closing: [id: 0x2976135c, L:/10.100.0.51:6640 - R:/10.100.0.11:34774]
java.lang.NullPointerException: null
        at org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:325) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
        at org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:321) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
        at io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:113) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:105) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:606) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.access$000(DefaultChannelPipeline.java:46) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline$PendingHandlerAddedTask.execute(DefaultChannelPipeline.java:1412) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.callHandlerAddedForAllHandlers(DefaultChannelPipeline.java:1131) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.invokeHandlerAddedIfNeeded(DefaultChannelPipeline.java:656) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:510) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:423) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:482) [63:io.netty.transport:4.1.16.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) [60:io.netty.common:4.1.16.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403) [60:io.netty.common:4.1.16.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463) [63:io.netty.transport:4.1.16.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [60:io.netty.common:4.1.16.Final]
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138) [60:io.netty.common:4.1.16.Final]
        at java.lang.Thread.run(Thread.java:748) [?:?]
2018-03-26T16:37:15,410 | INFO  | nioEventLoopGroup-4-1 | LoggingHandler                   | 60 - io.netty.common - 4.1.16.Final | [id: 0x57fd99d7, L:/0:0:0:0:0:0:0:0:6640] READ: [id: 0x1c79d0b8, L:/10.100.0.51:6640 - R:/10.100.0.11:34776]
   
ovsdb process:

root     16311  0.0  0.0  28768  2152 ?        S<s  17:05   0:00 ovsdb-server: monitoring pid 16312 (healthy)                                                                                                                                                                                                                                                                                                                                                       
root     16312  0.0  0.0  29040  5304 ?        S<   17:05   0:00  \_ ovsdb-server /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor

from its log:

2018-03-26T15:09:00.632Z|00005|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:00.632Z|00006|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:00.637Z|00007|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:01.639Z|00008|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:01.640Z|00009|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:01.642Z|00010|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:01.642Z|00011|reconnect|INFO|tcp:10.100.0.51:6640: waiting 2 seconds before reconnect
2018-03-26T15:09:03.643Z|00012|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:03.644Z|00013|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:03.648Z|00014|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:03.648Z|00015|reconnect|INFO|tcp:10.100.0.51:6640: waiting 4 seconds before reconnect
2018-03-26T15:09:07.648Z|00016|reconnect|INFO|tcp:10.100.0.51:6640: connecting...

and of course, there is no br-int on ovsdb:

 # ovs-vsctl show
d3b8068d-fcfb-4654-b6fa-5fea41d9922d
    Manager "tcp:10.100.0.51:6640"
    ovs_version: "2.6.1"

I suppose, that this latest patch from ovsdb could provoke the issue:

$ git show 77f0ef93babd
commit 77f0ef93babd9a1fd4a7f3508675830882c374aa (HEAD, origin/stable/oxygen)
Author: Tim Rozet <trozet@...>
Date:   Fri Mar 23 14:23:40 2018 -0400

    OVSDB-449 Fixes loading keystores dynamically per connection
   
    Currently when OVSDB manager is started with SSL, the SSL Context is
    only fetched once, which inherently means the keystores are only read
    that one time.  This patch changes the behavior so that the SSL Context
    is fetched everytime a new connection is opened, which in turn will
    update the keystore data.
   
    Change-Id: Iaafbd34afcb5f4708b48eb3d64eca14ef0a107e8
    Signed-off-by: Tim Rozet <trozet@...>
    (cherry picked from commit e4a170988175b98e2411782f2a85b63edb64d4cf)

Has anyone already had this issue ? If so, please, share you workaround. I use openvswitch  2.6.1-1.

Thanks for any help,
Regards

_______________________________________________
ovsdb-dev mailing list
ovsdb-dev@....org
https://lists.opendaylight.org/mailman/listinfo/ovsdb-dev



Valentina Krasnobaeva <valentina.krasnobaeva@...>
 

Hi OVSDB Team, Hi All

Since today's morning  with latest builds of OXYGEN and Fluorine (all branches were rebased) I always see this error in ovsdb log,

when devstack executes "sudo ovs-vsctl set-manager tcp:10.100.0.51:6640":

2018-03-26T16:37:14,404 | WARN  | nioEventLoopGroup-5-1 | ChannelInitializer               | 60 - io.netty.common - 4.1.16.Final | Failed to initialize a channel. Closing: [id: 0x2976135c, L:/10.100.0.51:6640 - R:/10.100.0.11:34774]
java.lang.NullPointerException: null
        at org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:325) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
        at org.opendaylight.ovsdb.lib.impl.OvsdbConnectionService$2.initChannel(OvsdbConnectionService.java:321) [400:org.opendaylight.ovsdb.library:1.6.1.SNAPSHOT]
        at io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:113) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:105) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:606) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.access$000(DefaultChannelPipeline.java:46) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline$PendingHandlerAddedTask.execute(DefaultChannelPipeline.java:1412) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.callHandlerAddedForAllHandlers(DefaultChannelPipeline.java:1131) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.DefaultChannelPipeline.invokeHandlerAddedIfNeeded(DefaultChannelPipeline.java:656) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:510) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:423) [63:io.netty.transport:4.1.16.Final]
        at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:482) [63:io.netty.transport:4.1.16.Final]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) [60:io.netty.common:4.1.16.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403) [60:io.netty.common:4.1.16.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463) [63:io.netty.transport:4.1.16.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [60:io.netty.common:4.1.16.Final]
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138) [60:io.netty.common:4.1.16.Final]
        at java.lang.Thread.run(Thread.java:748) [?:?]
2018-03-26T16:37:15,410 | INFO  | nioEventLoopGroup-4-1 | LoggingHandler                   | 60 - io.netty.common - 4.1.16.Final | [id: 0x57fd99d7, L:/0:0:0:0:0:0:0:0:6640] READ: [id: 0x1c79d0b8, L:/10.100.0.51:6640 - R:/10.100.0.11:34776]
   
ovsdb process:

root     16311  0.0  0.0  28768  2152 ?        S<s  17:05   0:00 ovsdb-server: monitoring pid 16312 (healthy)                                                                                                                                                                                                                                                                                                                                                       
root     16312  0.0  0.0  29040  5304 ?        S<   17:05   0:00  \_ ovsdb-server /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err -vfile:info --remote=punix:/var/run/openvswitch/db.sock --private-key=db:Open_vSwitch,SSL,private_key --certificate=db:Open_vSwitch,SSL,certificate --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir --log-file=/var/log/openvswitch/ovsdb-server.log --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor

from its log:

2018-03-26T15:09:00.632Z|00005|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:00.632Z|00006|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:00.637Z|00007|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:01.639Z|00008|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:01.640Z|00009|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:01.642Z|00010|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:01.642Z|00011|reconnect|INFO|tcp:10.100.0.51:6640: waiting 2 seconds before reconnect
2018-03-26T15:09:03.643Z|00012|reconnect|INFO|tcp:10.100.0.51:6640: connecting...
2018-03-26T15:09:03.644Z|00013|reconnect|INFO|tcp:10.100.0.51:6640: connected
2018-03-26T15:09:03.648Z|00014|reconnect|INFO|tcp:10.100.0.51:6640: connection closed by peer
2018-03-26T15:09:03.648Z|00015|reconnect|INFO|tcp:10.100.0.51:6640: waiting 4 seconds before reconnect
2018-03-26T15:09:07.648Z|00016|reconnect|INFO|tcp:10.100.0.51:6640: connecting...

and of course, there is no br-int on ovsdb:

 # ovs-vsctl show
d3b8068d-fcfb-4654-b6fa-5fea41d9922d
    Manager "tcp:10.100.0.51:6640"
    ovs_version: "2.6.1"

I suppose, that this latest patch from ovsdb could provoke the issue:

$ git show 77f0ef93babd
commit 77f0ef93babd9a1fd4a7f3508675830882c374aa (HEAD, origin/stable/oxygen)
Author: Tim Rozet <trozet@...>
Date:   Fri Mar 23 14:23:40 2018 -0400

    OVSDB-449 Fixes loading keystores dynamically per connection
   
    Currently when OVSDB manager is started with SSL, the SSL Context is
    only fetched once, which inherently means the keystores are only read
    that one time.  This patch changes the behavior so that the SSL Context
    is fetched everytime a new connection is opened, which in turn will
    update the keystore data.
   
    Change-Id: Iaafbd34afcb5f4708b48eb3d64eca14ef0a107e8
    Signed-off-by: Tim Rozet <trozet@...>
    (cherry picked from commit e4a170988175b98e2411782f2a85b63edb64d4cf)

Has anyone already had this issue ? If so, please, share you workaround. I use openvswitch  2.6.1-1.

Thanks for any help,
Regards