Serviceutils distro failure
Michael Vorburger <vorburger@...>
On Tue, Jun 26, 2018 at 1:45 PM, Robert Varga <nite@...> wrote: On 26/06/18 13:40, Faseela K wrote: Oh! So loading https://github.com/opendaylight/genius/blob/5ad5d6bf9b17bb66ddb35e56e895d220795febc6/srm/api/src/main/yang/srm-rpcs.yang together with https://github.com/opendaylight/serviceutils/blob/7162c48453a8c369ca7abbe3e83ae269977c9f96/srm/api/src/main/yang/srm-rpcs.yang could really never work? Would it then be crazy to suggest, certainly more as a medium term enhancement request than a short term blocker bug, that it would be really useful to get a much clearer failure in the log highlighting this, instead the current obscure internal errors seen in https://jira.opendaylight.org/browse/GENIUS-169 ? If I understand what happened here correctly, then -in an ideal world- we really should have failed much earlier, much more clearly... I'm thinking something like : "bundle org.opendaylight.serviceutils.srm-api failed to load because its srm-rpcs.yang steps on the 'srm-rpcs' module name already used by srm-rpcs.yang in org.opendaylight.genius.srm-api", and that should mark the bundle as not activated (which we could pick up on in ready & diagstatus) - kind of thing? That would be cool and very helpful, instead of the current (apparent) silent ignoring of a module it does not like... Faseela, renaming the yang modules in serviceutils:srm is obviously definitely the way to resolve this and move forward! |
Vishal Thapar <vthapar@...>
On Thu, Jun 21, 2018 at 10:50 PM, Michael Vorburger <vorburger@...> wrote:
1. build distribution and run karaf. 2. feature:repo-add mvn:org.opendaylight.integration/odl-integration-all/0.9.0-SNAPSHOT/xml/features3. feature:install odl-integration-all I may be missing 'git pull' to get latest distribution though.
If I manage to stay awake after dinner. At least will fire off a build before going to bed.
|
Michael Vorburger <vorburger@...>
On Thu, Jun 21, 2018 at 6:32 PM, Vishal Thapar <vthapar@...> wrote:
Cool! Can you describe the exact steps one has to take? Like literally "for dummies", describe for any interested in locally reproducing this exactly what you did... ;-)
If you could locally "mvn clean install" rebuild with the patches from https://git.opendaylight.org/gerrit/#/q/topic:MDSAL-354, and share a new log which includes that - that would be very intersting! (There will be [probably MUCH] more after that "IllegalStateException: Schema for interface org.opendaylight.yang.gen.v1.urn.opendaylight.serviceutils.srm.rpcs.rev170711.SrmRpcsService is not available." ) right; that's exactly the same as originally, below.
|
Vishal Thapar <vthapar@...>
Hi Michael, I was able to reproduce in my local distro build, so let me know what logging to enable, I can take a shot. Regards, Vishal. On Thu, Jun 21, 2018 at 9:32 PM, Michael Vorburger <vorburger@...> wrote:
|
Michael Vorburger <vorburger@...>
On Thu, Jun 21, 2018 at 5:42 PM, Faseela K <faseela.k@...> wrote:
OK but so if it's only possible to reproduce this locally with a binary blob that we cannot trust on how it was actually built (which is quite concerning!), and if I understand correctly what you are saying we cannot reproduce this in a local bulid where we could include logging / debug patches, then as a next step I would suggest we wait for the additional logging I've just proposed via https://git.opendaylight.org/gerrit/#/q/topic:MDSAL-354 to be merged to master, and then reproduce this on a new distribution job run including those patches, and look at those logs and take it from there. Sounds like a plan?
|
Michael Vorburger <vorburger@...>
https://jira.opendaylight.org/
|
Faseela K <faseela.k@...>
Hi, The issue cannot be reproduced locally, even when I try to install odl-integration-all from distribution. But whenever I download the distribution built on Jenkins[0] from my patch to enable serviceutils in distribution, the feature:install fails with the same error as below. Thanks, Faseela
From: Michael Vorburger [mailto:vorburger@...]
Sent: Thursday, June 21, 2018 6:56 PM To: controller-dev <controller-dev@...>; mdsal-dev@... Cc: Stephen Kitt <skitt@...>; serviceutils-dev@...; Vishal Thapar <vthapar@...>; Faseela K <faseela.k@...> Subject: Re: Serviceutils distro failure
On Thu, Jun 21, 2018 at 1:41 PM, Michael Vorburger <vorburger@...> wrote:
I had to refresh my own memory by looking at the code in odlparent:bundles-test-lib and infrautils:ready, so just as a refresher: This isn't actually "timing out", in this case. I was wrong to suggest that "somehow the build in distribution is too slow perhaps" - there are no timeouts to increase to get this to pass. It's *NOT* waiting those hard-coded max. 5 minutes. What's happening here is that one of the bundles is in "bundleState = Failure" (grep that log for that), and that fairly quickly and early on - there are only 17s between the following two key log messages:
2018-06-20T18:35:25,908 | INFO | awaitility[checkBundleDiagInfos] | SystemReadyImpl | 350 - org.opendaylight.infrautils.ready-impl - 1.4.0.SNAPSHOT | checkBundleDiagInfos: Elapsed time 2s, remaining time 297s, diag: Booting {Installed=0, Resolved=4, Unknown=0, GracePeriod=117, Waiting=0, Starting=0, Active=485, Stopping=0, Failure=0}
2018-06-20T18:35:42,573 | ERROR | SystemReadyService-0 | SystemReadyImpl | 350 - org.opendaylight.infrautils.ready-impl - 1.4.0.SNAPSHOT | Failed, some bundles did not start (SystemReadyListeners are not called) org.opendaylight.odlparent.bundlestest.lib.SystemStateFailureException: diag failed; some bundles failed to start diag: Failure {Installed=0, Resolved=4, Unknown=0, GracePeriod=54, Waiting=0, Starting=0, Active=547, Stopping=0, Failure=1}
The bundle that fails is in Blueprint initialization is the (new) serviceutils:srm-impl, due to that weird schema not found - so figuring that one really is the key to resolving this.
Nothing new - just reconfirming previous message, after having stared more at the logs.
Tom or Robert, if you can help clarify how these schema are normally found by the BindingToNormalizedNodeCodec, and what could cause one to not be found, that we interesting... ;-) otherwise I guess we can have some fun for the next weeks to learn more about this controller and mdsal code!
|
Michael Vorburger <vorburger@...>
On Thu, Jun 21, 2018 at 1:41 PM, Michael Vorburger <vorburger@...> wrote:
I had to refresh my own memory by looking at the code in odlparent:bundles-test-lib and infrautils:ready, so just as a refresher: This isn't actually "timing out", in this case. I was wrong to suggest that "somehow the build in distribution is too slow perhaps" - there are no timeouts to increase to get this to pass. It's *NOT* waiting those hard-coded max. 5 minutes. What's happening here is that one of the bundles is in "bundleState = Failure" (grep that log for that), and that fairly quickly and early on - there are only 17s between the following two key log messages: 2018-06-20T18:35:25,908 | INFO | awaitility[checkBundleDiagInfos] | SystemReadyImpl | 350 - org.opendaylight.infrautils.ready-impl - 1.4.0.SNAPSHOT | checkBundleDiagInfos: Elapsed time 2s, remaining time 297s, diag: Booting {Installed=0, Resolved=4, Unknown=0, GracePeriod=117, Waiting=0, Starting=0, Active=485, Stopping=0, Failure=0} 2018-06-20T18:35:42,573 | ERROR | SystemReadyService-0 | SystemReadyImpl | 350 - org.opendaylight.infrautils.ready-impl - 1.4.0.SNAPSHOT | Failed, some bundles did not start (SystemReadyListeners are not called) org.opendaylight.odlparent.bundlestest.lib.SystemStateFailureException: diag failed; some bundles failed to start diag: Failure {Installed=0, Resolved=4, Unknown=0, GracePeriod=54, Waiting=0, Starting=0, Active=547, Stopping=0, Failure=1} Nothing new - just reconfirming previous message, after having stared more at the logs. Tom or Robert, if you can help clarify how these schema are normally found by the BindingToNormalizedNodeCodec, and what could cause one to not be found, that we interesting... ;-) otherwise I guess we can have some fun for the next weeks to learn more about this controller and mdsal code!
|
Michael Vorburger <vorburger@...>
+controller-dev & mdsal-dev: On Thu, Jun 21, 2018 at 4:57 AM, Vishal Thapar <vthapar@...> wrote:
For background, this is with https://git.opendaylight.org/gerrit/#/c/73212/, currently reverted on master. It passes locally (I just tried), so I suspect another timing related issue - somehow the build in distribution is too slow perhaps. The error shown about (unresolved dependencies ... DOMSchemaService) is probably just an effect, not a cause. This which appears earlier in that https://jenkins.opendaylight.org/releng/job/distribution-check-fluorine/69/consoleFull log seems to be more interesting: 2018-06-20T18:35:42,573 | ERROR | SystemReadyService-0 | TestBundleDiag | 350 - org.opendaylight.infrautils.ready-impl - 1.4.0.SNAPSHOT | NOK org.opendaylight.serviceutils.srm-impl:0.2.0.SNAPSHOT: OSGi state = Active, Karaf bundleState = Failure, due to: Declarative Services Blueprint 6/20/18 6:35 PM Exception: Unable to initialize bean .component-2 org.osgi.service.blueprint.container.ComponentDefinitionException: Unable to initialize bean .component-2 at org.apache.aries.blueprint.container.BeanRecipe.runBeanProcInit(BeanRecipe.java:738) (...) Caused by: org.osgi.service.blueprint.container.ComponentDefinitionException: Error processing "rpc-implementation" for class org.opendaylight.serviceutils.srm.impl.SrmRpcProvider at org.opendaylight.controller.blueprint.ext.RpcImplementationBean.init(RpcImplementationBean.java:69)(...)Caused by: java.lang.IllegalStateException: Schema for interface org.opendaylight.yang.gen.v1.urn.opendaylight.serviceutils.srm.rpcs.rev170711.SrmRpcsService is not available. at com.google.common.base.Preconditions.checkState(Preconditions.java:585) at org.opendaylight.mdsal.binding.dom.adapter.BindingToNormalizedNodeCodec.getModuleBlocking(BindingToNormalizedNodeCodec.java:303) This doesn't look great, right? How can a Schema for a generated code interface just be missing like this? But only in distribution, and not locally reproducible? There is also an occurrence of https://jira.opendaylight.org/browse/NETCONF-534 - not sure how worrying that should be? |
Vishal Thapar <vthapar@...>
Hi Michael, Stephen, We are unable to enable serviceutils in distro due to failure in distro job. I tried looking at logs, it shows something wrong with srm-shell, but it works locally, even with clean m2, so not sure what are we missing. Any inputs? 2018-06-20T18:40:23,484 | ERROR | Blueprint Extender: 3 | BlueprintContainerImpl | 82 - org.apache.aries.blueprint.core - 1.8.3 | Unable to start blueprint container for bundle org.opendaylight.serviceutils.srm-shell/0.2.0.SNAPSHOT due to unresolved dependencies [(&(|(type=default)(!(type=*)))(objectClass=org.opendaylight.mdsal.dom.api.DOMSchemaService))]Regards, Vishal. |