HelloI tend to agree. The branch lock is relevant only to MSI projects at this point. Those projects do not even have what I would call active committers (with the notable exception on OFP).
As discussed during the TSC meeting of 7th July <https://wiki.opendaylight.org/display/ODL/2022-07-07+TSC+Minutes>,
I'd like to challenge the relevancy of branch locks during releases processes.
In my opinion they have more cons than pros today.
I agree they used to be meaningful in the past to avoid potential overlaps and incoherence.
But it was a time when many active projects and committers were taking part to the release.
I am not convinced the size of the community still justifies such a process today.
And since most active projects and their committers are quite experienced,
I am quite convinced that branch locks more brake the release than they help.
I think especially of repeated situations such as when downstream projects face
bugs in their upstream dependencies and have to wait for the branch to be unlocked
to update their poms and trigger stage-releases jobs.
But I might have missed some aspects.
So I would like to have other community members'opinion on the topic.
Meanwhile the way branch-lock works is quite wrong, as it should only be affecting MSI projects (e.g. those in autorelease.git), but affects everyone (who happens to match their branch naming).
Ditch it for all I care.
Regards,
Robert
I am slowly catching up on things from last month. One item is the subject of Github workflows.
There are a number of unresolved issues, some of which may be my ignorance of the outside world (in which case I would *love* to be proven wrong). Here is the list:
1. Support for multiple branches
================================
It is OpenDaylight policy to support up to 3 branches at any given time for any MSI project. For MRI projects, that number gets to 4 for periods last 2-5 months -- as is the case for YANG tools right now, we have:
- yangtools-7.0.x for 2022.03 Phosphorus security support
- yangtools-8.0.x for 2022.06 Sulfur
- yangtools-9.0.x for 2022.09 Chlorine
- yangtools-master for 2023.03 Argon
As far as I know, Github does not provide the equivalent of Gerrit cherry-picks out of the box. That certainly was the case ~5 years when I investigated this more deeply.
The crux of the issue seems to be Change-ID and its tie-in with GH PRs. I was told by Andy Grimberg this is nigh impossible to reconcile. Change-ID is critical for cross-referencing commits, because equivalent patches can look very differently on each supported branch.
That having been said, I do believe this is fixable by automation, e.g. having a bot assign Change-IDs for a PR and squashing each PR into a single patch -- which then can be projected to Gerrit, allowing for migration. I am not aware of such a bot existing, so I track this as something would have to be contributed.
2. Permissions
==============
Github is a system external to LF. As such, I do not think there is infrastructure present to project each project's INFO.yaml into Github permissions. AFAICT the only existing thing is the 'OpenDaylight project', which is an all-or-nothing thing. That is something LF IT has to tackle before we consider migrating.
3. Verification
===============
Our current infrastructure is tied to Jenkins. A switch to GH requires that a PR triggers the appropriate jobs in Jenkins. Unless we are talking a straight-up move to GH Actions, we need point 1. to be solved and drive verification projected from Gerrit back to GH. If GH actions are in the picture, at least maven-verify need to be migrated. Again, this needs a community contribution.
Now, I am not a nay-sayer, but we have a sh*tload of things going and migrating it requires some serious man-power and for those uninitiated, I like to quote Thanh: "Getting Eclipse build where it is was a twelve month effort for three dedicated people" (IIRC). We no longer have Thanh and all his dedication and experience.
I have used this quote when someone proposed moving to Gradle. Moving to GH workflows is harder.
If we are to tackle this, we need to solve above problems in order: 2, 1, 3. I will lend my support to anyone seriously committed to this undertaking.
At the end of the day, this is not impossible. The OpenJDK community has executed a full transition from custom Mercurial workflows (webrev et al.) to GitHub PRs -- but that transition includes a metric ton of automation which had to be written from scratch. We as a community are struggling to get to Jenkins Pipelines, which is dead simple in comparison.
So, Venkat, as the one proposing this change, are you in a position and willing to drive it to completion?
Regards,
Robert
_________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
Hello
As discussed during the
TSC meeting of 7th July,
I'd like to challenge the relevancy of branch
locks during releases processes.
In my opinion they have more cons than pros today.
I agree they used to be meaningful in the past to avoid potential overlaps and incoherence.
But it was a time when many active projects and committers were taking part to the release.
I am not convinced the size of the community still justifies such a process today.
And since most active projects and their committers are quite experienced,
I am quite convinced that branch locks more brake the release than they help.
I think especially of repeated situations such as when downstream projects face
bugs in their upstream dependencies and have to wait for the branch to be unlocked
to update their poms and trigger stage-releases jobs.
But I might have missed some aspects.
So I would like to have other community members'opinion on the topic.
Best Regards
Guillaume
Envoyé : lundi 8 août 2022 17:45
À : Daniel de la Rosa; TSC
Objet : RE: Code freeze for Sulfur SR2
Hi Daniel
You are right.
For
the moment, I assume that we can still process as before once again.
Bacuase we didn't close the debate about this point.
I still have to
send an email with more details to trigger it.
Most people including myself were on vacation just after this meeting.
I can do it soon now
that everyone is back.
Best Regards
Guillaume
Envoyé : lundi 8 août 2022 17:00:00
À : TSC; LAMBERT Guillaume INNOV/NET
Objet : Code freeze for Sulfur SR2
_________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
I have filed patches for all three of these, once they are merged we should see how CSIT goes.Alright, so now we are in business, we have https://jenkins.opendaylight.org/releng/job/integration-distribution-test-chlorine/90/ going.
It seems all tests are currently failing due to being run with Java 11, now I am not sure where those definitions are...
Regards,
Robert
Hi Daniel
You are right.
For the moment, I assume that we can still process as before once again.
Bacuase we didn't close the debate about this point.
I still have to send an email with more details to trigger it.
Most people including myself were on vacation just after this meeting.
I can do it soon now that everyone is back.
Best RegardsGuillaume
De : Daniel de la Rosa <ddelarosa0707@...>
Envoyé : lundi 8 août 2022 17:00:00
À : TSC; LAMBERT Guillaume INNOV/NET
Objet : Code freeze for Sulfur SR2Hello Guillaume and all, I just remembered that we wanted to shorten the code freeze for new releases, as it is documented in our TSC meeting minutes from July 7th. So based on this, when do we want to announce and/or start the code freeze for Sulfur SR2?
_________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
We are going to code freeze Sulfur for all Managed Projects ( cut and lock release branches ) on Monday August 15th 2022 at 10 am UTC
Please remember that we only allow blocker bug fixes in release branch after code freezes
Daniel de la Rosa
ODL Release Manager
Thanks
ps. Release schedule and checklist for your reference
Hello,
Did you get a chance to look in to the configurations shared.
Thanks & Regards,
Rohini
Cell: +91.9995241298 | VoIP: +91.471.3025332
From: Rohini Ambika
Sent: Friday, July 29, 2022 11:35 AM
To: Rahul Sharma <rahul.iitr@...>
Cc: Anil Shashikumar Belur <abelur@...>; Hsia, Andrew <andrew.hsia@...>; Casey Cain <ccain@...>; Luis Gomez <ecelgp@...>; TSC <tsc@...>; John Mangan <John.Mangan@...>; Sathya Manalan <sathya.manalan@...>; Hemalatha Thangavelu <hemalatha.t@...>; Gokul Sakthivel <gokul.sakthivel@...>; Bhaswati_Das <Bhaswati_Das@...>
Subject: RE: [opendaylight-dev] Message Approval Needed - rohini.ambika@... posted to dev@...
Hello Rahul,
Please find the answers below:
- Official Helm chart @ ODL Helm Chart . Attaching the values.yml for reference
- Fix was to restart the Owner Supervisor on failure . Check-in @ https://git.opendaylight.org/gerrit/c/controller/+/100357
We observed the same problem when tested without K8s set up by following the instructions @ https://docs.opendaylight.org/en/stable-phosphorus/getting-started-guide/clustering.html. Instead of installing odl-mdsal-distributed-datastore feature, we have enabled the features given in the values.yml.
Thanks & Regards,
Rohini
Cell: +91.9995241298 | VoIP: +91.471.3025332
From: Rahul Sharma <rahul.iitr@...>
Sent: Thursday, July 28, 2022 9:32 PM
To: Rohini Ambika <rohini.ambika@...>
Cc: Anil Shashikumar Belur <abelur@...>; Hsia, Andrew <andrew.hsia@...>; Casey Cain <ccain@...>; Luis Gomez <ecelgp@...>; TSC <tsc@...>; John Mangan <John.Mangan@...>; Sathya Manalan <sathya.manalan@...>; Hemalatha Thangavelu <hemalatha.t@...>; Gokul Sakthivel <gokul.sakthivel@...>; Bhaswati_Das <Bhaswati_Das@...>
Subject: Re: [opendaylight-dev] Message Approval Needed - rohini.ambika@... posted to dev@...
[**EXTERNAL EMAIL**]
Hello Rohini,
Thank you for the answers.
- For the 1st one: when you say you tried with the official Helm charts - which helm charts are you referring to? Can you send more details on how (parameters in values.yaml that you used) when you deployed these charts.
- What was the Temporary fix that reduced the occurrence of the issue. Can you point to the check-in made or change in configuration parameters? Would be helpful to diagnose a proper fix.
Regards,
Rahul
On Thu, Jul 28, 2022 at 2:21 AM Rohini Ambika <rohini.ambika@...> wrote:
Hi Anil,
Thanks for the response.
Please find the details below:
1. Is the Test deployment using our Helm charts (ODL Helm Chart)? – We have created our own helm chart for the ODL deployment. Have also tried the use case with official helm chart.
2. I see that the JIRA mentioned in the below email ( https://jira.opendaylight.org/browse/CONTROLLER-2035 ) is already marked Resolved. Has somebody fixed it in the latest version. – This was a temporary fix from our end and the failure rate has reduced due to the fix, however we are still facing the issue when we do multiple restarts of master node.
ODL version used is Phosphorous SR2
All the configurations are provided and attached in the initial mail .
Thanks & Regards,
Rohini
Cell: +91.9995241298 | VoIP: +91.471.3025332
From: Anil Shashikumar Belur <abelur@...>
Sent: Thursday, July 28, 2022 5:05 AM
To: Rahul Sharma <rahul.iitr@...>
Cc: Hsia, Andrew <andrew.hsia@...>; Rohini Ambika <rohini.ambika@...>; Casey Cain <ccain@...>; Luis Gomez <ecelgp@...>; TSC <tsc@...>
Subject: Re: [opendaylight-dev] Message Approval Needed - rohini.ambika@... posted to dev@...
[**EXTERNAL EMAIL**]
Hi,
I belive they are using ODL Helm charts and K8s for the cluster setup, that said I have requested the version of ODL being used.
Rohoni: Can you provide more details on the ODL version, and configuration, that Rahul/Andrew requested?
On Thu, Jul 28, 2022 at 8:08 AM Rahul Sharma <rahul.iitr@...> wrote:
Hi Anil,
Thank you for bringing this up.
Couple of questions:
- Is the Test deployment using our Helm charts (ODL Helm Chart)?
- I see that the JIRA mentioned in the below email ( https://jira.opendaylight.org/browse/CONTROLLER-2035 ) is already marked Resolved. Has somebody fixed it in the latest version.
Thanks,
Rahul
On Wed, Jul 27, 2022 at 5:05 PM Anil Shashikumar Belur <abelur@...> wrote:
Hi Andrew and Rahul:
I remember we have discussed these topics in the ODL containers and helm charts meetings.
Do we know if the expected configuration would work with the ODL on K8s clusters setup or requires some configuration changes?
Cheers,
Anil
---------- Forwarded message ---------
From: Group Notification <noreply@...>
Date: Wed, Jul 27, 2022 at 9:04 PM
Subject: [opendaylight-dev] Message Approval Needed - rohini.ambika@... posted to dev@...
To: <odl-mailman-owner@...>
A message was sent to the group https://lists.opendaylight.org/g/dev from rohini.ambika@... that needs to be approved because the user is new member moderated.
Subject: FW: ODL Clustering issue - High Availability
Hi All,
As presented/discussed in the ODL TSC meeting held on 22nd Friday 10.30 AM IST, posting this email to highlight the issues on ODL clustering use cases encountered during the performance testing.
Details and configurations as follows:
* Requirement : ODL clustering for high availability (HA) on data distribution
* Env Configuration:
* 3 node k8s Cluster ( 1 master & 3 worker nodes) with 3 ODL instances running on each node
* CPU : 8 Cores
* RAM : 20GB
* Java Heap size : Min - 512MB Max - 16GB
* JDK version : 11
* Kubernetes version : 1.19.1
* Docker version : 20.10.7
* ODL features installed to enable clustering:
* odl-netconf-clustered-topology
* odl-restconf-all
* Device configured : Netconf devices , all devices having same schema(tested with 250 devices)
* Use Case:
* Fail Over/High Availability:
* Expected : In case any of the ODL instance gets down/restarted due to network splits or internal error, other instance in cluster should be available and functional. If the affected instance is having master mount, the other instance who is elected as master by re-election should be able to re-register the devices and resume the operations. Once the affected instance comes up, it should be able to join the cluster as member node and register the slave mounts.
* Observation : When the odl instance which is having the master mount restarts, election happens among the other node in the cluster and elects the new leader. Now the new leader is trying to re-register the master mount but failed at a point due to the termination of the Akka Cluster Singleton Actor. Hence the cluster goes to idle state and failed to assign owner for the device DOM entity. In this case, the configuration of already mounted device/ new mounts will fail.
* JIRA reference : https://jira.opendaylight.org/browse/CONTROLLER-2035<https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fjira.opendaylight.org%2Fbrowse%2FCONTROLLER-2035&data=05%7C01%7Crohini.ambika%40infosys.com%7C12cedda8fd77459df73b08da6fb6802e%7C63ce7d592f3e42cda8ccbe764cff5eb6%7C0%7C0%7C637945126890707334%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=6yZbWAhTgVdwHVbpO7UtUenKW5%2B476j%2BG4ZEodjBUKc%3D&reserved=0>
* Akka configuration of all the nodes attached. (Increased the gossip-interval time to 5s in akka.conf file to avoid Akka AskTimedOut issue while mounting multiple devices at a time.)
Requesting your support to identify if there is any mis-configurations or any known solution for the issue .
Please let us know if any further information required.
Note : We have tested the single ODL instance without enabling cluster features in K8s cluster. In case of K8s node failure, ODL instance will be re-scheduled in other available K8s node and operations will be resumed.
Thanks & Regards,
RohiniA complete copy of this message has been attached for your convenience.
To approve this using email, reply to this message. You do not need to attach the original message, just reply and send.
Reject this message and notify the sender.
Delete this message and do not notify the sender.
NOTE: The pending message will expire after 14 days. If you do not take action within that time, the pending message will be automatically rejected.
Change your notification settings
---------- Forwarded message ----------
From: rohini.ambika@...
To: "dev@..." <dev@...>
Cc:
Bcc:
Date: Wed, 27 Jul 2022 11:03:22 +0000
Subject: FW: ODL Clustering issue - High AvailabilityHi All,
As presented/discussed in the ODL TSC meeting held on 22nd Friday 10.30 AM IST, posting this email to highlight the issues on ODL clustering use cases encountered during the performance testing.
Details and configurations as follows:
- Requirement : ODL clustering for high availability (HA) on data distribution
- Env Configuration:
- 3 node k8s Cluster ( 1 master & 3 worker nodes) with 3 ODL instances running on each node
- CPU : 8 Cores
- RAM : 20GB
- Java Heap size : Min – 512MB Max – 16GB
- JDK version : 11
- Kubernetes version : 1.19.1
- Docker version : 20.10.7
- ODL features installed to enable clustering:
- odl-netconf-clustered-topology
- odl-restconf-all
- Device configured : Netconf devices , all devices having same schema(tested with 250 devices)
- Use Case:
- Fail Over/High Availability:
- Expected : In case any of the ODL instance gets down/restarted due to network splits or internal error, other instance in cluster should be available and functional. If the affected instance is having master mount, the other instance who is elected as master by re-election should be able to re-register the devices and resume the operations. Once the affected instance comes up, it should be able to join the cluster as member node and register the slave mounts.
- Observation : When the odl instance which is having the master mount restarts, election happens among the other node in the cluster and elects the new leader. Now the new leader is trying to re-register the master mount but failed at a point due to the termination of the Akka Cluster Singleton Actor. Hence the cluster goes to idle state and failed to assign owner for the device DOM entity. In this case, the configuration of already mounted device/ new mounts will fail.
- JIRA reference : https://jira.opendaylight.org/browse/CONTROLLER-2035
- Akka configuration of all the nodes attached. (Increased the gossip-interval time to 5s in akka.conf file to avoid Akka AskTimedOut issue while mounting multiple devices at a time.)
Requesting your support to identify if there is any mis-configurations or any known solution for the issue .
Please let us know if any further information required.
Note : We have tested the single ODL instance without enabling cluster features in K8s cluster. In case of K8s node failure, ODL instance will be re-scheduled in other available K8s node and operations will be resumed.
Thanks & Regards,
Rohini
--
- Rahul Sharma
--
- Rahul Sharma
Hi TSC, (+ Pano on CC)
as discussed on the last meeting, I would like to propose a survey for OpenDaylight users, in order to get a better angle on our community and users of OpenDaylight.
Feel free to think about any additional questions that might be suitable for the survey. Do keep in mind, that it should be shorter, in order to encourage users to answer.
In order to visualize it better, I created a mockup in a tool called Tally: https://tally.so/r/3xXyDd (Most questions are marked as required, except for one optional field)
The ideal outcome would be 50+ responses and a sort of 1-pager, discussing what the results indicate for the project.
@Pano: Does LFN have some kind of tool where we could create the survey and validate them? I will gladly take part in creating the output of this survey with you.
Looking forward to any feedback,
Filip Čúzy
Marketing Specialist
PANTHEON .tech
Mlynské Nivy 56, 821 05 Bratislava
Slovakia
Tel / +421 220 665 111
MAIL / filip.cuzy@...
WEB / https://pantheon.tech
Hi Daniel
You are right.
For
the moment, I assume that we can still process as before
once again.
Bacuase we didn't close the debate about this point.
I still have to
send an email with more details to trigger it.
Most people including myself were on vacation just after this meeting.
I can do it soon now
that everyone is back.
Best Regards
Guillaume
Envoyé : lundi 8 août 2022 17:00:00
À : TSC; LAMBERT Guillaume INNOV/NET
Objet : Code freeze for Sulfur SR2
_________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
On 07/07/2022 01:35, Robert Varga wrote:Done, managed distribution is out there and it shrunk by ~8MiB, which is good news.Hello everyone,All this has been completed and the MSI projects have been updated. Unfortunately there are three more things blocking autorelease:
Since we are well in the 2022.09 Simultaneous Release (Chlorine), here is a quick summary of where we are at:
- MRI projects up to and including AAA have released
- MSI projects have preliminary patches staged at https://git.opendaylight.org/gerrit/q/topic:chlorine-mri
- NETCONF is awaiting a bug scrub and the corresponding release. There are quite a few issues to scrub and we also need some amount of code reorg withing the repo, which in itself may entail breaking changes. There are quite a few unreviewed patches pendign as well. Given the raging summer in the northern hemosphere, I expect netconf-4.0.0 release to happen in about 2-3 weeks' time (i.e. last week of July 2022)
- BGPCEP has a few deliverables yet to be finished and the corresponding 0.18.0 release being dependent on NETCONF, my working assumption is having the release available mid-August 2022
- int/dist's master needs to be built with Java 17
- we need a centos8-4c-16g builder-Done, we have AR working here: https://jenkins.opendaylight.org/releng/job/autorelease-release-chlorine-mvn38-openjdk17/buildTimeTrend
- we need to remove Java 11-based autorelease-release-chlorineWe still need https://git.opendaylight.org/gerrit/c/releng/builder/+/101980 merged, because ...
I have filed patches for all three of these, once they are merged we should see how CSIT goes.... AR-openjdk17 is not triggering CSIT without it.
Regards,
Robert
Hi TSC, (+ Pano on CC)
as discussed on the last meeting, I would like to propose a survey for OpenDaylight users, in order to get a better angle on our community and users of OpenDaylight.
Feel free to think about any additional questions that might be suitable for the survey. Do keep in mind, that it should be shorter, in order to encourage users to answer.
In order to visualize it better, I created a mockup in a tool called Tally: https://tally.so/r/3xXyDd (Most questions are marked as required, except for one optional field)
The ideal outcome would be 50+ responses and a sort of 1-pager, discussing what the results indicate for the project.
@Pano: Does LFN have some kind of tool where we could create the survey and validate them? I will gladly take part in creating the output of this survey with you.
Looking forward to any feedback,
Filip Čúzy
Marketing Specialist
PANTHEON .tech
Mlynské Nivy 56, 821 05 Bratislava
Slovakia
Tel / +421 220 665 111
MAIL / filip.cuzy@...
WEB / https://pantheon.tech
Hello everyone,All this has been completed and the MSI projects have been updated. Unfortunately there are three more things blocking autorelease:
Since we are well in the 2022.09 Simultaneous Release (Chlorine), here is a quick summary of where we are at:
- MRI projects up to and including AAA have released
- MSI projects have preliminary patches staged at https://git.opendaylight.org/gerrit/q/topic:chlorine-mri
- NETCONF is awaiting a bug scrub and the corresponding release. There are quite a few issues to scrub and we also need some amount of code reorg withing the repo, which in itself may entail breaking changes. There are quite a few unreviewed patches pendign as well. Given the raging summer in the northern hemosphere, I expect netconf-4.0.0 release to happen in about 2-3 weeks' time (i.e. last week of July 2022)
- BGPCEP has a few deliverables yet to be finished and the corresponding 0.18.0 release being dependent on NETCONF, my working assumption is having the release available mid-August 2022
- int/dist's master needs to be built with Java 17
- we need a centos8-4c-16g builder-
- we need to remove Java 11-based autorelease-release-chlorine
I have filed patches for all three of these, once they are merged we should see how CSIT goes.
All the patches related to this effort are staged at https://git.opendaylight.org/gerrit/q/topic:chlorine-mri, as usual.
Regards,
Robert
P.S. The docs patches should be finished in the next few days.
Hello OpenDaylight Community,
The next TSC meeting is August 4, 2022 at 10 pm Pacific Time.
As usual, the agenda proposal and the connection details for this meeting are available in the wiki
at the following URL:
https://wiki.opendaylight.org/x/YwGdAQ
If you need to add anything, please let me know or add it there.
The meeting minutes will be at the same location after the meeting is over.
Best Regards
Guillaume
_________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
Hello,
Did you get a chance to look in to the configurations shared.
Thanks & Regards,
Rohini
Cell: +91.9995241298 | VoIP: +91.471.3025332
Sent: Friday, July 29, 2022 11:35 AM
To: Rahul Sharma <rahul.iitr@...>
Cc: Anil Shashikumar Belur <abelur@...>; Hsia, Andrew <andrew.hsia@...>; Casey Cain <ccain@...>; Luis Gomez <ecelgp@...>; TSC <tsc@...>; John Mangan <John.Mangan@...>; Sathya Manalan <sathya.manalan@...>; Hemalatha Thangavelu <hemalatha.t@...>; Gokul Sakthivel <gokul.sakthivel@...>; Bhaswati_Das <Bhaswati_Das@...>
Subject: RE: [opendaylight-dev] Message Approval Needed - rohini.ambika@... posted to dev@...
Hello Rahul,
Please find the answers below:
- Official Helm chart @ ODL Helm Chart . Attaching the values.yml for reference
- Fix was to restart the Owner Supervisor on failure . Check-in @ https://git.opendaylight.org/gerrit/c/controller/+/100357
We observed the same problem when tested without K8s set up by following the instructions @ https://docs.opendaylight.org/en/stable-phosphorus/getting-started-guide/clustering.html. Instead of installing odl-mdsal-distributed-datastore feature, we have enabled the features given in the values.yml.
Thanks & Regards,
Rohini
Cell: +91.9995241298 | VoIP: +91.471.3025332
From: Rahul Sharma <rahul.iitr@...>
Sent: Thursday, July 28, 2022 9:32 PM
To: Rohini Ambika <rohini.ambika@...>
Cc: Anil Shashikumar Belur <abelur@...>; Hsia, Andrew <andrew.hsia@...>; Casey Cain <ccain@...>;
Luis Gomez <ecelgp@...>; TSC <tsc@...>; John Mangan <John.Mangan@...>; Sathya Manalan <sathya.manalan@...>;
Hemalatha Thangavelu <hemalatha.t@...>; Gokul Sakthivel <gokul.sakthivel@...>; Bhaswati_Das <Bhaswati_Das@...>
Subject: Re: [opendaylight-dev] Message Approval Needed -
rohini.ambika@... posted to
dev@...
[**EXTERNAL EMAIL**]
Hello Rohini,
Thank you for the answers.
- For the 1st one: when you say you tried with the official Helm charts - which helm charts are you referring to? Can you send more details on how (parameters in values.yaml that you used) when you deployed these charts.
- What was the Temporary fix that reduced the occurrence of the issue. Can you point to the check-in made or change in configuration parameters? Would be helpful to diagnose a proper fix.
Regards,
Rahul
On Thu, Jul 28, 2022 at 2:21 AM Rohini Ambika <rohini.ambika@...> wrote:
Hi Anil,
Thanks for the response.
Please find the details below:
1. Is the Test deployment using our Helm charts (ODL Helm Chart)? – We have created our own helm chart for the ODL deployment. Have also tried the use case with official helm chart.
2. I see that the JIRA mentioned in the below email ( https://jira.opendaylight.org/browse/CONTROLLER-2035 ) is already marked Resolved. Has somebody fixed it in the latest version. – This was a temporary fix from our end and the failure rate has reduced due to the fix, however we are still facing the issue when we do multiple restarts of master node.
ODL version used is Phosphorous SR2
All the configurations are provided and attached in the initial mail .
Thanks & Regards,
Rohini
Cell: +91.9995241298 | VoIP: +91.471.3025332
From: Anil Shashikumar Belur <abelur@...>
Sent: Thursday, July 28, 2022 5:05 AM
To: Rahul Sharma <rahul.iitr@...>
Cc: Hsia, Andrew <andrew.hsia@...>; Rohini Ambika <rohini.ambika@...>; Casey Cain <ccain@...>; Luis Gomez <ecelgp@...>; TSC <tsc@...>
Subject: Re: [opendaylight-dev] Message Approval Needed - rohini.ambika@... posted to dev@...
[**EXTERNAL EMAIL**]
Hi,
I belive they are using ODL Helm charts and K8s for the cluster setup, that said I have requested the version of ODL being used.
Rohoni: Can you provide more details on the ODL version, and configuration, that Rahul/Andrew requested?
On Thu, Jul 28, 2022 at 8:08 AM Rahul Sharma <rahul.iitr@...> wrote:
Hi Anil,
Thank you for bringing this up.
Couple of questions:
- Is the Test deployment using our Helm charts (ODL Helm Chart)?
- I see that the JIRA mentioned in the below email ( https://jira.opendaylight.org/browse/CONTROLLER-2035 ) is already marked Resolved. Has somebody fixed it in the latest version.
Thanks,
Rahul
On Wed, Jul 27, 2022 at 5:05 PM Anil Shashikumar Belur <abelur@...> wrote:
Hi Andrew and Rahul:
I remember we have discussed these topics in the ODL containers and helm charts meetings.
Do we know if the expected configuration would work with the ODL on K8s clusters setup or requires some configuration changes?
Cheers,
Anil
---------- Forwarded message ---------
From: Group Notification <noreply@...>
Date: Wed, Jul 27, 2022 at 9:04 PM
Subject: [opendaylight-dev] Message Approval Needed - rohini.ambika@... posted to dev@...
To: <odl-mailman-owner@...>
A message was sent to the group https://lists.opendaylight.org/g/dev from rohini.ambika@... that needs to be approved because the user is new member moderated.
Subject: FW: ODL Clustering issue - High Availability
Hi All,
As presented/discussed in the ODL TSC meeting held on 22nd Friday 10.30 AM IST, posting this email to highlight the issues on ODL clustering use cases encountered during the performance testing.
Details and configurations as follows:
* Requirement : ODL clustering for high availability (HA) on data distribution
* Env Configuration:
* 3 node k8s Cluster ( 1 master & 3 worker nodes) with 3 ODL instances running on each node
* CPU : 8 Cores
* RAM : 20GB
* Java Heap size : Min - 512MB Max - 16GB
* JDK version : 11
* Kubernetes version : 1.19.1
* Docker version : 20.10.7
* ODL features installed to enable clustering:
* odl-netconf-clustered-topology
* odl-restconf-all
* Device configured : Netconf devices , all devices having same schema(tested with 250 devices)
* Use Case:
* Fail Over/High Availability:
* Expected : In case any of the ODL instance gets down/restarted due to network splits or internal error, other instance in cluster should be available and functional. If the affected instance is having master mount, the other instance who is elected as master by re-election should be able to re-register the devices and resume the operations. Once the affected instance comes up, it should be able to join the cluster as member node and register the slave mounts.
* Observation : When the odl instance which is having the master mount restarts, election happens among the other node in the cluster and elects the new leader. Now the new leader is trying to re-register the master mount but failed at a point due to the termination of the Akka Cluster Singleton Actor. Hence the cluster goes to idle state and failed to assign owner for the device DOM entity. In this case, the configuration of already mounted device/ new mounts will fail.
* JIRA reference : https://jira.opendaylight.org/browse/CONTROLLER-2035<https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fjira.opendaylight.org%2Fbrowse%2FCONTROLLER-2035&data=05%7C01%7Crohini.ambika%40infosys.com%7C12cedda8fd77459df73b08da6fb6802e%7C63ce7d592f3e42cda8ccbe764cff5eb6%7C0%7C0%7C637945126890707334%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=6yZbWAhTgVdwHVbpO7UtUenKW5%2B476j%2BG4ZEodjBUKc%3D&reserved=0>
* Akka configuration of all the nodes attached. (Increased the gossip-interval time to 5s in akka.conf file to avoid Akka AskTimedOut issue while mounting multiple devices at a time.)
Requesting your support to identify if there is any mis-configurations or any known solution for the issue .
Please let us know if any further information required.
Note : We have tested the single ODL instance without enabling cluster features in K8s cluster. In case of K8s node failure, ODL instance will be re-scheduled in other available K8s node and operations will be resumed.
Thanks & Regards,
RohiniA complete copy of this message has been attached for your convenience.
To approve this using email, reply to this message. You do not need to attach the original message, just reply and send.
Reject this message and notify the sender.
Delete this message and do not notify the sender.
NOTE: The pending message will expire after 14 days. If you do not take action within that time, the pending message will be automatically rejected.
Change your notification settings
---------- Forwarded message ----------
From: rohini.ambika@...
To: "dev@..." <dev@...>
Cc:
Bcc:
Date: Wed, 27 Jul 2022 11:03:22 +0000
Subject: FW: ODL Clustering issue - High AvailabilityHi All,
As presented/discussed in the ODL TSC meeting held on 22nd Friday 10.30 AM IST, posting this email to highlight the issues on ODL clustering use cases encountered during the performance testing.
Details and configurations as follows:
- Requirement : ODL clustering for high availability (HA) on data distribution
- Env Configuration:
- 3 node k8s Cluster ( 1 master & 3 worker nodes) with 3 ODL instances running on each node
- CPU : 8 Cores
- RAM : 20GB
- Java Heap size : Min – 512MB Max – 16GB
- JDK version : 11
- Kubernetes version : 1.19.1
- Docker version : 20.10.7
- ODL features installed to enable clustering:
- odl-netconf-clustered-topology
- odl-restconf-all
- Device configured : Netconf devices , all devices having same schema(tested with 250 devices)
- Use Case:
- Fail Over/High Availability:
- Expected : In case any of the ODL instance gets down/restarted due to network splits or internal error, other instance in cluster should be available and functional. If the affected instance is having master mount, the other instance who is elected as master by re-election should be able to re-register the devices and resume the operations. Once the affected instance comes up, it should be able to join the cluster as member node and register the slave mounts.
- Observation : When the odl instance which is having the master mount restarts, election happens among the other node in the cluster and elects the new leader. Now the new leader is trying to re-register the master mount but failed at a point due to the termination of the Akka Cluster Singleton Actor. Hence the cluster goes to idle state and failed to assign owner for the device DOM entity. In this case, the configuration of already mounted device/ new mounts will fail.
- JIRA reference : https://jira.opendaylight.org/browse/CONTROLLER-2035
- Akka configuration of all the nodes attached. (Increased the gossip-interval time to 5s in akka.conf file to avoid Akka AskTimedOut issue while mounting multiple devices at a time.)
Requesting your support to identify if there is any mis-configurations or any known solution for the issue .
Please let us know if any further information required.
Note : We have tested the single ODL instance without enabling cluster features in K8s cluster. In case of K8s node failure, ODL instance will be re-scheduled in other available K8s node and operations will be resumed.
Thanks & Regards,
Rohini
--
- Rahul Sharma
--
- Rahul Sharma
Hello all
Gilles, I understand you need a formal approval of the TSC to proceed with LFIT.
So I just created a poll at this URL https://wiki.opendaylight.org/x/MAGdAQ
Please TSC members, can you give your feedback about this?
Thanks in advance
Best RegardsGuillaume
De : app-dev@... <app-dev@...> de la part de Gilles Thouenon via lists.opendaylight.org <gilles.thouenon=orange.com@...>
Envoyé : lundi 1 août 2022 15:08:01
À : TSC
Cc : transportpce-dev@...; LAMBERT Guillaume INNOV/NET; OLLIVIER Cédric INNOV/NET
Objet : [app-dev] TransportPCE evolutionDear TSC members,
During the TSC meeting of June 30, Guillaume Lambert briefly presented to you our proposal to make TransportPCE project structure evolve. The purpose of this email is to summarize the evolution that we wish to implement from the Chlorine release, for validation by the TSC.
Currently, the TransportPCE project implements data models from the OpenROADM community and from ONF (T-API). All these models are systematically compiled at the beginning of the project build step, which somewhere is useless.
This step already takes a lot of time, and will increase even more because in the medium term the project is moving towards implementing new additional models (openconfig).
Moreover, past experience, especially with the migration to Sulfur, shows us that a number of problems/regressions are directly related to yang models. Having visibility as quickly as possible on the compilation of all these models when core projects such as yangtools or mdsal evolve, would probably help to quickly detect possible bugs during major evolutions.
This is the reason why we would like to export the compilation step of these official models to a new dedicated project (transportpce-models) which would have its own gerrit repo (see my request for a new repo IT-24286).
With each new release (of models and ODL), the models would be compiled once and for all, and the TransportPCE project would use them by simple maven dependency.
These models could if necessary be reused by another project (typically the node simulator used for our functional tests which implements part of these models), and also be used in the CI of yangtools/mdsal/etc.
Thank you in advance to the TSC for agreeing to give us its agreement to implement this change, and if necessary its recommendations if there are any.
Gilles Thouenon
TransportPCE PTL
_________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you._________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
Hi TSC,
the beta version of the website transformation, from WordPress to Jekyll, is prepared for a wider demonstration towards you.
- Repository on GitHub with a README. It is missing some sort of structure on website approval.
- The website itself, hosted on Netlify.
The final version will be hosted on GitHub pages, this is for convenience sake.
- OpenDaylight Wiki page updated.
Filip Čúzy
Marketing Specialist
PANTHEON .tech
Mlynské Nivy 56, 821 05 Bratislava
Slovakia
Tel / +421 220 665 111
MAIL / filip.cuzy@...
WEB / https://pantheon.tech
Hello all
Gilles, I understand you need a formal approval of the TSC to proceed with LFIT.
So I just created a poll at this URL https://wiki.opendaylight.org/x/MAGdAQ
Please TSC members, can you give your feedback about this?
Thanks in advance
Best Regards
Guillaume
Envoyé : lundi 1 août 2022 15:08:01
À : TSC
Cc : transportpce-dev@...; LAMBERT Guillaume INNOV/NET; OLLIVIER Cédric INNOV/NET
Objet : [app-dev] TransportPCE evolution
Dear TSC members,
During the TSC meeting of June 30, Guillaume Lambert briefly presented to you our proposal to make TransportPCE project structure evolve. The purpose of this email is to summarize the evolution that we wish to implement from the Chlorine release, for validation by the TSC.
Currently, the TransportPCE project implements data models from the OpenROADM community and from ONF (T-API). All these models are systematically compiled at the beginning of the project build step, which somewhere is useless.
This step already takes a lot of time, and will increase even more because in the medium term the project is moving towards implementing new additional models (openconfig).
Moreover, past experience, especially with the migration to Sulfur, shows us that a number of problems/regressions are directly related to yang models. Having visibility as quickly as possible on the compilation of all these models when core projects such as yangtools or mdsal evolve, would probably help to quickly detect possible bugs during major evolutions.
This is the reason why we would like to export the compilation step of these official models to a new dedicated project (transportpce-models) which would have its own gerrit repo (see my request for a new repo
IT-24286).
With each new release (of models and ODL), the models would be compiled once and for all, and the TransportPCE project would use them by simple maven dependency.
These models could if necessary be reused by another project (typically the node simulator used for our functional tests which implements part of these models), and also be used in the CI of yangtools/mdsal/etc.
Thank you in advance to the TSC for agreeing to give us its agreement to implement this change, and if necessary its recommendations if there are any.
Gilles Thouenon
TransportPCE PTL
_________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
_________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
Dear TSC members,
During the TSC meeting of June 30, Guillaume Lambert briefly presented to you our proposal to make TransportPCE project structure evolve. The purpose of this email is to summarize the evolution that we wish to implement from the Chlorine release, for validation by the TSC.
Currently, the TransportPCE project implements data models from the OpenROADM community and from ONF (T-API). All these models are systematically compiled at the beginning of the project build step, which somewhere is useless.
This step already takes a lot of time, and will increase even more because in the medium term the project is moving towards implementing new additional models (openconfig).
Moreover, past experience, especially with the migration to Sulfur, shows us that a number of problems/regressions are directly related to yang models. Having visibility as quickly as possible on the compilation of all these models when core projects such as yangtools or mdsal evolve, would probably help to quickly detect possible bugs during major evolutions.
This is the reason why we would like to export the compilation step of these official models to a new dedicated project (transportpce-models) which would have its own gerrit repo (see my request for a new repo
IT-24286).
With each new release (of models and ODL), the models would be compiled once and for all, and the TransportPCE project would use them by simple maven dependency.
These models could if necessary be reused by another project (typically the node simulator used for our functional tests which implements part of these models), and also be used in the CI of yangtools/mdsal/etc.
Thank you in advance to the TSC for agreeing to give us its agreement to implement this change, and if necessary its recommendations if there are any.
Gilles Thouenon
TransportPCE PTL
_________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.