Re: ODL modernization


Robert Varga
 

On 09/04/2022 05:47, Luis Gomez wrote:
Hi TSC ex-colleagues,
As I promised, here is a list of topics for ODL modernization. This is all that came to my mind now but it could be more, the goal here is really to trigger some brainstorm and discussion.
1) Use cases:
Our YANG based platform made it easy for ODL to become the SDN controller of choice for HW devices supporting NETCONF and other control protocols like BGP-LS, PCEP, OPENFLOW where multi-vendor is important. However it made it hard to have a role in the cloud where YANG and protocols like NETCONF, BGP-LS, PCEP are almost not existent and multi-vendor is not that important. Now, given the amount of money and effort that is currently put in the cloud, it would make sense for ODL to at least participate in some cloud use case. For example, I believe ODL can still play some role in the hybrid cloud use case where HW devices need to talk to cloud devices, ODL could take care of the HW devices while another open source controller could take care of the cloud devices.
Strictly speaking: we already do, but through external integrations and those bring very little in terms of engagement/contributions :(

2) SW Platform:
The OSGI/Karaf platform was state-of-the-art 10 years back where there was no real micro-services platforms like K8s, but now it is just obsolete. In the new paradigm of micro-services, applications are loose-coupled and they are mostly self-contained (e.g. run their own processes within a container) although they can share some common resources like a database, a message broker, an API gateway, etc.
Weeeeeeeeeell. I could *very easily* sell you OSGi as:
- the original Java micro-service platform
- the current Java nano-service platform

Also JPMS re-creates a (miserable) subset of OSGi. But that really is semantics, so let's not get distracted here.

OSGi is just another runtime. It always has been where ODL architectural principles are considered. Unfortunately some projects require it as a core dependency -- just because they did not know better at the time that code was merged. There has been some very solid work done over the past ~2 years to remove those assumptions.

What remains, though, is that OSGi+Karaf is the only thing that is seriously integrated *and tested*. Your next comments and my responses need to be considered with regard to this simple truth.

For ODL to fit in the new paradigm, we would need to:
- Replace ODL distribution with ODL applications that can run in their own container: NETCONF, BGP, PCEP, TPCE, etc
Agree. That is a packaging exercise. From the get go, we can (mostly, with notable exception of PCEP right now) do whatever you'd like... more below.

- Consolidate kernel repos and jars: Existing ODL applications share a common code called kernel. Today we have ~6 repositories and a bunch of jars for the kernel where there is a single person/organization maintaining the code.
Yeah, no. As the person referenced in that sentence, I have to say that the each repo (and project) has a rather well-defined scope -- as per our governance. It also keeps a reasonable straight-jacket on what is done and how.

Merging the repos would throw us back ~9 years when controller.git contained all of yangtools, mdsal, netconf, adsal, the works. It would also lower to guards we have against layering/design violations.

One example I can quote here is https://git.opendaylight.org/gerrit/c/openflowplugin/+/91313 -- that patch should have never been approved and it was upgraded SpotBugs which eventually found it. It took three days to correct -- not something we want to do if long-term maintainability is our goal.

- Replace Karaf/OSGI framework with something more actual to plumb the java code and jars together (e.g. spring).
Right, and odlmicro just dropped the ball here in more than one way :(

The long-term plan here is to have OSGi DS annotations and a reasonable DI framework for static deployments. In this regard Blueprint is one of the worst decisions we have ever made.

In terms of "reasonable framework":

1. we have static Karaf fully supported, but per-use-case packaging has not been appearing. This is easy pickings most of the time: create a static karaf, put into a docker (or whathever) and you are done. It is a no-frills solution, boot time is at ~30% of what dynamic Karaf is.

2. odlmicro is moribund. Contributors are welcome, but if those do not step forward, https://github.com/PANTHEONtech/lighty is very much an alternative, which is not readily integratable into individual projects :(

3. what we *really* want to do is Dagger. That's pretty much the gold standard in Android apps, completely compile-time, all that jazz. While we have the basics ready in some places, noone has prototyped anything reasonable with it.

At this point, I think the idea behind odlmicro should be supplanted by a single goal: make everthing wireable via Dagger and provide a Dagger-based equivalent of netconf.git/static/pom.xml to get that use case up and running.

3) Infrastructure:
Here is kind of obsolete too. Some ideas to renew the build and test infrastructure:
Build pipeline:
- Move from JJB (not maintained anymore) to Jenkins pipelines or similar (work is ongoing).
Yeah, that's the idea, but I am not aware of anyone actively working on this.

- Move to a continuous release process where a merge in master produces a new release in the artifact repository (staging). This simplifies a lot the release process: just move artifacts from staging to release repository.
This ties in with your comment about consolidating repos. Rather that that, I wish we just had a reasonable infra to automatically release on patch merge -- including version bumps, CSIT validation, proper git history (which we do not have), all that jazz.

There is just no way I can over-sell this -- this is the core piece of automation we are missing. Requires some real DevOps folks, which we seem to be in short supply of these days.

- Every ODL application (NETCONF, BGP, PCEP, TPCE, etc) should generate a container automatically after a merge in master. We should be testing this vs ODL distribution.
Right, and we are building towards this. I think NETCONF is *almost* there, but I am not sure. This is a prerequisite to having maven-stage jobs executing CSIT prior to allowing release, which ties in to the previous point.

System test:
- Robot was the best open source system test framework 10 years back where organizations had developers and system engineers separated. Things have changed a lot since and many agile organizations nowadays have developers doing system integration and system test code apart from writing product features. Robot framework is good for system integrators with basic or non coding skills but bad for developers that have to ramp up in a new language that soon find very limiting. This is why I think at this moment it would be good to switch to something like pytest for example.
Yes.

At the end of the day, CSIT must be completely owned by the project it is testing and it must be reasonably maintainable. Neither int/test organization (we still carry SXP tests?!) nor RF fulfill (please point me to a reasonable IDE, can you?) that criteria :)

- Leverage K8s to do multi-application and scale testing.
I think int/packaging some overhaul here. At the end of the day, the first use case we need to have is a Docker (or whatever, I don't care) based on netconf.git/static being packaged as part of netconf-maven-merge job. I have no idea what we need to make that happen, though.

Regards,
Robert

Join TSC@lists.opendaylight.org to automatically receive all group messages.