Hi Muthukumaran,
let me explain the use case and how GIT can help to solve the configuration issue.
Problem Statement ================= we need to have a way to synchronize the configuration in the cluster and handle the case of partitioning. As we know recovering the states from a partition from network elements is relatively easier because it can simulated with new network elements appearing when the partition merges. Different story is for configuration changes because they have impact potentially on different nodes, for example a new local user added or a path being setup. In this case the changes happening on different partitions need to be merged back.
Proposal of placing the configurations under GIT ================================================ Now lets assume each controller node when doing configuration changes instead of simply accepting the change, saving in a clustered cache (as it's done now) and saving when a UI command is issued, we follow a different approach.
T0) Controller nodes get the configuration change T1) The configuration change is serialized on a file (or more than one depending how many components span) T2) The configuration directory instead to be a simple directory is a directory kept under GIT revision control, now when the change is performed the file change is committed in the repo and pushed to the coordinator of the cluster. T3) The coordinator of the cluster merges the config and signal the others to pull the new config. T4) Other nodes pull the config and apply the changes if possible.
Now having each change revisioned with GIT helps to figure out merge conflicts if any and apply fast-forward for changes without conflicts.
If a partition happens there will be two or more coordinators and in this case there would a drift among the GIT repo of the configurations. Now when the partition heals you can still use the git merge capability to try to figure out the new configuration and if there are issues ask for a human intervention.
This to kick start the conversation about how we can actually spread the configuration but at the same time make sure the configuration is also revisioned to address inconsistencies.
Regarding the insertion of config-subsystem i have spoken a bit with Tomas about it, and the config subsystem is meant to actually implement a two phase commit system with the capability of defining the configuration in Yang, beside this there is no meant to spread the data in the cluster and using the schema i described would also benefit that case.
This said, i'm really willing to discuss and maybe we can work together to implement it. Would actually be awesome to see this growing in a true collaborative fashion.
Thanks, Giovanni
toggle quoted message
Show quoted text
On 30-Oct-13 10:11, Muthukumaran Kothandaraman wrote: Hi Giovanni,
I agree that connection-manager should be the culmination point and Inifinispan based implementation meets the need very much well - thanks to the transaction. That's why mentioned that some usecases already have alternative implementations and hence we must evaluate before we embark on zookeeper
>>>this i'm thinking to provide via GIT so we gain sharing and versioning of the configuration.
I am not getting this. Technically, yes, this is a feasible solution. But I am not able to picture the usecases.
The usecases what I was thinking while writing the previous mail were of following stereotypes
- In node 1 of cluster I changed arp-refresh-interval how other nodes get to know this - yes. this could be simply solved via Infinsipan (provided we use Infinispan to propagate config changes as well). But, are we going to build that level of dependency on Infinispan is something which could be examined.
But if your usecases of git-usage is orthogonal to the above configuration stereotype, it would of course be interesting to discuss. I think, we also need to consider YANG-Driven "JMX aspect" which is coming up newly as far as Configuration aspect goes - you agree ?
Regards
Muthukumaran (Muthu) L : 080-49141730
P(existence at t) = 1- 1/P(existence at t-1)
From: Giovanni Meo <gmeo@...> To: Muthukumaran Kothandaraman/India/IBM@IBMIN Cc: "dev (controller-dev@...)" <controller-dev@...>, controller-dev-bounces@..., "Bainbridge, David" <dbainbri@...>, "'integration-dev@...'" <integration-dev@...> Date: 10/30/2013 01:46 PM Subject: Re: [controller-dev] Clustering question --------------------------------------------------------------------------------
Hi Muthukumaran,
just some answers inline ...
On 30-Oct-13 07:43, Muthukumaran Kothandaraman wrote: > Hi David, > > I had earlier contemplated on Zookeeper option. Mainly because of the deployment > aspect as mentioned by Giovanni, I had put it in back-burner :-) > > Particularly in Zookeeper, we would require minimum 3 Zookeeper members because > the cluster "consensus" has to be reached via "majority voting" and even numbers > would mean "hung parliament" :-) > > So, deploying ODL with isolated cluster of Zookeeper(s) could become an > installation issue. My thoughts are more on these lines if we bring back > Zookeeper approch > > - Each ODL instance must have embedded Zookeeper client - obviously > - There must be a Zookeeper cluster of 3 members minimum > - The code in ODL must be abstracted in such a manner that modules do not see > zookeeper directly. Some concocted interface like "ClusterCoordinator" must be > presented to modules and > Zookeeper client must be hidden behind this interface > > No doubts there are much bigger advantages in using zookeeper, > > Some immediate usecases could be > > - shared configuration
this i'm thinking to provide via GIT so we gain sharing and versioning of the configuration.
> - coordination in electing "MASTER" controller for switches
this is already provided by connection manager, using the transactional caches on infinispan.
Thanks, Giovanni
> - common pinning location for switch <--> controller mapping (of course this > overlaps a bit with what Infinispan provides so we need to take this case with a > fine-toothed comb) > > > Regards > > Muthukumaran (Muthu) > L : 080-49141730 > > P(existence at t) = 1- 1/P(existence at t-1) > > > > From: Giovanni Meo <gmeo@...> > To: "Bainbridge, David" <dbainbri@...> > Cc: "dev \(controller-dev@...\)" > <controller-dev@...>, > "'integration-dev@...'" <integration-dev@...> > Date: 10/29/2013 03:52 PM > Subject: Re: [controller-dev] Clustering question > Sent by: controller-dev-bounces@... > -------------------------------------------------------------------------------- > > > > Hi David, > > yes i have looked at ZooKeeper, so the reason why i didn't pick it was because > the Zookeeper to my knowledge implements its functionality using a server-client > architecture: > > http://zookeeper.apache.org/doc/r3.1.2/zookeeperOver.html > > that means that depending on how you deploy your cluster you may have to add > more zookeper server because else the system may run in bottlenecks. That > translates in a deployment issue because when you deploy it you need to > dimension the number of zookeper servers to deploy as well, and take care to > where you connect in order to avoid to overload any of them. > Now starting from this points i looked around and run into infinispan which > allow to deploy your cluster as a P2P network because beside the meet and greet > phase the other communications happens on a full mesh. > This is a bit of history, that said if folks have a good zookeeper experience i > encourage to create an alternative clustering.services-implementation based on > it so we can actually weight the pro and cons on real code, which is always eye > opening. > > Thanks, > Giovanni > > On 28-Oct-13 18:00, Bainbridge, David wrote: > > Giovanni, > > > > Thanks for this pointer. I was wondering if any thought was given to use > > something like Zookeeper to maintain the cluster information as opposed to the > > supernode construct? > > > > I understand that structurally this doesn't significantly change how clustering > > operates (i.e. something needs to be started and used for coordination before > > controllers are started), but using something like ZK would provide a separation > > of concerns between controller cluster management and other controller functions > > as well as potentially provide a mechanism to coordinate between controllers > > when ODL is not the only controller deployed in a network. > > > > Just thinking out loud ... > > > > > > mahalo, > > /david > > > > On 10/28/2013 03:56 AM, Giovanni Meo wrote: > >> Hi Luis, > >> > >> apologize for it, create a wiki page with the information: > >> > >> https://wiki.opendaylight.org/view/OpenDaylight_Controller:Clustering:HowTo > >> > >> Please let me know if you run into issues. > >> > >> Thanks, > >> Giovanni > >> > >> On 28-Oct-13 04:04, Luis Gomez wrote: > >>> Hi again, > >>> > >>> I am looking for a procedure to do system test on a controller cluster. However > >>> I do not find any instruction on how to setup and verify a cluster, any idea? > >>> > >>> Thanks in advance > >>> > >>> Luis > >>> > >>> > >>> > >>> _______________________________________________ > >>> controller-dev mailing list > >>> controller-dev@... > >>> https://lists.opendaylight.org/mailman/listinfo/controller-dev > >>> > > > > > > -- > > *David Bainbridge* |Principal Architect |<www.ciena.com> > > _dbainbri@... <mailto:dbainbri@...>_ | 3939 North 1st Street |San > > Jose, CA 95134 USA > > Direct +1.408.904.2103 |Mobile +1.978.835.0771 |Fax +1.408.904.2103 > > -- > Giovanni Meo > Via del Serafico, 200 Telephone: +390651644000 > 00142, Roma Mobile: +393480700958 > Italia Fax: +390651645917 > VOIP: 8-3964000 > “The pessimist complains about the wind; > the optimist expects it to change; > the realist adjusts the sails.” -- Wm. Arthur Ward > IETF credo: "Rough consensus and running code" > _______________________________________________ > controller-dev mailing list > controller-dev@... > https://lists.opendaylight.org/mailman/listinfo/controller-dev > >
-- Giovanni Meo Via del Serafico, 200 Telephone: +390651644000 00142, Roma Mobile: +393480700958 Italia Fax: +390651645917 VOIP: 8-3964000 “The pessimist complains about the wind; the optimist expects it to change; the realist adjusts the sails.” -- Wm. Arthur Ward IETF credo: "Rough consensus and running code"
-- Giovanni Meo Via del Serafico, 200 Telephone: +390651644000 00142, Roma Mobile: +393480700958 Italia Fax: +390651645917 VOIP: 8-3964000 “The pessimist complains about the wind; the optimist expects it to change; the realist adjusts the sails.” -- Wm. Arthur Ward IETF credo: "Rough consensus and running code"
|