how to address ovsdb node connection flap


suneelu
 

Hi,

   I have created the following jira

https://jira.opendaylight.org/browse/OVSDB-438

   https://git.opendaylight.org/gerrit/#/c/66504/

 

client connects to only one odl controller via ha proxy

Some times when the client ovsdb connection flap happens, its node goes missing from operational datastore.

The following could be scenarios when connection flap happens

1) client disconnects and connects back to same controller after some delay
2) client disconnects and connects back to same controller immediately
3) client disconnects and connects to another controller after some delay
4) client disconnects and connects to another controller immediately
5) client disconnects and never connects back

When client disconnects all the odl controllers are trying to cleanup the operds node.
When client connects the owner odl controller is trying to create the operds node.

When the processing of one odl controller which is trying to cleanup the operds node is delayed, then we end up client node missing in oper topology.

 

To address this issue , the following has been proposed in the review.

 

When the client connects back to any odl controller before creating the node in oper store , fire delete of the node from oper store.

delay the cleanup of the node in disconnected() callback and in other controllers and do a cleanup only if node never connected back.

 

Now the node cleanup responsibility is shifted to the controller wherever the client is connected to.

That controller can predictably delete and recreate the node in oper datastore.

This also ensures that reconciliation gets triggered.

 

Thanks,

Suneelu