OF connection close during handshake when switch-idle-timeout is low
Michal Rehak -X (mirehak - Pantheon Technologies SRO@Cisco) <mirehak@...>
Hi Tali,
unfortunately the idle timeout is provided by netty. If we bypass the idle notification during handshake that would solve your the current issue but we would freeze if device times out during handshake. Good news is that there was already a request for modifying the idle timeout during session lifecycle. However this requires some adaptation in openflowJava project where netty is utilized. @Michal: could you please provide some up-to-date information regarding changing idle timeout on demand (probably per device)? Thank you. Regards, Michal From: Tali Ben Meir [Tali.BenMeir@...]
Sent: Tuesday, April 14, 2015 16:48 To: Michal Rehak -X (mirehak - Pantheon Technologies SRO at Cisco) Subject: RE: OF connection close during handshake when switch-idle-timeout is low Hi,
Actually what we are trying to do is just to detect switch failure quickly (using NodeRemoved notification from OpenDaylightInvetoryListener) Everything is ok as long as the switch is being shutdown gracefully, but when the switch box is shut down forcefully, we need to wait 15 sec. until ECHO request times out to detect the failure. We just wanted to make the ECHO more frequent so that switch failure detection would not take more than 1 sec. But when switch-idle-timeout configuration was set to 1000msec – handshake fails and the controller send TCP FIN to the switch. Can you think of a way to make switch failure detection quicker without influencing the handshake?
Tali
From: Michal Rehak -X (mirehak - Pantheon Technologies SRO at Cisco) [mailto:mirehak@...]
Hi Tali, From: Tali Ben Meir [Tali.BenMeir@...] So is there any nice way to work around this? - Different timeouts for handshake phase and working phase? - Echo during handshake? (I guess it may be forbidden based on protocol spec or a very error prone process) - Other ideas?
Tali
From: Michal Rehak -X (mirehak - Pantheon Technologies SRO at Cisco) [mailto:mirehak@...]
Hi Tali, From: Tali Ben Meir [Tali.BenMeir@...] Hi,
First, thanks for the quick reply.
So if I understand you correctly, can I add a piece of before
if (!CONDUCTOR_STATE.WORKING.equals(getConductorState())) {
Something like
if (CONDUCTOR_STATE.HANDSHAKING.equals(getConductorState())) { return; }
Would that be ok? Can this make some damage someplace else?
Additional question – I see the timeout interval between ECHO request and response is hardcoded to 2000msec. Do I necessarily need to make it less than my switch-idle-timeout or can several pending ECHO messages co-exist?
Thanks again Tali
From: Michal Rehak -X (mirehak - Pantheon Technologies SRO at Cisco) [mailto:mirehak@...]
Hi Tali, From: Tali Ben Meir [Tali.BenMeir@...] Hi Michal,
My name is Tali from ConteXtream and I have a question about the ConnectionConductorImpl behavior on SwitchIdleTimeout event. I’m trying to make the OF heartbeat i.e. ECHO request/response to operate on a very high rate – ECHO request should be sent each 300msec. I have tried setting the switch-idle-timeout to values like 300msec/1000msec/2000msec but when the OF connection is being established (handshake phase), ODL sometimes terminates the connection towards the switch. It never happens when using the default timeout (15000msec). I have seen you placed the following protective code in ConnectionConductorImpl:
public void onSwitchIdleEvent(SwitchIdleEvent notification) { new Thread(new Runnable() { @Override public void run() { if (!CONDUCTOR_STATE.WORKING.equals(getConductorState())) { // idle state in any other conductorState than WORKING means real // problem and wont be handled by echoReply, but disconnection disconnect(); OFSessionUtil.getSessionManager().invalidateOnDisconnect(ConnectionConductorImpl.this);
I fail to understand why conductor state == HANDSHAKING is an erroneous and will lead to OF session invalidation? Could you explain? I am using Helium SR1.
Thanks in advance Tali
Tali Ben-Meir SW Engineer ConteXtream Email: tali.benmeir@...
|
|