[openflowplugin-dev] https://bugs.opendaylight.org/show_bug.cgi?id=2905


Anton Ivanov
 

On 30/03/15 10:26, Anton Ivanov wrote:
On 30/03/15 10:22, Michal Polkoráb wrote:

Ok, sounds good. Shall I merge your changes now or are we going to wait until you set your benchmark environment up ?


I think this one is obvious. We can merge it now.

This will allow me to concentrate on 2825 and the benchmarks (and cascaded changes to OF, OFJ, etc) for it.

By the way, some of the other projects set it. For example  - OVSDB does.

As far as benchmarking it specifically for openflowjava across all possible use cases, in order to do a proper benchmark across a variety of scenarios one needs to set-up a network delay/drop simulator. 

I am going to do that too at some point as it will be needed in order to look into a different issue - the netty retry mechanics. Those are obviously optimized for a small number of high throughput (instead of low latency) channels too. However, as that also works around various issues in Java NIO, I would not venture an opinion there until I have tested it properly.

Setting up such an environment to test is not trivial, so in the meantime, I think we should work on the basis of what the literature on the subject says and what everyone else does - set it to on.

A.


A.


Michal 


From: Anton Ivanov <anton.ivanov@...>
Sent: 30 March 2015 10:49
To: Michal Polkoráb; Edward Warnicke; openflowjava-dev
Cc: openflowplugin-dev@...
Subject: Re: [openflowjava-dev] [openflowplugin-dev] https://bugs.opendaylight.org/show_bug.cgi?id=2905
 
On 30/03/15 09:12, Michal Polkoráb wrote:

Hi Anton,


do we have any performance numbers (to compare with original code performance) ?


This one - not yet. I started setting up a benchmark environment as I have to do proper benchmarks for the proposed fixes for ser/des as well (bug 2825) last week.

Unfortunately, I had to switch urgently to clear something else out of my queue. I am just about done with what I had to context-switch to, so I will get back to the benchmark task ASAP.

Otherwise, I have worked on similar stuff in the past - the difference depends on rtt.

Based on my recollections from implementing mobile signalling over TCP, for very small rtt (virtual switch on same host) it will probably be not more than 50% for command-acknowledge sequence. For LAN you are looking at 2x difference. For WAN - even more than that. For ~ 10ms you may see several times difference depending on use case.

There is a reasonable write-up on this here:

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_MRG/1.2/html/Realtime_Tuning_Guide/sect-Realtime_Tuning_Guide-Application_Tuning_and_Deployment-TCP_NODELAY_and_Small_Buffer_Writes.html

Java does not do cork, but we can emulate it in the ser/des layer so that is not a big deal.

Further write up here

http://www.techrepublic.com/article/tcp-ip-options-for-high-performance-data-transmission/

and ... frankly everywhere. All the way back to TCP Illustrated. http://www.amazon.co.uk/TCP-IP-Illustrated-Protocols-APC/dp/0201633469

This is the de-facto standard of implementing low-latency TCP response-ack. If you do not do it, you are guaranteed to get the latencies which the world is demonstrating for our stuff on CBENCH.

A.


Regards,

Michal Polkorab


From: Edward Warnicke <hagbard@...>
Sent: 27 March 2015 12:25
To: Anton Ivanov; openflowjava-dev
Cc: openflowplugin-dev@...
Subject: Re: [openflowjava-dev] [openflowplugin-dev] https://bugs.opendaylight.org/show_bug.cgi?id=2905
 
Looping in openflowjava-dev, where this patch is actually applied.

Ed

On Fri, Mar 27, 2015 at 4:29 AM, Anton Ivanov <anton.ivanov@...> wrote:
Looks like jenkins is out with the daisies. Again :)

It is not even starting build jobs at the moment.

In any case, Abhjit, Michal, please review - this one is obvious.

We never have "throughput" in the sense TCP as a protocol puts into it. What we have is more of a NO_DELAY scenario.

Additionally, once this has been fixed, there are a couple of other netty tunables (specifically rexmit attempts) which need to be revisited. Hitting a retransmit in a NO_DELAY scenario is a clear indication of an error or waiting for a TCP retransmission of an ACK from peer. In that case using the "beatings will continue until the morale improves" approach regarding the TCP stack is useless. It does not matter how many time you resubmit it will not transmit. It just wastes a mass of CPU which could have been used to service other sockets in the netty pool.

I will file that (and submit a patch) once we have finished this one off.

There are some other additional issues on Linux related to cork, autocork, etc, but they cannot be dealt with in Java as it does not do OS specific TCP options. So there is no way to solve them from inside ODL.

A.
_______________________________________________
openflowplugin-dev mailing list
openflowplugin-dev@...
https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev

MichalPolkoráb

Software Developer


Mlynské Nivy 56 / 821 05 Bratislava / Slovakia
+421 918 378 907
/ michal.polkorab@...
reception: +421 2 206 65 111
/ www.pantheon.sk

logo 


MichalPolkoráb

Software Developer


Mlynské Nivy 56 / 821 05 Bratislava / Slovakia
+421 918 378 907
/ michal.polkorab@...
reception: +421 2 206 65 111
/ www.pantheon.sk

logo




_______________________________________________
openflowplugin-dev mailing list
openflowplugin-dev@...
https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev


Anton Ivanov
 

On 30/03/15 10:22, Michal Polkoráb wrote:

Ok, sounds good. Shall I merge your changes now or are we going to wait until you set your benchmark environment up ?


I think this one is obvious. We can merge it now.

This will allow me to concentrate on 2825 and the benchmarks (and cascaded changes to OF, OFJ, etc) for it.

A.


Michal 


From: Anton Ivanov <anton.ivanov@...>
Sent: 30 March 2015 10:49
To: Michal Polkoráb; Edward Warnicke; openflowjava-dev
Cc: openflowplugin-dev@...
Subject: Re: [openflowjava-dev] [openflowplugin-dev] https://bugs.opendaylight.org/show_bug.cgi?id=2905
 
On 30/03/15 09:12, Michal Polkoráb wrote:

Hi Anton,


do we have any performance numbers (to compare with original code performance) ?


This one - not yet. I started setting up a benchmark environment as I have to do proper benchmarks for the proposed fixes for ser/des as well (bug 2825) last week.

Unfortunately, I had to switch urgently to clear something else out of my queue. I am just about done with what I had to context-switch to, so I will get back to the benchmark task ASAP.

Otherwise, I have worked on similar stuff in the past - the difference depends on rtt.

Based on my recollections from implementing mobile signalling over TCP, for very small rtt (virtual switch on same host) it will probably be not more than 50% for command-acknowledge sequence. For LAN you are looking at 2x difference. For WAN - even more than that. For ~ 10ms you may see several times difference depending on use case.

There is a reasonable write-up on this here:

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_MRG/1.2/html/Realtime_Tuning_Guide/sect-Realtime_Tuning_Guide-Application_Tuning_and_Deployment-TCP_NODELAY_and_Small_Buffer_Writes.html

Java does not do cork, but we can emulate it in the ser/des layer so that is not a big deal.

Further write up here

http://www.techrepublic.com/article/tcp-ip-options-for-high-performance-data-transmission/

and ... frankly everywhere. All the way back to TCP Illustrated. http://www.amazon.co.uk/TCP-IP-Illustrated-Protocols-APC/dp/0201633469

This is the de-facto standard of implementing low-latency TCP response-ack. If you do not do it, you are guaranteed to get the latencies which the world is demonstrating for our stuff on CBENCH.

A.


Regards,

Michal Polkorab


From: Edward Warnicke <hagbard@...>
Sent: 27 March 2015 12:25
To: Anton Ivanov; openflowjava-dev
Cc: openflowplugin-dev@...
Subject: Re: [openflowjava-dev] [openflowplugin-dev] https://bugs.opendaylight.org/show_bug.cgi?id=2905
 
Looping in openflowjava-dev, where this patch is actually applied.

Ed

On Fri, Mar 27, 2015 at 4:29 AM, Anton Ivanov <anton.ivanov@...> wrote:
Looks like jenkins is out with the daisies. Again :)

It is not even starting build jobs at the moment.

In any case, Abhjit, Michal, please review - this one is obvious.

We never have "throughput" in the sense TCP as a protocol puts into it. What we have is more of a NO_DELAY scenario.

Additionally, once this has been fixed, there are a couple of other netty tunables (specifically rexmit attempts) which need to be revisited. Hitting a retransmit in a NO_DELAY scenario is a clear indication of an error or waiting for a TCP retransmission of an ACK from peer. In that case using the "beatings will continue until the morale improves" approach regarding the TCP stack is useless. It does not matter how many time you resubmit it will not transmit. It just wastes a mass of CPU which could have been used to service other sockets in the netty pool.

I will file that (and submit a patch) once we have finished this one off.

There are some other additional issues on Linux related to cork, autocork, etc, but they cannot be dealt with in Java as it does not do OS specific TCP options. So there is no way to solve them from inside ODL.

A.
_______________________________________________
openflowplugin-dev mailing list
openflowplugin-dev@...
https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev

MichalPolkoráb

Software Developer


Mlynské Nivy 56 / 821 05 Bratislava / Slovakia
+421 918 378 907
/ michal.polkorab@...
reception: +421 2 206 65 111
/ www.pantheon.sk

logo 


MichalPolkoráb

Software Developer


Mlynské Nivy 56 / 821 05 Bratislava / Slovakia
+421 918 378 907
/ michal.polkorab@...
reception: +421 2 206 65 111
/ www.pantheon.sk

logo



Michal Polkorab
 

Ok, sounds good. Shall I merge your changes now or are we going to wait until you set your benchmark environment up ?


Michal 


From: Anton Ivanov <anton.ivanov@...>
Sent: 30 March 2015 10:49
To: Michal Polkoráb; Edward Warnicke; openflowjava-dev
Cc: openflowplugin-dev@...
Subject: Re: [openflowjava-dev] [openflowplugin-dev] https://bugs.opendaylight.org/show_bug.cgi?id=2905
 
On 30/03/15 09:12, Michal Polkoráb wrote:

Hi Anton,


do we have any performance numbers (to compare with original code performance) ?


This one - not yet. I started setting up a benchmark environment as I have to do proper benchmarks for the proposed fixes for ser/des as well (bug 2825) last week.

Unfortunately, I had to switch urgently to clear something else out of my queue. I am just about done with what I had to context-switch to, so I will get back to the benchmark task ASAP.

Otherwise, I have worked on similar stuff in the past - the difference depends on rtt.

Based on my recollections from implementing mobile signalling over TCP, for very small rtt (virtual switch on same host) it will probably be not more than 50% for command-acknowledge sequence. For LAN you are looking at 2x difference. For WAN - even more than that. For ~ 10ms you may see several times difference depending on use case.

There is a reasonable write-up on this here:

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_MRG/1.2/html/Realtime_Tuning_Guide/sect-Realtime_Tuning_Guide-Application_Tuning_and_Deployment-TCP_NODELAY_and_Small_Buffer_Writes.html

Java does not do cork, but we can emulate it in the ser/des layer so that is not a big deal.

Further write up here

http://www.techrepublic.com/article/tcp-ip-options-for-high-performance-data-transmission/

and ... frankly everywhere. All the way back to TCP Illustrated. http://www.amazon.co.uk/TCP-IP-Illustrated-Protocols-APC/dp/0201633469

This is the de-facto standard of implementing low-latency TCP response-ack. If you do not do it, you are guaranteed to get the latencies which the world is demonstrating for our stuff on CBENCH.

A.


Regards,

Michal Polkorab


From: Edward Warnicke <hagbard@...>
Sent: 27 March 2015 12:25
To: Anton Ivanov; openflowjava-dev
Cc: openflowplugin-dev@...
Subject: Re: [openflowjava-dev] [openflowplugin-dev] https://bugs.opendaylight.org/show_bug.cgi?id=2905
 
Looping in openflowjava-dev, where this patch is actually applied.

Ed

On Fri, Mar 27, 2015 at 4:29 AM, Anton Ivanov <anton.ivanov@...> wrote:
Looks like jenkins is out with the daisies. Again :)

It is not even starting build jobs at the moment.

In any case, Abhjit, Michal, please review - this one is obvious.

We never have "throughput" in the sense TCP as a protocol puts into it. What we have is more of a NO_DELAY scenario.

Additionally, once this has been fixed, there are a couple of other netty tunables (specifically rexmit attempts) which need to be revisited. Hitting a retransmit in a NO_DELAY scenario is a clear indication of an error or waiting for a TCP retransmission of an ACK from peer. In that case using the "beatings will continue until the morale improves" approach regarding the TCP stack is useless. It does not matter how many time you resubmit it will not transmit. It just wastes a mass of CPU which could have been used to service other sockets in the netty pool.

I will file that (and submit a patch) once we have finished this one off.

There are some other additional issues on Linux related to cork, autocork, etc, but they cannot be dealt with in Java as it does not do OS specific TCP options. So there is no way to solve them from inside ODL.

A.
_______________________________________________
openflowplugin-dev mailing list
openflowplugin-dev@...
https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev

MichalPolkoráb

Software Developer


Mlynské Nivy 56 / 821 05 Bratislava / Slovakia
+421 918 378 907
/ michal.polkorab@...
reception: +421 2 206 65 111
/ www.pantheon.sk

logo 


MichalPolkoráb

Software Developer


Mlynské Nivy 56 / 821 05 Bratislava / Slovakia
+421 918 378 907
/ michal.polkorab@...
reception: +421 2 206 65 111
/ www.pantheon.sk

logo


Anton Ivanov
 

On 30/03/15 09:12, Michal Polkoráb wrote:

Hi Anton,


do we have any performance numbers (to compare with original code performance) ?


This one - not yet. I started setting up a benchmark environment as I have to do proper benchmarks for the proposed fixes for ser/des as well (bug 2825) last week.

Unfortunately, I had to switch urgently to clear something else out of my queue. I am just about done with what I had to context-switch to, so I will get back to the benchmark task ASAP.

Otherwise, I have worked on similar stuff in the past - the difference depends on rtt.

Based on my recollections from implementing mobile signalling over TCP, for very small rtt (virtual switch on same host) it will probably be not more than 50% for command-acknowledge sequence. For LAN you are looking at 2x difference. For WAN - even more than that. For ~ 10ms you may see several times difference depending on use case.

There is a reasonable write-up on this here:

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_MRG/1.2/html/Realtime_Tuning_Guide/sect-Realtime_Tuning_Guide-Application_Tuning_and_Deployment-TCP_NODELAY_and_Small_Buffer_Writes.html

Java does not do cork, but we can emulate it in the ser/des layer so that is not a big deal.

Further write up here

http://www.techrepublic.com/article/tcp-ip-options-for-high-performance-data-transmission/

and ... frankly everywhere. All the way back to TCP Illustrated. http://www.amazon.co.uk/TCP-IP-Illustrated-Protocols-APC/dp/0201633469

This is the de-facto standard of implementing low-latency TCP response-ack. If you do not do it, you are guaranteed to get the latencies which the world is demonstrating for our stuff on CBENCH.

A.


Regards,

Michal Polkorab


From: Edward Warnicke <hagbard@...>
Sent: 27 March 2015 12:25
To: Anton Ivanov; openflowjava-dev
Cc: openflowplugin-dev@...
Subject: Re: [openflowjava-dev] [openflowplugin-dev] https://bugs.opendaylight.org/show_bug.cgi?id=2905
 
Looping in openflowjava-dev, where this patch is actually applied.

Ed

On Fri, Mar 27, 2015 at 4:29 AM, Anton Ivanov <anton.ivanov@...> wrote:
Looks like jenkins is out with the daisies. Again :)

It is not even starting build jobs at the moment.

In any case, Abhjit, Michal, please review - this one is obvious.

We never have "throughput" in the sense TCP as a protocol puts into it. What we have is more of a NO_DELAY scenario.

Additionally, once this has been fixed, there are a couple of other netty tunables (specifically rexmit attempts) which need to be revisited. Hitting a retransmit in a NO_DELAY scenario is a clear indication of an error or waiting for a TCP retransmission of an ACK from peer. In that case using the "beatings will continue until the morale improves" approach regarding the TCP stack is useless. It does not matter how many time you resubmit it will not transmit. It just wastes a mass of CPU which could have been used to service other sockets in the netty pool.

I will file that (and submit a patch) once we have finished this one off.

There are some other additional issues on Linux related to cork, autocork, etc, but they cannot be dealt with in Java as it does not do OS specific TCP options. So there is no way to solve them from inside ODL.

A.
_______________________________________________
openflowplugin-dev mailing list
openflowplugin-dev@...
https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev

MichalPolkoráb

Software Developer


Mlynské Nivy 56 / 821 05 Bratislava / Slovakia
+421 918 378 907
/ michal.polkorab@...
reception: +421 2 206 65 111
/ www.pantheon.sk

logo



Michal Polkorab
 

Hi Anton,


do we have any performance numbers (to compare with original code performance) ? 


Regards,

Michal Polkorab


From: Edward Warnicke <hagbard@...>
Sent: 27 March 2015 12:25
To: Anton Ivanov; openflowjava-dev
Cc: openflowplugin-dev@...
Subject: Re: [openflowjava-dev] [openflowplugin-dev] https://bugs.opendaylight.org/show_bug.cgi?id=2905
 
Looping in openflowjava-dev, where this patch is actually applied.

Ed

On Fri, Mar 27, 2015 at 4:29 AM, Anton Ivanov <anton.ivanov@...> wrote:
Looks like jenkins is out with the daisies. Again :)

It is not even starting build jobs at the moment.

In any case, Abhjit, Michal, please review - this one is obvious.

We never have "throughput" in the sense TCP as a protocol puts into it. What we have is more of a NO_DELAY scenario.

Additionally, once this has been fixed, there are a couple of other netty tunables (specifically rexmit attempts) which need to be revisited. Hitting a retransmit in a NO_DELAY scenario is a clear indication of an error or waiting for a TCP retransmission of an ACK from peer. In that case using the "beatings will continue until the morale improves" approach regarding the TCP stack is useless. It does not matter how many time you resubmit it will not transmit. It just wastes a mass of CPU which could have been used to service other sockets in the netty pool.

I will file that (and submit a patch) once we have finished this one off.

There are some other additional issues on Linux related to cork, autocork, etc, but they cannot be dealt with in Java as it does not do OS specific TCP options. So there is no way to solve them from inside ODL.

A.
_______________________________________________
openflowplugin-dev mailing list
openflowplugin-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev

MichalPolkoráb

Software Developer


Mlynské Nivy 56 / 821 05 Bratislava / Slovakia
+421 918 378 907
/ michal.polkorab@...
reception: +421 2 206 65 111
/ www.pantheon.sk

logo


Edward Warnicke <hagbard@...>
 

Looping in openflowjava-dev, where this patch is actually applied.

Ed

On Fri, Mar 27, 2015 at 4:29 AM, Anton Ivanov <anton.ivanov@...> wrote:
Looks like jenkins is out with the daisies. Again :)

It is not even starting build jobs at the moment.

In any case, Abhjit, Michal, please review - this one is obvious.

We never have "throughput" in the sense TCP as a protocol puts into it. What we have is more of a NO_DELAY scenario.

Additionally, once this has been fixed, there are a couple of other netty tunables (specifically rexmit attempts) which need to be revisited. Hitting a retransmit in a NO_DELAY scenario is a clear indication of an error or waiting for a TCP retransmission of an ACK from peer. In that case using the "beatings will continue until the morale improves" approach regarding the TCP stack is useless. It does not matter how many time you resubmit it will not transmit. It just wastes a mass of CPU which could have been used to service other sockets in the netty pool.

I will file that (and submit a patch) once we have finished this one off.

There are some other additional issues on Linux related to cork, autocork, etc, but they cannot be dealt with in Java as it does not do OS specific TCP options. So there is no way to solve them from inside ODL.

A.
_______________________________________________
openflowplugin-dev mailing list
openflowplugin-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev


Anil Vishnoi
 

forwarding it to openflowjava mailing list ...

Anil
---------- Forwarded message ----------
From: Anton Ivanov <anton.ivanov@...>
Date: Fri, Mar 27, 2015 at 1:59 PM
Subject: [openflowplugin-dev] https://bugs.opendaylight.org/show_bug.cgi?id=2905
To: "openflowplugin-dev@..." <openflowplugin-dev@...>


Looks like jenkins is out with the daisies. Again :)

It is not even starting build jobs at the moment.

In any case, Abhjit, Michal, please review - this one is obvious.

We never have "throughput" in the sense TCP as a protocol puts into it. What we have is more of a NO_DELAY scenario.

Additionally, once this has been fixed, there are a couple of other netty tunables (specifically rexmit attempts) which need to be revisited. Hitting a retransmit in a NO_DELAY scenario is a clear indication of an error or waiting for a TCP retransmission of an ACK from peer. In that case using the "beatings will continue until the morale improves" approach regarding the TCP stack is useless. It does not matter how many time you resubmit it will not transmit. It just wastes a mass of CPU which could have been used to service other sockets in the netty pool.

I will file that (and submit a patch) once we have finished this one off.

There are some other additional issues on Linux related to cork, autocork, etc, but they cannot be dealt with in Java as it does not do OS specific TCP options. So there is no way to solve them from inside ODL.

A.
_______________________________________________
openflowplugin-dev mailing list
openflowplugin-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev



--
Thanks
Anil