Logs not getting compressed


Robert Varga
 

Hello,

it seems our captured logz are no longer getting compressed with .gz:

https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/controller-csit-3node-clustering-ask-all-neon/247/robot-plugin/

Given the size of files involved, it does not make for good experience
:( Also FF seems to have liked .gz very much.

Can we get compression back? Also, would it make sense to compress them
with 'xz -T 0 -0'?

That should prove faster in builds and lower on Nexus storage footprint
(i.e. page cache pressure).

Thanks,
Robert


Anil Belur
 



On Fri, May 31, 2019 at 7:05 AM Robert Varga <nite@...> wrote:
Hello,

it seems our captured logz are no longer getting compressed with .gz:

https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/controller-csit-3node-clustering-ask-all-neon/247/robot-plugin/

Given the size of files involved, it does not make for good experience
:( Also FF seems to have liked .gz very much.

Can we get compression back? Also, would it make sense to compress them
with 'xz -T 0 -0'?

That should prove faster in builds and lower on Nexus storage footprint
(i.e. page cache pressure).

Thanks,
Robert


Robert, I'll look into it and get back. 

- Anil 
 


Anil Belur
 



On Fri, May 31, 2019 at 7:51 AM Anil Belur <abelur@...> wrote:


On Fri, May 31, 2019 at 7:05 AM Robert Varga <nite@...> wrote:
Hello,

it seems our captured logz are no longer getting compressed with .gz:

https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/controller-csit-3node-clustering-ask-all-neon/247/robot-plugin/

Given the size of files involved, it does not make for good experience
:( Also FF seems to have liked .gz very much.

Can we get compression back? Also, would it make sense to compress them
with 'xz -T 0 -0'?

That should prove faster in builds and lower on Nexus storage footprint
(i.e. page cache pressure).

Thanks,
Robert


Robert, I'll look into it and get back. 

- Anil 
 

I agree some of the .xml files output into the logs are pretty large (1GiB) in other projects like FD.io
and we should be compressing these files too. I've looked up lftools code [2.], and it's compressing only *.{txt,log} and
not '*.{xml,html}' files.
  
This change [1.] should fix the issue which adds file types and also uses lzma instead of gzip.

[1.] https://gerrit.linuxfoundation.org/infra/c/releng/lftools/+/15793
[2.] https://github.com/lfit/releng-lftools/blob/master/lftools/deploy.py#L40-L42

Thanks,
Anil


Thanh Ha <zxiiro@...>
 

On Thu, 30 May 2019 at 17:05, Robert Varga <nite@...> wrote:
Also, would it make sense to compress them with 'xz -T 0 -0'?

Hi Robert,

The reason we didn't go with xz compression in the past (we looked into it) is because you cannot open a xz file natively in the browser. It's nice especially for folks who spend a lot of time looking at logs to be able to navigate to a log file on the server and click and have it immediately open. If we compressed with xz this would become an inconvenient 3 step process where you have to first download, then decompress, then open.

With that consideration I think staying on gz compression would be best until xz files can be opened directly from browser like we can with gz.

Regards,
Thanh


Robert Varga
 

On 31/05/2019 22:29, Thanh Ha wrote:
On Thu, 30 May 2019 at 17:05, Robert Varga <nite@...
<mailto:nite@...>> wrote:

Also, would it make sense to compress them with 'xz -T 0 -0'?


Hi Robert,
Hello Thanh,


The reason we didn't go with xz compression in the past (we looked into
it) is because you cannot open a xz file natively in the browser. It's
nice especially for folks who spend a lot of time looking at logs to be
able to navigate to a log file on the server and click and have it
immediately open. If we compressed with xz this would become an
inconvenient 3 step process where you have to first download, then
decompress, then open.
Ah yes, I did not read up on the web-side of the topic. This is a
concern critically important for me.
https://bugzilla.mozilla.org/show_bug.cgi?id=366559 offers some insight,
eventually leading to https://tools.ietf.org/html/rfc7932


With that consideration I think staying on gz compression would be best
until xz files can be opened directly from browser like we can with gz.
I just checked, FF67 does not recognize .xz, but it does send:

Accept-Encoding: gzip, deflate, br

which leads me to https://www.chromestatus.com/feature/5420797577396224
and hence -- can we reasonably switch to Brotli?

Regards,
Robert


Thanh Ha <zxiiro@...>
 

On Sat, 1 Jun 2019 at 18:34, Robert Varga <nite@...> wrote:
On 31/05/2019 22:29, Thanh Ha wrote:
> On Thu, 30 May 2019 at 17:05, Robert Varga <nite@...
> <mailto:nite@...>> wrote:
>
>     Also, would it make sense to compress them with 'xz -T 0 -0'?
>
>
> Hi Robert,

Hello Thanh,

>
> The reason we didn't go with xz compression in the past (we looked into
> it) is because you cannot open a xz file natively in the browser. It's
> nice especially for folks who spend a lot of time looking at logs to be
> able to navigate to a log file on the server and click and have it
> immediately open. If we compressed with xz this would become an
> inconvenient 3 step process where you have to first download, then
> decompress, then open.

Ah yes, I did not read up on the web-side of the topic. This is a
concern critically important for me.
https://bugzilla.mozilla.org/show_bug.cgi?id=366559 offers some insight,
eventually leading to https://tools.ietf.org/html/rfc7932


> With that consideration I think staying on gz compression would be best
> until xz files can be opened directly from browser like we can with gz.

I just checked, FF67 does not recognize .xz, but it does send:

Accept-Encoding: gzip, deflate, br

which leads me to https://www.chromestatus.com/feature/5420797577396224
and hence -- can we reasonably switch to Brotli?

Hi Robert,

Sounds promising and I'm all for this as long as it works out of the box. Anil setup a test last week here:


I'm using the latest version of Chrome which still asks me to open the xz file. I'm not sure if the server-side Apache server needs additional configuration to enable this?

Anil, is this something you can investigate?

Regards,
Thanh


Anil Belur
 



On Sun, Jun 2, 2019 at 8:34 AM Robert Varga <nite@...> wrote:


On 31/05/2019 22:29, Thanh Ha wrote:
> On Thu, 30 May 2019 at 17:05, Robert Varga <nite@...
> <mailto:nite@...>> wrote:
>
>     Also, would it make sense to compress them with 'xz -T 0 -0'?
>
>
> Hi Robert,

Hello Thanh,

>
> The reason we didn't go with xz compression in the past (we looked into
> it) is because you cannot open a xz file natively in the browser. It's
> nice especially for folks who spend a lot of time looking at logs to be
> able to navigate to a log file on the server and click and have it
> immediately open. If we compressed with xz this would become an
> inconvenient 3 step process where you have to first download, then
> decompress, then open.

Ah yes, I did not read up on the web-side of the topic. This is a
concern critically important for me.
https://bugzilla.mozilla.org/show_bug.cgi?id=366559 offers some insight,
eventually leading to https://tools.ietf.org/html/rfc7932


> With that consideration I think staying on gz compression would be best
> until xz files can be opened directly from browser like we can with gz.

I just checked, FF67 does not recognize .xz, but it does send:

Accept-Encoding: gzip, deflate, br

which leads me to https://www.chromestatus.com/feature/5420797577396224
and hence -- can we reasonably switch to Brotli?
 

Just to clarify do we really want to move to Brotli here? Using br would take a toll on the performance and end up costing more
unless there is a newer version that I am unaware and beats lzma/.xz (offers better compression and performance), while
the only downside with .xz is it's not viewable in the browser due to lack of support in https header.


Anil Belur
 



On Mon, Jun 3, 2019 at 12:15 AM Thanh Ha <zxiiro@...> wrote:
On Sat, 1 Jun 2019 at 18:34, Robert Varga <nite@...> wrote:
On 31/05/2019 22:29, Thanh Ha wrote:
> On Thu, 30 May 2019 at 17:05, Robert Varga <nite@...
> <mailto:nite@...>> wrote:
>
>     Also, would it make sense to compress them with 'xz -T 0 -0'?
>
>
> Hi Robert,

Hello Thanh,

>
> The reason we didn't go with xz compression in the past (we looked into
> it) is because you cannot open a xz file natively in the browser. It's
> nice especially for folks who spend a lot of time looking at logs to be
> able to navigate to a log file on the server and click and have it
> immediately open. If we compressed with xz this would become an
> inconvenient 3 step process where you have to first download, then
> decompress, then open.

Ah yes, I did not read up on the web-side of the topic. This is a
concern critically important for me.
https://bugzilla.mozilla.org/show_bug.cgi?id=366559 offers some insight,
eventually leading to https://tools.ietf.org/html/rfc7932


> With that consideration I think staying on gz compression would be best
> until xz files can be opened directly from browser like we can with gz.

I just checked, FF67 does not recognize .xz, but it does send:

Accept-Encoding: gzip, deflate, br

which leads me to https://www.chromestatus.com/feature/5420797577396224
and hence -- can we reasonably switch to Brotli?

Hi Robert,

Sounds promising and I'm all for this as long as it works out of the box. Anil setup a test last week here:


I'm using the latest version of Chrome which still asks me to open the xz file. I'm not sure if the server-side Apache server needs additional configuration to enable this?

Anil, is this something you can investigate? 

The content encoding from Nexus does not return br and returns only gzip by default. I'm not sure if these settings need to be
tuned on nginx/Apache. 

HTTP/1.1 200 OK
Server: nginx/1.14.2
Date: Mon, 03 Jun 2019 00:27:01 GMT
Content-Type: text/html;charset=UTF-8
Connection: keep-alive
Strict-Transport-Security: max-age=15552000
Content-Encoding: gzip

Even though you'd change the settings on the Apache to support br, then that will not open the .xz logs files directly in your web browser.
 


Thanh Ha <zxiiro@...>
 

Maybe we need to setup mod_brotli?


Thanh


On Sun., Jun. 2, 2019, 22:24 Anil Belur, <abelur@...> wrote:


On Mon, Jun 3, 2019 at 12:15 AM Thanh Ha <zxiiro@...> wrote:
On Sat, 1 Jun 2019 at 18:34, Robert Varga <nite@...> wrote:
On 31/05/2019 22:29, Thanh Ha wrote:
> On Thu, 30 May 2019 at 17:05, Robert Varga <nite@...
> <mailto:nite@...>> wrote:
>
>     Also, would it make sense to compress them with 'xz -T 0 -0'?
>
>
> Hi Robert,

Hello Thanh,

>
> The reason we didn't go with xz compression in the past (we looked into
> it) is because you cannot open a xz file natively in the browser. It's
> nice especially for folks who spend a lot of time looking at logs to be
> able to navigate to a log file on the server and click and have it
> immediately open. If we compressed with xz this would become an
> inconvenient 3 step process where you have to first download, then
> decompress, then open.

Ah yes, I did not read up on the web-side of the topic. This is a
concern critically important for me.
https://bugzilla.mozilla.org/show_bug.cgi?id=366559 offers some insight,
eventually leading to https://tools.ietf.org/html/rfc7932


> With that consideration I think staying on gz compression would be best
> until xz files can be opened directly from browser like we can with gz.

I just checked, FF67 does not recognize .xz, but it does send:

Accept-Encoding: gzip, deflate, br

which leads me to https://www.chromestatus.com/feature/5420797577396224
and hence -- can we reasonably switch to Brotli?

Hi Robert,

Sounds promising and I'm all for this as long as it works out of the box. Anil setup a test last week here:


I'm using the latest version of Chrome which still asks me to open the xz file. I'm not sure if the server-side Apache server needs additional configuration to enable this?

Anil, is this something you can investigate? 

The content encoding from Nexus does not return br and returns only gzip by default. I'm not sure if these settings need to be
tuned on nginx/Apache. 

HTTP/1.1 200 OK
Server: nginx/1.14.2
Date: Mon, 03 Jun 2019 00:27:01 GMT
Content-Type: text/html;charset=UTF-8
Connection: keep-alive
Strict-Transport-Security: max-age=15552000
Content-Encoding: gzip

Even though you'd change the settings on the Apache to support br, then that will not open the .xz logs files directly in your web browser.
 


Anil Belur
 

On Sun, Jun 2, 2019 at 8:34 AM Robert Varga <nite@...> wrote:

Ah yes, I did not read up on the web-side of the topic. This is a
concern critically important for me.
https://bugzilla.mozilla.org/show_bug.cgi?id=366559 offers some insight,
eventually leading to https://tools.ietf.org/html/rfc7932


> With that consideration I think staying on gz compression would be best
> until xz files can be opened directly from browser like we can with gz.

I just checked, FF67 does not recognize .xz, but it does send:

Accept-Encoding: gzip, deflate, br

which leads me to https://www.chromestatus.com/feature/5420797577396224
and hence -- can we reasonably switch to Brotli?
 

Just to clarify do we really want to move to Brotli here? Using br would take a toll on the performance and end up costing more
unless there is a newer version that I am unaware and beats lzma/.xz (offers better compression and performance), while
the only downside with .xz is it's not viewable in the browser due to lack of support in https header.


I ran a few tests to compare br vs xz vs gzip vs pigz --best vs pigz -11 (zopfli) with a sample log file size of 148MiB.
Performance: pigz -11 (zopfli) > br > xz > gzip > pigz -best. In the order of longest time taken and droped caches in between the runs.

# time brotli -f -o output.xml.br output.xml
real 4m44.346s
user 4m43.687s
sys 0m0.517s

# time xz -c output.xml > output.xml.xz
real 0m30.603s
user 0m30.056s
sys 0m0.520s

# time gzip -c output.xml > output.xml.gz
real 0m2.694s
user 0m2.369s
sys 0m0.102s

# time pigz --best -p2 -k output.xml
real 0m1.926s
user 0m3.646s
sys 0m0.137s 

# time pigz -11 -p2 -k output.xml
real    16m44.561s
user    33m22.829s
sys    0m0.149s

Footprint: gz > pigz > pigz (zopfli) > xz > br. In the order of the largest file. 

-rw-r--r--. 1 root root 148M May 30 08:18 output.xml
-rw-r--r--. 1 root root 2.3M May 30 08:18 output.xml.br
-rw-r--r--. 1 root root 2.9M Jun  3 10:45 output.xml.xz
-rw-r--r--. 1 root root 11158378 Jun  3 10:46 /tmp//output.xml.gz (gzip)
-rw-r--r--. 1 root root 10560053 May 30 08:18 output.xml.gz (pigz --best)
-rw-r--r--. 1 root root  10002016 May 30 08:18 output.xml.gz (Using zopfli)

To summarize, I'd think we should stick with gzip (use pigz --best) and the current implementation of lftools for the time being because
lzma is not supported in the response headers. Moving to brotli would be costly considering some of the jobs are pushing GiB's of logs
and this would directly impact the job throughput. 

Moving our existing implementation to pigz (parallel gz) might be a better choice, although we may not see a significant diff in the filesize being reduced.

Cheers,
Anil 
 


Luis Gomez
 

Not sure if anybody already told this, but output.xml is fine to compress because it is normally very big file and not very much used for debugging test, however I would leave log.html as it is today because it is very much used and it is normally not that big size.

On Jun 3, 2019, at 4:41 PM, Anil Belur <abelur@...> wrote:

On Sun, Jun 2, 2019 at 8:34 AM Robert Varga <nite@...> wrote:

Ah yes, I did not read up on the web-side of the topic. This is a
concern critically important for me.
https://bugzilla.mozilla.org/show_bug.cgi?id=366559 offers some insight,
eventually leading to https://tools.ietf.org/html/rfc7932


> With that consideration I think staying on gz compression would be best
> until xz files can be opened directly from browser like we can with gz.

I just checked, FF67 does not recognize .xz, but it does send:

Accept-Encoding: gzip, deflate, br

which leads me to https://www.chromestatus.com/feature/5420797577396224
and hence -- can we reasonably switch to Brotli?
 

Just to clarify do we really want to move to Brotli here? Using br would take a toll on the performance and end up costing more
unless there is a newer version that I am unaware and beats lzma/.xz (offers better compression and performance), while
the only downside with .xz is it's not viewable in the browser due to lack of support in https header.


I ran a few tests to compare br vs xz vs gzip vs pigz --best vs pigz -11 (zopfli) with a sample log file size of 148MiB.
Performance: pigz -11 (zopfli) > br > xz > gzip > pigz -best. In the order of longest time taken and droped caches in between the runs.

# time brotli -f -o output.xml.br output.xml
real 4m44.346s
user 4m43.687s
sys 0m0.517s

# time xz -c output.xml > output.xml.xz
real 0m30.603s
user 0m30.056s
sys 0m0.520s

# time gzip -c output.xml > output.xml.gz
real 0m2.694s
user 0m2.369s
sys 0m0.102s

# time pigz --best -p2 -k output.xml 
real 0m1.926s
user 0m3.646s
sys 0m0.137s 

# time pigz -11 -p2 -k output.xml
real    16m44.561s
user    33m22.829s
sys    0m0.149s

Footprint: gz > pigz > pigz (zopfli) > xz > br. In the order of the largest file. 

-rw-r--r--. 1 root root 148M May 30 08:18 output.xml
-rw-r--r--. 1 root root 2.3M May 30 08:18 output.xml.br
-rw-r--r--. 1 root root 2.9M Jun  3 10:45 output.xml.xz
-rw-r--r--. 1 root root 11158378 Jun  3 10:46 /tmp//output.xml.gz (gzip)
-rw-r--r--. 1 root root 10560053 May 30 08:18 output.xml.gz (pigz --best)
-rw-r--r--. 1 root root  10002016 May 30 08:18 output.xml.gz (Using zopfli)

To summarize, I'd think we should stick with gzip (use pigz --best) and the current implementation of lftools for the time being because
lzma is not supported in the response headers. Moving to brotli would be costly considering some of the jobs are pushing GiB's of logs
and this would directly impact the job throughput. 

Moving our existing implementation to pigz (parallel gz) might be a better choice, although we may not see a significant diff in the filesize being reduced.

Cheers,
Anil 
 
_______________________________________________
integration-dev mailing list
integration-dev@...
https://lists.opendaylight.org/mailman/listinfo/integration-dev


Thanh Ha <zxiiro@...>
 

Hi Luis,

These files can be opened in the browser while compressed as long as we use a supported compression method. So I think compressing all possible files is a good idea so that you can download faster.

The main requirement though for this is that these files must be openable via the browser for our testers. If that is not possible with the compression algorithm used then I would be against it using it.

Regards,
Thanh


On Mon, 3 Jun 2019 at 23:23, Luis Gomez <ecelgp@...> wrote:
Not sure if anybody already told this, but output.xml is fine to compress because it is normally very big file and not very much used for debugging test, however I would leave log.html as it is today because it is very much used and it is normally not that big size.

On Jun 3, 2019, at 4:41 PM, Anil Belur <abelur@...> wrote:

On Sun, Jun 2, 2019 at 8:34 AM Robert Varga <nite@...> wrote:

Ah yes, I did not read up on the web-side of the topic. This is a
concern critically important for me.
https://bugzilla.mozilla.org/show_bug.cgi?id=366559 offers some insight,
eventually leading to https://tools.ietf.org/html/rfc7932


> With that consideration I think staying on gz compression would be best
> until xz files can be opened directly from browser like we can with gz.

I just checked, FF67 does not recognize .xz, but it does send:

Accept-Encoding: gzip, deflate, br

which leads me to https://www.chromestatus.com/feature/5420797577396224
and hence -- can we reasonably switch to Brotli?
 

Just to clarify do we really want to move to Brotli here? Using br would take a toll on the performance and end up costing more
unless there is a newer version that I am unaware and beats lzma/.xz (offers better compression and performance), while
the only downside with .xz is it's not viewable in the browser due to lack of support in https header.


I ran a few tests to compare br vs xz vs gzip vs pigz --best vs pigz -11 (zopfli) with a sample log file size of 148MiB.
Performance: pigz -11 (zopfli) > br > xz > gzip > pigz -best. In the order of longest time taken and droped caches in between the runs.

# time brotli -f -o output.xml.br output.xml
real 4m44.346s
user 4m43.687s
sys 0m0.517s

# time xz -c output.xml > output.xml.xz
real 0m30.603s
user 0m30.056s
sys 0m0.520s

# time gzip -c output.xml > output.xml.gz
real 0m2.694s
user 0m2.369s
sys 0m0.102s

# time pigz --best -p2 -k output.xml 
real 0m1.926s
user 0m3.646s
sys 0m0.137s 

# time pigz -11 -p2 -k output.xml
real    16m44.561s
user    33m22.829s
sys    0m0.149s

Footprint: gz > pigz > pigz (zopfli) > xz > br. In the order of the largest file. 

-rw-r--r--. 1 root root 148M May 30 08:18 output.xml
-rw-r--r--. 1 root root 2.3M May 30 08:18 output.xml.br
-rw-r--r--. 1 root root 2.9M Jun  3 10:45 output.xml.xz
-rw-r--r--. 1 root root 11158378 Jun  3 10:46 /tmp//output.xml.gz (gzip)
-rw-r--r--. 1 root root 10560053 May 30 08:18 output.xml.gz (pigz --best)
-rw-r--r--. 1 root root  10002016 May 30 08:18 output.xml.gz (Using zopfli)

To summarize, I'd think we should stick with gzip (use pigz --best) and the current implementation of lftools for the time being because
lzma is not supported in the response headers. Moving to brotli would be costly considering some of the jobs are pushing GiB's of logs
and this would directly impact the job throughput. 

Moving our existing implementation to pigz (parallel gz) might be a better choice, although we may not see a significant diff in the filesize being reduced.

Cheers,
Anil 
 
_______________________________________________
integration-dev mailing list
integration-dev@...
https://lists.opendaylight.org/mailman/listinfo/integration-dev

_______________________________________________
integration-dev mailing list
integration-dev@...
https://lists.opendaylight.org/mailman/listinfo/integration-dev


Anil Belur
 

Hello Robert:

For the first issue we have a fix in lftools and gets pulled in with the latest version of global-jjb v0.38.3 and
the .xml|.html file would be compressed going forward.


Thanks,
Anil

On Fri, May 31, 2019 at 1:16 PM Anil Belur <abelur@...> wrote:


On Fri, May 31, 2019 at 7:51 AM Anil Belur <abelur@...> wrote:


On Fri, May 31, 2019 at 7:05 AM Robert Varga <nite@...> wrote:
Hello,

it seems our captured logz are no longer getting compressed with .gz:

https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/controller-csit-3node-clustering-ask-all-neon/247/robot-plugin/

Given the size of files involved, it does not make for good experience
:( Also FF seems to have liked .gz very much.

Can we get compression back? Also, would it make sense to compress them
with 'xz -T 0 -0'?

That should prove faster in builds and lower on Nexus storage footprint
(i.e. page cache pressure).

Thanks,
Robert


Robert, I'll look into it and get back. 

- Anil 
 

I agree some of the .xml files output into the logs are pretty large (1GiB) in other projects like FD.io
and we should be compressing these files too. I've looked up lftools code [2.], and it's compressing only *.{txt,log} and
not '*.{xml,html}' files.
  
This change [1.] should fix the issue which adds file types and also uses lzma instead of gzip.

[1.] https://gerrit.linuxfoundation.org/infra/c/releng/lftools/+/15793
[2.] https://github.com/lfit/releng-lftools/blob/master/lftools/deploy.py#L40-L42

Thanks,
Anil