Update on LZMA compression

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Update on LZMA compression

Hi All,

This mail is a follow up to a previous mail thread I started back in
June [1], to look at LZMA compression for certain parts of jmod
packages. I've played a little more with it since then, and here is a
summary of my findings.

Originally I used the Java implementation from the latest 9.20 SDK [2],
but since then I have modified jpkg to use the native implementations of
LZMA and LZMA2. Initially results were marginally better, but nothing to
write home about. Then I tried archiving the native libraries section
and applying LZMA(2) to it. For this I simply used ZIP with the mode set
to store (no compression). This gives much better compression with
native LZMA(2), marginally better with the Java implementation. There
are more nobs available with the native implementation to increase
compression levels. The difference between LZMA and LZMA2 is miniscule,
with LZMA actually giving marginally getter compressing ( I would have
guessed this since LZMA2 uses subchunks to support mutlithreading). This
is similar to some package distribution on Linux using tar.lzma, or
tar.xz. In fact, test programs have shown that ZIP_LZMA(2) gives around
the same level of compression as tar.lzma, tar.xz for typical native
library content.

I looked at the jmod packages built on all platforms/architectures, they
differ in size, number of classes and native libraries, but using a
combination of ZIP (for archiving) and LZMA (for compression) gives us
very favorable ( compared to original and deb packages ) jmod package
sizes. Here are a few examples, using the two largest packages, jdk.boot
and sun.desktop, and the total size of all the jdk module packages
(jigsaw-pkgs) :


 > :  ls -la */jdk.boot*
-rw-rw-r--   1 root 6018     6805823 Jul 28 15:19
jmod-lzma/[hidden email]
-rw-rw-r--   1 root 6018     6807999 Jul 28 15:22
jmod-lzma2/[hidden email]
-rw-rw-r--   1 root 6018     12159909 Jul 28 15:16 jmod/[hidden email]
 > :  ls -la */sun.desktop*
-rw-rw-r--   1 root 6018     4548927 Jul 28 15:20
jmod-lzma/[hidden email]
-rw-rw-r--   1 root 6018     4549255 Jul 28 15:24
jmod-lzma2/[hidden email]
-rw-rw-r--   1 root 6018     5731426 Jul 28 15:17 jmod/[hidden email]
 >: du -sk *
26840   jmod
19112   jmod-lzma
19112   jmod-lzma2 << (~29% reduction)


:> ls -la */jdk.boot*
-rw-r--r--   1 root java     6715654 Jul 29 16:00
jmod-lzma/[hidden email]
-rw-r--r--   1 root java     6709674 Jul 29 16:07
jmod-lzma2/[hidden email]
-rw-r--r--   1 root java     12546939 Jul 29 15:52 jmod/[hidden email]
 >: ls -la */sun.desktop*
-rw-r--r--   1 root java     6228344 Jul 29 16:03
jmod-lzma/[hidden email]
-rw-r--r--   1 root java     6230798 Jul 29 16:11
jmod-lzma2/[hidden email]
-rw-r--r--   1 root java     8606906 Jul 29 15:56 jmod/[hidden email]
 >: du sk *
62628   jmod
43318   jmod-lzma
43320   jmod-lzma2 << (~30% reduction)


 > : ls -la */jdk.boot*
-rw-rw-r--   1 root 6018     4905902 Jul 28 15:08
jmod-lzma/[hidden email]
-rw-rw-r--   1 root 6018     4909477 Jul 28 15:11
jmod-lzma2/[hidden email]
-rw-rw-r--   1 root 6018     6564630 Jul 28 15:06 jmod/[hidden email]
 > :  ls -la */sun.desktop*
-rw-rw-r--   1 root 6018     4768691 Jul 28 15:10
jmod-lzma/[hidden email]
-rw-rw-r--   1 root 6018     4769668 Jul 28 15:13
jmod-lzma2/[hidden email]
-rw-rw-r--   1 root 6018     5985300 Jul 28 15:07 jmod/[hidden email]
 > : du -sk *
21593   jmod
17583   jmod-lzma
17584   jmod-lzma2 << (~19% reduction)

I did not use the maximum LZMA compression level when generating these
results. I found during various runs that using the higher levels of
compression gives very little gain and increases the compression time
quite a bit. Instead using a level just one above the default gives very
good compression and reasonable compression time. That said, generating
packages is a one time event and decompressing/installing is more
important. Installation/extraction times of packages using LZMA are
about 1.5 - 2 times longer than that of existing packages.

I don't see compression as 'one size fits all', for my tests I added a
new option to jpkg to enable LZMA. I guess the key here is to find a
good match for the JDK packages and possibly support
creation/extraction/installation of the existing GZIP and LZMA.

It is worth noting that archiving the native libraries within the jmod
package removes the ability to individually extract one, but I don't
think this should be a problem or conflict with the goal of having the
package format streamable. We already do something similar for classes,
a gzipped pack200 archive. Please shout if I mis-interpreted this goal.


[1]  http://mail.openjdk.java.net/pipermail/jigsaw-dev/2011-June/001332.html
[2] http://www.7-zip.org/sdk.html