Index
-
What is lzip?
-
What makes lzip different from gzip/bzip2?
-
What do you mean I can't restore my files?
-
What is lossy compression?
-
What are the benefits?
-
Are there any drawbacks?
-
Why don't more people use lossy compression?
-
What is the Lessiss-Moore algorithm?
-
What is the PLACeBO algorithm?
-
Since PLACeBO doesn't work, why does lzip use it?
-
What is the Free Object-Oriented License?
-
What is Free Software?
-
How can I aid in the development of lzip?
-
What are your plans for the future of lzip?
1. What is lzip?
Lzip is the most advanced file compression utility ever conceived.
It is literally years ahead of gzip (though admittedly gzip was around
first), and makes use of mathematical transforms the bzip developers have
never even heard of. The practical upshot of this is that when you use
lzip, you get the best compression on the planet. Smaller file sizes; faster
compression/uncompression times.
Used properly, lzip is capable of reducing
a file down to 0% of its original size. Yes, you read that correctly: 0%
of its original size. And regardless of file size, this can be done in
constant time. Now do you see why some people are calling lzip the "holy
grail" of file utilities?
(top)
2. What makes lzip different from
gzip/bzip2?
Well, other than the performance benefits mentioned above, the real
difference is that lzip uses a "lossy" compression scheme. Most other file
compression utilities use a "lossless" compression scheme, mostly
because the lossless algorithms are better understood and simpler mathematically
(most programmers take shortcuts, particularly in areas that involve a
lot of math).
This has two side effects. The first is that
files compressed with lzip cannot be restored to their original state --
this is the "lossy" in lossy compression. The second is that the performance
is vastly improved. Why don't go go back up to question number one and
read that second paragraph again. We're talking about a constant-time algorithm
that can reduce a file down to 0% of its original size. What's not to like?
(top)
3. What do you mean I can't restore
my files?
Ha! A common misconception. You can restore your files
after they have been compressed with lzip. They just won't be exactly
the same as they were before. This makes sense when you think about it;
if you lose a lot of weight suddenly, and then put the same weight back
on suddenly, you wouldn't expect to be in exactly the same health that
you were when you started, would you? Compression is a dramatic process,
and dramatic processes often change people. It's no different for your
files.
On the reassuring side, it is important to
note that the compression algorithm used by lzip only discards the unimportant
data. And if it was unimportant before, what makes it so important now? Huh? In
fact, many users may find that compressing their entire file system and
then restoring it will be a good way to learn what is truly important.
(top)
4. What is lossy compression?
Simply put, a lossy compression algorithm is one in which not all of
the data is preserved. The JPEG file format uses lossy compression.
Alternatively, the GIF format uses lossless compression. And just
look at all the trouble that decision has caused.
Specifically, lzip uses the Lessiss-Moore
algorithm to do its compression. You specify the level of compression that
you want on the command line, and lzip meets your needs by tweaking the
algorithm. The algorithm used by lunzip is currently a modified version
of the PLACeBO algorithm, although this may change with the next release.
(top)
5. What are the benefits?
Numerous; numerous. The size factor, obviously, is a prime benefit.
But on a deeper level, using lossy compression to manage your files is
a way to learn something about yourself. You will most likely experience
a feeling of euphoria or lightheadedness as you watch your free disk space
cascade upwards to 100%. You will become bolder, have increased stamina,
and adrenaline may make you temporarily impervious to pain. You may also
gain a new appreciation for backup devices (this has been widely reported
among the developers).
Lossy compression has benefits that extend
well beyond day-to-day file management. Our short list includes: permanent
(irretrievable) archiving; ultra-high speed transfers over existing network
lines, and high-security "steganographic" storage of sensitive information.
(top)
6. Are there any drawbacks?
Not that we know of. Occasionally, in the pre-1.0 days, someone would
compress a file down to 0K and it would be lost for good. But that has
been happening less and less frequently, and these days it has been a long
time since we received any complaints from the people who reported this
originally.
(top)
7. Why don't more people use lossy
compression?
Probably because it is so new. The Lessiss-Moore algortihm that
lzip uses was only invented a few days ago, and the decompression algorithm
is even now still under development.
There are also a lot of peole who are just
content to stay satisfied with the status quo. We call these people "lazy
dopes." Where would the world be today if it weren't for go-getters and
dreamers like Tom Edison, Karl Marx, Henry Ford, or even Voltaire or the
Earl of Sandwich? Just reflect on that next time you're eating lunch,
if you catch my drift.
(top)
8. What is the Lessiss-Moore algorithm?
The Lessiss-Moore algorithm was invented by Werner von Lessiss and
R.T. Moore in the middle of the last Century. I'm sorry; I meant to
say the middle of last week. [note to nate: change
this].
It utilizes a two-pass bit-sieve to first
remove all unimportant data from the data set. Lzip implements this quiet
effectively by eliminating all of the 0's. It then sorts the remaining
bits into increasing order, and begins searching for patterns. The number
of passes in this search is set to (10-N) in lzip, where N
is the numeric command-line argument we've been telling you about.
For every pattern of length (10/N)
found in the data set, the algorithm makes a mark in its hash table. By
keeping the hash table small, we can reduce memory overhead. Lzip uses
a two-entry hash table. Then data in this table is then plotted in three
dimensions, and a discrete cosine transform transforms it into frequency
and amplitude data. This data is filtered for sounds that are beyond the
range of the human ear, and the result is transformed back (via an indiscrete
cosine) into the hash table, in random order.
Take each pattern in the original data set,
XOR it with the log of it's entry in the new hash table, then shuffle each
byte two positions to the left and you're done!
And you can see, there is some very advanced
thinking going on here. It is no wonder this algorithm took so long
to develop!
(top)
9. What is the PLACeBO algorithm?
PLACeBO was the lzip team's first attempt to implement the Lessiss-Moore
compression filter in reverse. The results were less than astounding, however,
as analysis has shown Lessiss-Moore to be a trapdoor function.
In the end, PLACeBO may be abandoned
in favor of something else. For now, however, it is the method used by
lunzip to decopress lzip-compressed files, even if it has it's flaws.
(top)
10. Since PLACeBO doesn't work, why
does lzip use it?
It may not be perfect, but it is the best tool we have. I don't
want anyone to get the wrong impression: just because PLACeBO doesn't
work, doesn't mean it can't be used. Lunzip makes up for the shortcomings
in PLACeBO by patching in a couple of support functions.
We use the Warren Interior Point Method from
Operations Research to step backwards through the cosine transform. This
method, alternatively known as the Warren "Dice-Prayer" method, is very
useful in OR problems when you don't have the time or perhaps the
willpower to work through Simplex. The application of it to our filtering
problem was not straightforward, but late in the process we added fast
monte-carlo sorting to the mix and everything seems to have turned out
fine.
(top)
11. What is the Free Object-Oriented
License?
The Free Object-Oriented License (or FO2L,
or "foo" license) is an Open Source license we created under which to release
the code for lzip. Many people create their own licenses every day, and
we figured we should take a look at the existing ones to see which best
met our needs. Unfortunately, the creation of the Lzip logo graphic took
a lot longer than expected, and we never got around to looking at the existing
licenses. The FO2L is what we came up with
on our own. You can read it here.
(top)
12. What is Free Software?
I'm not sure. I've heard a lot about it, though, so I'm going
to assume that it's here to stay. We decided to include "Free" in the name
of our license because we liked the way it sounded, and we needed
an "F" for the acronym to come out how we wanted.
A lot of sites that talk about free software
seem to point here. When I get the chance, i
plan to check it out myself someday.
(top)
13. How can I aid in the development
of lzip?
We'd love to have you help out! Unfortunately, the odds are pretty
low that you have anything good to offer. You have to be pretty smart to
keep up with the lzip team. We're already staffed with really smart people,
several of whom have quite a bit of experience writing software of this
sort. Those that have computers (unlike myself) tell me that programming
isn't really all that interesting anyway.
But, if you're still up to the challenge,
please tell us in the discussion forum.
(top)
14. What are your plans for the future
of lzip?
We have many plans, including creating a library in addition to the
standalone program, and adding a GUI with a variety of themeable "skins."
Your suggestions are welcome.
(top)
Menu: Home | Download
| Quickstart Guide | FAQ | License
| About Us
April 1, 2000
|