ZX7: an "optimal" LZ77 packer

About 20 years ago, I started a M.Sc. in Computing, planning to work on data compression. Although I quickly changed to another subject for a number of reasons, I always kept a desire to produce something in this area...

Last week, motivated by the recent interest in WoS about improving existing packers (data compressors) for the ZX-Spectrum, I finally decided to give it a try. The result is a new packer called "ZX7":

http://www.worldofspectrum.org/infoseekid.cgi?id=0027996

"ZX7" is basically a new implementation of the classic LZ77 ("Lempel-Ziv 1977") algorithm, using a data format very similar to Bitbuster. In comparison to already existing packers, its key characteristics are:
  • AFAIK all existing LZ77 compressors are based on "greedy" or "flexible parsing", because everybody still claims that comparing all possibilities to find the best possible compression would take too much time. However I invented a new compression algorithm for "ZX7" that is "optimal" (i.e. it will always find the best data compression) and nearly instantaneous on modern computers.

  • The "ZX7" decompressor for the ZX-Spectrum is highly optimized. It's just 73 bytes long, requires no extra memory area, and only affects the main registers (HL, DE, BC, AF).
Notice however that "ZX7" uses LZ77 compression only. In practice, its "optimal" packer probably means that "ZX7" should work better than other packers based (almost) solely on LZ77 (such as PuCrunch and Bitbuster). However this is probably not enough to win against more complex packers that use different combinations of algorithms (such as Exomizer and RIP). Perhaps I will also add support for other algorithms into "ZX7" in the future...

In the meantime, I will take some time to compare it with existing packers and post the results here!


EDIT: Updated link to WoS entry
Post edited by Einar Saukas on
Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
«13456712

Comments

  • edited December 2012
    The included .exe doesn't work on 64 bit windows, and when I tried to compile it with GCC, I couldn't because the coder ain't proper C (seems to mix in a bit of C++ syntax).

    I changed a few things to make it work:

    zx7.h , line 42:
    unsigned char *compress(Optimal *optimal, unsigned char *input_data, size_t input_size, size_t *output_size);
    

    zx7.c , line 107:
        output_data = compress(optimize(input_data, input_size), input_data, input_size, &output_size);
    

    compress.c, line 63:
    unsigned char *compress(Optimal *optimal, unsigned char *input_data, size_t input_size, size_t *output_size) {
    

    compress.c, line 72:
        *output_size = (optimal[input_index].bits+18+7)/8;
    

    Now after testing it, it doesn't seem to be that optimal... using MegaLZ's benchmark test, I get:

    123609 bytes for bitbuster_extreme0.1
    121942 bytes for zx7
    118341 bytes for MegaLZ
    117791 bytes for pletter 0.5a
    115547 bytes for RNC PRO-PACK
    114715 bytes for aPLib
    113927 bytes for pucrunch
    110121 bytes for exomizer 2 (no literal sequences)
    110116 bytes for exomizer 2 (no literal sequences,optimized)
    110089 bytes for exomizer 2
    110085 bytes for exomizer 2 (optimized)
  • edited December 2012
    In the meantime, I will take some time to compare it with existing packers and post the results here!

    I suggest you use Ticks for compare time with other packers.

    http://www.worldofspectrum.org/forums/showthread.php?t=41709
  • edited December 2012
    Metalbrain wrote: »
    The included .exe doesn't work on 64 bit windows

    Ops! I compiled it with djgpp without spending enough time on tests. Thanks a lot for the warning!

    Metalbrain wrote: »
    and when I tried to compile it with GCC, I couldn't because the coder ain't proper C (seems to mix in a bit of C++ syntax).

    Ops! I forgot that passing parameters by reference was C++ only. Again I should have checked it better. Thanks for the correction!

    Metalbrain wrote: »
    Now after testing it, it doesn't seem to be that optimal...

    "ZX7" is optimal LZ77 only, i.e. it always provides the best possible compression for the LZ77 method according to the chosen data format. However LZ77 is not always the best compression method, other algorithms may work better for certain parts of a file. Also the Bitbuster data format (adopted by "ZX7" with minor changes) seems to work quite well in most cases for LZ77, but other formats may be more efficient for specific files.

    I'm not surprised other compressors provided better results, since they combine different algorithms, choosing different methods for each part of the file. I would be just surprised if it wasn't able to defeat Bitbuster :)

    Do you have the detailed test results?
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited December 2012
    Congratulations! The optimal finding is not an easy problem to solve. (but i must say i haven't looked into it.)

    Also, I'm currently looking for an easy to use PC compressor/ZX decompressor combination. I might actually looking at it to see how good it is for compressing large files, like screen data.

    I'm also looking for something that can compress/decompress large amount of individual map data or sprite data. How good is this algorithm for this? Do you store additional data like dictionaries of trees into every piece of the compressed data?
  • edited December 2012
    2 bytes improvement:
    ; -----------------------------------------------------------------------------
    ; ZX7 decoder by Einar Saukas
    ; "Standard" version (73 bytes only)
    ; -----------------------------------------------------------------------------
    ; Parameters:
    ;   HL: source address (compressed data)
    ;   DE: destination address (decompressing)
    ; -----------------------------------------------------------------------------
    
    dzx7:
            ld      a, $80
    copy_byte_loop:
            ldi                             ; copy literal byte
    main_loop:
            call    next_bit
            jr      nc, copy_byte_loop      ; next bit indicates either literal or sequence
    
    ; determine number of bits used for length (Elias gamma coding)
            push    de
            ld      bc, 0
            ld      d, b
    length_size_loop:
            inc     d
            call    next_bit
            jr      nc, length_size_loop
    
    ; determine length
    length_value_loop:
            call    nc, next_bit
            rl      c
            rl      b
            jr      c, exit                 ; check end marker
            dec     d
            jr      nz, length_value_loop
            inc     bc                      ; adjust length
    
    ; determine offset
            ld      e, (hl)                 ; load offset flag (1 bit) + offset value (7 bits)
            inc     hl
            sll     e                       ; opcode for undocumented instruction "SLS E"
            jr      nc, offset_end          ; if offset flag is set, load 4 extra bits
            rr      d
            call    twice
            call    twice
    offset_end:
            rr      e                       ; insert fourth bit into E
    
    ; copy previous sequence
            ex      (sp), hl                ; store source, restore destination
            push    hl                      ; store destination
            sbc     hl, de                  ; HL = destination - offset - 1
            pop     de                      ; DE = destination
            ldir
            pop     hl                      ; restore source address (compressed data)
            jr      main_loop
    exit:   pop     de
    twice:  call    rld_next_bit
    rld_next_bit:
            rl      d                       ; insert previous bit into D
    next_bit:
            add     a, a                    ; check next bit
            ret     nz                      ; no more bits left?
            ld      a, (hl)                 ; load another group of 8 bits
            inc     hl
            rla
            ret
    


    Edit: Same bytes but with a clean exit:
    dzx7:
            ld      a, $80
    copy_byte_loop:
            ldi                             ; copy literal byte
    main_loop:
            call    next_bit
            jr      nc, copy_byte_loop      ; next bit indicates either literal or sequence
    
    ; determine number of bits used for length (Elias gamma coding)
            push    de
            ld      bc, 0
            ld      d, b
    length_size_loop:
            inc     d
            call    next_bit
            jr      nc, length_size_loop
    
    ; determine length
    length_value_loop:
            call    nc, next_bit
            rl      c
            rl      b
            jr      c, exit                 ; check end marker
            dec     d
            jr      nz, length_value_loop
            inc     bc                      ; adjust length
    
    ; determine offset
            ld      e, (hl)                 ; load offset flag (1 bit) + offset value (7 bits)
            inc     hl
            sll     e                       ; opcode for undocumented instruction "SLS E"
            jr      nc, offset_end          ; if offset flag is set, load 4 extra bits
            call    twice1
            call    twice
    offset_end:
            rr      e                       ; insert fourth bit into E
    
    ; copy previous sequence
            ex      (sp), hl                ; store source, restore destination
            push    hl                      ; store destination
            sbc     hl, de                  ; HL = destination - offset - 1
            pop     de                      ; DE = destination
            ldir
            pop     hl                      ; restore source address (compressed data)
            db      $18
    twice1: rr      d
    twice:  call    rld_next_bit
    rld_next_bit:
            rl      d                       ; insert previous bit into D
    next_bit:
            add     a, a                    ; check next bit
            ret     nz                      ; no more bits left?
            ld      a, (hl)                 ; load another group of 8 bits
            inc     hl
            rla
            ret
    
    exit:   pop     de
            ret
    
  • edited December 2012
    "ZX7" is optimal LZ77 only, i.e. it always provides the best possible compression for the LZ77 method according to the chosen data format. However LZ77 is not always the best compression method,

    So it's like designing an optimal bicycle which however still is slower than an unoptimal, average car? ;)

    Well, as long as it is fun, it is perfectly okay !
  • edited December 2012
    68 bytes:
    dzx7:
            ld      a, $80
    copy_byte_loop:
            ldi                             ; copy literal byte
    main_loop:
            ld      bc, 0
            call    next_bit
            jr      nc, copy_byte_loop      ; next bit indicates either literal or sequence
    
    ; determine number of bits used for length (Elias gamma coding)
            push    de
            ld      d, b
    length_size_loop:
            inc     d
            call    next_bit
            jr      nc, length_size_loop
    
    ; determine length
    length_value_loop:
            call    nc, next_bit
            rl      c
            rl      b
            jr      c, quad-1               ; check end marker
            dec     d
            jr      nz, length_value_loop
            inc     bc                      ; adjust length
    
    ; determine offset
            ld      e, (hl)                 ; load offset flag (1 bit) + offset value (7 bits)
            inc     hl
            sll     e                       ; opcode for undocumented instruction "SLS E"
            call    c, quad                 ; insert first bit into D
            rr      e                       ; insert fourth bit into E
    
    ; copy previous sequence
            ex      (sp), hl                ; store source, restore destination
            push    hl                      ; store destination
            sbc     hl, de                  ; HL = destination - offset - 1
            pop     de                      ; DE = destination
            ldir
            pop     hl                      ; restore source address (compressed data)
            jr      main_loop+1
    
    quad:   rr      d
            call    twice
    twice:  call    rld_next_bit
    rld_next_bit:
            rl      d                       ; insert previous bit into D
    next_bit:
            add     a, a                    ; check next bit
            ret     nz                      ; no more bits left?
            ld      a, (hl)                 ; load another group of 8 bits
            inc     hl
            rla
            ret
    

    Also you can change this fragment, same bytes but faster:
            ld      d, b
    
    length_size_loop:
            inc     b
            call    next_bit
            jr      nc, length_size_loop
    
    ; determine length
    length_value_loop:
            call    nc, next_bit
            rl      c
            rl      d
            jr      c, quad-1               ; check end marker
            djnz    length_value_loop
            ld      b, d
    
  • edited December 2012
    64 bytes if longest match is <=255 length
    dzx7:
            ld      a, $80
    copy_byte_loop:
            ldi                             ; copy literal byte
    main_loop:
            call    next_bit
            jr      nc, copy_byte_loop      ; next bit indicates either literal or sequence
    
            ld      bc, 0
            
    length_size_loop:
            inc     b
            call    next_bit
            jr      nc, length_size_loop
    
    ; determine length
    length_value_loop:
            call    nc, next_bit
            rl      c
            ret     c                       ; check end marker
            djnz    length_value_loop
    
            push    de
            ld      d, b
            inc     c                       ; adjust length
    
    ; determine offset
            ld      e, (hl)                 ; load offset flag (1 bit) + offset value (7 bits)
            inc     hl
            sll     e                       ; opcode for undocumented instruction "SLS E"
            call    c, quad                 ; insert first bit into D
            rr      e                       ; insert fourth bit into E
    
    ; copy previous sequence
            ex      (sp), hl                ; store source, restore destination
            push    hl                      ; store destination
            sbc     hl, de                  ; HL = destination - offset - 1
            pop     de                      ; DE = destination
            ldir
            pop     hl                      ; restore source address (compressed data)
            jr      main_loop
    
    quad:   rr      d
            call    twice
    twice:  call    rld_next_bit
    rld_next_bit:
            rl      d                       ; insert previous bit into D
    next_bit:
            add     a, a                    ; check next bit
            ret     nz                      ; no more bits left?
            ld      a, (hl)                 ; load another group of 8 bits
            inc     hl
            rla
            ret
    
  • edited December 2012
    Metalbrain wrote: »
    The included .exe doesn't work on 64 bit windows, and when I tried to compile it with GCC, I couldn't because the coder ain't proper C (seems to mix in a bit of C++ syntax).

    OK, I recompiled the compressor for Windows using Open Watcom instead of DJGPP (seems to work fine on 64 bit Windows). And this time I forced strict C compilation to ensure I didn't screw up again.

    The link in my original post was updated to reference the corrected version. Thanks again!
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited December 2012
    I suggest you use Ticks for compare time with other packers.

    http://www.worldofspectrum.org/forums/showthread.php?t=41709

    Good idea!
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited December 2012
    Timmy wrote: »
    Congratulations! The optimal finding is not an easy problem to solve. (but i must say i haven't looked into it.)

    Thanks! I'm assuming it's not easy considering that LZ77 was invented in 1977, it's still quite popular, but I could not find any optimal packer for it. I even did some theoretical research and identified a few academic papers discussing this subject in recent years, but none of them seemed able to provide a proper solution.
    Timmy wrote: »
    Also, I'm currently looking for an easy to use PC compressor/ZX decompressor combination. I might actually looking at it to see how good it is for compressing large files, like screen data.

    The "ZX7" demo (in TZX) shows how it works to decompress Spectrum screens. You may want to take a look.
    Timmy wrote: »
    I'm also looking for something that can compress/decompress large amount of individual map data or sprite data. How good is this algorithm for this? Do you store additional data like dictionaries of trees into every piece of the compressed data?

    There's no dictionary or additional data, so the overhead for each separate block is quite small. However it depends on the characteristics of your data. In certain specific cases, a simple RLE encoding may provide better results than the most sophisticated packers! You will have to give it a try.

    The "ZX7" documentation is fairly small and should provide all the info you need. If you have further questions, just let me know!
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited December 2012
    Hi,

    the name supposedly will be confusing.
    http://www.worldofspectrum.org/infoseekid.cgi?id=0012538
    ZX Spectrum 48K BEEPER Music:
    http://mister_beep.republika.pl/
  • edited December 2012
    2 bytes improvement

    Thanks! Please give me some time to check your suggestions, and I will write a proper reply later!
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited December 2012
    Ralf wrote: »
    So it's like designing an optimal bicycle which however still is slower than an unoptimal, average car? ;)

    Exactly!

    Nowadays we need cars much more often, but it doesn't mean we should stop improving bicycles. Besides, new ideas for bicycle improvements can sometimes also help to make cars (or motorcycles) slightly better...
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited December 2012
    Hi,

    the name supposedly will be confusing.
    http://www.worldofspectrum.org/infoseekid.cgi?id=0012538

    Yes, I first thought it was a specialised packer for ZX-7 music data too.
    WIP Tritone Demo
    No more html on dropbox. :(
  • edited December 2012
    the name supposedly will be confusing.
    http://www.worldofspectrum.org/infoseekid.cgi?id=0012538

    Thanks for the info! However I don't think this will be a problem, since "ZX7" and "ZX-7" are completely unrelated (also spelled differently) so nobody should ever be confused about this.

    Besides, ZX7 is not an uncommon name. For instance, it's also a Kawasaki Ninja model:

    http://upload.wikimedia.org/wikipedia/commons/7/79/Kawazaki_ZX7_-_South_Darenth%2C.jpg
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited December 2012
    FrankT wrote: »
    Yes, I first thought it was a specialised packer for ZX-7 music data too.

    It seems like you just contradicted what I just posted! :)
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited December 2012
    Also you can change this fragment, same bytes but faster:
            ld      d, b
    
    length_size_loop:
            inc     b
            call    next_bit
            jr      nc, length_size_loop
    
    ; determine length
    length_value_loop:
            call    nc, next_bit
            rl      c
            rl      d
            jr      c, quad-1               ; check end marker
            djnz    length_value_loop
            ld      b, d
    

    This won't work because it will leave some garbage in D, that will be used to store MSB of the offset value.
    64 bytes if longest match is <=255 length

    Unfortunately this change would generate larger compressed files in most cases, so it's not a good idea.

    However your other suggestions are very clever! I will certainly incorporate some of them (giving you proper credit), thanks a lot!
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited December 2012
    This won't work because it will leave some garbage in D, that will be used to store MSB of the offset value.

    Ups. It's true. Sorry in my test the length was <255.
    Unfortunately this change would generate larger compressed files in most cases, so it's not a good idea.

    The idea is that you add an option in the compressor to avoid >255 length (an output a shorter escape secuence). So you offer 2 decrunchers, one of 68 bytes (option disabled) and other of 64 bytes (option enabled). Also you can show a message in case that an unexperienced user pass a file with length<255 bytes suggesting the shorter decruncher. It will save 4 bytes in the code and 1 byte in the bitstream (escape secuence is 1 byte instead 2).

    And in the case that exists matches with >255 length you can try the compressor with and without the option, if the difference is less than 5 bytes the shorter version is also better.
    However your other suggestions are very clever! I will certainly incorporate some of them (giving you proper credit), thanks a lot!

    Thank you for take account my suggestions.
  • edited December 2012
    Following Metalbrain's suggestion, I executed MegaLZ's benchmark test and obtained the following results:
           |New MegaLZ
           |       |Old MegaLZ
           |       |       |PuCrunch
           |       |       |       |ZX7
           |       |       |       |       |Bitbuster Extreme v0.1
           |       |       |       |       |       |HRUST2.1
           |       |       |       |       |       |       |PCD6.2
           |       |       |       |       |       |       |       |RIP v0.01
    -------+-------+-------+-------+-------+-------+-------+-------+------
    code-1 | 12654 | 12863 | 12474 | 12754 | 12827 | 12865 | 12904 | 12118
    code-2 |  7767 |  8034 |  7832 |  7967 |  8119 |  8012 |  8090 |  7689
    code-3 |  8406 |  8524 |  8273 |  8135 |  8181 |  8291 |  8373 |  7891
    -------+-------+-------+-------+-------+-------+-------+-------+------
    scrv-1 |  4672 |  4724 |  4613 |  4727 |  4757 |  4705 |  4764 |  4551
    scrv-2 |  4074 |  4148 |  4131 |  4134 |  4170 |  4144 |  4173 |  4030
    scrv-3 |  3604 |  3688 |  3670 |  3684 |  3717 |  3722 |  3753 |  3544
    scrv-4 |  2211 |  2314 |  2156 |  2225 |  2270 |  2359 |  2366 |  2165
    scrv-5 |  4399 |  4479 |  4415 |  4431 |  4470 |  4486 |  4524 |  4298
    scrv-6 |  4233 |  4324 |  4354 |  4238 |  4300 |  4345 |  4364 |  4129
    scrv-7 |  3788 |  3899 |  3800 |  3856 |  3902 |  3921 |  3942 |  3721
    scrv-8 |  5275 |  5340 |  5288 |  5307 |  5345 |  5308 |  5373 |  5139
    scrv-9 |  4958 |  5043 |  5038 |  5030 |  5081 |  5048 |  5095 |  4844
    -------+-------+-------+-------+-------+-------+-------+-------+------
    text-1 |  3858 |  3986 |  3872 |  3975 |  4077 |  4004 |  4029 |  3726
    text-2 |  7341 |  7663 |  6762 |  7585 |  7766 |  7549 |  7470 |  6525
    text-3 | 10252 | 10681 |  9184 | 10887 | 11156 | 10360 | 10246 |  8847
    text-4 | 20406 | 21417 | 18577 | 21563 | 22246 | 20892 | 20752 | 17334
    text-5 | 10443 | 10919 |  9488 | 10892 | 11225 | 10618 | 10515 |  9088
    -------+-------+-------+-------+-------+-------+-------+-------+------
     TOTAL |118341 |122046 |113927 |121390 |123609 |120629 |120733 |109639
    

    Results above were obtained as follows:
    • PuCrunch: executed using options "pucrunch.exe -c0 -d" (that seems to produce the best results);
    • ZX7: executed using the version I uploaded earlier today;
    • For everything else, I just copied the info from the original MegaLZ benchmark documentation.

    My conclusions:
    • It's no surprise that ZX7 usually provides better results than BitBuster and Old MegaLZ since these are all "pure" LZ77 implementations (AFAIK), thus the "optimal" ZX7 compressor gives a lot of advantage in this case. Although there's still some variation here, because Old MegaLZ data format may be a better choice for certain files and ZX7 data format may be a better choice for others.

    • It's no surprise either that packers based on more complex combinations of different algorithms provide even better results. All commercial packers for modern computers are based on this principle! In particular, the best results were obtained by RIP, which is based on LZSS with Huffman (although depacking is supposedly quite slow).

    • I was just surprised about PuCrunch results. It worked much better than I expected! I realize now it's not "pure" LZ77 as I originally thought, it actually uses a complex combination of LZ77 and RLE. I didn't think this would be so effective but it is! I will take another look at it later.


    EDIT: Updated table above according to new results presented in #40
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited December 2012
    Hi Einar. I have another suggestion to you, in this case it's not mine, it's from Urusergi.

    You can interlace the counter and the bits of the gamma encoding in something like this:
    length_loop:
            call    next_bit
            rl      c
            rl      b
            ret     c
            call    next_bit
            jr      c, length_loop
            inc     bc
            push    de
    

    So you must change also the compressor.
  • edited December 2012
    Sorry for the latest code, Urusergi has revised and this is the correct one:
    dzx7:
            ld      a, $80
    copy_byte_loop:
            ldi                             ; copy literal byte
    main_loop:
            call    next_bit
            jr      nc, copy_byte_loop      ; next bit indicates either literal or sequence
    
    ; determine length (Elias gamma coding)
            ld      bc, 0
    length_value_loop:
            call    nc, next_bit            ; load length data bit
            rl      c
            rl      b
            ret     c                       ; check end marker
            call    next_bit                ; load length flag
            jr      nc, length_value_loop
            inc     bc                      ; adjust length
            push    de
    ; determine offset
    

    The correct Elias Gamma coding is here:
    1-> bc=0000000000000001= length 1+1
    0 0 1-> bc=0000000000000010= length 2+1 
    0 1 1-> bc=0000000000000011= length 3+1
    0 0 0 0 1-> bc=0000000000000100= length 4+1
    0 0 0 1 1-> bc=0000000000000101= length 5+1
    0 1 0 0 1-> bc=0000000000000110= length 6+1
    0 1 0 1 1-> bc=0000000000000111= length 7+1
    0 0 0 0 0 0 1-> bc=0000000000001000= length 8+1
    0 0 0 0 0 1 1-> bc=0000000000001001= length 9+1
    0 0 0 1 0 0 1-> bc=0000000000001010= length 10+1
    0 0 0 1 0 1 1-> bc=0000000000001011= length 11+1
    .....
    
  • utzutz
    edited December 2012
    FrankT wrote: »
    Yes, I first thought it was a specialised packer for ZX-7 music data too.

    I was a bit confused as well.

    So just for fun I tried ZX7 with a ZX-7 binary (player code + music data). Here's the result:

    Original Binary: 5838 bytes
    ZX7: 1646
    (New) MegaLZ: 1669
    Bitbuster 1.2 1684
    pucrunch 1960

    With beeper music data it's usually hard to beat MegaLZ, so that deserves an extra round of applause for Einer Saukas, I think. Will be watching the further development of this utility closely.
  • edited January 2013
    utz wrote: »
    I was a bit confused as well.

    So just for fun I tried ZX7 with a ZX-7 binary (player code + music data). Here's the result:

    Original Binary: 5838 bytes
    ZX7: 1646
    (New) MegaLZ: 1669
    Bitbuster 1.2 1684
    pucrunch 1960

    With beeper music data it's usually hard to beat MegaLZ, so that deserves an extra round of applause for Einar Saukas, I think. Will be watching the further development of this utility closely.

    Thanks! It's really a lot of coincidence that ZX7 seems especially useful for compressing ZX-7 :)
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited January 2013
    You can interlace the counter and the bits of the gamma encoding

    This is another brilliant idea! However it would increase every data compressed block by 2 bytes, so I don't think it's worth it.

    The reason is, ZX7 end marker is indicated by (length-1) >= 65536, which is represented as follows:

    00000000000000001xxxxxxxxxxxxxxxx

    In practice, the last 16 bits are omitted since they are irrelevant. However, in "interlaced Elias Gamma Coding" there would be no way to ommit these bits:

    0x0x0x0x0x0x0x0x0x0x0x0x0x0x0x0x1

    Changes that would decrease decompressor size, but increase compressed data, is exactly what I was trying to avoid. Typically Spectrum programs have several compressed data blocks (for instance one for each level) and just one copy of the decompressor, thus saving 2 bytes in every block is usually more important than saving 6 bytes in the decompressor...
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited January 2013
    Is the mediafire ZIP being kept up-to-date with the changes/improvements mentioned here?

    I'm very interested in this method as you mention that it only uses the four main register pairs (AF, BC, DE, & HL) which makes it excellent for use on a ZX81.
  • edited January 2013
    The idea is that you add an option in the compressor to avoid >255 length (an output a shorter escape secuence). So you offer 2 decrunchers, one of 68 bytes (option disabled) and other of 64 bytes (option enabled). Also you can show a message in case that an unexperienced user pass a file with length<255 bytes suggesting the shorter decruncher. It will save 4 bytes in the code and 1 byte in the bitstream (escape secuence is 1 byte instead 2).

    Yes, I agree a shorter length limit may be an advantage for certain specific files. However, the same applies for countless other variations in the compressed format. For instance, we could have an option to limit the maximum offset to 1024 instead of 2048 (this would be enough for certain files), thus saving 1 bit for every "long sequence" and reducing the decompressor size slightly. Another option to limit offset to 512, thus saving 2 bits per "long sequence" and reducing the compressor a little more. Yet another option to limit offset to 256, so offsets between 128 and 256 would spend only 8 bits instead of 12, and the decompressor would reduce a lot. And so on...

    For every change in data format and decompressor that we can imagine, it will be always possible to have files where it works better and others where it's worse.

    This is something I was also trying to avoid. ZX7 was designed for simplicity. I didn't want to provide lots of compressing options, each one requiring a different decompressor, because I don't think most users would bother to test lots of combinations just to save very few bytes.

    For this reason, I tested different combinations for encoding standard LZ77 compression, and the combination that provided the best results in most cases was exactly like BitBuster. This is not surprising, since BitBuster developers certainly executed similar tests to choose it. Thus this is the format I adopted for ZX7, except for minor improvements only.

    Just out of curiosity, I may put together an extra-official "ZX7 Mini" version combining the ideas that allow a very small decompressor routine (length and offset limited to 256 and perhaps "interlaced Elias Gamma Coding"). But I prefer to keep the official release as an all purpose, single format, very easy to use compressor. I hope you understand this design decision!

    Thank you for take account my suggestions.

    On the contrary, thank YOU for all your contributions!
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited January 2013
    Based on several (excellent) suggestions from Antonio Villena, after discarding the ones that would increase compression size, I came with this version for the "Standard" decompressor:
    ; -----------------------------------------------------------------------------
    ; ZX7 decoder by Einar Saukas & Antonio Villena
    ; "Standard" version (68 bytes only)
    ; -----------------------------------------------------------------------------
    ; Parameters:
    ;   HL: source address (compressed data)
    ;   DE: destination address (decompressing)
    ; -----------------------------------------------------------------------------
    
    dzx7:
            ld      a, $80
    copy_byte_loop:
            ldi                             ; copy literal byte
    main_loop:
            call    next_bit
            jr      nc, copy_byte_loop      ; next bit indicates either literal or sequence
    
    ; determine number of bits used for length (Elias gamma coding)
            push    de
            ld      bc, 0
            ld      d, b
    length_size_loop:
            inc     d
            call    next_bit
            jr      nc, length_size_loop
    
    ; determine length
    length_value_loop:
            call    nc, next_bit
            rl      c
            rl      b
            jr      c, exit                 ; check end marker
            dec     d
            jr      nz, length_value_loop
            inc     bc                      ; adjust length
    
    ; determine offset
            ld      e, (hl)                 ; load offset flag (1 bit) + offset value (7 bits)
            inc     hl
            defb    $cb, $33                ; opcode for undocumented instruction "SLL E" a.k.a. "SLS E"
            call    c, long_offset          ; if offset flag is set, load 4 extra bits
            rr      e
    
    ; copy previous sequence
            ex      (sp), hl                ; store source, restore destination
            push    hl                      ; store destination
            sbc     hl, de                  ; HL = destination - offset - 1
            pop     de                      ; DE = destination
            ldir
            pop     hl                      ; restore source address (compressed data)
            jr      main_loop
    
    exit:
            pop     de
    long_offset:
            ccf                             ; clear carry flag
            call    twice_rld_next          ; load 2+2 extra bits
    twice_rld_next:
            call    rld_next_bit            ; load 1+1 extra bits
    rld_next_bit:
            rl      d                       ; insert previous bit into D
    next_bit:
            add     a, a                    ; check next bit
            ret     nz                      ; no more bits left?
            ld      a, (hl)                 ; load another group of 8 bits
            inc     hl
            rla
            ret
    
    ; -----------------------------------------------------------------------------
    

    It's 68 bytes long and should run faster than previous suggestions... although I didn't test it yet.

    I will run a few tests before incorporating this into the official release.
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited January 2013
    bobs wrote: »
    I'm very interested in this method as you mention that it only uses the four main register pairs (AF, BC, DE, & HL) which makes it excellent for use on a ZX81.

    Good point!

    bobs wrote: »
    Is the mediafire ZIP being kept up-to-date with the changes/improvements mentioned here?

    Not yet. However the latest version is exactly like the current ZIP, except using the "Standard" compressor posted in #29.

    I need some time to test it properly. I will let you know as soon as I update it!
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited January 2013
    Just out of curiosity, I may put together an extra-official "ZX7 Mini" version combining the ideas that allow a very small decompressor routine (length and offset limited to 256 and perhaps "interlaced Elias Gamma Coding"). But I prefer to keep the official release as an all purpose, single format, very easy to use compressor. I hope you understand this design decision!

    Yes, of course. In that extra-official ZX7 Mini I have decided to the alternate Elias Gamma encoding (even bits), and the length limit to 255 bytes.

    This is the 59 bytes decruncher:
    dzx7:
            ld      a, $80
    copy_byte_loop:
            ldi                             ; copy literal byte
    main_loop:
            call    next_bit
            jr      nc, copy_byte_loop      ; next bit indicates either literal or sequence
    
    ; determine length (Elias gamma coding, EVEN bits)
            ld      bc, 1
    length_value_loop:
            call    next_bit                ; load length data bit
            rl      c
            ret     c                       ; check end marker
            call    next_bit                ; load length flag
            jr      nc, length_value_loop
            push    de
            ld      d, b
    
    ; determine offset
            ld      e, (hl)                 ; load offset flag (1 bit) + offset value (7 bits)
            inc     hl
            sll     e                       ; opcode for undocumented instruction "SLS E"
            call    c, quad                 ; insert first bit into D
            rr      e                       ; insert fourth bit into E
    
    ; copy previous sequence
            ex      (sp), hl                ; store source, restore destination
            push    hl                      ; store destination
            sbc     hl, de                  ; HL = destination - offset - 1
            pop     de                      ; DE = destination
            ldir
            pop     hl                      ; restore source address (compressed data)
            jr      main_loop
    
    quad:   cp      a
            call    twice
    twice:  call    rld_next_bit
    rld_next_bit:
            rl      d                       ; insert previous bit into D
    next_bit:
            add     a, a                    ; check next bit
            ret     nz                      ; no more bits left?
            ld      a, (hl)                 ; load another group of 8 bits
            inc     hl
            rla
            ret
    

    The changes that you may apply to this alternative version are:
    -put the limit to 255 bytes.
    -Substract a bit in the end marker.
        *output_size = (optimal[input_index].bits+18+7)/8;
    by
        *output_size = (optimal[input_index].bits+17+7)/8;
    
    and
        /* sequence indicator */
        write_bit(1);
    
        /* end marker > MAX_LEN */
        for (i = 0; i < 16; i++) {
            write_bit(0);
        }
        write_bit(1);
    by
        /* sequence indicator */
        write_bit(1);
    
        /* end marker > MAX_LEN */
        for (i = 0; i < 16; i++) {
            write_bit(0);
        }
    
    -Replace these 2 functions:
    int elias_gamma_bits(int value) {
        int bits;
        --value;
        bits = 2;
        while (value > 1) {
            bits += 2;
            value-= 2;
            value>>= 1;
        }
        return bits;
    }
    void write_elias_gamma(int value) {
        int bits= 0, rvalue= 0;
        --value;
        while ( value>1 )
          ++bits,
          value-= 2,
          rvalue<<= 1,
          rvalue|= value&1,
          value>>= 1;
        rvalue<<= 1,
        rvalue|= value&1;
        write_bit(rvalue & 1);
        while ( bits-- )
          rvalue>>= 1,
          write_bit(0),
          write_bit(rvalue & 1);
        write_bit(1);
    }
    

    Also in your version you can save 1 byte by changing "rr d" by "cp a" or "ccf"
Sign In or Register to comment.