Z80 compressor
Is there one available?
Is it any good?
Peeps have been suggesting we should look into this, so here's a new thread!
I'd do the compressor on the PC.
I've done huffman before, pretty easy. But boring to code.
Mixing compression algorithms can be a win too, compress using various methods and compare results.
Is it any good?
Peeps have been suggesting we should look into this, so here's a new thread!
I'd do the compressor on the PC.
I've done huffman before, pretty easy. But boring to code.
Mixing compression algorithms can be a win too, compress using various methods and compare results.
Post edited by Paradigm Shifter on
Currently released version of SJOE: https://spectrumcomputing.co.uk/index.php?cat=96&id=42215
Comments
I've been using the aplib decompresor quite a bit lately, and it works great for full screen graphics.
There are many mature compressor, namely pucrunch and exomizer are good. pucrunch uses an innovative way how to mark literals and matches while exomizer do some magic with reducing offset length (basically statiscal compression of offsets). Both do some optimizing on stream but exomizer looks more advanced in this.
By far best is pmc by Paolo Ferraris, it is LZ backed by Huffman (well, i believe it is Shannon-Fano actually but i could be wrong). It was written for minigame compo and has amazing compression ratio and very compact decoder. Sadly its decompressor is very slow, it takes maybe 15 seconds to decompress 5 KB.
What is missing is some kind compressor which would use LZ (with some tweaks like optimizing, using index to few previously used offsets - 7zip does such things) and backed with full huffman and with quite fast (at least 1KB/sec) decompressor.
For screens, Laser Compact 5.2 does the best compression i've seen. You can save the compressed screen with or without the decompressor included. Options to include the attributes or not, and different 1/3rds of the screen. Its TRDOS disk only though. Might be worth extracting the compressor code from the disk to make a tap for all speccies.
My ZX Art Music Page
Carlos Michelis Theme
This is the development forum after all ;)
I was going to start with a really simple "factor out the common byte sequences and index to them" but then I realised that the index reference might nearly take up as much space as the factored out bytes. But I knew this in the back of my mind anyway, if I'm honest.
Compression of the room data would also be good - though I'd probably split the map into segments (being a bunch of rooms), and unpack the segment you're in when you move from one to the other. Saves doing it too often and improves the compression ratio as you're working on a bigger data set (unless I statistically analyse the entire room data, and the compress map segments against that).
So a balance between speed'n'complexity against compression ratio is good.
Step 1:
Build histogram of data
Step 2:
Build tree and compress using frequency table from (1)
Step 3: Profit
I'm not saying huffman is the holy grail though ;) It's a good start.
I suppose we could frig it to create a single expansion table (or whatever it's called) for all the sprites, and then compress each (or groups) against that. Obviously we want to selectively decompress data, and thus we could do without the overhead that normally goes at the front of each compressed block.
EDIT: And then go from there, a new table is gonna cost you memory.
EDIT2: Yes, I am Cptn Obvious ;)
Edit: I'm only talking about one table/dictionary, that's my point.
The more I think about it, the more I like the idea of:
1. feed the compressor each sprite in turn (or each room data in turn). ie a chunk.
2. The compressor analyses each "chunk" and puts them to one side.
3. When all chunks analysed, the compressor writes out the dictionary (say for LZW)
4. It then compresses each chunk in turn, maintaining an index to each in the output stream.
4. Write out the chunk_number->stream_offset index.
To decompress, you call the decompressor with the chunk number, from which it finds the start/end of the compressed chunk. It then decompresses against the common shared dictionary.
Thus you only need one dictionary.
Something like that.
Ok nasty from a reuse point of view, but great from a memory use point of view.
Way back in 2003 (has it been that long already!?!) I wrote this huffman decoder masterpiece:
That's it sans decoder tree.
The main bit of huffman compression code:
with the statistical stuff written in C. Of course I don't know how it all works anymore or if it even works (it was never really tested but I tend to write bugfree code too ;-P ).
The point is it'snot hard and it's not big in asm (not including the tree). I never even considered dictionary compressions since I didn't think it would work too well if you wanted random access to individual bits of data on the go.
I was not the only one to implement such a thing here, I remember a coupl eof others taking about their z80 implementations over the years.
Write games in C using Z88DK and SP1
Well, we should start an archive. This kind of thing is really useful and does not come up that often for people to remember what's good and what's not, or have working examples to hand.... Unlike sprite code and stuff which is _always_ discussed ;-)
We always have gamedev.net as backup, I was thinking of asking them if they would do a retro forum, but I have only just logged on today after several years ;)
EDIT: http://www.gamedev.net/community/forums/topic.asp?topic_id=563232
Decompressor.
A1. I think it would help to settle on one algorithm that gives good compression and a fast, compact Z80 decompressor.
A2. It should be able to decompress distinct chunks from a single compressed data block, using a single common encoding tree / dictionary.
A3. It should also be easy to adapt the decompressor to work linearly and incrementally, so we can run it until we get X bytes (or just over) from a given chunk, then save its state and come back to it later for more. (I don't mean just decoding lots of little chunks here - I mean the ability to stop mid-stream).
GUI/Script
B1. The compressor should come with a PC GUI, to prepare a config script to gather the files for compressing.
B2. The actual compression should be done by passing this script file to a command-line executable. The GUI can then invoke this command-line executable. That makes it all a bit more versatile and portable.
B3. In the GUI, you select one or more files that you want compressing together. The order of the files is remembered, and can be manipulated up/down. Each file is intended to be compressed as a distinct chunk of data, but with a common decompression tree. You can also specify further sub-divisions of each file into smaller distinct chunks.
B4. Each file is one of three (initially) basic types; ASCII text, binary data and indexed binary data. Plain text is just ASCII to be compressed (although \r\n will be replaced by CHR$(13) in the compressed data). Binary data is just raw bytes.
B5. 'Indexed binary' files have the following form:
1 byte = number of chunks n
2 x n bytes = offset from start of file to start of each chunk. Length of chunk is inferred from the offset to the next chunk, or the end of the file.
Remaining bytes = chunked binary data.
This is for, for example, sprites of varying sizes, or individual maps of different areas. Each sprite or map is one 'chunk'.
B6. For each text file, you can also specifiy an optional separator character. The text will be split up into chunks based on this separator character. The actual separator character (and any \r\n immediately before it) does not appear in the resultant compressed data.
B7. Similarly for binary files, you can specify an optional fixed 'chunk size', e.g. '32' for 16x16 pixel-only sprites. The compressor will then treat the entire binary file as a series of chunks of this size. (There is no optional setting for the 'indexed binary' file type).
B8. Or if you don't want to use chunks, you can set a global option that ignores all these extra settings, concatenates all your files into one stream and compresses that.
B9. You then specify an output file, an optional report text file, and set two options, if you want the number of chunks, and the chunk offset table, to appear at the start of the file (as per the 'Indexed Binary' file format, above).
Compressor
C1. When run, the compressor should pool all the files together and create a single optimised tree for all the files and chunks.
C2. It should then compress each file, or each defined chunk of each file, into separate and distinct streams of bytes. These are all then concatenated, in order, into a single binary file. (I know this will waste bits as padding at the end of each chunk, but byte-aligning will make accessing them a lot faster. If you don't want to use chunks like this, you could just specify a single stream).
C3. The number of chunks (optional), the offset to the start of each chunk (optional), and the decoding tree are added to the start of the data block, then all is saved as a single binary.
C4. If required, a report of progress (including the file/chunk before/after sizes and indexing data) is produced as a text output file.
C5. The decompressor can then be given the address of the decoding tree/dictionary and the address of a compressed chunk, and it will decode that chunk.
More than one tree
On this, I thought if you wanted a separate tree, you could build a separate file. Or the GUI could let you add a numeric field to each file, to tell it to use tree 1, 2, 3 etc. Or the GUI could have a tree control, and you group like files under a top-level heading which denotes the tree.
- IONIAN-GAMES.com -
http://starbase.globalpc.net/~wyndex/mini03/index.main.html
download ZDBlast, check sources. in 2004 it was used to pack 4K RACE too.
for some details check 2003, 2004 minigame bbs, paolo describes algorithm he used in some thread. dont remember where it was, you have to dig it.
If you want to do incremental extractions, would you also need a limit on the maximum size of a word in the data dictionary, so you know how big a rolling buffer you need to leave to work with?
- IONIAN-GAMES.com -
http://retrospec.sgn.net/users/tomcat/yu/ZX/Utils/Html/FileCompressor.php
does a good job at compression, decompression is VERY fast and it does auto-reloc of the decompressor, etc.
As for MiniGames, well that's obvious.
For really large games I would say compress all levels/screens and only decompress the level/screen you are at.
If it's for loading only and you decompress all data after loading, I fail to see the emergency of compressing. OK it saves some time, but on an emulator you don't really notice this. As for size, OK so it saves some space on a disk/tape or whatever.
But back to compressors:
I myself never use a predefined compressor, but make my own (de)compressor per game by selecting the right routine needed for my program.
Some examples:
123Maze (1K) holds 26 puzzles from 10x10. Compression is done by storing 4 fields in a byte and unpacking this (4 values possible)
Lingo (1K) holds 300 5 letter words by using Huffmann.
Worldbar (2K and 4K with AI) holds a screen that is built from all kinds of repeating strings (i.e INK 7, PAPER 8, " " or INK 6, PAPER 2," ") which are numbered and called with a printroutine to print the nth string.
SOKOB-ONE (1K) holds 26 levels of Sokoban differently sized.
In 1 byte up to 4 positions with 8 possible values can be stored (8 values = 3 bits, 4th bit used as valuerepeater)
In Stratego (now working on) I have compressed and uncompressed graphics. The compressed graphics decompress AFTER the uncompressed graphics so I only need to set UDG or CHARS once.
So I would like to see WHAT you want compressed and think a special compressingroutine over for it.
Graphics mainly, and room data (which will already have been optimised to use as little space as possible).
The general theory (well as joefish and I have declared) is to compress everything as one stream to improve the compression ratio (theoretically), but to chunk the actual compression stream to allow individual sprites and screens (or groupings of) to be cherry picked and decompressed when needed.
No one is planning to compress in-game. Only decompress.
Using the same compression for all the chunks allows you to have one dictionary, or similar, depending on the compression algorithm used.
Edit: You have to think of the data you're compressing as being arbitrarily random, and assume that it has been optimised, which it should have if you're worth your salt.
(By random I mean not data that you can spot patterns in by eye. Compressors will be able to of couse, as that is how they work ;-) Real random data will compress terribly!)
The guts could be a library that a GUI can use as well as the command line tool.
Not everyone has a computer that runs windows, so a windows GUI app is not so useful ;-)
Plus the compression could then be built into Makefiles as part of the development environment.
In my currently-shelved-but-at-the-fully-working-prototype-stage Speccy platformer game, I compress each level's graphics in one chunk, and then each individual room in it's own chunk. I could go crazy and figure out a custom compression scheme that would no doubt save me bytes but.. well.. what I've got works well enough that I no longer have memory issues and the decompression speed is not a worry either. For me it's much more important to get things working before I worry too much about the size or the speed of the tech.
So remember:
Just replace "slow" with "taking too much memory".
I think a command line program is best option for compressor since you can whack a GUI on top if you want to.
No PC that runs windows? I hope you have Wine or whatever it is called.
Otherwise how can you play Civ IV? :-o