20-column multicolour: a half-baked idea

edited August 2013 in Development
I was writing an email reply to one Mr Jowett about multicolour routines, and in that way that explaining something to someone else makes you think "oh, hang on, that's not right" and re-evaluate what you thought you knew about the subject, I realised that it might be possible to beat the 18-column limit for 8x1-attribute multicolour.

Existing routines write each line of attributes just before the ULA reaches them. While the ULA is busy drawing the last scanline of the character row above, we're busy writing the first line of attribute data for the next row down. While the ULA is rendering that first pixel line, we're just behind it writing data for the second line, and so on.

The key insight is this: while all of this is in full swing, we only need to change each character row SEVEN times, not eight. We will need to write to it an eighth time at some point, so that we're not still showing the last line's data at the top when the next frame comes around - but we can do that in our own time (during the lower and upper border, for example).

(I realise that the lower / upper border is the only opportunity you have to do actual game logic when working with engines like ZXodus and Bifrost, so this is clearly not very practical for real-world projects. But what the hell, we have records to break here?)

So, since there are 7 writes in the space of 8 scanlines, we can over-run the usual 224/228 tstates slightly, and begin each pass slightly later for each successive line. If we start the first line at the earliest possible opportunity (which means writing the leftmost byte as soon as the ULA has passed it, and generally working left-to-right - which unfortunately means fighting against the direction of PUSHes), and finish just before the ULA catches up with us at the end of the last line, can we possibly squeeze in another column or two?

I made a naive attempt at a 20-column routine last night - churning through the PUSHes as fast as possible, without checking where the contention delays fall - and failed on the fifth line, where the ULA caught up and redrew the magenta line (see columns 12-13) before I'd had chance to paint it green:

20column_fail.png

This probably isn't too surprising - we only managed to pull off 18 columns by carefully arranging our PUSHes to minimise contention, so we probably need to do the same here. Unfortunately, since each of our seven passes begins at a different point in the contention pattern, we'll need to solve that jigsaw puzzle seven times over. I think, then, that the next step is for me to build an interactive assembler-type tool (in Javascript, no doubt?) where I can enter my instruction sequence and immediately get a graphical representation of where the contention delays fall, so that I don't have to work that out by hand all the time. Unless someone else wants to get in on the action, that is!
Post edited by gasman on
«1

Comments

  • edited July 2013
    Are you just using PUSHes, or are you doing the odd LD (nn),HL [22 NN NN] too? In the right place, I've found those can shave off a few ticks rather than moving the stack pointer.

    I'm also trying to get my head aronud the logic of what you're suggesting, and I can't quite work out if it makes sense. I suppose that you've got it to work for a few rows suggests the logic is sound, even if the timing isn't certain.

    Personally I'm happy to drop to 8x2 multicolour - I can do that 24-wide with the luxury of a loop and a start pointer. And as you say, for a game I'd be unwilling to surrender too much top border time.

    The other thought I had was maybe not to use all the registers on each line, and pre-load some of the alternate registers during the spare line for use later on. On each subsequent line you'd have one more register to use as it's pre-loaded data gets dumped, so it would need a custom sequence for each of the seven lines rather than one repeated function. But it looks like you were considering that anyway, to get the timing right.

    One thing I'd do is mix up the colour order in your test data though, as it can be hard to spot if red/magenta and green/cyan edges are stable. Try 0,4,1,5,2,6,3,7.
    Joefish
    - IONIAN-GAMES.com -
  • edited July 2013
    61545.png
    multicolour#1(border effects works on Unreal emulator,Pentagon 128 timing)
    http://pouet.net/prod.php?which=61545
    marauder2.png
    multicolour#2(Pentagon128 only!)
    http://zxaaa.untergrund.net/view_demo.php?id=7599
  • edited July 2013
    Well, there you go then.
    Joefish
    - IONIAN-GAMES.com -
  • edited July 2013
    Impressive. Glad I didn't bet you another steak.
    gasman wrote: »
    (I realise that the lower / upper border is the only opportunity you have to do actual game logic when working with engines like ZXodus and Bifrost, so this is clearly not very practical for real-world projects.

    ZXodus doesn't leave you with any top border time for the game logic because it's sitting waiting to draw the tiles, which it does a couple per frame after it's updated the attributes.
    we only managed to pull off 18 columns by carefully arranging our PUSHes to minimise contention

    That was more luck than judgement on my part as far as ZXodus is concerned. Now that Bifrost* is available I wouldn't suggest using ZXodus for any projects. The ZXodus II Engine is something else all together but I'm keeping that for my own project.
  • edited July 2013
    I'm using a very similar trick in my early experimental code for multicolor on 128k machines. The benefit of using 128k machine is, of course, the added fillrate that you get by pre-filling the first two lines of attributes out of 8 in each line of characters. With such a cheap 25% increase in fillrate, you can beat 18-character limit quite easily even on contended computers.

    The filling is best set up behind the beam, so that you are trying to update things almost immediately after they've been shown. You start behind the beam, but the beam is eventually catching up.

    So far I only experimented with code for uncontended machines, where all this is trivially simple to achieve. But with 25% increase in fillrate, I'd be surprised if a 22 character wide 8x1 multicolor was not possible even on contended 128k machine.

    But I never considered to apply this mode of thinking to 48K. It is definitely an excellent idea.
  • edited July 2013
    gasman wrote: »
    So, since there are 7 writes in the space of 8 scanlines, we can over-run the usual 224/228 tstates slightly, and begin each pass slightly later for each successive line.

    The raster takes 4 states to draw each character. Since we need to start drawing right after it has passed the first 2 characters, and end the last line before it reaches the last two, that leaves us 16*4 = 64 states to overrun in total. Divided by 7, it means we have 9,14 extra states for each line. In practical terms, this means just 8 extra states, as contention would eat extra time going over 8... So, the question is... would just 8 extra states per line suffice to write another pair of bytes? I guess it depends on how much free time was left with the previous 18-character routines (which I'm not familiar with).
  • edited July 2013
    gasman wrote: »
    I realised that it might be possible to beat the 18-column limit for 8x1-attribute multicolour.

    Good luck! :)
    gasman wrote: »
    I realise that the lower / upper border is the only opportunity you have to do actual game logic when working with engines like ZXodus and Bifrost, so this is clearly not very practical for real-world projects.

    Not really. Implementing game logic during upper border is a real nightmare, because the game developer would need to ensure this code always takes an exact number of cycles, regardless of what happens. Even if we had lots of multicolor games in development right now, the vast majority would not do it.

    In BIFROST*, the upper border is normally used just to refresh/animate tiles. Even so, there's still some spare time left. Since updating all attributes in the multicolor area could be done in less than 5K T-states, I'm sure it would be possible to do both.

    In ZXodus II, the upper border time is more heavily used for more animations, but it could be changed to play animations slightly slower, and use the remaining time to update pixel lines.

    Therefore your idea would be perfectly practical for most real-world projects!
    gasman wrote: »
    (which means writing the leftmost byte as soon as the ULA has passed it, and generally working left-to-right - which unfortunately means fighting against the direction of PUSHes)

    There's an approach I had in mind since last year for a 128K multicolor renderer. Perhaps it could be adapted to 48K as follows:
    • In 1st raster scan, change the first few bytes directly using LD (nn),rr. This will give enough time for the raster scan to reach roughly the middle multicolor column, so you can now set SP to the middle multicolor column and start PUSHing from there. In the meantime, the raster scan will reach the last multicolor column, so you can set SP there and start PUSHing to complete the pixel line.

    • In 2nd raster scan, set SP to the middle multicolor column, start PUSHing, then set SP to the last multicolor column, and start PUSHing.

    • In 3rd raster scan, set SP near the last multicolor column, start PUSHing, then complete the last few multicolor columns directly using LD (nn),rr.

    • In 4th and 5th raster scans, set SP to the last multicolor column and simply start PUSHing as usual.

    • In 6th raster scan, set SP near the last multicolor column, start PUSHing, then complete the last few multicolor columns directly using LD (nn),rr.

    • In 7th raster scan, set SP near the middle multicolor column and start PUSHing from there, then set SP near the last multicolor column and start PUSHing from there, finally change the last few bytes directly using LD (nn),rr.

    Of course the description above is just a rough estimate. It will take quite some effort to adjust everything properly. But I bet this strategy works.
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited July 2013
    In ZXodus II, the upper border time is more heavily used for more animations, but it could be changed to play animations slightly slower, and use the remaining time to update pixel lines.

    18x18 is about the sweet spot for the ZXodus II Engine though. It updates nine tiles per frame, plays an AY tune, and handles up to 66 animated tiles on screen at once without glitching on a +2A while still leaving enough cycles for game logic. The animations are quite CPU intensive so it uses an interrupt manager to do different things on different frames to keep it all flowing nicely. If I'd written in back in the 80s I'd have made a fortune. :(
  • edited July 2013
    Metalbrain wrote: »
    The raster takes 4 states to draw each character. Since we need to start drawing right after it has passed the first 2 characters, and end the last line before it reaches the last two, that leaves us 16*4 = 64 states to overrun in total. Divided by 7, it means we have 9,14 extra states for each line.

    Actually it's more.

    You didn't take into account the free time before the raster scan passes the first 2 characters. This extra time can be used to load initial values into register pairs AF,BC,DE,HL,AF',BC',DE',HL',IX,IY and to set SP.

    Normally you would need to update 8 times each multicolor row in 8*224T, including the time to initialize all registers 8 times.

    Now you will need to update 7 times each multicolor row in 64+7*224T, including the time to initialize all registers 6 times only. The remaining 224-64=160T per row would be still available to execute another initialization of all registers before the next multicolor row starts.
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited July 2013
    Of course if McLeod_Ideafix ever finishes the plug-in ULAplus you'll be able to 32 columns of 8x1 attributes in 64 colours with no CPU overhead. But in the meantime it's fun to push the envelope.
  • edited July 2013
    aowen wrote: »
    Now that Bifrost* is available I wouldn't suggest using ZXodus for any projects.

    Thanks!

    I would be interested to implement something like BIFROST*2 using a larger 20x20 multicolor area or such, assuming this idea really works. Although putting together a new engine like this would take me a lot of time, so it certainly won't happen this year...
    aowen wrote: »
    The ZXodus II Engine is something else all together but I'm keeping that for my own project.

    Based on everything I already saw, I can testify that project is awesome! :)
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited July 2013
    Cheers, that gives me a lot to think about! :-) I hadn't considered the LD (nn),HL instruction before... on one hand, I'm happy that gives another avenue to explore. On the other hand, it means that the search space for a solution to this problem is now twice as big...
    joefish wrote: »
    The other thought I had was maybe not to use all the registers on each line, and pre-load some of the alternate registers during the spare line for use later on.
    Yup, pretty sure the eventual solution is going to rely on lots of tactical register usage like that, particularly the use of alternate registers, since: A) loading up a bunch of registers is a nice way to fill time without incurring contention delays while the ULA is passing over the screen area; and B) if you do need to write to the screen while contention is going on, a long chain of PUSHes results in less wasted cycles than LD / PUSH / LD / PUSH...
    introspec wrote: »
    The benefit of using 128k machine is, of course, the added fillrate that you get by pre-filling the first two lines of attributes out of 8 in each line of characters.
    Very clever! I was actually working with 128K timings in my initial test, as that's my 'default' platform of choice, but didn't think of using the shadow screen as well.
  • edited July 2013
    If this ends up working reliably, I will personally be very dismayed if it is called anything other than Jowett Mode.
  • edited July 2013
    gasman wrote: »
    I think, then, that the next step is for me to build an interactive assembler-type tool (in Javascript, no doubt…) where I can enter my instruction sequence and immediately get a graphical representation of where the contention delays fall, so that I don't have to work that out by hand all the time. Unless someone else wants to get in on the action, that is!

    actually, I've already made something like this called asmp. it only supports couple of opcodes, but asmp tool has a built in assembler/simulator with full contention emulation. I'm sure you'll do it better than me, but it's suprising asmp also writes 7 of 8 raster lines, and uses 8th one for syncing with contention timing and waiting for about 200ts :D

    I couldn't find anything useful to do in this time, as we cannot lose the sync with contention :)

    by the way, I don't see any other way than structuring registers cleverly and reusing them without reloading, thus sparing some time.

    spectrum only got 8 colours, and enough registers to arrange colours accordingly. I think more than 18 columns is a possibility with some help of pre-computation of registers and 8th raster.
  • edited July 2013
    gasman wrote: »
    On the other hand, it means that the search space for a solution to this problem is now twice as big...

    There's more. Keep in mind that contention will affect your routine differently depending on the location of the multicolor area on screen. I prefer starting at column 1 as introduced by ZXodus, but you may need to choose another position to make your routine work.

    There's also the choice of using IX or IY. Notice that PUSH IX is slower than PUSH DE, but LD (nn),IX takes the same time as LD (nn),DE. Actually I suspect it will be better to start your routine with LD (nn),IX to update the first 2 bytes on screen as soon as the raster scan allows it.

    The good news is, you can assume your routine starts perfectly synchronized with the raster scan. Although an interrupt during HALT may have a variable delay from 0 to 3 T-states (or a lot more if the interrupt occurs when executing other instructions), the "anti-flickering mechanism" from BIFROST* can completely eliminate this difference. Anyway that's not something you need to worry right now.
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited July 2013
    I started investigating this idea and noticed something interesting.

    The "classic" multicolor algorithm takes 224 T for each raster scan pass in a Spectrum 48K. For instance, the first pass on row 1 works as follows:
    ; 16125 T after interrupt
    LD SP,nn        ; 10 T
    LD HL,nn        ; 10 T
    LD DE,nn        ; 10 T
    LD BC,nn        ; 10 T
    EXX             ; 4 T
    LD HL,nn        ; 10 T
    LD DE,nn        ; 10 T
    LD BC,nn        ; 10 T
    PUSH BC         ; 17 T (columns 17 and 18)
    LD BC,nn        ; 10 T
    PUSH BC         ; 22 T (columns 15 and 16)
    LD BC,nn        ; 10 T
    PUSH BC         ; 11 T (columns 13 and 14)
    LD BC,nn        ; 10 T
    PUSH BC         ; 11 T (columns 11 and 12)
    PUSH DE         ; 11 T (columns 9 and 10)
    PUSH HL         ; 11 T (columns 7 and 8)
    EXX             ; 4 T
    PUSH BC         ; 11 T (columns 5 and 6)
    PUSH DE         ; 11 T (columns 3 and 4)
    PUSH HL         ; 11 T (columns 1 and 2)
    ; 16349 T after interrupt
    

    However we could save 8 T simply grouping the first 2 PUSHes:
    ; either 16127 or 16135 T after interrupt
    LD SP,nn        ; 10 T
    LD HL,nn        ; 10 T
    LD DE,nn        ; 10 T
    LD BC,nn        ; 10 T
    EXX             ; 4 T
    LD HL,nn        ; 10 T
    LD DE,nn        ; 10 T
    LD BC,nn        ; 10 T
    PUSH BC         ; 15 T (columns 17 and 18)
    PUSH DE         ; 16 T (columns 15 and 16)
    LD DE,nn        ; 10 T
    LD BC,nn        ; 10 T
    PUSH BC         ; 11 T (columns 13 and 14)
    LD BC,nn        ; 10 T
    PUSH BC         ; 11 T (columns 11 and 12)
    PUSH DE         ; 11 T (columns 9 and 10)
    PUSH HL         ; 11 T (columns 7 and 8)
    EXX             ; 4 T
    PUSH BC         ; 11 T (columns 5 and 6)
    PUSH DE         ; 11 T (columns 3 and 4)
    PUSH HL         ; 11 T (columns 1 and 2)
    ; either 16343 or 16351 T after interrupt
    

    Of course we can't make this change on current 18 columns multicolor implementations otherwise we would be updating attributes too fast. But it means there's potential for saving more time for the 20 columns idea...
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited July 2013
    OK, here's my first attempt to implement the 20 columns multicolor routine.

    It's not working yet, but it's VERY close. If I didn't make any mistakes, the timing will only fail in the last instruction, that was supposed to update the last pair of columns in the last raster scan!

    All timings listed below are based on a Spectrum 48K with normal (non-late) timing. This routine assumes that attributes for the first raster scan were previously set during upper border, so it can start working at the second raster scan:
    ; --- PREPARE "PUSH AF/AF'" FOR LATER
    
                  ; 15984T
    LD SP,nn                    ; reference AF/AF' values
                  ; 15994T
    POP AF
                  ; 16004T
    EX AF,AF'
                  ; 16008T
    POP AF'
                  ; 16018T
    
    ; --- SET ATTRIBUTES FOR 2ND RASTER SCAN ---
    
    LD SP,nn                    ; reference columns 9 and 10
                  ; 16028T
    LD HL,nn
                  ; 16038T
    LD DE,nn
                  ; 16048T
    LD BC,nn
                  ; 16058T
    EXX
                  ; 16062T
    LD HL,nn
                  ; 16072T
    LD DE,nn
                  ; 16082T
    LD BC,nn
                  ; 16092T
    LD IX,nn
                  ; 16106T
    LD IY,nn
                  ; 16120T
    LD (nn),IX                  ; columns 1 and 2
                  ; 16144T
    LD (nn),IY                  ; columns 3 and 4
                  ; 16168T
    PUSH BC                     ; columns 9 and 10
                  ; 16184T
    PUSH DE                     ; columns 7 and 8
                  ; 16200T
    PUSH HL                     ; columns 5 and 6
                  ; 16216T
    LD SP,nn                    ; reference columns 19 and 20
                  ; 16226T
    LD HL,nn
                  ; 16236T
    LD DE,nn
                  ; 16246T
    LD BC,nn
                  ; 16256T
    PUSH BC                     ; columns 19 and 20
                  ; 16267T
    PUSH DE                     ; columns 17 and 18
                  ; 16278T
    EXX
                  ; 16282T
    PUSH BC                     ; columns 15 and 16
                  ; 16293T
    PUSH DE                     ; columns 13 and 14
                  ; 16304T
    PUSH HL                     ; columns 11 and 12
                  ; 16315T
    
    ; --- SET ATTRIBUTES FOR 3RD RASTER SCAN ---
    
    LD SP,nn                    ; reference columns 5 and 6
                  ; 16325T
    LD HL,nn
                  ; 16335T
    LD DE,nn
                  ; 16345T
    LD BC,nn
                  ; 16355T
    EXX
                  ; 16359T
    LD DE,nn
                  ; 16369T
    LD BC,nn
                  ; 16379T
    LD IX,nn
                  ; 16393T
    PUSH BC                     ; columns 5 and 6
                  ; 16408T
    PUSH DE                     ; columns 3 and 4
                  ; 16424T
    PUSH HL                     ; columns 1 and 2
                  ; 16440T
    LD SP,nn                    ; reference columns 19 and 20
                  ; 16450T
    LD HL,nn
                  ; 16460T
    LD DE,nn
                  ; 16470T
    LD BC,nn
                  ; 16480T
    PUSH BC                     ; columns 19 and 20
                  ; 16491T
    PUSH DE                     ; columns 17 and 18
                  ; 16502T
    PUSH HL                     ; columns 15 and 16
                  ; 16513T
    EXX
                  ; 16517T
    PUSH BC                     ; columns 13 and 14
                  ; 16528T
    PUSH DE                     ; columns 11 and 12
                  ; 16539T
    PUSH HL                     ; columns 9 and 10
                  ; 16550T
    LD HL,nn
                  ; 16560T
    PUSH HL                     ; columns 7 and 8
                  ; 16571T
    
    ; --- SET ATTRIBUTES FOR 4TH RASTER SCAN ---
    
    LD HL,nn
                  ; 16581T
    LD DE,nn
                  ; 16591T
    LD BC,nn
                  ; 16601T
    PUSH BC                     ; columns 5 and 6
                  ; 16616T
    PUSH DE                     ; columns 3 and 4
                  ; 16632T
    PUSH HL                     ; columns 1 and 2
                  ; 16648T
    LD SP,nn                    ; reference columns 19 and 20
                  ; 16658T
    LD HL,nn
                  ; 16668T
    LD DE,nn
                  ; 16678T
    LD BC,nn
                  ; 16688T
    PUSH BC                     ; columns 19 and 20
                  ; 16704T
    PUSH DE                     ; columns 17 and 18
                  ; 16715T
    PUSH HL                     ; columns 15 and 16
                  ; 16726T
    LD HL,nn
                  ; 16736T
    PUSH HL                     ; columns 13 and 14
                  ; 16747T
    LD HL,nn
                  ; 16757T
    PUSH HL                     ; columns 11 and 12
                  ; 16768T
    LD HL,nn
                  ; 16778T
    PUSH HL                     ; columns 9 and 10
                  ; 16789T
    PUSH AF                     ; colums 7 and 8
                  ; 16800T
    EX AF,AF'
                  ; 16804T
    
    ; --- SET ATTRIBUTES FOR 5TH RASTER SCAN ---
    
    LD HL,nn
                  ; 16814T
    LD DE,nn
                  ; 16824T
    PUSH DE                     ; columns 5 and 6
                  ; 16840T
    PUSH HL                     ; columns 3 and 4
                  ; 16856T
    LD HL,nn
                  ; 16866T
    LD DE,nn
                  ; 16876T
    LD BC,nn
                  ; 16886T
    EXX
                  ; 16890T
    LD HL,nn
                  ; 16900T
    LD DE,nn
                  ; 16910T
    LD BC,nn
                  ; 16920T
    PUSH BC                     ; columns 1 and 2
                  ; 16931T
    LD SP,nn                    ; reference columns 19 and 20
                  ; 16941T
    PUSH DE                     ; columns 19 and 20
                  ; 16952T
    PUSH HL                     ; columns 17 and 18
                  ; 16963T
    EXX
                  ; 16967T
    PUSH BC                     ; columns 15 and 16
                  ; 16978T
    PUSH DE                     ; columns 13 and 14
                  ; 16989T
    PUSH HL                     ; columns 11 and 12
                  ; 17000T
    LD HL,nn
                  ; 17010T
    PUSH HL                     ; columns 9 and 10
                  ; 17021T
    LD HL,nn
                  ; 17031T
    PUSH HL                     ; columns 7 and 8
                  ; 17048T
    
    ; --- SET ATTRIBUTES FOR 6TH RASTER SCAN ---
    
    LD HL,nn
                  ; 17058T
    LD DE,nn
                  ; 17068T
    LD BC,nn
                  ; 17078T
    EXX
                  ; 17082T
    LD HL,nn
                  ; 17092T
    LD DE,nn
                  ; 17102T
    LD BC,nn
                  ; 17112T
    PUSH BC                     ; columns 5 and 6
                  ; 17128T
    PUSH DE                     ; columns 3 and 4
                  ; 17144T
    PUSH HL                     ; columns 1 and 2
                  ; 17155T
    LD SP,nn                    ; reference columns 17 and 18
                  ; 17165T
    LD HL,nn
                  ; 17175T
    PUSH HL                     ; columns 17 and 18
                  ; 17186T
    EXX
                  ; 17190T
    PUSH BC                     ; columns 15 and 16
                  ; 17201T
    PUSH DE                     ; columns 13 and 14
                  ; 17212T
    PUSH HL                     ; columns 11 and 12
                  ; 17223T
    PUSH IX                     ; columns 9 and 10
                  ; 17238T
    PUSH AF                     ; columns 7 and 8
                  ; 17249T
    LD HL,nn
                  ; 17259T
    LD (nn),HL                  ; columns 19 and 20
                  ; 17280T
    
    ; --- SET ATTRIBUTES FOR 7TH RASTER SCAN ---
    
    LD HL,nn
                  ; 17290T
    LD DE,nn
                  ; 17300T
    LD BC,nn
                  ; 17310T
    EXX
                  ; 17314T
    LD HL,nn
                  ; 17324T
    LD DE,nn
                  ; 17334T
    LD BC,nn
                  ; 17344T
    PUSH BC                     ; columns 5 and 6
                  ; 17360T
    LD BC,nn
                  ; 17370T
    PUSH BC                     ; columns 3 and 4
                  ; 17381T
    PUSH DE                     ; columns 1 and 2
                  ; 17392T
    LD SP,nn                    ; reference columns 17 and 18
                  ; 17402T
    PUSH HL                     ; columns 17 and 18
                  ; 17413T
    EXX
                  ; 17417T
    PUSH BC                     ; columns 15 and 16
                  ; 17428T
    PUSH DE                     ; columns 13 and 14
                  ; 17439T
    PUSH HL                     ; columns 11 and 12
                  ; 17450T
    LD HL,nn
                  ; 17460T
    PUSH HL                     ; columns 9 and 10
                  ; 17471T
    LD HL,nn
                  ; 17481T
    PUSH HL                     ; columns 7 and 8
                  ; 17496T
    LD HL,nn
                  ; 17506T
    LD (nn), HL                 ; columns 19 and 20
                  ; 17528T
    
    ; --- SET ATTRIBUTES FOR 8TH RASTER SCAN ---
    
    LD HL,nn
                  ; 17538T
    LD DE,nn
                  ; 17548T
    LD BC,nn
                  ; 17558T
    EXX
                  ; 17562T
    LD HL',nn
                  ; 17572T
    LD DE',nn
                  ; 17582T
    LD BC',nn
                  ; 17592T
    PUSH BC                     ; columns 5 and 6
                  ; 17603T
    PUSH DE                     ; columns 3 and 4
                  ; 17614T
    PUSH HL                     ; columns 1 and 2
                  ; 17625T
    LD SP,nn                    ; reference columns 15 and 16
                  ; 17635T
    EXX
                  ; 17639T
    PUSH BC                     ; columns 15 and 16
                  ; 17650T
    PUSH DE                     ; columns 13 and 14
                  ; 17661T
    PUSH HL                     ; columns 11 and 12
                  ; 17672T
    LD HL,nn
                  ; 17682T
    PUSH HL                     ; columns 9 and 10
                  ; 17693T
    LD HL,nn
                  ; 17703T
    PUSH HL                     ; columns 7 and 8
                  ; 17720T
    LD HL,nn
                  ; 17730T
    LD (nn), HL                 ; columns 17 and 18
                  ; 17752T
    LD HL,nn
                  ; 17762T
    LD (nn), HL                 ; OPS!!! TOO LATE FOR COLUMNS 19 AND 20
    

    Now back to the drawing board...
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited July 2013
    All timings listed below are based on a Spectrum 48K with normal (non-late) timing.

    Really those should be referred to as cool and warm timings. ULA-based Spectrums all start off with cool timings and will eventually drift to warm timings if left on long enough. Another reason why I'm not bothering to support them with the ZXodus II Engine.
  • edited July 2013
    ... and when will we see this ZXodus II engine? Might want a play with it, :)
    So far, so meh :)
  • edited July 2013
    polomint wrote: »
    ... and when will we see this ZXodus II engine? Might want a play with it, :)

    Well, I believe you have seen a preview version of it already. If you've got a project you want to use it for let me know. I'm willing to license it on a case by case basis. It will remain proprietary closed source though.

  • edited July 2013
    I think this is going to work:
    ; --- PREPARE "PUSH AF/AF'" FOR LATER
    
                  ; 15984T
    LD SP,nn                    ; reference AF/AF' values
                  ; 15994T
    POP AF
                  ; 16004T
    EX AF,AF'
                  ; 16008T
    POP AF
                  ; 16018T
    
    ; --- SET ATTRIBUTES FOR 2ND RASTER SCAN ---
    
    LD SP,nn                    ; reference columns 5 and 6
                  ; 16028T
    LD HL,nn
                  ; 16038T
    LD DE,nn
                  ; 16048T
    LD BC,nn
                  ; 16058T
    EXX
                  ; 16062T
    LD HL,nn
                  ; 16072T
    LD DE,nn
                  ; 16082T
    LD BC,nn
                  ; 16092T
    LD IX,nn
                  ; 16106T
    LD IY,nn
                  ; 16120T
    LD (nn),IX                  ; columns 1 and 2
                  ; 16144T
    PUSH IY                     ; columns 5 and 6
                  ; 16168T
    PUSH BC                     ; columns 3 and 4
                  ; 16184T
    LD SP,nn                    ; reference columns 19 and 20
                  ; 16194T
    LD IX,nn
                  ; 16208T
    PUSH DE                     ; columns 19 and 20
                  ; 16224T
    LD DE,nn
                  ; 16234T
    LD BC,nn
                  ; 16244T
    PUSH BC                     ; columns 17 and 18
                  ; 16259T
    PUSH DE                     ; columns 15 and 16
                  ; 16270T
    PUSH HL                     ; columns 13 and 14
                  ; 16281T
    LD HL,nn
                  ; 16291T
    LD DE,nn
                  ; 16301T
    LD BC,nn
                  ; 16311T
    PUSH BC                     ; columns 11 and 12
                  ; 16322T
    PUSH DE                     ; columns 9 and 10
                  ; 16333T
    PUSH HL                     ; columns 7 and 8
                  ; 16344T
    
    ; --- SET ATTRIBUTES FOR 3RD RASTER SCAN ---
    
    LD HL,nn
                  ; 16354T
    LD DE,nn
                  ; 16364T
    LD BC,nn
                  ; 16374T
    PUSH BC                     ; columns 5 and 6
                  ; 16392T
    PUSH DE                     ; columns 3 and 4
                  ; 16408T
    PUSH HL                     ; columns 1 and 2
                  ; 16424T
    LD IY,nn
                  ; 16438T
    LD SP,nn                    ; reference columns 19 and 20
                  ; 16448T
    LD HL,nn
                  ; 16458T
    LD DE,nn
                  ; 16468T
    LD BC,nn
                  ; 16478T
    PUSH BC                     ; columns 19 and 20
                  ; 16489T
    PUSH DE                     ; columns 17 and 18
                  ; 16500T
    PUSH HL                     ; columns 15 and 16
                  ; 16511T
    EXX
                  ; 16515T
    PUSH BC                     ; columns 13 and 14
                  ; 16526T
    PUSH DE                     ; columns 11 and 12
                  ; 16537T
    PUSH HL                     ; columns 9 and 10
                  ; 16548T
    LD HL,nn
                  ; 16558T
    PUSH HL                     ; columns 7 and 8
                  ; 16569T
                  
    ; --- SET ATTRIBUTES FOR 4TH RASTER SCAN ---
    
    LD HL,nn
                  ; 16579T
    LD DE,nn
                  ; 16589T
    LD BC,nn
                  ; 16599T
    EXX
                  ; 16603T
    LD HL,nn
                  ; 16613T
    LD DE,nn
                  ; 16623T
    LD BC,nn
                  ; 16633T
    PUSH BC                     ; columns 5 and 6
                  ; 16648T
    PUSH DE                     ; columns 3 and 4
                  ; 16664T
    PUSH HL                     ; columns 1 and 2
                  ; 16680T
    LD SP,nn                    ; reference columns 19 and 20
                  ; 16690T
    LD HL,nn
                  ; 16700T
    LD DE,nn
                  ; 16710T
    LD BC,nn
                  ; 16720T
    PUSH BC                     ; columns 19 and 20
                  ; 16731T
    PUSH DE                     ; columns 17 and 18
                  ; 16742T
    PUSH HL                     ; columns 15 and 16
                  ; 16753T
    EXX
                  ; 16757T
    PUSH BC                     ; columns 13 and 14
                  ; 16768T
    PUSH DE                     ; columns 11 and 12
                  ; 16779T
    PUSH HL                     ; columns 9 and 10
                  ; 16790T
    PUSH AF                     ; columns 7 and 8
                  ; 16801T
                  
    ; --- SET ATTRIBUTES FOR 5TH RASTER SCAN ---
    
    LD HL,nn
                  ; 16811T
    LD DE,nn
                  ; 16821T
    LD BC,nn
                  ; 16831T
    EXX
                  ; 16835T
    LD HL,nn
                  ; 16845T
    LD DE,nn
                  ; 16855T
    LD BC,nn
                  ; 16865T
    PUSH BC                     ; columns 5 and 6
                  ; 16880T
    PUSH DE                     ; columns 3 and 4
                  ; 16896T
    LD DE,nn
                  ; 16906T
    LD BC,nn
                  ; 16916T
    PUSH IY                     ; columns 1 and 2
                  ; 16931T
    LD SP,nn                    ; reference columns 19 and 20
                  ; 16941T
    PUSH BC                     ; columns 19 and 20
                  ; 16952T
    PUSH DE                     ; columns 17 and 18
                  ; 16963T
    PUSH HL                     ; columns 15 and 16
                  ; 16974T
    EXX
                  ; 16978T
    PUSH BC                     ; columns 13 and 14
                  ; 16989T
    PUSH DE                     ; columns 11 and 12
                  ; 17000T
    PUSH HL                     ; columns 9 and 10
                  ; 17011T
    LD HL,nn
                  ; 17021T
    EX AF,AF'
                  ; 17025T
    PUSH HL                     ; columns 7 and 8
                  ; 17040T
                  
    ; --- SET ATTRIBUTES FOR 6TH RASTER SCAN ---
    
    LD HL,nn
                  ; 17050T
    LD DE,nn
                  ; 17060T
    LD BC,nn
                  ; 17070T
    EXX
                  ; 17074T
    LD HL,nn
                  ; 17084T
    LD DE,nn
                  ; 17094T
    LD BC,nn
                  ; 17104T
    PUSH BC                     ; columns 5 and 6
                  ; 17120T
    PUSH DE                     ; columns 3 and 4
                  ; 17136T
    LD DE,nn
                  ; 17146T
    LD BC,nn
                  ; 17156T
    PUSH BC                     ; columns 1 and 2
                  ; 17167T
    LD SP,nn                    ; reference columns 19 and 20
                  ; 17177T
    PUSH DE                     ; columns 19 and 20
                  ; 17188T
    PUSH HL                     ; columns 17 and 18
                  ; 17199T
    EXX
                  ; 17203T
    PUSH BC                     ; columns 15 and 16
                  ; 17214T
    PUSH DE                     ; columns 13 and 14
                  ; 17225T
    PUSH HL                     ; columns 11 and 12
                  ; 17236T
    PUSH AF                     ; columns 9 and 10
                  ; 17247T
    LD HL,nn
                  ; 17257T
    PUSH HL                     ; columns 7 and 8
                  ; 17272T
    
    ; --- SET ATTRIBUTES FOR 7TH RASTER SCAN ---
    
    LD HL,nn
                  ; 17282T
    LD DE,nn
                  ; 17292T
    LD BC,nn
                  ; 17302T
    EXX
                  ; 17306T
    LD HL,nn
                  ; 17316T
    LD DE,nn
                  ; 17326T
    LD BC,nn
                  ; 17336T
    PUSH BC                     ; columns 5 and 6
                  ; 17352T
    LD BC,nn
                  ; 17362T
    PUSH IX                     ; columns 3 and 4
                  ; 17379T
    PUSH BC                     ; columns 1 and 2
                  ; 17390T
    LD SP,nn                    ; reference columns 17 and 18
                  ; 17400T
    PUSH DE                     ; columns 17 and 18
                  ; 17411T
    PUSH HL                     ; columns 15 and 16
                  ; 17422T
    EXX
                  ; 17426T
    PUSH BC                     ; columns 13 and 14
                  ; 17437T
    PUSH DE                     ; columns 11 and 12
                  ; 17448T
    PUSH HL                     ; columns 9 and 10
                  ; 17459T
    LD HL,nn
                  ; 17469T
    LD DE,nn
                  ; 17479T
    PUSH DE                     ; columns 7 and 8
                  ; 17496T
    LD DE,nn
                  ; 17506T
    LD BC,nn
                  ; 17516T
    LD (nn),HL                  ; columns 19 and 20
                  ; 17536T
    
    ; --- SET ATTRIBUTES FOR 8TH RASTER SCAN ---
    
    LD HL,nn
                  ; 17546T
    EXX
                  ; 17550T
    LD HL,nn
                  ; 17560T
    PUSH HL                     ; columns 5 and 6
                  ; 17576T
    LD HL,nn
                  ; 17586T
    LD DE,nn
                  ; 17596T
    LD BC,nn
                  ; 17606T
    PUSH BC                     ; columns 3 and 4
                  ; 17617T
    PUSH DE                     ; columns 1 and 2
                  ; 17628T
    LD SP,nn                    ; reference columns 15 and 16
                  ; 17638T
    PUSH HL                     ; columns 15 and 16
                  ; 17649T
    EXX
                  ; 17653T
    PUSH BC                     ; columns 13 and 14
                  ; 17664T
    PUSH DE                     ; columns 11 and 12
                  ; 17675T
    PUSH HL                     ; columns 9 and 10
                  ; 17686T
    LD HL,nn
                  ; 17696T
    PUSH HL                     ; columns 7 and 8
                  ; 17712T
    LD HL,nn
                  ; 17722T
    LD (nn),HL                  ; columns 17 and 18
                  ; 17744T
    LD HL,nn
                  ; 17754T
    LD (nn),HL                  ; columns 19 and 20
                  ; 17776T
    

    Once again, all timings are based on a Spectrum 48K with normal (non-late) timing. Total execution time is 17776T - 15984T = 1792T = 8 * 224T, so each execution cycle should finish exactly on time to start the following cycle.

    I'm guessing exactly the same code will work just fine on a Spectrum 128K/+2, but I have no idea about the Spectrum +2A/+3. As a matter of fact, I didn't even emulate this code in a Spectrum 48K yet, so it's perfectly possible I made a silly mistake somewhere :)
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited July 2013
    I'm guessing exactly the same code will work just fine on a Spectrum 128K/+2, but I have no idea about the Spectrum +2A/+3. As a matter of fact, I didn't even emulate this code in a Spectrum 48K yet, so it's perfectly possible I made a silly mistake somewhere :)

    You've got more free t-states per frame on the +2A/+3 than on the other machines so I'd actually start by trying to get it to work on those and then see if it will run on the 48. You may run into problems on the ULA 128s owing to the I/O contention.
  • edited July 2013
    aowen wrote: »
    You've got more free t-states per frame on the +2A/+3 than on the other machines so I'd actually start by trying to get it to work on those and then see if it will run on the 48. You may run into problems on the ULA 128s owing to the I/O contention.

    The number of T-states per frame is not important. The real problem is the limited number of T-states per raster scan line (224T for a Spectrum 48K, 228T for others).

    For this reason, it would never work to plan everything for 228T and hope it would also work for 224T. It has to be the opposite. At first glace, it seems the extra delays due to contention will make the same routine also work on a Spectrum 128K/+2, but I will need some time to test everything properly...
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited July 2013
    The number of T-states per frame is not important.

    True. The I/O contention would only be an issue if you were using the shadow VRAM.
    At first glace, it seems the extra delays due to contention will make the same routine also work on a Spectrum 128K/+2, but I will need some time to test everything properly...

    ULA contention will iron out kinks between 224 and 228 Ts. You just need to start drawing the attributes after a different number of Ts.
  • edited July 2013
    aowen wrote: »
    ULA contention will iron out kinks between 224 and 228 Ts. You just need to start drawing the attributes after a different number of Ts.

    Not necessarily.

    Delays due to memory contention in a Spectrum 48K follows a pattern line this: 6,5,4,3,2,1,0,0,6,5,4,3,2,1,0,0,6,5,...

    Delays due to memory contention in a Spectrum +2A/+3 follows a pattern line this: 7,6,5,4,3,2,1,0,7,6,5,4,3,2,1,0,7,6,...

    The "standard" 18 columns multicolor routine executes only 2 PUSHes per raster scan line. Each of these instructions typically take 1T or 2T longer to execute in a Spectrum +2A/+3, but since there are extra 4T per raster scan line (228T instead of 224T), this is not a problem.

    Unfortunately it's impossible to implement a 20 columns multicolor routine without executing a lot more accesses during contention, so I'm afraid a Spectrum +2A/+3 may require a completely different routine...
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited August 2013
    Unfortunately it's impossible to implement a 20 columns multicolor routine without executing a lot more accesses during contention, so I'm afraid a Spectrum +2A/+3 may require a completely different routine...

    I'll be interested to see if this in fact proves to be the case.
  • edited August 2013
    I'll drop a hint here, drawn from much trial and error, of how to get your first line in synch without slaving over it for months. (I may have said this already - forgive my wonky memory).
    Leave at least one row of ordinary attributes at the top of the screen. ZXODUS does this. I don't do it in Buzzsaw+ but my other multicolour experiments do. You can always find a way to use more upper border time.

    Then, whilst you're in that contended time of the first character row, do something like copy one or two rows of attributes (or even do a whole character row's worth of copying) on the first row of multicoloured characters before the raster gets there. Then, do them again properly as the raster actually arrives. That first pass will iron out all the kinks in timing coming from a delayed interrupt, then your first line of multicolour will work just as well as all the repeated lines.
    Joefish
    - IONIAN-GAMES.com -
  • edited August 2013
    aowen wrote: »
    I'll be interested to see if this in fact proves to be the case.

    Me too :)
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited August 2013
    joefish wrote: »
    I'll drop a hint here, drawn from much trial and error, of how to get your first line in synch without slaving over it for months. (I may have said this already - forgive my wonky memory).
    Leave at least one row of ordinary attributes at the top of the screen. ZXODUS does this. I don't do it in Buzzsaw+ but my other multicolour experiments do. You can always find a way to use more upper border time.

    Then, whilst you're in that contended time of the first character row, do something like copy one or two rows of attributes (or even do a whole character row's worth of copying) on the first row of multicoloured characters before the raster gets there. Then, do them again properly as the raster actually arrives. That first pass will iron out all the kinks in timing coming from a delayed interrupt, then your first line of multicolour will work just as well as all the repeated lines.

    Thanks for the suggestion, but there's a better way. Executing a single instruction ld hl,($4000) at the end of the contention period on each raster scan line will provide a similar result.

    This is the main idea behind the "anti-flickering mechanism" implemented in BIFROST*. This saves a lot of bytes, and you could even use the remaining time on each of these raster scan lines to do something useful if you need.

    In BIFROST* source code, search for comment "synchronize with the raster beam" to find the relevant piece of code.
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited August 2013
    IT WORKS!!!

    Tested on both Spectrum 48K and Spectrum 128K, obtaining the following result:

    kbyb6c.jpg

    Unfortunately it's not working (yet) on a Spectrum +3:

    2sbqrg1.jpg

    In this case, the last 2 columns during the last raster scan line are not updated fast enough due to (different) contention. Because of this, every 8th pixel line shows the same attribute from the pixel line above (PAPER black instead of blue).
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
Sign In or Register to comment.