Optimal shifting for set pixel?

edited May 2014 in Development
Hi,

I'm optimizing my set pixel code. For speed. But without using partitioned memory (i.e. a table on 256 byte boundary) I have optimal memory address position code and am now trying to optimize the shifting part.

Here's what I came up with so far.
		ld	a,#x		; symbolic. a is x (0->255)
		and	#7		; remainder of division by 8 (0->7)
		inc	a		; avoid special condition without shift / i.e. jr z
		ld	b,a		; our shift counter
		xor	a		; a=0
		scf			; our shift bit
loop:		rra			; to a
		djnz	loop		; until apropriate position

Is there a more optimal approach to creating a mask for shifting then this?

T.
Post edited by tstih on

Comments

  • edited May 2014
    Using a lookup table is probably best, but if you really wanted to do it without, you could use self-modifying code.

    Have a SET 0,(HL) instruction, and modify the last byte from C6 to be (C6 | (x&7)<<3), then execute that.

    (disclaimer: haven't tried it)
  • edited May 2014
    What kayamon said + table can be at any place ( ld hl, tab \ ld d, 0 \ ld e, entry \ add hl, de \ ld a, (hl))
    Also, this could work:
    ld (jump+1), a \ ld a, 1 \ jump: jr 0 \ rla \ rla \ rla \ rla \ rla \ rla \ rla
  • edited May 2014
    Something like this?
    Pixel Set
    ---------
    
    	LD A,#x	(#00-FF)
    	AND #07
    	INC A
    	RLA
    	RLA
    	RLA
    	ADD A,#C6
    	LD (NN+1),A
    NN	SET N,(HL)
    
  • edited May 2014
            ld de,table
            and 7
            add a,e
            ld e,a
            ld a,(de)     
            ld (setn+1),a
    setn    set 0,(hl)
            ret
    table   defb #c6,#ce,#d6,#de,#e6,#ed,#d6,#de
    

    Edit: looking previous code, tablevalues might be incorrect

    Edit2:
    As mentioned above C6 and further are correct values.
    When table at #nn00:
            ld d,tablehighbyte
            and 7
            ld e,a
            ld a,(de)     
            ld (setn+1),a
    setn    set 0,(hl)
            ret
    table   defb #c6,#ce,#d6,#de,#e6,#ed,#d6,#de
    
  • edited May 2014
    jamorski wrote: »
    Something like this?
    Pixel Set
    ---------
     
        LD A,#x    (#00-FF)
        AND #07
    ;    INC A
        RLA
        RLA
        RLA
        ADD A,#C6+8
        LD (NN+1),A
    NN    SET N,(HL)
    

    Like this?
  • edited May 2014
    Dr BEEP wrote: »
    Like this?
    Intetesting, computed version beats LUT.
  • edited May 2014
    Surely if you're using the last 3 bits of the X-value to set the bit in a byte then aren't you counting them in reverse order to how they appear on the screen?
    Joefish
    - IONIAN-GAMES.com -
  • edited May 2014
    LD A,#x;
    AND #7;
    LD H, table/256; # locate table in such manner that L=0
    LD L,A;
    LD A,(HL);

    table defb #80,#40,#20,#10,#08,#04,#02,#01;


    It's untested, but you should get the idea.
  • edited May 2014
    LD A,#x;
    AND #7;
    LD H, table/256; # locate table in such manner that L=0
    LD L,A;
    LD A,(HL);

    table defb #80,#40,#20,#10,#08,#04,#02,#01;


    It's untested, but you should get the idea.

    easy way is fill 256 bytes with the same sequence, so code looks like:
    ld h,bw/256
    ld l,X; X is value
    ld a,(hl)
    bw:db #80,#40,#20,#10,#08,#04,#02,#01 ;32 times
    
  • edited May 2014
    g0blinish, X is a passed value, not a constant, because it's a pixel set routine and the rest of X is used to calculate display buffer memory offset, it's going to be used anyway. Repeating byte sequence is a waste of memory, IMO...

    If it weren't a passed value, we could do:

    LD A,(bw+X);
    bw:db #80,#40,#20,#10,#08,#04,#02,#01 ;32 times
  • edited May 2014
    g0blinish, X is a passed value, not a constant, because it's a pixel set routine and the rest of X is used to calculate display buffer memory offset, it's going to be used anyway. Repeating byte sequence is a waste of memory, IMO...

    If it weren't a passed value, we could do:

    LD A,(bw+X);
    bw:db #80,#40,#20,#10,#08,#04,#02,#01 ;32 times

    routine by Busy:
    ;DE=Y.X
    PLOT    PUSH HL,BC:LD H,HIGH PLOTTBL,L,D,B,(HL):INC H
            LD A,(HL),L,E:INC H:OR (HL)
            INC H:LD C,A,A,(BC)
            OR (HL):LD (BC),A
            POP BC,HL:RET
    
    of course, routine ain't optimized.

    to use plot some data must be prepared(1024 bytes):
    PLOTTBL EQU #A000
    
      call FORMER
    
    FORMER  LD DE,#4000,BC,#8000,L,E
    FLP1    LD H,high PLOTTBL
            LD (HL),D:INC H:LD (HL),E:INC H
            LD (HL),C:INC H:LD (HL),B
            RRC B
            LD A,C:ADC A,0:LD C,A
    FBR1    INC D:LD A,D:AND 7
            JR NZ,FNXT:LD A,E:ADD A,32
            LD E,A:JR C,FNXT
            LD A,D:SUB 8:LD D,A
    FNXT    INC L:JR NZ,FLP1
            LD HL,PLOTTBL+#C0,BC,#3F
            LD DE,HL:INC E
            LD (HL),0:LDIR
            RET
    
  • edited May 2014
    Self modifying code is bad karma /not ROMable/. But the jump can still be made (if placing shift table on 256-byte boundary) by using JP (HL) I guess.
    LD H,$HIGH(SHIFTS)
    LD L,#nrshifts
    JP (HL)
    SHIFTS:
    ...
    8 shift instructions in the other direction
    ...
    
Sign In or Register to comment.