Part 14
The Character Set

Subjects covered...

	CODE, CHR$
	POKE, PEEK
	USR
	BIN

The letters, digits, spaces, punctuation marks and so on that can
appear in strings are called characters, and they make up the
character set that the +3 uses. Most of these characters are single
symbols, but there are some more, called tokens, that represent whole
words, such as PRINT, STOP, '<>' and so on.

There are 256 characters, and each one has a code between 0 and 255
(there is a complete list of them in part 26 of this chapter). To
convert between codes and characters, there are two functions, CODE
and CHR$.

CODE is applied to a string, and gives the code of the first character
in the string (or 0 if the string is empty).

CHR$ is applied to a number, and gives the single character string
whose code is that number.

This program prints out the entire character set...

	10 FOR a=32 TO 255: PRINT CHR$ a;: NEXT a

On the screen will appear the following...

  +----------------------------------------------------------------------+
  |                                                                      |
  |     ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ?    |
  |   /=A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _    |
  |   ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ ()   |
  |      '' '' . :'.':. .': :'...::.::A B C D E F G H I J K L M N O P    |
  |   Q R S   S P E C T R U M   P L A Y   R N D I N K E Y $ P I F N      |
  |   P O I N T   S C R E E N $   A T T R   A T   T A B   V A L $   C    |
  |   O D E   V A L   L E N   S I N   C O S   T A N   A S N   A C S      |
  |   A T N   L N   E X P   I N T   S Q R   S G N   A B S   P E E K      |
  |   I N   U S R   S T R $   C H R $   N O T   B I N   O R   A N D      |
  |   < = > = < >   L I N E   T H E N   T O   S T E P   D E F   F N      |
  |   C A T   F O R M A T   M O V E   E R A S E   O P E N   #   C L O    |
  |   S E   #   M E R G E   V E R I F Y   B E E P   C I R C L E   I N    |
  |   K   P A P E R   F L A S H   B R I G H T   I N V E R S E   O V E    |
  |   R   O U T   L P R I N T   L L I S T   S T O P   R E A D   D A T    |
  |   A   R E S T O R E   N E W   B O R D E R   C O N T I N U E   D I    |
  |   M   R E M   F O R   G O   T O   G O   S U B   I N P U T   L O A    |
  |   D   L I S T   L E T   P A U S E   N E X T   P O K E   P R I N T    |
  |     P L O T   R U N   S A V E   R A N D O M I Z E   I F   C L S      |
  |   D R A W   C L E A R   R E T U R N   C O P Y                        |
  |                                                                      |
  |                                                                      |
  |                                                                      |
  |                                                                      |
  |   0   O K ,   1 0 : 1                                                |
  |                                                                      |
  +----------------------------------------------------------------------+

                             The character set


As you can see, the character set consists of a space, 15 symbols and
punctuation marks, the ten digits, seven more symbols, the capital
letters, six more symbols, the lower case letters and five more
symbols. These are all (except the pound sign shown above as '/=', and
the copyright symbol shown above as '()') taken from a widely-used set
of characters known as ASCII (American Standard Codes for Information
Interchange). ASCII also assigns numeric codes to these characters,
and these are the codes that the +3 uses.

The rest of the characters are not part of ASCII, but are dedicated to
the ZX Spectrum range of computers. First amongst them are a space and
15 patterns of black and white blobs [although it is difficult to
depict them in ASCII, as you have recently seen]. These are called the
graphics symbols and can be used for drawing pictures. You can enter
these from the keyboard, using what's known as graphics mode. Pressing
the GRAPH key switches on graphics mode, after which the keys 1, 2, 3,
4, 5, 6, 7 and 8 will produce the graphics symbols...

+-----------------------------------------------------------------------------+
|     |     |..## |##.. |#### |.... |..## |##.. |#### |.... |graph|     |     |
|     |     |....1|....2|....3|..##4|..##5|..##6|..##7|....8|off 9|     |     |
|-----------------------------------------------------------------------------|
|       |     |     |     |     |     |     |     |     |     |     |     |   |
|       |GRAPH|     |     |     |     |     |     |     |     |     |     |   |
|-------------------------------------------------------------------------+   |
|         |     |     |     |     |     |     |     |     |     |     |       |
|         |     |     |     |     |     |     |     |     |     |     |       |
|-----------------------------------------------------------------------------|
|            |     |     |     |     |     |     |     |     |     |          |
|            |     |     |     |     |     |     |     |     |     |          |
|-----------------------------------------------------------------------------|
|     |     |     |     |     |                       |     |     |     |     |
|     |     |     |     |     |                       |     |     |     |     |
+-----------------------------------------------------------------------------+


While in graphics mode, pressing CAPS SHIFT together with one of the
keys 1 to 8 produces 'inverted' versions of the same symbols, i.e.
black becomes white and white becomes black...


+-----------------------------------------------------------------------------+
|     |     |##.. |..## |.... |#### |##.. |..## |.... |#### |graph|     |     |
|     |     |####1|####2|####3|##..4|##..5|##..6|##..7|####8|off 9|     |     |
|-----------------------------------------------------------------------------|
|       |     |     |     |     |     |     |     |     |     |     |     |   |
|       |GRAPH|     |     |     |     |     |     |     |     |     |     |   |
|-------------------------------------------------------------------------+   |
|         |     |     |     |     |     |     |     |     |     |     |       |
|         |     |     |     |     |     |     |     |     |     |     |       |
|-----------------------------------------------------------------------------|
|            |     |     |     |     |     |     |     |     |     |          |
| CAPS SHIFT |     |     |     |     |     |     |     |     |     |CAPS SHIFT|
|-----------------------------------------------------------------------------|
|     |     |     |     |     |                       |     |     |     |     |
|     |     |     |     |     |                       |     |     |     |     |
+-----------------------------------------------------------------------------+


The cursor keys won't work properly while all this is going on as the
+3 interprets them as shifted number keys, and prints graphics
characters accordingly.

Pressing the 9 key turns everything back to normal (as does pressing
GRAPH again). The 0 key deletes the character to the left of the
cursor.

Here are the sixteen graphics symbols...


           Symbol    Code                Symbol    Code
            ____                          ____          
           |    |    128                 |####|    143
           |____|                        |####|       
            ____                          ____        
           |  ##|    129                 |##  |    142
           |____|                        |####|       
            ____                          ____        
           |##  |    130                 |  ##|    141
           |____|                        |####|       
            ____                          ____        
           |####|    131                 |    |    140
           |____|                        |####|       
            ____                          ____        
           |    |    132                 |####|    139
           |__##|                        |##__|       
            ____                          ____        
           |  ##|    133                 |##  |    138
           |__##|                        |##__|       
            ____                          ____        
           |##  |    134                 |  ##|    137
           |__##|                        |##__|       
            ____                          ____        
           |####|    135                 |    |    136
           |__##|                        |##__|


After the graphics symbols in the character set, you will see what
appears to be another copy of the alphabet from A to S. These are
characters that you can redefine yourself (though when the machine is
first switched on they are set as letters) - they are called
user-defined graphics. You can type these in from the keyboard by
going into graphics mode, and then using the letter keys A to S.

To define a new character for yourself, follow this recipe - it
defines a character to show pi.

(i) Work out what the character looks like. Each character has an 8x8
grid of dots, each of which can appear to be either on or off. You'd
draw a diagram something like this (with blank squares representing
the dots which are on)...

                    _______________________________
                   |   |   |   |   |   |   |   |   |
                   |___|___|___|___|___|___|___|___|
                   |   |   |   |   |   |   |   |   |
                   |___|___|___|___|___|___|___|___|
                   |   |   |   |   |   |   |###|   |
                   |___|___|___|___|___|___|###|___|
                   |   |   |###|###|###|###|   |   |
                   |___|___|###|###|###|###|___|___|
                   |   |###|   |###|   |###|   |   |
                   |___|###|___|###|___|###|___|___|
                   |   |   |   |###|   |###|   |   |
                   |___|___|___|###|___|###|___|___|
                   |   |   |   |###|   |###|   |   |
                   |___|___|___|###|___|###|___|___|
                   |   |   |   |   |   |   |   |   |
                   |___|___|___|___|___|___|___|___|


When a dot is on, the +3 prints the ink colour; when a dot if off, the
+3 prints the paper colour. (The terms ink and paper are explained in
part 16 of this chapter.)

We've left a one-square border around the edge because all the other
letters also have one (except for lower case letters with tails, where
the tail goes right down to the bottom).

(ii) Work out which user-defined graphic you wish to display pi -
let's say the one corresponding to 'P' so that if you press P (after
pressing GRAPH) you get pi.


(iii) Store the new pattern. Each user-defined graphic has its pattern
stored as eight numbers, one for each row.  You can write each of
these numbers in a program as BIN followed by eight 0's or 1's - 0 for
paper, 1 for ink - so the eight numbers for our pi character are...

	BIN 00000000	- top row
	BIN 00000000	- second row down
	BIN 00000010	- third row down
	BIN 00111100	- forth row down
	BIN 01010100	- fifth row down
	BIN 00010100	- sixth row down
	BIN 00010100	- seventh row down
	BIN 00000000	- bottom row

(If you know about binary numbers, then it should help you to know
that BIN is used to write a number in binary instead of the usual
decimal.) Look at the pattern of binary numbers through half-closed
eyes - you may even be able to see the pi character!

These eight numbers are stored in eight locations (bytes) in memory.
Each of these locations has an address. The address of the first byte
(or group of eight digits) is 'USR "P"' (we chose 'P' in (ii) above).
The address of the second byte is 'USR "P"+1', and so on up to the
address 'USR "P"+7'.

USR here is a function to convert a string argument into the address
of the first byte in memory for the corresponding user-defined
graphic. The string argument must be a single character which can be
either the user-defined graphic itself or the corresponding letter (in
upper or lower case). There is another use for USR, when its argument
is a number, which will be dealt with later.

Even if you don't understand this, the following program will define
the character for you...

	 10 FOR n=0 TO 7
	 20 READ row: POKE USR "P"+n, row
	 30 NEXT n
	 40 DATA BIN 00000000
	 50 DATA BIN 00000000
	 60 DATA BIN 00000010
	 70 DATA BIN 00111100
	 80 DATA BIN 01010100
	 90 DATA BIN 00010100
	100 DATA BIN 00010100
	110 DATA BIN 00000000

The POKE statement stores a number directly in a memory location,
bypassing the mechanisms normally used by the BASIC. The opposite of
POKE is PEEK, and this allows us to look at the contents of a memory
location although it does not actually alter the contents themselves.
PEEK and POKE are described more fully in part 24 of this chapter.

After the user-defined graphics in the character set come the tokens.

You will have noticed that we have not printed out the first 32
characters (codes 0 to 31) - these are control characters. They don't
produce anything printable, but instead are used to control the screen
display or some other function of the +3.

(If you try to print control characters, the +3 displays '?' to show
that it doesn't understand them. Control characters are described
more fully in part 28 of this chapter.)

The three control characters that the screen display uses are 6, 8 and
13 (these will now be explained). On the whole, 'CHR$ 8' is the only
one you are likely to find useful.

'CHR$ 6' prints spaces in exactly the same way as a comma does in a
PRINT statement, for instance...

	PRINT 1; CHR$ 6;2

...does the same as...

	PRINT 1,2

Obviously this is not a very clear way of using it. A more subtle way
is to say...

	LET a$="1"+ CHR$ 6+"2"
	PRINT a$

'CHR$ 8' is 'backspace' - it moves the print position back one place.
Try...

	PRINT "1234"; CHR$ 8;"5"

...which prints out...

	1235

'CHR$ 13' is 'newline' - it moves the print position to the beginning
of the next line.

The screen display also uses control codes 16 to 23 - these are
explained in parts 15 and 16 of this chapter (all the codes are listed
in part 28).

Using the codes for the characters we can extend the concept of
'alphanumerical ordering' to cover strings containing any characters,
not just letters. If instead of thinking in terms of the usual
alphabet of 26 letters we use the extended alphabet of 256 characters,
in the same order as their codes, then the principle is exactly the
same. For instance, the following strings are in their 'Spectrum'
ASCII alphabetical order. (Notice the rather odd feature that lower
case letters come after all the capitals; so 'a' comes after 'Z'.
Notice also that spaces are significant.)

	CHR$ 3+"ZOOLOGICAL GARDENS"
	CHR$ 8+"AARDVARK HUNTING"
	" AAAARGH!"
	"(Parenthetical remark)"
	"100"
	"129.95 inc. VAT"
	"AASVOGEL"
	"Aardvark"
	"Elgar, the Regal Lager"
	"PRINT"
	"Zoo"
	"[interpolation]"
	"aardvark"
	"aasvogel"
	"derby"
	"zoo"
	"zoology"

Here is the rule for finding out in which order two strings come.
Start by comparing the first two characters. If they are different,
then one of them has its code less than the other, and the string it
comes from is the earlier (lesser) of the two strings. If they are the
same, then go on to compare the next two characters. If in this
process one of the strings runs out before the other, then that string
is the earlier; otherwise they must be equal.

The relations '=', '<', '>', '<=', '>=' and '<>' are used for strings
as well as for numbers: '<' means 'comes before' and '>' means 'comes
after', so that...

	"AA man"<"AARDVARK"
	"AARDVARK">"AA man"

...are both true.

'<=' and '>=' work in the same way as they do for numbers, so that...

	"The same string" <= "The same string"

...is true, but...

	"The same string" < "The same string"

...is false.

Experiment on all this using the program here, which inputs two
strings and puts them in order.

	10 INPUT "Type in two strings:",a$,b$
	20 IF a$>b$ THEN LET c$=a$: LET a$=b$: LET b$=c$
	30 PRINT a$;" ";
	40 IF a$
[Back] [Contents] [Next]