Objectives: To instill an understanding of how digital computers work from both a hardware and software perspective. |
References:
MC9S08QG8 Data Manual, 300 pages, 3Mb pdf
The Central Processing Unit (CPU) is the brain of the computer. It performs the essential task of fetching instructions from program memory and executing these instructions. The CPU contains the Arithmetic/Logic Unit (ALU) which performs the various arithmetic and boolean operations. The design of the ALU gives each computer its distinctive flavour. This is what makes the Intel 486 processor different from the Motorola 68000.
Memory is used for storage of program instructions as well as data. A unit of storage is called a word. Computers have different word lengths, for example, 4, 8, 12, 14, 16, 32 and 64-bit words are in common use today. Microcomputers generally use a word length of 8 bits, or a byte. Memory is organized as a linear array of words, a word being the smallest collection of bits which can be accessed by the CPU. Each word is identified by its address or location in memory. Thus a computer with 64K words of memory (65536) would require a memory address register that is 16 bits wide.
Semiconductor memories have virutally replaced older forms of memory such as magnetic core memory. Because of their highest density and lowest cost, dynamic random-access memory (DRAM) chips are used in modern computers. These have the drawback that they must be refreshed continuously. That is, additional control circuitry is required to access banks of memory periodically otherwise the electronic charge in each memory cell would be lost. DRAM are also volatile, i.e., all data is lost after the power is turned off. Static random-access memories (SRAM) are faster than DRAM, use less power and do not require refreshing. SRAM, like DRAM, are also volatile. There are may applications where non-volatile storage is required. Read-only memories (ROM) are used to store program code or information that never changes. To be cost effective, these must be programmed by the manufacturer in large quantities. UV-EPROM (ultra-violet erasable programmable ROM) can be erased and reprogrammed by the user using a special UV lamp and programming equipment. EEPROM (electrically erasable PROM) has the advantage of being erasable and programmable in situ. Flash memory is another form of EEPROM. Because of their higher density, block orientation and longer erase times, these are more suited as replacements for hard disk drives or as removable storage devices such as CompactFlash cards and memory sticks.
Without Input/Output (I/O) capabilities a computer would not be of much use. I/O is any means of getting information into and out of the computer. This could be as simple as switches and lights or more traditional I/O devices such as the keyboard, mouse and video display. Modems, sound cards, analog-to-digital and digital-to-analog converters, serial and parallel ports, timers and counters are all examples of I/O devices.
Storage devices fall into another category of I/O devices. Hard disk drives, floppy disk drives and CD-ROM drives are mass storage devices for keeping large amounts of information. While these may not be perceived as part of the I/O capabilities of the computer, from the hardware perspective they are interfaced to the computer via the I/O bus.
We are familiar with the computer in its most highly visible form as the personal computer (PC) and its aliases such as desk-top or lap-top computer, engineering work-station, business computer, file server and so on. Applications of the PC cover wide areas from personal, scientific, engineering, business and commercial applications to Internet servers, Information Management Systems (e.g. patient records, airline reservations) and industrial process controllers.
And yet there are more computers installed all over the world in areas other than the ones just described. These are the computers embedded in machines, instruments and appliances such as cameras, stereos, TVs, VCRs, microwave ovens, computer printers, copiers, modems, mobile phones, wristwatches, automobiles, test and diagnostic equipment, scientific equipment, health-care instruments and the list goes on and on.
Embedded computers refer to both microprocessors as well as more complex computer systems applied to a specific task. In most cases, the user is unaware of the computational engine under the cover and neither cares about the software or operating system being employed. Embedded processors range in size and cost from miniature 4-bit microprocessors costing under $1 to full blown Pentium or Power PC based systems costing in excess of $10,000.
(In July 2004 Freescale Semiconductor was created from the semiconductor division of Motorola.)
This course uses the Freescale MC9S08QG8 as a model microcontroller unit (MCU) because its architecture and programmer's model is relatively simple. The fundamentals are universal and can be applied to any other microprocessor. Moreover, we will promote the idea that bigger does not mean better. Smallness can be beautiful. There are many obvious advantages to being small such as being compact, portable, energy efficient, simple to design and manufacture, less costly and more reliable.
The Freescale MC9S08QG8 microprocessor is a low cost and yet very powerful 8-bit solution to many embedded processor applications. The HC08/HCS08 is a large family of processors available from Freescale in various input/output configurations based on the same CPU core.
Here are some of the highlights of the MC9S08QG8:
The MC9S08QG8/4 is available in 8-pin and 16-pin surface mount packages (SMD) and some versions in dual in-line packages (DIP). The MC9S08QG4 comes with 4Kbytes FLASH and 256 bytes RAM.
Figure 2.2 shows the register model of the ALU of the HC08 and HCS08 CPU.
Figure 2.2 CPU Registers
The A accumulator is a general-purpose 8-bit register. In general, all mathematical and logical operations will be performed using this register. However, the Freescale family of microcontrollers can perform many operations directly on the first 256 bytes (page zero) of RAM. This is like having an additional 256 registers besides the A accumulator.
The H and X registers together constitute a 16-bit index register. Previous versions of the Motorola 6800 and 6805 families had a single 8-bit X index register. This was found to be somewhat lacking and hence the additional 8-bit H register was added to allow full 16-bit addressing capability. Note that for backward compatiblity, some instructions operate only on the X register.
The program counter (PC) contains the address of the next instruction to be executed. Similarly, the stack pointer (SP) contains the address of the next free space in SRAM which is available for temporary storage. (Stacks are explained in Chapter 4). In general, the programmer can consider the handling of the PC and SP as an internal matter and may ignore the two for the time being.
The Condition Code Register contains eight flags and control bits used to monitor the results of arithmetic/logic operations and to control certain CPU functions.
Figure 2.3 - Condition Code Register
The Zero (Z) bit is set when the result of the last arithmetic, logical, or data manipulation is zero.
The Negative (N) bit reflects the state of the MSB of the result. For 2's complement notation, the N bit is set if the result is negative.
The Overflow (V) bit is used to indicate if an arithmetic overflow has occurred as a result of the operation.
The Carry (C) bit is used to indicate if a carry from an addition or a borrow from a subtraction has occurred. The C bit is also used in shift and rotate operations.
For the time being you may ignore the functions served by H and I bits.
A computer program is a set of precise instructions stored in memory for the CPU to act on. Instructions on the HC08 vary in length from 1 to 4 bytes, depending on the individual instruction. This is characteristic of a Complex Instruction Set Computer (CISC) as compared to a Reduced Instruction Set Computer (RISC). For example, the Pentium is a CISC while the Power PC is a RISC. How does the CPU of a CISC know how many bytes make up the instruction? The first byte in the sequence dictates the type of instruction and therefore the number of bytes to follow for that instruction. Therefore it makes no sense to execute program bytes out of synchronous order.
A machine cycle is the length of time it takes to perform an internal hardware operation. The MC9S08QG8/4 can operate from an internal oscillator or from an external quartz crystal. On power-on reset, the internal oscillator will default to run at 16MHz which is then divided by 4 to give a bus clock of 4MHz. Thus, a machine cycle is 250ns in duration. To fetch and execute an instruction may require a number of machine cycles.
A collection of instructions (i.e. the program) is stored in memory at sequential locations and the CPU fetches and executes each instruction one at a time. The flow of execution is controlled by the CPU with the use of the program counter. This register keeps track of the address of the next byte to be fetched.
Normally, the CPU will fetch instructions from memory in sequential order unless instructed to do otherwise as in the case of a jump or branch instruction. In this case, the PC will be loaded with a new memory address and on the next fetch cycle the flow of program execution will be directed to a different location in memory.
Pieces of data are stored in memory in exactly the same fashion as instructions. The CPU has no way of distinguishing instructions from data. Of course, the programmer knows which locations are used for data storage and will create the program such that data locations are never executed. In the event that this occurs, the program will behave unexpectedly and "crash".
What is the difference between machine language and assembly language? Machine language or machine code is the native language or instructions that the computer executes. Information on a digital computer is encoded using a two-state mechanism, that is, either ON or OFF. Therefore all information is manipulated and stored using a binary number system. Machine language is inherently a binary system.
Assembly language and machine language both represent the same information. They represent the native binary codes which will be entered into the computer and constitute a coherent set of instructions which we call the program. The difference lies in the interpretation by humans.
Machine code is binary. Programming a computer using the binary system of notation is very time consuming, inefficient and prone to errors. By utilizing decimal, octal and hexadecimal notation, it is possible to reduce the amount of data entry and book-keeping and therefore reduce effort and likelihood of errors. For example, the machine code to stop the CPU is 10001110. It is easier to write this as $8E, and even better to write STOP. Using hexadecimal formats or meaningful words is simply a better way for humans to relate to the binary instruction.
Instead of using a numeric value, either written as binary or hexadecimal, to represent an instruction, it is easier to use an acronym or mnemonic. The use of a mnemonic which has a one-to-one equivalence to the machine code and its function is called assembly language programming. The computer does not understand assembly language. Assembly language is a mechanism to assist humans in creating and managing programs in machine code more effectively and efficiently. Thus, programmers write in assembly language. Computers execute machine code.
When a program is written in assembly language, it must be translated into machine code and into a form that can be entered into the memory of the computer. This process is called assembling and can be accomplished manually. However, this is prone to errors and an automated method is preferred. A computer program which translates assembly language programs into machine code is called an assembler.
Simulators and debuggers are other program development tools which assist programmers in testing and debugging assembly language programs.
Assembly language programming is best introduced through the use of simple examples. Let us suppose we wish to find the sum of 12 and 23. In standard algebra we could write:
RESULT = 12 + 23
In HC08 assembly language, we have to instruct the CPU at each step of the operation. Here is a program to do this:
LDA | #12 | |||
ADD | #23 | |||
STA | RESULT | |||
STOP |
Here is a brief explanation of the mnemonics and what each line does.
The machine codes for this program in hexadecimal notation are as follows:
A6 0C AB 17 B7 00 8E
This program is of little use since we know the answer to be 35, a priori. Instead, it would be more useful to add two numbers whose values may be not fixed, i.e., what we call variables. Let us rephrase our example as:
RESULT = NUM1 + NUM2
The assembly language program to accomplish this may look as follows:
LDA | NUM1 | |||
ADD | NUM2 | |||
STA | RESULT | |||
STOP |
The difference between the two examples above is the use of the immediate symbol (#) in the first two lines. In the first example, the # specifies that the numeric data is a constant value and this value is found in the byte or pair of bytes following the instruction op-code byte. This is called immediate addressing mode. In the second example, the omission of the # specifies that the symbols NUM1 and NUM2 represent addresses and the numeric data are stored at these memory addresses.
Thus, note the difference between the following two instructions:
LDA | #4 | |||
LDA | 4 |
The first instruction loads the value 4 into accumulator A. The second loads the contents of memory address 4 into accumulator A.
Misuse of the # symbol is a common source of programming error which the assembler cannot detect. |
LDA | 4 | |||
ADD | 5 |
In general, it is best not to use absolute addresses of variables. Instead, let the assembler/compiler assign memory addresses to symbols as it sees fit.
What is the difference between the following statements: LDA #123 LDA #$7B LDA #%01111011 The answer is none. 123, $7B and %01111011 are just different notations or representations for us humans to represent the same quantity or value which is One Hundred and Twenty Three. The bit pattern stored on the computer is 01111011 and is always binary. |
Figure 2.3 MC9S08QG8 Memory Map
The memory map shows that the first 96 memory addresses are reserved for the most commonly used internal hardware registers. This leaves 160 locations on page zero from address $0060 to $00FF which can be used to directly address memory locations in RAM. Eighty high page registers from $1800 to $184F are for less often used internal hardware registers. To access these locations, extended or indexed addressing modes are required.
Program code is normally stored in the FLASH memory area which can be erased and reprogrammed many times.
The complete HC08 instruction set is listed here to familiarize you with the HC08 capabilities. For more information, see the HCS08 Instruction Set Summary. There are six addressing modes which refer to the different ways parameters are accessed. These modes are as follows:
In inherent addressing mode, all of the information is contained in the instruction byte. The operands (if any) are registers and no memory reference is required. These are one or two byte instructions. Here is a list of the inherent instructions:
Mathematical Operations | ||
CLRA | Clear A | |
CLRX | Clear X | |
CLRH | Clear H | |
COMA | 1's Complement A | |
COMX | 1's Complement X | |
DAA | Decimal Adjust A | |
DECA | Decrement A | |
DECX | Decrement X | |
INCA | Increment A | |
INCX | Increment X | |
NEGA | 2's Complement A | |
NEGX | 2's Complement X | |
NSA | Nibble Swap A | |
TSTA | Test A | |
TSTX | Test X | |
Shift Operations | ||
ASLA | Arithmetic Shift Left A | |
ASLX | Arithmetic Shift Left X | |
ASRA | Arithmetic Shift Right A | |
ASRX | Arithmetic Shift Right X | |
LSLA | Logical Shift Left A (same as ASLA) | |
LSLX | Logical Shift Left B (same as ASLX) | |
LSRA | Logical Shift Right A | |
LSRX | Logical Shift Right X | |
ROLA | Rotate Left A through Carry | |
ROLX | Rotate Left X through Carry | |
RORA | Rotate Right A through Carry | |
RORX | Rotate Right X through Carry | |
Inter-Register Operations | ||
TAX | Transfer A to X | |
TXA | Transfer X to A | |
TAP | Transfer A to Condition Code Register | |
TPA | Transfer Condition Code Register to A | |
TSX | Transfer (SP) + 1 to H:X | |
TXS | Transfer (H:X) -1 to SP | |
PSHA | Push A onto Stack | |
PSHX | Push X onto Stack | |
PSHH | Push H onto Stack | |
PULA | Pull A from Stack | |
PULX | Pull X from Stack | |
PULH | Pull H from Stack | |
Flag Operations | ||
CLC | Clear Carry | |
CLI | Clear Interrupt Mask | |
SEC | Set Carry | |
SEI | Set Interrupt Mask | |
Miscellaneous Operations | ||
BGND | Enter background debug mode | |
DIV | Unsigned Divide, (H:A)/X) => A, remainder => H | |
MUL | Unsigned Multiply (X) × (A) => (X:A) | |
NOP | No Operation | |
RTI | Return from Interrupt | |
RTS | Return from Subroutine | |
STOP | Stop Microprocessor | |
SWI | Software Interrupt | |
WAIT | Wait for Interrupt |
The HC08 is capable of performing arithmetic and logical operations on memory locations as well as the A and X registers. Thus the programming model is not restricted to just 8-bit entities. This makes implementation of multiple-precision arithmetic possible.
Some 16-bit operations are simplified using the H:X register combination. When performing 16-bit memory transfers, big-endian byte order is used. That is, bytes are stored with the high-byte appearing first follow by the low-byte. This is opposite to Intel processors which use little-endian byte order.
In the immediate addressing mode, the actual argument is contained in the one or two bytes immediately following the instruction byte, where the number of bytes must match the size of the register being used. Thus the actual constant value is stored as part of the sequence of bytes that make up the instruction. This mode is selected when the # symbol precedes the argument. Examples:
LDA | #23 | |||
LDHX | #ONE | |||
AND | #$F0 |
In the direct addressing mode (also called page zero addressing) a single byte is used to specify the least significant byte of the 16-bit memory address of the parameter to be accessed. The MSB of this effective address is assumed to be $00. Therefore only addresses $0000 to $00FF are accessible using direct addressing. Instructions using direct addressing are two byte instructions and therefore make more efficient use of machine cycles and memory space. Examples:
STA | PTAD | |||
LDX | RESULT | |||
LSR | NUM |
In the extended addressing mode, two bytes are required to specify the full 16-bit effective address of the parameter to be referenced. Hence the full range of address from $0000 to $FFFF can be specified. Examples:
STA | SOPT1 | |||
LDHX | table | |||
JSR | output |
In indexed addressing mode, the 16-bit H:X index register is used in calculating the effective address. The address contained in the index register H:X can be used as is or an 8-bit or 16-bit unsigned offset can be added to the contents of H:X to form the effective address. Indexed addressing can also be performed using the stack pointer SP as the index register.
LDA | ,X | |||
ADD | 3,X | |||
LSL | NAME,X | |||
COM | 4,SP |
The HC08 does not have an indirect addressing mode. This information is included to round out the discussion on memory addressing modes. Indirect addressing is one of the most powerful and sometimes confusing features available on most computers and yet the concept is fairly simple. With direct (as well as extended) addressing the effective address is specified in the instruction bytes. In indirect addressing mode, the memory location specified by the instruction bytes contains the effective address of the parameter.
In many programming operations, we are not so much concerned about the actual contents of a variable but more about the location of the variable. That is, many times our focus is on the address of a variable and how to manipulate this address. In high level languages such as Pascal and C, structures and pointers rely heavily on the use of indirect addressing. On the HC08, indexed addressing mode is used to implement indirect addressing.
Memory-Accumulator Operations | ||
ADC | Add with Carry to A | |
ADD | Add Memory to A | |
AND | AND A with Memory | |
BIT | Bit Test A with Memory | |
CMP | Compare A with Memory | |
CPHX | Compare H:X with 16 bits in Memory | |
CPX | Compare X with Memory | |
EOR | Exclusive OR A with Memory | |
LDA | Load Accumulator A with Memory | |
LDHX | Load Index Register H:X with 16 bits | |
LDX | Load Index Register X with 8 bits | |
ORA | OR Accumulator A | |
SBC | Subtract with Carry from A | |
STA | Store Accumulator A | |
STHX | Store H:X to Memory | |
STX | Store Index Register X | |
SUB | Subtract Memory from A | |
Memory-Only Operations | ||
ASL | Arithmetic Shift Left | |
ASR | Arithmetic Shift Right | |
CLR | Clear Memory | |
COM | 1's Complement | |
DEC | Decrement Memory | |
INC | Increment Memory | |
LSL | Logical Shift Left (same as ASL) | |
LSR | Logical Shift Right | |
MOV | Move from Memory to Memory | |
NEG | 2's Complement | |
ROL | Rotate Left | |
ROR | Rotate Right | |
TST | Test for zero or minus |
The relative addressing mode is used only for branch instructions. If the branch condition is true, the 8-bit signed integer following the instruction opcode is added to the current contents of the program counter (PC) to form the effective branch address. If the branch is not taken, program execution continues with the next instruction. Examples
BNE | main | |||
BSR | putc | |||
loop | BRCLR | bit4,flags,loop |
Branching - Unsigned Arithmetic | ||
BHI | Branch if Higher | |
BHS | Branch if Higher or Same (same as BCC) | |
BLO | Branch if Lower (same as BCS) | |
BLS | Branch if Lower or Same | |
Branching - 2's Complement Signed Arithmetic | ||
BGE | Branch if Greater than or Equal to zero | |
BGT | Branch if Greater Than zero | |
BLE | Branch if Less than or Equal to zero | |
BLT | Branch if Less Than zero | |
General Branching | ||
BCC | Branch if Carry is Clear | |
BCS | Branch if Carry is Set | |
BEQ | Branch if EQual to zero | |
BMI | Branch if MInus | |
BNE | Branch if Not Equal to zero | |
BPL | Branch if PLus | |
BRA | BRanch Always | |
BRN | BRanch Never | |
BVC | Branch if oVerflow is Clear | |
BVS | Branch if oVerflow is Set | |
BSR | Branch to SubRoutine | |
Long Branch | ||
JMP | Jump to new location (16-bit address) | |
JSR | Jump to SubRoutine (16-bit address) | |
(Technically speaking, these are not relative branch instructions but are absolute jumps using extended addressing mode. These two instructions are listed here to complete the list of branch instructions.) |
Another feature of the HC08 is the ability to set or clear any individual bit of RAM or I/O register using the BSET and BCLR instructions. Program branching can also take place based on the value of any specified bit using the BRSET and BRCLR instructions..
Bit Set/Clear | ||
BSET n,dir | Bit Set | |
BCLR n,dir | Bit Clear | |
Branching if Bit Set/Clear | ||
BRSET n,dir,rel | Branch if bits set | |
BRCLR n,dir,rel | Branch if bits clear |
Examples:
BSET | 0,NUM | ;set bit-0 of NUM | |
BCLR | 7,PTAD | ;clear bit-7 of PORTA | |
BRSET | 2,NUM,MAIN | ;branch if bit-2 of NUM is set | |
There are instructions added to the HC08 and HCS08 instruction set which improve on the previous HC05 instruction set. These are primarily instructions added to handle the new H register and instructions added to manage the stack and the stack pointer. Other improvements are the MOV instruction which does not require the A accumulator, compare and branch if equal, CBEQ, and decrement and branch if not zero, DBNZ.
Move | ||
MOV #imm,dir | Move immediate to direct location | |
MOV dir,dir | Move direct to direct | |
MOV dir,X+ | Move direct to location addressed by H:X and increment H:X | |
MOV X+,dir | Move byte from location addressed by H:X to direct and increment H:X | |
Compare and Branch if Equal | ||
CBEQA #imm,rel | Branch if A = imm | |
CBEQX #imm,rel | Branch if X = imm | |
CBEQ dir,rel | Branch if A = byte at direct location | |
CBEQ X+,rel | Branch if A = byte at H:X, post increment H:X | |
CBEQ disp,X+,rel | Branch if A = byte at (H:X + disp), post increment H:X | |
CBEQ disp,SP,rel | Branch if A = byte at (SP + disp) | |
Decrement and Branch if Not Zero | ||
DBNZA rel | Decrement A and branch if A is not zero | |
DBNZX rel | Decrement X and branch if X is not zero | |
DBNZ dir,rel | Decrement byte at location dir and branch if not zero | |
DBNZ X,rel | Decrement byte at H:X and branch if not zero | |
DBNZ disp,X,rel | Decrement byte at (H:X + disp) and branch if not zero | |
DBNZ disp,SP,rel | Decrement byte at (SP + disp) and branch if not zero | |
Load H:X (big-endian byte order, i.e. H first) | ||
LDHX #imm | Load H:X with immediate 16 bits | |
LDHX mem | Load H with (mem) and X with (mem + 1) | |
LDHX ,X | Load H:X with 16 bits from memory at (H:X) | |
LDHX disp,X | Load H:X with 16 bits from (H:X + disp) | |
LDHX disp8,SP | Load H:X with 16 bits from (SP + disp8) | |
Store H:X (big-endian byte order, i.e. H first) | ||
STHX mem | Store H into (mem) and X into (mem + 1) | |
STHX disp8,SP | Store H:X into (SP + disp8) |
Miscellaneous |
||
AIS #imm | Add 8-bit signed value toSP | |
AIX #imm | Add 8-bit signed value to H:X | |
BGND | Enter background debug mode (if ENBDM = 1 in BDC control register) | |
CLRH | Clear H | |
DAA | Decimal Adjust Accumulator | |
DIV | Divide (H:A) by X, result => A, remainder =>H | |
MUL | Multiply A by X, result => X:A | |
NSA | Nibble Swap Accumulator | |
PSHA | Push A onto stack | |
PSHH | Push H onto stack | |
PSHX | Push X onto stack | |
PULA | Pull A from stack | |
PULH | Pull H from stack | |
PULX | Pull X from stack | |
RSP | Reset Stack Pointer, SP <= $FF | |
What is an opcode map? Opcode refers to the machine code for each operation or instruction. For example, the opcode for the CLRA instruction is $4F. Since the HCS08 is based on an 8-bit opcode, theoretically, there are 256 possible instructions. The opcode or first byte defines the instruction and dictates how many additional bytes are required for the complete instruction. An opcode map is a 16 x 16 table showing all possible 256 instructions ordered according to the actual opcode in hexadecimal representation.
In practice, when there are more than 256 instructions, the designer of the MCU uses one or more codes out of the set of 256 to create an extension, escape code, or page-2 set of instructions. The HCS08 uses opcode $9E as the page-2 identifier. For efficiency reasons, extensions or page-2 instructions tend to be less often used instructions.
The opcode map also shows for your convienience, the number of bytes, number of machine cycles and the addressing mode along with the opcode in hexadecimal representation.
In assembler, instruction mnemonics are not case sensitive. However, user names and labels are case sensitive in both assembler and C. In assembler, comments may begin with a semicolon or //. In C, single comment lines begin with //. A block of code can be treated as comments if enclosed by /* */.
In C, hexadecimal notation is preceded by 0x. In assembler, hexadecimal notation is preceded by 0x or $.
Examples:
// you can embed assembler code in a C procedure using the asm { } structure | ||||||
asm | ||||||
{ | ||||||
BSET | 0,NUM | ; this is a comment in assembler | ||||
lda | $E3 | // instructions are not case sensitive | ||||
} | ||||||
// this is a comment in C | ||||||
SOPT1 = 0x52; | ||||||
/* | ||||||
In C you can comment out | ||||||
a block of code or text | ||||||
*/ | ||||||
MC9S08QG8 Instruction Set Summary
MC9S08QG8 Data Manual, 300 pages, 3Mb pdf