Chapter 2 Computer Architecture

&

Machine Language Programming

Objectives: To instill an understanding of how digital computers work from both a hardware and software perspective.

References:

Old Chapter 2 for HC11

MC9S08QG8 Resource page

MC9S08QG8 Data Manual, 300 pages, 3Mb pdf

The essential components of the digital computer are shown in Figure 2.1.

The Central Processing Unit (CPU) is the brain of the computer. It performs the essential task of fetching instructions from program memory and executing these instructions. The CPU contains the Arithmetic/Logic Unit (ALU) which performs the various arithmetic and boolean operations. The design of the ALU gives each computer its distinctive flavour. This is what makes the Intel 486 processor different from the Motorola 68000.

Memory is used for storage of program instructions as well as data. A unit of storage is called a word. Computers have different word lengths, for example, 4, 8, 12, 14, 16, 32 and 64-bit words are in common use today. Microcomputers generally use a word length of 8 bits, or a byte. Memory is organized as a linear array of words, a word being the smallest collection of bits which can be accessed by the CPU. Each word is identified by its address or location in memory. Thus a computer with 64K words of memory (65536) would require a memory address register that is 16 bits wide.

Semiconductor memories have virutally replaced older forms of memory such as magnetic core memory. Because of their highest density and lowest cost, dynamic random-access memory (DRAM) chips are used in modern computers. These have the drawback that they must be refreshed continuously. That is, additional control circuitry is required to access banks of memory periodically otherwise the electronic charge in each memory cell would be lost. DRAM are also volatile, i.e., all data is lost after the power is turned off. Static random-access memories (SRAM) are faster than DRAM, use less power and do not require refreshing. SRAM, like DRAM, are also volatile. There are may applications where non-volatile storage is required. Read-only memories (ROM) are used to store program code or information that never changes. To be cost effective, these must be programmed by the manufacturer in large quantities. UV-EPROM (ultra-violet erasable programmable ROM) can be erased and reprogrammed by the user using a special UV lamp and programming equipment. EEPROM (electrically erasable PROM) has the advantage of being erasable and programmable in situ. Flash memory is another form of EEPROM. Because of their higher density, block orientation and longer erase times, these are more suited as replacements for hard disk drives or as removable storage devices such as CompactFlash™ cards and memory sticks.

Without Input/Output (I/O) capabilities a computer would not be of much use. I/O is any means of getting information into and out of the computer. This could be as simple as switches and lights or more traditional I/O devices such as the keyboard, mouse and video display. Modems, sound cards, analog-to-digital and digital-to-analog converters, serial and parallel ports, timers and counters are all examples of I/O devices.

Storage devices fall into another category of I/O devices. Hard disk drives, floppy disk drives and CD-ROM drives are mass storage devices for keeping large amounts of information. While these may not be perceived as part of the I/O capabilities of the computer, from the hardware perspective they are interfaced to the computer via the I/O bus.

Embedded Computers

We are familiar with the computer in its most highly visible form as the personal computer (PC) and its aliases such as desk-top or lap-top computer, engineering work-station, business computer, file server and so on. Applications of the PC cover wide areas from personal, scientific, engineering, business and commercial applications to Internet servers, Information Management Systems (e.g. patient records, airline reservations) and industrial process controllers.

And yet there are more computers installed all over the world in areas other than the ones just described. These are the computers embedded in machines, instruments and appliances such as cameras, stereos, TVs, VCRs, microwave ovens, computer printers, copiers, modems, mobile phones, wristwatches, automobiles, test and diagnostic equipment, scientific equipment, health-care instruments and the list goes on and on.

Embedded computers refer to both microprocessors as well as more complex computer systems applied to a specific task. In most cases, the user is unaware of the computational engine under the cover and neither cares about the software or operating system being employed. Embedded processors range in size and cost from miniature 4-bit microprocessors costing under $1 to full blown Pentium or Power PC based systems costing in excess of $10,000.

Freescale MC9S08QG8

(In July 2004 Freescale Semiconductor was created from the semiconductor division of Motorola.)

This course uses the Freescale MC9S08QG8 as a model microcontroller unit (MCU) because its architecture and programmer's model is relatively simple. The fundamentals are universal and can be applied to any other microprocessor. Moreover, we will promote the idea that bigger does not mean better. Smallness can be beautiful. There are many obvious advantages to being small such as being compact, portable, energy efficient, simple to design and manufacture, less costly and more reliable.

The Freescale MC9S08QG8 microprocessor is a low cost and yet very powerful 8-bit solution to many embedded processor applications. The HC08/HCS08 is a large family of processors available from Freescale in various input/output configurations based on the same CPU core.

Here are some of the highlights of the MC9S08QG8:

20MHz HCS08 CPU core
8Kbytes FLASH
512 bytes SRAM
Serial Peripheral Interface (SPI)
Serial Communications Interface (SCI)
8-channel 10-bit ADC
8-bit modulo timer
2-channel pulse-width modulation (PWM)
12 lines of general purpose digital input/output
Bakground debugging capability using a single pin, BGND
Internal 16MHz oscillator
8-pin DIP - MC9S08QG4 only
16-pin DIP - MC9S08QG8 only

The MC9S08QG8/4 is available in 8-pin and 16-pin surface mount packages (SMD) and some versions in dual in-line packages (DIP). The MC9S08QG4 comes with 4Kbytes FLASH and 256 bytes RAM.

HCS08 ALU

Reference: HCS08 CPU

Figure 2.2 shows the register model of the ALU of the HC08 and HCS08 CPU.

Figure 2.2 CPU Registers

The A accumulator is a general-purpose 8-bit register. In general, all mathematical and logical operations will be performed using this register. However, the Freescale family of microcontrollers can perform many operations directly on the first 256 bytes (page zero) of RAM. This is like having an additional 256 registers besides the A accumulator.

The H and X registers together constitute a 16-bit index register. Previous versions of the Motorola 6800 and 6805 families had a single 8-bit X index register. This was found to be somewhat lacking and hence the additional 8-bit H register was added to allow full 16-bit addressing capability. Note that for backward compatiblity, some instructions operate only on the X register.

The program counter (PC) contains the address of the next instruction to be executed. Similarly, the stack pointer (SP) contains the address of the next free space in SRAM which is available for temporary storage. (Stacks are explained in Chapter 4). In general, the programmer can consider the handling of the PC and SP as an internal matter and may ignore the two for the time being.

The Condition Code Register contains eight flags and control bits used to monitor the results of arithmetic/logic operations and to control certain CPU functions.

Figure 2.3 - Condition Code Register

The Zero (Z) bit is set when the result of the last arithmetic, logical, or data manipulation is zero.

The Negative (N) bit reflects the state of the MSB of the result. For 2's complement notation, the N bit is set if the result is negative.

The Overflow (V) bit is used to indicate if an arithmetic overflow has occurred as a result of the operation.

The Carry (C) bit is used to indicate if a carry from an addition or a borrow from a subtraction has occurred. The C bit is also used in shift and rotate operations.

For the time being you may ignore the functions served by H and I bits.

Machine Language vs Assembly Language

What is a computer program?
What is a computer instruction?
What is machine language or machine code?
What is assembly language?
How does one program the computer in assembly lanuage?

A computer program is a set of precise instructions stored in memory for the CPU to act on. Instructions on the HC08 vary in length from 1 to 4 bytes, depending on the individual instruction. This is characteristic of a Complex Instruction Set Computer (CISC) as compared to a Reduced Instruction Set Computer (RISC). For example, the Pentium is a CISC while the Power PC is a RISC. How does the CPU of a CISC know how many bytes make up the instruction? The first byte in the sequence dictates the type of instruction and therefore the number of bytes to follow for that instruction. Therefore it makes no sense to execute program bytes out of synchronous order.

A machine cycle is the length of time it takes to perform an internal hardware operation. The MC9S08QG8/4 can operate from an internal oscillator or from an external quartz crystal. On power-on reset, the internal oscillator will default to run at 16MHz which is then divided by 4 to give a bus clock of 4MHz. Thus, a machine cycle is 250ns in duration. To fetch and execute an instruction may require a number of machine cycles.

A collection of instructions (i.e. the program) is stored in memory at sequential locations and the CPU fetches and executes each instruction one at a time. The flow of execution is controlled by the CPU with the use of the program counter. This register keeps track of the address of the next byte to be fetched.

Normally, the CPU will fetch instructions from memory in sequential order unless instructed to do otherwise as in the case of a jump or branch instruction. In this case, the PC will be loaded with a new memory address and on the next fetch cycle the flow of program execution will be directed to a different location in memory.

Pieces of data are stored in memory in exactly the same fashion as instructions. The CPU has no way of distinguishing instructions from data. Of course, the programmer knows which locations are used for data storage and will create the program such that data locations are never executed. In the event that this occurs, the program will behave unexpectedly and "crash".

Machine Language & Assembly Language

What is the difference between machine language and assembly language? Machine language or machine code is the native language or instructions that the computer executes. Information on a digital computer is encoded using a two-state mechanism, that is, either ON or OFF. Therefore all information is manipulated and stored using a binary number system. Machine language is inherently a binary system.

Assembly language and machine language both represent the same information. They represent the native binary codes which will be entered into the computer and constitute a coherent set of instructions which we call the program. The difference lies in the interpretation by humans.

Machine code is binary. Programming a computer using the binary system of notation is very time consuming, inefficient and prone to errors. By utilizing decimal, octal and hexadecimal notation, it is possible to reduce the amount of data entry and book-keeping and therefore reduce effort and likelihood of errors. For example, the machine code to stop the CPU is 10001110. It is easier to write this as $8E, and even better to write STOP. Using hexadecimal formats or meaningful words is simply a better way for humans to relate to the binary instruction.

Instead of using a numeric value, either written as binary or hexadecimal, to represent an instruction, it is easier to use an acronym or mnemonic. The use of a mnemonic which has a one-to-one equivalence to the machine code and its function is called assembly language programming. The computer does not understand assembly language. Assembly language is a mechanism to assist humans in creating and managing programs in machine code more effectively and efficiently. Thus, programmers write in assembly language. Computers execute machine code.

When a program is written in assembly language, it must be translated into machine code and into a form that can be entered into the memory of the computer. This process is called assembling and can be accomplished manually. However, this is prone to errors and an automated method is preferred. A computer program which translates assembly language programs into machine code is called an assembler.

Simulators and debuggers are other program development tools which assist programmers in testing and debugging assembly language programs.

Programming Example

Assembly language programming is best introduced through the use of simple examples. Let us suppose we wish to find the sum of 12 and 23. In standard algebra we could write:

RESULT = 12 + 23

In HC08 assembly language, we have to instruct the CPU at each step of the operation. Here is a program to do this:

LDA #12

ADD #23

STA RESULT

STOP

Here is a brief explanation of the mnemonics and what each line does.

Load accumulator A with 12.
Add 23 to A, leaving the result in A.
Store the contents of accumulator A to a location called RESULT. For the moment let us assume that the address of RESULT is $00.
Stop execution of the program.

The machine codes for this program in hexadecimal notation are as follows:

A6 0C AB 17 B7 00 8E

This program is of little use since we know the answer to be 35, a priori. Instead, it would be more useful to add two numbers whose values may be not fixed, i.e., what we call variables. Let us rephrase our example as:

RESULT = NUM1 + NUM2

The assembly language program to accomplish this may look as follows:

LDA NUM1

ADD NUM2

STA RESULT

STOP

The difference between the two examples above is the use of the immediate symbol (#) in the first two lines. In the first example, the # specifies that the numeric data is a constant value and this value is found in the byte or pair of bytes following the instruction op-code byte. This is called immediate addressing mode. In the second example, the omission of the # specifies that the symbols NUM1 and NUM2 represent addresses and the numeric data are stored at these memory addresses.

Thus, note the difference between the following two instructions:

LDA #4

LDA 4

The first instruction loads the value 4 into accumulator A. The second loads the contents of memory address 4 into accumulator A.

Misuse of the # symbol is a common source of programming error which the assembler cannot detect.

If we know the addresses of NUM1 and NUM2 to be 4 and 5 respectively, we could write the following:

LDA 4

ADD 5

In general, it is best not to use absolute addresses of variables. Instead, let the assembler/compiler assign memory addresses to symbols as it sees fit.

What is the difference between the following statements:

LDA #123

LDA #$7B

LDA #%01111011

The answer is none.

123, $7B and %01111011 are just different notations or representations for us humans to represent the same quantity or value which is One Hundred and Twenty Three. The bit pattern stored on the computer is 01111011 and is always binary.

MC9S08QG8 Memory Model

Figure 2.3 MC9S08QG8 Memory Map

The memory map shows that the first 96 memory addresses are reserved for the most commonly used internal hardware registers. This leaves 160 locations on page zero from address $0060 to $00FF which can be used to directly address memory locations in RAM. Eighty high page registers from $1800 to $184F are for less often used internal hardware registers. To access these locations, extended or indexed addressing modes are required.

Program code is normally stored in the FLASH memory area which can be erased and reprogrammed many times.

Instruction Summary

The complete HC08 instruction set is listed here to familiarize you with the HC08 capabilities. For more information, see the HCS08 Instruction Set Summary. There are six addressing modes which refer to the different ways parameters are accessed. These modes are as follows:

Inherent Addressing
Immediate Addressing
Direct Addressing
Extended Addressing
Indexed Addressing
Relative Addressing

Inherent Addressing

In inherent addressing mode, all of the information is contained in the instruction byte. The operands (if any) are registers and no memory reference is required. These are one or two byte instructions. Here is a list of the inherent instructions:

Mathematical Operations

CLRA Clear A

CLRX Clear X

CLRH Clear H

COMA 1's Complement A

COMX 1's Complement X

DAA Decimal Adjust A

DECA Decrement A

DECX Decrement X

INCA Increment A

INCX Increment X

NEGA 2's Complement A

NEGX 2's Complement X

NSA Nibble Swap A

TSTA Test A

TSTX Test X

Shift Operations

ASLA Arithmetic Shift Left A

ASLX Arithmetic Shift Left X

ASRA Arithmetic Shift Right A

ASRX Arithmetic Shift Right X

LSLA Logical Shift Left A (same as ASLA)

LSLX Logical Shift Left B (same as ASLX)

LSRA Logical Shift Right A

LSRX Logical Shift Right X

ROLA Rotate Left A through Carry

ROLX Rotate Left X through Carry

RORA Rotate Right A through Carry

RORX Rotate Right X through Carry

Inter-Register Operations

TAX Transfer A to X

TXA Transfer X to A

TAP Transfer A to Condition Code Register

TPA Transfer Condition Code Register to A

TSX Transfer (SP) + 1 to H:X

TXS Transfer (H:X) -1 to SP

PSHA Push A onto Stack

PSHX Push X onto Stack

PSHH Push H onto Stack

PULA Pull A from Stack

PULX Pull X from Stack

PULH Pull H from Stack

Flag Operations

CLC Clear Carry

CLI Clear Interrupt Mask

SEC Set Carry

SEI Set Interrupt Mask

Miscellaneous Operations

BGND Enter background debug mode

DIV Unsigned Divide, (H:A)/X) => A, remainder => H

MUL Unsigned Multiply (X) × (A) => (X:A)

NOP No Operation

RTI Return from Interrupt

RTS Return from Subroutine

STOP Stop Microprocessor

SWI Software Interrupt

WAIT Wait for Interrupt

Memory Operations

The HC08 is capable of performing arithmetic and logical operations on memory locations as well as the A and X registers. Thus the programming model is not restricted to just 8-bit entities. This makes implementation of multiple-precision arithmetic possible.

Some 16-bit operations are simplified using the H:X register combination. When performing 16-bit memory transfers, big-endian byte order is used. That is, bytes are stored with the high-byte appearing first follow by the low-byte. This is opposite to Intel processors which use little-endian byte order.

Immediate Addressing

In the immediate addressing mode, the actual argument is contained in the one or two bytes immediately following the instruction byte, where the number of bytes must match the size of the register being used. Thus the actual constant value is stored as part of the sequence of bytes that make up the instruction. This mode is selected when the # symbol precedes the argument. Examples:

LDA #23

LDHX #ONE

AND #$F0

Direct Addressing

In the direct addressing mode (also called page zero addressing) a single byte is used to specify the least significant byte of the 16-bit memory address of the parameter to be accessed. The MSB of this effective address is assumed to be $00. Therefore only addresses $0000 to $00FF are accessible using direct addressing. Instructions using direct addressing are two byte instructions and therefore make more efficient use of machine cycles and memory space. Examples:

STA PTAD

LDX RESULT

LSR NUM

Extended Addressing

In the extended addressing mode, two bytes are required to specify the full 16-bit effective address of the parameter to be referenced. Hence the full range of address from $0000 to $FFFF can be specified. Examples:

STA SOPT1

LDHX table

JSR output

Indexed Addressing

In indexed addressing mode, the 16-bit H:X index register is used in calculating the effective address. The address contained in the index register H:X can be used as is or an 8-bit or 16-bit unsigned offset can be added to the contents of H:X to form the effective address. Indexed addressing can also be performed using the stack pointer SP as the index register.

LDA ,X

ADD 3,X

LSL NAME,X

COM 4,SP

Indirect Addressing

The HC08 does not have an indirect addressing mode. This information is included to round out the discussion on memory addressing modes. Indirect addressing is one of the most powerful and sometimes confusing features available on most computers and yet the concept is fairly simple. With direct (as well as extended) addressing the effective address is specified in the instruction bytes. In indirect addressing mode, the memory location specified by the instruction bytes contains the effective address of the parameter.

In many programming operations, we are not so much concerned about the actual contents of a variable but more about the location of the variable. That is, many times our focus is on the address of a variable and how to manipulate this address. In high level languages such as Pascal and C, structures and pointers rely heavily on the use of indirect addressing. On the HC08, indexed addressing mode is used to implement indirect addressing.

Memory-Accumulator Operations

ADC Add with Carry to A

ADD Add Memory to A

AND AND A with Memory

BIT Bit Test A with Memory

CMP Compare A with Memory

CPHX Compare H:X with 16 bits in Memory

CPX Compare X with Memory

EOR Exclusive OR A with Memory

LDA Load Accumulator A with Memory

LDHX Load Index Register H:X with 16 bits

LDX Load Index Register X with 8 bits

ORA OR Accumulator A

SBC Subtract with Carry from A

STA Store Accumulator A

STHX Store H:X to Memory

STX Store Index Register X

SUB Subtract Memory from A

Memory-Only Operations

ASL Arithmetic Shift Left

ASR Arithmetic Shift Right

CLR Clear Memory

COM 1's Complement

DEC Decrement Memory

INC Increment Memory

LSL Logical Shift Left (same as ASL)

LSR Logical Shift Right

MOV Move from Memory to Memory

NEG 2's Complement

ROL Rotate Left

ROR Rotate Right

TST Test for zero or minus

Relative Addressing

The relative addressing mode is used only for branch instructions. If the branch condition is true, the 8-bit signed integer following the instruction opcode is added to the current contents of the program counter (PC) to form the effective branch address. If the branch is not taken, program execution continues with the next instruction. Examples

BNE main

BSR putc

loop BRCLR bit4,flags,loop

Branching - Unsigned Arithmetic

BHI Branch if Higher

BHS Branch if Higher or Same (same as BCC)

BLO Branch if Lower (same as BCS)

BLS Branch if Lower or Same

Branching - 2's Complement Signed Arithmetic

BGE Branch if Greater than or Equal to zero

BGT Branch if Greater Than zero

BLE Branch if Less than or Equal to zero

BLT Branch if Less Than zero

General Branching

BCC Branch if Carry is Clear

BCS Branch if Carry is Set

BEQ Branch if EQual to zero

BMI Branch if MInus

BNE Branch if Not Equal to zero

BPL Branch if PLus

BRA BRanch Always

BRN BRanch Never

BVC Branch if oVerflow is Clear

BVS Branch if oVerflow is Set

BSR Branch to SubRoutine

Long Branch

JMP Jump to new location (16-bit address)

JSR Jump to SubRoutine (16-bit address)

(Technically speaking, these are not relative branch instructions but are absolute jumps using extended addressing mode. These two instructions are listed here to complete the list of branch instructions.)

Bit Operations

Another feature of the HC08 is the ability to set or clear any individual bit of RAM or I/O register using the BSET and BCLR instructions. Program branching can also take place based on the value of any specified bit using the BRSET and BRCLR instructions..

Bit Set/Clear

BSET n,dir Bit Set

BCLR n,dir Bit Clear

Branching if Bit Set/Clear

BRSET n,dir,rel Branch if bits set

BRCLR n,dir,rel Branch if bits clear

Examples:

BSET 0,NUM ;set bit-0 of NUM

BCLR 7,PTAD ;clear bit-7 of PORTA

BRSET 2,NUM,MAIN ;branch if bit-2 of NUM is set

HC(S)08 Additional Instructions

There are instructions added to the HC08 and HCS08 instruction set which improve on the previous HC05 instruction set. These are primarily instructions added to handle the new H register and instructions added to manage the stack and the stack pointer. Other improvements are the MOV instruction which does not require the A accumulator, compare and branch if equal, CBEQ, and decrement and branch if not zero, DBNZ.

Move
	MOV #imm,dir	Move immediate to direct location
	MOV dir,dir	Move direct to direct
	MOV dir,X+	Move direct to location addressed by H:X and increment H:X
	MOV X+,dir	Move byte from location addressed by H:X to direct and increment H:X

Compare and Branch if Equal
	CBEQA #imm,rel	Branch if A = imm
	CBEQX #imm,rel	Branch if X = imm
	CBEQ dir,rel	Branch if A = byte at direct location
	CBEQ X+,rel	Branch if A = byte at H:X, post increment H:X
	CBEQ disp,X+,rel	Branch if A = byte at (H:X + disp), post increment H:X
	CBEQ disp,SP,rel	Branch if A = byte at (SP + disp)

Decrement and Branch if Not Zero
	DBNZA rel	Decrement A and branch if A is not zero
	DBNZX rel	Decrement X and branch if X is not zero
	DBNZ dir,rel	Decrement byte at location dir and branch if not zero
	DBNZ X,rel	Decrement byte at H:X and branch if not zero
	DBNZ disp,X,rel	Decrement byte at (H:X + disp) and branch if not zero
	DBNZ disp,SP,rel	Decrement byte at (SP + disp) and branch if not zero

Load H:X (big-endian byte order, i.e. H first)
	LDHX #imm	Load H:X with immediate 16 bits
	LDHX mem	Load H with (mem) and X with (mem + 1)
	LDHX ,X	Load H:X with 16 bits from memory at (H:X)
	LDHX disp,X	Load H:X with 16 bits from (H:X + disp)
	LDHX disp8,SP	Load H:X with 16 bits from (SP + disp8)

Store H:X (big-endian byte order, i.e. H first)
	STHX mem	Store H into (mem) and X into (mem + 1)
	STHX disp8,SP	Store H:X into (SP + disp8)

Miscellaneous
	AIS #imm	Add 8-bit signed value toSP
	AIX #imm	Add 8-bit signed value to H:X
	BGND	Enter background debug mode (if ENBDM = 1 in BDC control register)
	CLRH	Clear H
	DAA	Decimal Adjust Accumulator
	DIV	Divide (H:A) by X, result => A, remainder =>H
	MUL	Multiply A by X, result => X:A
	NSA	Nibble Swap Accumulator
	PSHA	Push A onto stack
	PSHH	Push H onto stack
	PSHX	Push X onto stack
	PULA	Pull A from stack
	PULH	Pull H from stack
	PULX	Pull X from stack
	RSP	Reset Stack Pointer, SP <= $FF

Instruction Opcode Map

What is an opcode map? Opcode refers to the machine code for each operation or instruction. For example, the opcode for the CLRA instruction is $4F. Since the HCS08 is based on an 8-bit opcode, theoretically, there are 256 possible instructions. The opcode or first byte defines the instruction and dictates how many additional bytes are required for the complete instruction. An opcode map is a 16 x 16 table showing all possible 256 instructions ordered according to the actual opcode in hexadecimal representation.

In practice, when there are more than 256 instructions, the designer of the MCU uses one or more codes out of the set of 256 to create an extension, escape code, or page-2 set of instructions. The HCS08 uses opcode $9E as the page-2 identifier. For efficiency reasons, extensions or page-2 instructions tend to be less often used instructions.

The opcode map also shows for your convienience, the number of bytes, number of machine cycles and the addressing mode along with the opcode in hexadecimal representation.

HCS08 Opcode Map

Basic Syntax Rules

In assembler, instruction mnemonics are not case sensitive. However, user names and labels are case sensitive in both assembler and C. In assembler, comments may begin with a semicolon or //. In C, single comment lines begin with //. A block of code can be treated as comments if enclosed by /* */.

In C, hexadecimal notation is preceded by 0x. In assembler, hexadecimal notation is preceded by 0x or $.