SWEG4102 Embedded System — Final Exam Study Notes
SWEG4102 Embedded System — Final Exam Study Notes
Addis Ababa Science & Technology University | Dept. of Software Engineering Source: Marilyn Wolf, Computers as Components: Principles of Embedded Computing System Design, 4th ed., Morgan Kaufmann, 2016 (the course textbook)
HOW TO READ THESE NOTES
The four focusing points from your instructor map directly onto the Wolf textbook as follows:
| Focusing Point | Wolf Chapter(s) | Course Week |
|---|---|---|
| A — Duty Cycle & Signal Frequency | Ch.4 (Platforms), Ch.13 concept | Week 13 |
| B — Addressing Modes | Ch.2 (Instruction Sets), Sections 2.3–2.5 | Weeks 3, 6 |
| C — C Programming | Ch.2, Ch.3, Ch.5 | Week 7 |
| D — All Chapters | Ch.1–5 + interfaces | Weeks 1–13 |
All section references below point to Wolf unless otherwise stated.
PART A: DUTY CYCLE AND SIGNAL FREQUENCY
Wolf Ch.4 platform concepts; also connects to Wolf Ch.1 Section 1.2.2 (power/energy constraints)
A.1 Why PWM Exists in Embedded Systems
Wolf establishes in Section 1.2.2 that power and energy are primary design constraints in embedded systems: "Power consumption directly affects the cost of the hardware, because a larger power supply may be necessary. Energy consumption affects battery life." PWM is the embedded engineer's answer to this constraint — it controls power delivery to a load using only digital signals, with no heat-wasting resistors and no expensive DAC hardware.
A microcontroller output pin can only be fully ON (e.g., 5V) or fully OFF (0V). PWM solves the need for variable voltage by switching rapidly between these states. Because loads like motors and LEDs respond to average power (not instantaneous switching), controlling the ratio of ON-time to OFF-time controls the effective output.
A.2 Core Definitions and Formulas
Period (T): The fixed total duration of one complete ON + OFF cycle.
Frequency (f): Number of complete cycles per second.
ON-time (t_ON): Duration the signal is HIGH within one period.
OFF-time (t_OFF):
Duty Cycle (D): The fraction of each period that the signal is HIGH.
Average Output Voltage:
A.3 Fully Worked Calculations
Example 1 — All three values from basics:
A PWM signal has T = 20 ms and t_ON = 5 ms. Supply = 5V.
Example 2 — Finding t_ON from frequency and duty cycle:
Motor PWM: f = 50 Hz, D = 80%, V_supply = 12V
Example 3 — Reverse: finding duty cycle from voltages:
You want 3V average from a 5V supply. What duty cycle?
Example 4 — Servo motor (angle control via pulse width):
Servo period = 20 ms. 1 ms pulse = 0°, 2 ms pulse = 180°. Find duty cycle for 90°.
- Pulse for 90° = midpoint = 1.5 ms
Example 5 — Arduino analogWrite:
analogWrite(9, 191)on 5V Arduino. What is V_avg?
- Scale: 0 → 0%, 255 → 100%
A.4 PWM Applications
Wolf Section 1.2.2 identifies motor control as a key embedded use case, citing the BMW ABS/ASC+T as a real example (Design Example 1.1). The textbook notes that motor control requires "complicated filtering functions to optimize performance while minimizing pollution and fuel utilization" — PWM is the fundamental technique enabling this.
| Application | PWM Mechanism | Why It Works |
|---|---|---|
| DC Motor Speed | Duty cycle → average voltage → RPM | Motor inertia averages the pulses |
| LED Brightness | Duty cycle → average light intensity | Eye averages flicker above ~50 Hz |
| Servo Positioning | Pulse width (not duty cycle) → angle | 1 ms = 0°, 1.5 ms = 90°, 2 ms = 180° |
| DC-DC Converters | PWM regulates output voltage | >90% efficiency vs. linear regulators |
| Audio/Buzzer | Rapid duty cycle variation → sound | Acts as 1-bit DAC with filtering |
Arduino PWM pins: 3, 5, 6, 9, 10, 11 (marked ~). Function: analogWrite(pin, 0–255).
A.5 Key Relationships Table
| Change | Effect |
|---|---|
| Duty cycle ↑ | Average voltage ↑, motor faster, LED brighter |
| Duty cycle ↓ | Average voltage ↓, motor slower, LED dimmer |
| t_ON = T (D = 100%) | Fully ON — same as direct DC |
| t_ON = 0 (D = 0%) | Fully OFF |
| Period ↓ (frequency ↑) | Faster switching; load response smoother |
PART B: ADDRESSING MODES
Wolf Chapter 2 — Instruction Sets, Sections 2.3 (ARM), 2.4 (PIC16F), 2.5 (C55x)
B.1 What Addressing Modes Are and Why They Matter
Wolf Section 2.2.1 defines an instruction set as the "interface between software modules and the underlying hardware." A central part of that interface is the addressing mode — the mechanism by which an instruction locates its operand (the data it needs to work on).
Wolf explicitly lists addressing modes as one of the key instruction set characteristics alongside fixed vs. variable length, numbers of operands, and types of operations (Wolf §2.2.1, p.57). Understanding addressing modes is critical because:
- They determine execution speed — register access is far faster than memory access
- They determine code flexibility — indirect modes enable arrays and pointers
- They directly connect to C programming — the C pointer (
*ptr) is indirect addressing at the machine level
Wolf covers addressing modes across three real processors used in your course: ARM (§2.3), PIC16F (§2.4), and TI C55x (§2.5).
B.2 Addressing Modes in the ARM Processor
Wolf Section 2.3.2, pp.65–70
The ARM is a load-store architecture (Wolf §2.3.2): "arithmetic and logical operations cannot be performed directly on memory locations… ARM is a load-store architecture — data operands must first be loaded into the CPU and then stored back to main memory." This means addressing modes are most relevant to LDR (load) and STR (store) instructions.
Mode 1: Register Addressing
The operand is in a named register. No memory access needed.
ADD r0, r1, r2 ; r0 = r1 + r2 — all operands are registers
MOV r0, r1 ; copy r1 into r0
Wolf §2.3.2: "The basic form of a data instruction is simple: ADD r0,r1,r2 — this instruction sets register r0 to the sum of the values stored in r1 and r2."
ARM has 16 general-purpose registers r0–r15, where r15 is also the PC (Wolf §2.3.2, p.63).
- Speed: Fastest — entirely within CPU, zero memory access
- Use: Arithmetic, temporary values, loop counters
Mode 2: Immediate Addressing
A constant value is embedded directly in the instruction.
ADD r0, r1, #2 ; r0 = r1 + 2 — #2 is immediate
MOV r0, #100 ; load constant 100 into r0
Wolf §2.3.2: "In addition to specifying registers as sources for operands, instructions may also provide immediate operands, which encode a constant value directly in the instruction. For example, ADD r0,r1,#2 sets r0 to r1+2."
- Speed: Very fast — constant is fetched with the instruction
- Use: Loading constants, setting up counters, simple arithmetic with fixed values
- Limitation: ARM immediate values have a limited encoding range (8-bit value rotated)
Mode 3: Register-Indirect Addressing
A register holds the memory address of the operand — the register "points to" the data.
LDR r0, [r1] ; load r0 from the address contained in r1
STR r0, [r1] ; store r0 to the address contained in r1
Wolf §2.3.2: "In register-indirect addressing, the value stored in the register is used as the address to be fetched from memory; the result of that fetch is the desired operand value. Thus, if we set r1 = 0x100, the instruction LDR r0,[r1] sets r0 to the value of memory location 0x100." (Wolf p.67, Figure 2.14)
- Speed: Two accesses — first read r1 for the address, then read that memory location
- Use: Accessing variables through pointers (C's
*ptr), dynamic memory
Mode 4: Base-Plus-Offset Addressing (Indexed)
Effective address = base register + constant offset. Used extensively in the ARM.
LDR r0, [r1, #16] ; load from address (r1 + 16)
LDR r0, [r1, -r2] ; load from address (r1 - r2)
LDR r0, [fp, #-24] ; load from frame pointer minus 24
Wolf §2.3.2: "LDR r0,[r1,#16] loads r0 with the value stored at location r1 + 16. Here, r1 is referred to as the base and the immediate value the offset." (Wolf p.69)
Wolf also shows how the compiler uses this for C variable access: "a is at −24, b at −28, c at −32, and x at −36" from the frame pointer — giving instructions like ldr r2, [fp, #-24] (Wolf p.69, Example 2.2).
Autoindexing (pre-index with update):
LDR r0, [r1, #16]! ; r1 = r1+16 FIRST, then load from new r1
Wolf §2.3.2: "The ! operator causes the base register to be updated with the computed address so that it can be used again later." (Wolf p.69)
Postindexing:
LDR r0, [r1], #16 ; load from r1 FIRST, then r1 = r1+16
Wolf §2.3.2: "Postindexing does not perform the offset calculation until after the fetch has been performed." (Wolf p.70)
- Use: Array traversal, stack operations, struct member access
Mode 5: PC-Relative Addressing (Relative)
The branch target is calculated as PC + offset. All ARM branches use this mode.
B #100 ; branch to PC + 400 bytes (offset in words, multiplied by 4)
BEQ label ; branch to label if zero flag set
Wolf §2.3.3: "Branches are PC-relative — the branch specifies the offset from the current PC value to the branch target. The offset is in words, but because the ARM is byte-addressable, the offset is multiplied by four to form a byte address. Thus, the instruction B #100 will add 400 to the current PC value." (Wolf p.70)
ARM also uses PC-relative addressing to load addresses: ADR r1, FOO computes r1 = PC + distance_to_FOO (Wolf §2.3.2, Figure 2.15).
- Use: All branch and jump instructions; address loading via ADR
Mode 6: Branch-and-Link (Subroutine Calls)
Wolf §2.3.3 describes procedure calls using BL (Branch and Link):
BL foo ; save PC into r14 (link register), branch to foo
MOV r15, r14 ; return: restore PC from link register
Wolf: "The branch-and-link instruction BL foo will perform a branch and link to the code starting at location foo. Before branching it stores the current PC value in r14. Thus, to return from a procedure, you simply move the value of r14 to r15." (Wolf p.74)
B.3 Addressing Modes in the PIC16F
Wolf Section 2.4, pp.77–81
The PIC16F uses a Harvard architecture with separate program and data memories (Wolf §2.4.1). Wolf §2.4.2 describes its addressing modes:
- f = general-purpose register file (data memory location)
- W = accumulator register
- k = literal constant (immediate)
- b = bit address within a register
Immediate (Literal):
MOVLW #50H ; move literal 50H into W (accumulator)
ADDLW #10 ; add literal 10 to W
Register (File Register):
MOVWF f ; move W to file register f
ADDWF f, d ; add W and f; d=0 stores result in W, d=1 stores in f
Indirect Addressing via INDF/FSR: Wolf §2.4.2: "Indirect addressing is controlled by the INDF and FSR registers. INDF is not a physical register. Any access to INDF causes an indirect load through the file select register FSR. FSR can be modified as a standard register. Reading from INDF uses the value of FSR as a pointer to the location to be accessed." (Wolf p.79)
MOVWF FSR ; FSR = address of target
MOVF INDF, W ; W = memory[FSR] — indirect read
B.4 Addressing Modes in the TI C55x DSP
Wolf Section 2.5.2, pp.84–88
The C55x has three main addressing modes (Wolf §2.5.2, p.84):
1. Absolute Addressing — address supplied directly in instruction:
k16absolute: 16-bit value combined with DPH register → 23-bit data addressk23absolute: full 23-bit data address- I/O absolute:
port(#1234)— 16-bit I/O space address
Wolf: "Absolute addressing supplies an address in the instruction." (Wolf §2.5.2)
2. Direct Addressing — supplies an offset from a base register:
- DP addressing:
A_DP = DPH[22:15] | (DP + Doffset)— access data pages - SP addressing:
A_SP = SPH[22:15] | (SP + Soffset)— access stack values - PDP addressing:
A_PDP = PDP[15:6] | PDPoffset— access I/O pages
Wolf: "Direct addressing supplies an offset." (Wolf §2.5.2)
3. Indirect Addressing — register as pointer:
- AR indirect: auxiliary register (AR0–AR7) holds the address. The AR can be auto-updated after access (e.g.,
*AR0+adds 1 after access) - Dual AR indirect: two simultaneous data accesses
- CDP indirect: coefficient data pointer for DSP filter coefficients
- Circular addressing: AR-based access that wraps around a buffer — critical for DSP algorithms
Wolf: "AR indirect addressing uses an auxiliary register to point to data… This mode may update the value of the AR register. Updates are specified by modifiers to the register identifier, such as adding + after the register name." (Wolf §2.5.2, p.88)
B.5 Universal Addressing Mode Summary
| Mode | Where is Operand? | ARM Example | Speed | Best Use |
|---|---|---|---|---|
| Register | In CPU register | ADD r0,r1,r2 | Fastest | Arithmetic, temporaries |
| Immediate | In instruction itself | ADD r0,r1,#5 | Very fast | Constants, initialization |
| Indirect | At address in register | LDR r0,[r1] | Moderate | Pointers, dynamic access |
| Base+Offset | At base reg + offset | LDR r0,[r1,#16] | Moderate | Arrays, structs, stack frames |
| PC-Relative | At PC + offset | B label | Fast | Branches, address loading |
| Direct (C55x) | At offset from DP/SP | DP addressing | Fast | Fixed data pages, stack |
| Absolute (C55x) | Fixed address in instr | k23 | Fast | Memory-mapped registers |
| Circular (C55x) | AR with wrap-around | *AR0+ with BK | Fast | DSP filter buffers |
B.6 Effective Address Calculations
Base+Offset (ARM):
r1 = 0x200, instruction:
LDR r0, [r1, #0x10]
- EA = 0x200 + 0x10 = 0x210
Register-Indirect (ARM):
r1 = 0x100, instruction:
LDR r0, [r1]
- EA = 0x100 (Wolf Figure 2.14)
DP Direct (C55x):
DPH = 0x40, DP = 0x200, Doffset = 0x05
- EA = 0x40[22:15] | (0x200 + 0x05) = 0x40_0205 (23-bit address)
PC-Relative (ARM):
PC = 0x1000, offset = #100 (in words)
- EA = 0x1000 + (100 × 4) = 0x1000 + 0x190 = 0x1190
PART C: C PROGRAMMING
Wolf Chapter 2 (assembly context), Chapter 3 (I/O in C), Chapter 5 (program design)
C.1 Why C Is the Embedded Language of Choice
Wolf explains this in Section 1.2.3 (Why use microprocessors?): programmability is crucial because it "allows program design to be separated (at least to some extent) from design of the hardware on which programs will be run." C specifically earns its place because it sits close to the hardware while remaining portable.
Wolf Section 3.2.2 demonstrates this directly using C's peek and poke functions to access hardware registers through pointers — something higher-level languages cannot do cleanly:
int peek(char *location) {
return *location; /* dereference pointer = read hardware register */
}
void poke(char *location, char newval) {
(*location) = newval; /* write to hardware register */
}
"How can we directly write I/O devices in a high-level language such as C? We can use pointers to manipulate addresses of I/O devices." (Wolf §3.2.2, p.103)
C.2 Program Structure
/* 1. Include processor-specific header */
#include <c8051F020.h> /* defines SFR addresses, bit names */
/* 2. Global declarations — visible to all functions */
unsigned char counter;
int speed;
/* 3. Function prototypes */
void delay_ms(int ms);
unsigned char read_sensor(void);
/* 4. Main function */
void main(void) {
/* Local variables */
int temp;
/* Initialization */
P1 = 0x00; /* clear Port 1 */
/* Embedded systems typically run forever */
while(1) {
/* Application logic */
}
}
Wolf §3.2.3 (Busy-Wait I/O) shows the standard embedded infinite loop pattern:
while (TRUE) { /* perform operation forever */
while (peek(IN_STATUS) == 0); /* wait until ready */
achar = (char)peek(IN_DATA); /* read input */
poke(OUT_DATA, achar);
poke(OUT_STATUS, 1); /* start output */
while (peek(OUT_STATUS) != 0); /* wait until done */
}
(Wolf Example 3.3, p.104)
C.3 Data Types
| Type | Bits | Unsigned Range | Signed Range | Embedded Use |
|---|---|---|---|---|
unsigned char | 8 | 0–255 | — | Port values, most hardware registers |
char | 8 | — | −128–127 | Signed 8-bit data |
unsigned int | 16 | 0–65535 | — | Counters, addresses |
int | 16 | — | −32768–32767 | General integers |
long | 32 | — | ±2 billion | Timing values, large counters |
sbit | 1 | 0–1 | — | Individual pin control (8051) |
Rule: Always use unsigned for hardware register values. Storing 0xFF in a signed char gives −1, not 255.
C.4 Operators
Arithmetic:
a + b a - b a * b a / b a % b /* modulo: 7%3 = 1 */
Relational and Equality:
a > b a >= b a < b a <= b
a == b /* EQUAL — two equals signs. Using = instead of == is the #1 bug */
a != b /* not equal */
Logical:
a && b /* AND: both must be true */
a || b /* OR: at least one true */
!a /* NOT: flip true/false */
Bitwise — Essential for embedded hardware control:
Wolf §2.3.2 shows bitwise operations in the ARM instruction set: AND, ORR (OR), EOR (XOR), BIC (bit clear). These translate directly to C:
P1 = P1 | 0x01; /* SET bit 0 — OR with mask */
P1 = P1 & 0xFE; /* CLEAR bit 0 — AND with inverted mask (0xFE = 1111 1110) */
P1 = P1 ^ 0x01; /* TOGGLE bit 0 — XOR with mask */
~P1 /* INVERT all bits */
a << n /* left shift: multiply by 2^n */
a >> n /* right shift: divide by 2^n */
Wolf ARM instruction BIC r0,r1,r2: "sets r0 to r1 AND NOT r2. This instruction uses the second source operand as a mask: Where a bit in the mask is 1, the corresponding bit in the first source operand is cleared." (Wolf §2.3.2, p.66) — this is the hardware equivalent of C's & with an inverted mask.
C.5 Control Flow
if / else if / else:
if (temperature > 100) {
fan = 1;
heater = 0;
} else if (temperature > 50) {
fan = 0;
} else {
heater = 1;
}
while loop — condition checked BEFORE body executes:
Wolf Example 3.3 uses the while loop as the standard I/O pattern:
while (peek(OUT_STATUS) != 0); /* busy-wait: loop until device ready */
Infinite loop (standard in embedded systems — CPU must always be doing something):
while (1) {
do_task();
}
for loop — use when number of iterations is known:
for (int i = 0; i < 8; i++) {
send_bit((data >> i) & 1); /* shift and send each bit */
}
do…while — body executes AT LEAST once, condition checked after:
do {
value = read_ADC();
} while (value < 0); /* retry until valid reading */
switch:
switch (command) {
case 0x01: motor_forward(); break;
case 0x02: motor_reverse(); break;
case 0xFF: motor_stop(); break;
default: error_handler(); break; /* always include default */
}
Critical: The break statement after each case is mandatory. Without it, execution "falls through" to the next case.
C.6 Functions and Procedure Calls
Wolf §2.3.3 covers procedure calls in depth for the ARM. The key mechanism is the stack frame:
"The compiler passes parameters and return variables in a block of memory known as a frame. The frame is also used to allocate local variables. The stack elements are frames. A stack pointer (sp) defines the end of the current frame, while a frame pointer (fp) defines the end of the last frame." (Wolf §2.3.3, p.75)
In C:
/* Declaration */
int add(int a, int b);
/* Definition */
int add(int a, int b) {
return a + b; /* local to this function, stored on stack */
}
/* Call */
int result = add(3, 4); /* parameters passed; return value received */
ARM procedure call standard (Wolf §2.3.3): r0–r3 pass the first four parameters; r0 holds the return value; r11 = frame pointer; r13 = stack pointer.
Nested calls use a stack (Wolf Figure 2.17): f1() calls f2() which calls f3(). Each call pushes a new frame; each return pops it. This is exactly why saving registers to software stack (the answer to Exam Q.29) works for nested interrupts.
C.7 Memory-Mapped I/O in C
Wolf §3.2.2 shows the standard embedded pattern for accessing hardware:
#define OUT_CHAR 0x1000 /* output device character register */
#define OUT_STATUS 0x1001 /* output device status register */
/* Busy-wait write: */
char *mystring = "Hello";
char *current_char = mystring;
while (*current_char != '\0') {
poke(OUT_CHAR, *current_char); /* send character */
while (peek(OUT_STATUS) != 0); /* wait for device */
current_char++; /* advance pointer */
}
(Wolf Example 3.2, p.103–104)
Interrupt-driven version (Wolf Example 3.4, p.106):
void input_handler() { /* ISR — runs on interrupt */
achar = peek(IN_DATA); /* read character */
gotchar = TRUE; /* signal main program */
poke(IN_STATUS, 0); /* reset device status */
}
main() {
while (TRUE) {
if (gotchar) { /* check flag set by ISR */
poke(OUT_DATA, achar);
poke(OUT_STATUS, 1);
gotchar = FALSE;
}
}
}
The key insight Wolf establishes: interrupt-driven I/O is more efficient than busy-wait because "the CPU does nothing but test the device status while the I/O transaction is in progress" in the busy-wait case (Wolf §3.2.4, p.104).
C.8 Assembly vs C — The Textbook Comparison
Wolf Example 2.2 (p.69) generates ARM assembly from C statements. For x = (a + b) - c:
ldr r2, [fp, #-24] ; load a
ldr r3, [fp, #-28] ; load b
add r2, r2, r3 ; a + b
ldr r3, [fp, #-32] ; load c
rsb r3, r3, r2 ; (a+b) - c using reverse subtract
str r3, [fp, #-36] ; store into x
This is 6 lines of assembly for one C statement. For y = a*(b+c):
ldr r2, [fp, #-28]
ldr r3, [fp, #-32]
add r2, r2, r3
ldr r3, [fp, #-24]
mul r3, r2, r3
str r3, [fp, #-40]
And for z = (a << 2) | (b & 15):
ldr r3, [fp, #-24]
mov r2, r3, asl #2 ; shift left 2 = multiply by 4
ldr r3, [fp, #-28]
and r3, r3, #15 ; mask lower 4 bits
orr r3, r2, r3 ; bitwise OR
str r3, [fp, #-44]
Inline assembly in C (connects to Exam Q.20): when you need exact assembly inside a C program:
void precise_delay(void) {
__asm__("NOP"); /* one clock cycle — inline assembly */
__asm__("NOP");
}
PART D: ALL CHAPTERS IN DETAIL
Wolf Chapters 1–5; organized by course week
D.1 Chapter 1 — Embedded Computing Foundations
Wolf Ch.1 | Course Weeks 1–2 | CLO 1
Definition and Characteristics
Wolf §1.2 defines an embedded system: "Loosely defined, it is any device that includes a programmable computer but is not itself intended to be a general-purpose computer. Thus, a PC is not itself an embedded computing system. But a fax machine or a clock built from a microprocessor is an embedded computing system." (Wolf p.2)
Why Use Microprocessors? (Wolf §1.2.3)
Wolf gives three reasons:
- Efficiency: "Microprocessors execute programs very efficiently. Modern RISC processors can execute one instruction per clock cycle most of the time."
- Optimization: "Microprocessor manufacturers spend a great deal of money to make their CPUs run very fast… Few products can justify the dozens or hundreds of computer architects and VLSI designers customarily employed."
- Programmability: "Programmability makes it easier to design families of products. In many cases, high-end products can be created simply by adding code without changing the hardware." (Wolf §1.2.3, p.6–7)
Key Embedded System Constraints (Wolf §1.2.2)
Wolf identifies these simultaneous design constraints:
- Complex algorithms — e.g., engine control filtering (Wolf p.4)
- User interfaces — GPS moving maps (Wolf p.5)
- Real-time deadlines — "If the data are not ready by a certain deadline, the system breaks." (Wolf p.5)
- Multirate — audio and video at different rates, synchronized (Wolf p.5)
- Manufacturing cost — chip selection, memory amount (Wolf p.5)
- Power and energy — "Power consumption directly affects the cost of the hardware… Energy consumption affects battery life." (Wolf p.5)
Cyber-Physical Systems (Wolf §1.2.4)
Wolf defines: "A cyber-physical system is one that combines physical devices, known as the plant, with computers that control the plant." (Wolf p.7) The BMW ABS/ASC+T example (Wolf Design Example 1.1) shows a real cyber-physical system with wheel sensors, hydraulic pumps, and two microprocessors on separate circuit boards.
Safety and Security (Wolf §1.2.5)
Wolf establishes: "Security and safety cannot be bolted on — they must be baked in." (Wolf p.9)
- Security = preventing malicious attacks (data integrity, privacy)
- Safety = controlling energy release/physical behavior
- Together: "A poorly designed car can allow an attacker to install software in the car and take over operation of that car." (Wolf p.9)
Key cryptographic concepts Wolf covers (§1.2.5):
- AES — secret-key block cipher (128-bit blocks, 128/192/256-bit keys)
- RSA — public-key algorithm (private + public key pair)
- SHA-3 — cryptographic hash function
- Digital signatures — combines private key signing + public key verification
Design Process (Wolf §1.3)
Wolf describes a hierarchical, top-down process with bottom-up feedback (Wolf Figure 1.1):
Requirements → Specification → Architecture → Components → System Integration
- Requirements (Wolf §1.3.1): Informal customer description. Includes functional (what it does) and non-functional (performance, cost, power, size) requirements. Wolf's requirements form [Figure 1.2]: Name, Purpose, Inputs, Outputs, Functions, Performance, Manufacturing Cost, Power, Physical Size.
- Specification (Wolf §1.3.2): "The specification is more precise — it serves as the contract between the customer and the architects." Must be unambiguous and verifiable.
- Architecture (Wolf §1.3.3): Describes HOW the system is structured — hardware and software block diagrams (Wolf Figure 1.4).
- System Integration (Wolf §1.3.5): "Bugs are typically found during system integration… debugging facilities for embedded systems are usually much more limited than what you would find on desktop systems." (Wolf p.22)
D.2 Chapter 2 — Instruction Sets and Processor Architectures
Wolf Ch.2 | Course Weeks 3, 6 | CLO 2
Computer Architecture Taxonomy (Wolf §2.2.1)
Von Neumann vs Harvard:
Wolf: "A Harvard architecture provides higher memory bandwidth; not making data and memory compete for the same port also makes it easier to move the data at the proper times." (Wolf §2.2.1, p.57)
| Feature | Von Neumann | Harvard |
|---|---|---|
| Buses | Single shared bus | Separate instruction + data buses |
| Speed | Bottleneck at bus | Parallel access possible |
| Used in | ARM7, early systems | ARM Cortex-M3, PIC16F, C55x |
Wolf notes: "ARM7 is a von Neumann architecture machine, while ARM Cortex-M3 uses a Harvard architecture." (Wolf §2.3.1, p.63)
RISC vs CISC (Wolf §2.2.1):
"CISC machines provided a variety of instructions that may perform very complex tasks… RISC computers tended to provide somewhat fewer and simpler instructions. RISC machines generally use load/store instruction sets — operations cannot be performed directly on memory locations, only on registers." (Wolf p.57)
| Feature | RISC | CISC |
|---|---|---|
| Instructions | Few, simple, uniform | Many, complex, variable |
| Cycles/instruction | ~1 (pipelined) | Variable |
| Memory access | Load/store only | Any instruction |
| Examples (Wolf) | ARM, C55x, PIC16F | x86 family |
Processor types (Wolf §2.2.1):
- Single-issue: one instruction at a time
- Superscalar: hardware finds parallel instructions at runtime — expensive, high power
- VLIW: compiler identifies parallel instructions — efficient for DSP (Wolf §2.2.3): "Because the processor does not have to analyze data dependencies at run time, VLIW processors are smaller and consume less power than superscalar processors." (Wolf p.61)
ARM Processor (Wolf §2.3)
- 16 general-purpose registers r0–r15; r15 = Program Counter
- CPSR (Current Program Status Register): N (negative), Z (zero), C (carry), V (overflow) flags
- Word = 32 bits; byte-addressable; supports little-endian and big-endian (Wolf §2.3.1)
- Conditional execution: ANY ARM instruction can execute conditionally based on CPSR flags — unlike most architectures where only branches are conditional
ARM data instructions (Wolf §2.3.2, Figure 2.10):
| Instruction | Operation |
|---|---|
| ADD, ADC | Add, Add with carry |
| SUB, SBC | Subtract, Subtract with carry |
| MUL, MLA | Multiply, Multiply-accumulate |
| AND, ORR, EOR | Bitwise AND, OR, XOR |
| BIC | Bit clear (AND NOT) |
| MOV, MVN | Move, Move negated |
| LSL, LSR, ASR, ROR | Shift and rotate |
| CMP, CMN, TST, TEQ | Compare (set flags only, no result) |
| LDR, STR | Load/Store word |
| LDRB, STRB | Load/Store byte |
PIC16F (Wolf §2.4)
- Harvard architecture, 8-bit word, 14-bit instructions (Wolf §2.4.1)
- Up to 8192 words flash program memory, 368 bytes SRAM data memory
- Accumulator (W register): most operations use W as implicit source/destination
- Special function registers (SFRs) in lowest 32 locations of each bank for I/O control
- Indirect addressing via INDF/FSR register pair (Wolf §2.4.2)
TI C55x DSP (Wolf §2.5)
- Accumulator architecture (Wolf §2.5): "many arithmetic operations are of the form accumulator = operand + accumulator"
- Four 40-bit accumulators: AC0–AC3 (40 bits including 8 guard bits for DSP precision)
- Eight auxiliary registers AR0–AR7 for indirect addressing and circular buffers
- 24-bit address space (16 MB), byte-addressable program space, word-addressable data space
- Three addressing modes: Absolute, Direct, Indirect (Wolf §2.5.2)
- Supports circular addressing for FIR filter buffers — standard DSP requirement
D.3 Chapter 3 — CPUs: I/O, Interrupts, Memory
Wolf Ch.3 | Course Weeks 4–5 | CLO 2
I/O Device Structure (Wolf §3.2.1)
Wolf describes the standard I/O device interface (Figure 3.1): "Devices typically have several registers: Data registers hold values that are treated as data by the device. Status registers provide information about the device's operation, such as whether the current transaction has completed." (Wolf p.100)
Memory-mapped I/O (Wolf §3.2.2): "Memory-mapped I/O provides addresses for the registers in each I/O device. Programs use the CPU's normal read and write instructions to communicate with the devices." (Wolf p.102) No special I/O instructions needed — standard LDR/STR on ARM.
Three I/O Methods
1. Busy-Wait (Polling) (Wolf §3.2.3):
Wolf: "Busy-wait I/O is extremely inefficient — the CPU does nothing but test the device status while the I/O transaction is in progress." (Wolf §3.2.4, p.104)
while (peek(OUT_STATUS) != 0); /* spin until device ready */
2. Interrupt-Driven (Wolf §3.2.4):
"The interrupt mechanism allows devices to signal the CPU and to force execution of a particular piece of code. When an interrupt occurs, the program counter's value is changed to point to an interrupt handler routine." (Wolf §3.2.4, p.105)
Interrupt signals (Wolf Figure 3.2):
- Device asserts interrupt request when it wants service
- CPU asserts interrupt acknowledge when ready to handle it
Interrupt mechanism steps (Wolf §3.2.4):
- Device asserts interrupt request
- CPU finishes current instruction
- CPU saves PC (and flags/registers) to stack
- CPU reads Interrupt Vector Table (IVT) → gets ISR address
- CPU jumps to ISR
- ISR executes, handles device
- ISR executes RETI → CPU restores PC from stack
- Original program resumes
3. DMA (Direct Memory Access) — transfers data between device and memory without CPU involvement. Used for large bulk transfers.
UART (Wolf Application Example 3.1)
Wolf covers the 8251 UART in detail (§3.2.1): "The UART is programmable for a variety of transmission and reception parameters… Every character starts with a start bit (a 0) and a stop bit (a 1). The start bit allows the receiver to recognize the start of a new character." (Wolf p.101)
UART configuration parameters: baud rate, data bits (5/6/7/8), parity (none/odd/even), stop bits (1/1.5/2). Example: 9600 8N1 = 9600 baud, 8 data bits, No parity, 1 stop bit.
Supervisor Mode, Exceptions, Traps (Wolf §3.3)
- Supervisor mode (Wolf §3.3.1): privileged CPU mode for OS and critical operations. ARM uses
SWIinstruction to enter: "SWI causes the CPU to go into supervisor mode and sets the PC to 0x08." (Wolf p.117) - Exception (Wolf §3.3.2): internally detected error (division by zero, illegal memory access, undefined instruction). "Resets, undefined instructions, and illegal memory accesses are other typical examples of exceptions." (Wolf p.118)
- Trap (Wolf §3.3.3): software-generated interrupt. "A trap, also known as a software interrupt, is an instruction that explicitly generates an exception condition. The most common use of a trap is to enter supervisor mode." (Wolf p.118)
The Trap Handler (connects to Exam Q.17): Wolf §3.3.3 — the handler that receives control when a trap occurs, determines the cause (via opcode embedded in SWI), and dispatches to the appropriate service routine.
Caches (Wolf §3.5.1)
Wolf §3.5.1: "A cache is a small, fast memory that holds copies of some of the contents of main memory. Because the cache is fast, it provides higher-speed access for the CPU; but because it is small, not all requests can be satisfied by the cache." (Wolf p.121)
Cache performance:
Where h = hit rate, t_cache = cache access time, t_main = main memory access time.
Cache miss types (Wolf §3.5.1):
- Compulsory miss: first access to a location — unavoidable
- Capacity miss: working set too large for cache
- Conflict miss: two locations map to same cache block
Cache organization:
- Direct-mapped (Wolf Figure 3.8): each memory address maps to exactly one cache block. Fast, simple, but susceptible to conflict misses.
- Set-associative (Wolf Figure 3.9): n-way — each address can map to n different blocks. Higher hit rate, slightly slower, harder to predict.
Write policies:
- Write-through: every write updates both cache and main memory — simple, consistent
- Write-back: only write to main memory when block is evicted — fewer writes, more complex
D.4 Digital Logic Foundations
Course Chapter 1 notes | Weeks 1–2 | CLO 1
Boolean Algebra Laws
| Law | AND Form | OR Form |
|---|---|---|
| Identity | A · 1 = A | A + 0 = A |
| Null | A · 0 = 0 | A + 1 = 1 |
| Idempotent | A · A = A | A + A = A |
| Complement | A · A' = 0 | A + A' = 1 |
| Commutative | A·B = B·A | A+B = B+A |
| Associative | (A·B)·C = A·(B·C) | (A+B)+C = A+(B+C) |
| Distributive | A·(B+C) = A·B+A·C | A+(B·C) = (A+B)·(A+C) |
| De Morgan's | (A·B)' = A'+B' | (A+B)' = A'·B' |
De Morgan's Laws are the most exam-critical. They explain NAND = NOT(AND) = OR of inverted inputs; NOR = NOT(OR) = AND of inverted inputs.
XOR Truth Table
| A | B | A⊕B |
|---|---|---|
| 0 | 0 | 0 |
| 0 | 1 | 1 |
| 1 | 0 | 1 |
| 1 | 1 | 0 |
Output is 1 when inputs are different. Used in: parity checking, bit toggling, half-adder sum.
Counters, Timers, FSMs
Counters: Sequential circuits from flip-flops. n-bit counter → counts 0 to .
Timer interrupt period:
Example: f_clock = 16 MHz, prescaler = 256, TOP = 249:
FSM types:
- Moore: output depends only on current state
- Mealy: output depends on current state AND inputs
D.5 ADC — Analog-to-Digital Conversion
Course Week 12 | CLO 3
Three Steps (connects to Exam Q.11)
- Sampling — measuring the analog voltage at regular time intervals
- Quantization — mapping to the nearest of discrete digital levels (introduces quantization error ≤ ±0.5 LSB)
- Encoding — converting the quantized level to binary output
Nyquist-Shannon Theorem: . Violating this causes aliasing.
ADC Formulas
Example (Exam Q.6): 10-bit ADC, V_ref = 3.3V:
Example 2: 10-bit ADC, V_ref = 5V, reading = 512:
D.6 Serial Communication
Course Week 11 | CLO 3 | Connects to Exam Q.18
| Feature | UART | SPI | I²C |
|---|---|---|---|
| Wires | 2 (TX, RX) | 4 (MOSI, MISO, SCK, CS) | 2 (SDA, SCL) |
| Synchronous? | No (async) | Yes | Yes |
| Duplex | Full | Full | Half |
| Speed | Low–Medium | High | Low–Medium |
| Topology | Point-to-point | 1 master, N slaves | Multi-master, multi-slave |
SPI is the answer to Exam Q.18 because it is the only one that is both synchronous AND full-duplex. Wolf covers UART in §3.2.1 as the foundational serial communication example.
MASTER FORMULA REFERENCE
| Formula | Calculates |
|---|---|
| PWM Duty Cycle | |
| Average voltage from PWM | |
| Frequency from period | |
| Period from frequency | |
| PWM off-time | |
| ADC resolution per step | |
| Voltage from ADC reading | |
| Nyquist sampling criterion | |
| Timer interrupt period | |
| Average memory access time | |
| EA = Base_Reg + Offset | Base+offset addressing (ARM) |
| EA = PC + (offset × 4) | PC-relative branch (ARM) |
EXAM STRATEGY NOTES
-
For Duty Cycle questions: Identify T and t_ON first. Compute D = t_ON/T × 100%. Then V_avg = V_supply × D. All three formulas come from the same two numbers.
-
For Addressing Mode questions: Identify the MODE from the instruction syntax. In ARM:
#value= immediate,[reg]= indirect,[reg, #offset]= base+offset,B label= PC-relative. -
For C code analysis: Trace line by line. Know that
==tests equality (never write=in a condition). Bitwise&masks bits;|sets bits;^toggles bits. -
Wolf's key principle for I/O: Busy-wait wastes CPU; interrupts free the CPU. Use interrupts when events are rare or unpredictable (Wolf §3.2.4).
-
Common mark-losing errors:
- Writing
=instead of==in conditions - Using instead of 1024 in ADC calculations
- Confusing what is STORED in the IVT (the ISR address) vs the ISR itself
- Confusing synchronous (SPI, I²C) with asynchronous (UART)
- Forgetting
breakin switch cases