Context Switching
In a multi-tasking operating system such as ours, a task or process executes as if it alone runs on the processor, and has no knowledge or concern about other possible tasks. Each task is suspended and resumed transparently from the point of view of the task, without any record-keeping or maintenance performed by the task itself.
In order to run several tasks concurrently, each task is comprised of code, a sequence of machine instructions to perform, and a context in which the processor executes these instructions. Several tasks can even execute the same code; their contexts distinguish them.
What is a Context?
The context of a task includes all the information necessary to describe the state of the processor between execution of instructions, as well as a history of the execution of the task in the form of a stack. The context does not include information about the other elements of the micro-controller apart from the processor: timers, ports, peripherals, etc.
On the AT90USB1287, the context is comprised of:
- a program stack, which records the nesting of function calls
- the general purpose registers 0 to 31
- the Status Register (SREG), which indicates conditions of the processor
- the Program Counter register (PC), which holds the address of the next machine instruction to be executed
- the Stack Pointer register (SP), which points to the address in SRAM in which the next pushed item is to be placed, which is also the address above the next item to be popped.
Memory & The Stack
The AT90USB1287 uses a Harvard memory architecture to separate data memory (internal SRAM) and the program memory space. The program memory stores the instructions to be executed. The data memory stores the variables used in the program, as well as the program stack.
The above picture represents the layout our data memory after initialization by the C runtime library.
- The .data section lies at the top of memory. It holds static variables which are initialized in the
program. (The meaning of static here also includes top-level variables which are not declared to be
static
.) - The .bss section which follows holds the remainder of the static variables, which are not initialized or are initialized to zero.
- The stack area starts at the end of memory. The SP register initially points to the last address in SRAM and the stack grows "downward" towards the lower addresses as items are pushed on. There is no actual separation of the stack from the other areas--if enough pushes are made, the program will "blow up" the stack and overrun the program data.
- the area following the .bss section is the heap, where memory can be dynamically allocated
as with
malloc()
. We do not use dynamic allocation in our kernel, but the user application might even though it is generally not recommended in embedded systems.
Where is "the" stack?
In the picture above, the stack pointer initially points to the end of memory. But each task is to have its own stack. Where are these to reside?
The key observation is that a stack can reside anywhere in SRAM. So we declare an array of sufficient size to hold one task's stack. Of course, we still risk blowing up stacks as before. The array for the stack can be grouped together with the other context data for the task:
typedef struct { uint8_t stack[256]; uint8_t* sp; } context_type;
Then we set aside the memory for all of the task contexts, including the stacks, with the definition:
static context_type context[3];
The data memory now looks like this,
Within each context we must store all the elements of the task's context: R0 to R31, SREG, PC and an SP. How this is done is described below.
Switching Contexts
Now let us imagine that the task of context[0]
is running, and that the OS needs to suspend
this task and resume the task of context[1]
.
The processor's stack pointer, SP, is currently pointing somewhere inside the stack array in the context.
Somehow, either with an interrupt or with an explicit call to next()
, the context switch is
initiated. In either case, the address of the next instruction to return to is pushed onto the stack. This
takes care of storing the task's PC.
The first few instructions of the ISR or the next()
function push the contents of the
status register SREG and all 32 general purpose registers onto the stack as well.
The last element of the task's context, the current value of SP, is saved in context[0].sp
and next()
is ready to switch to another context.
next() will first change the SP register to point to the top of the stack in next task's context,
using the value saved in context[1].sp
.
Notice that the context[1]
task has been left in a similar state as the context[0]
task.
So to restore this context, the values of the registers must be popped off the new stack in the reverse order.
The final step to restore this context is to restore the program counter from the stack
and resume execution of the task.
This is precisely the action of the assembly instruction ret
.
Now we are now running task 1's code with context[1] stack. The context switch is complete.
Constructing a Context
We described how to resume a suspended task, but how are tasks started in the first place? The trick is to initialize the task's context as if it had been suspended, even though it wasn't.
At the bottom of the stack (the end of the stack array) needs to lie the address to "return" to the first time the task is switched in. We want this to be the first instruction in the task's code. In C, this is just the name of the task's function without parentheses following. The address occupies two bytes and care must be taken to place the bytes in the right order at the end of the stack array.
Above the PC (earlier in the array), we need initial values for the registers and SREG. Since the task is executing as if its function were being called for the first time, the only crucial values are r1 (the "zero" register), which gcc requires to contain the value 0, and the I bit of SREG, which should be 1 to enable interrupts.
Then the stored value for SP for this task must be set to point to the "top" of the stack, 35 bytes from the end of the array.
This setup procedure would be good enough if each task terminated itself properly with a call to
Task_Terminate()
, or else never returns. To handle the case where the user simply lets the task
code function call return
, we need to supply another address even lower in the stack. So we actually
place all the items listed above 2 spots lower in the stack array
(higher in the stack), and at the end of the array we place
the address of Task_Terminate
itself. Then if the function returns, it will begin executing
Task_Terminate()
as if the user had called it.