SE350 Labs

The labs guided us in building a very basic kernel in C responsible for process scheduling, memory management, inter-process communication and I/O handling.

RTX kernel on microcontroller STM32.

3 Major deliverables

  1. Multitasking (due Feb 9)
    • A simple OS that supports multi-threading
    • Each thread has own stack space
    • Tasks share the processor cooperatively by yielding
  2. Memory Management (due March 8)
    • Features for memory allocation & deallocation on the heap
  3. Pre-emptive Multitasking (due April 5)
    • Tasks have priorities (deadlines)
    • OS juggles tasks - a time slice per task, earliest deadlines first

Preparations

  • Find group of 4
  • Acquire hardware (required per person)
    • For this lab project you will be using the STM32F401RE or STM32F411RE microprocessor packaged in an STM Nucleo-64 board.
  • Download and install IDE

No standard C library functions are allowed to be used in the kernel code, with the exception of printf. Not permitted to use malloc, you will be writing your own memory allocator.

Required Header Files

  • common.h: hold common definitions that multiple files may need.
    • For example: the maximum number of tasks your OS can support
  • k_task.h: Functionality and definitions required for running threads
  • k_mem.h: Functionality and definitions for the memory allocator

Deliverable 1 (Multitasking Part 1)

Work with embedded development kit (STM Nucleo-64), that is equipped with a high performance and low power microcontroller (The STM32F401RE or STM32F411RE).

In this project you will create a kernel and develop the multitasking features in your kernel using the scheme of co-operative multitasking and round-robin scheduling. Specifically, you will:

  • Write the fundamental kernel API functions osKernelInit, osCreateTask, osKernelStart, osYield, osTaskInfo, osTaskExit
  • Create the Task Control Block (TCB) data structure and set up relevant OS metadata as per your group’s design
  • Understand how the automatic evaluation code interfaces with your RTX, and run basic public tests to ensure that your implementation conforms to the API specification

Design Considerations

  1. How are you planning on storing your TCBs?
  2. How does the kernel know which task is currently executing?
  3. How does the scheduler know which tasks are available for scheduling?
  4. What data does the kernel need to store and access in order to load and unload tasks?

SVC calls whenever we want the OS to handle something.

uint32_t* MSP_INIT_VAL = *(uint32_t**)0x0;

We want to create a pointer whose value is 0x0. 0x0 stores another memory address, which is the actual pointer we want. Therefore, we need to create a double pointer. RememberIt is a pointer to a uint32_t. 0x0 is a pointer to that pointer. Therefore, we need to dereference 0x0 once to get the address of the start of the MSP stack.

We have the TCB to be called table

k_task.h

  • TCB we U32, u8 and U16
    • We’ve included uint32_t sp; // saved PSP for this thread.
    • RTX_OK and RTX_ERR set to 0 and 1

Load:

Why use PendSV for context switching

After some in depth experimentation, here is a better explanation of the benefits (necessity) of using PendSV:

As you know, context switching involves two aspects: low level manipulation of registers (must be done with assembly); and working with data structures (better done in C). So you have to mix C and assembly. 

The difficulty is that the C compiler does a lot under the hood and uses registers to do so.

E.g. on our system it uses R7 to create and remove the stack frame for each function. This means inside the SVC_Handler_Main function, even if you don’t use any local variables, R7 will be overwritten.

Hence if you try to do the step of saving R4-R11 within the body of SVC_Handler_Main, it will get the incorrect register values.

However, when SVC_Handler_Main exits, the compiler has added code there to retore all the registers it overwrote. Therefore if you can put your context switching code into something that executes after SVC_Handler_Main exits, it will see the original register values from the yielding thread. 

(It is still possible to do context switch without PendSV. —For anyone particularly interested in assembly programing, or fields where mixing C & assembly is essential, like microbenchmark development).

LAB 1:

  • We have array of tcb at most MAX_TASKS in the main memory.

  • How to create: pointer thread_table_size

  • TCB is global

  • we have a stack_base that is there forever, we have a new_thread_stack_high

  • we calculated thread_stack_high_limit

  • We going to the tcb lists

  • Each tcb has an sp

  • 0x20 is the minimum size you can allocate to threads to contain all the registers

  • I need to understand the __set_PSP, __asm("SVC #0") and the __get_PSP().

3rd bit of the link register, we can check if it came from main or from a thread. in the SVC_Handler.s file

Subtracting msp by main

  • high values to low
  • where main_stack ends is where the thread starts msp base - main_stack_size

Failed 4 tests. Because of our stack hard-faulting?

LAB 2:

  • Develop malloc and free functions

    • Enable your kernel & user application to use the heap (used for run-time / dynamic memory needs)
  • Speed performance: just want to gain an appreciation for design considerations toward performance.

    • This field is deep, don’t make it too complicated

Speed performance

  • Priority should be to make it functional (worth up to 80%)
  • Then you can experiment with ideas to speed it up
    • E.g. Lists/pointers to directly reach free / allocated blocks
    • E.g. What about ordering?
    • How to observe speed - simple example:

Linker script symbols:

Testing:

  • Someone tries to free it that is not the owner of the memory. Then the actually try to free it.
  • Double free, token 0
  • Robustness test, allocate memory a lot to the heap
  • Allocate the max heap memory
  • test extfrag

Lab 3

Goals: Once this deliverable is complete, your RTX will be able to:

  1. Dynamically allocate task stacks based on a desired size
  2. Assign a priority to each task depending on its deadline
  3. Pre-emptively switch a task if it has exhausted its time slice
  4. Allow running tasks to change the priorities (deadlines) of others, potentially triggering a context switch
  5. Allow tasks to sleep for a set period of time, during which they are not scheduled

Extremely important note:

you should be starting this project from working Deliverable 2 code. All of the above points are in addition to what the OS can already do. This is a common point of misunderstanding – it is true that we are implementing pre-emption, but that does not mean that a task cannot also call the co-operative multitasking functions if it needs to. In addition, we are assuming that your memory allocator works!

Background: SysTick

All Cortex M chips have a timer on board called SysTick. SysTick is a timer that can be configured to trigger an interrupt, which is almost always called SysTick_Hander. SysTick works like any other timer you may have encountered: an integer gets loaded into a register, and every clock cycle that integer is decremented. Once the integer reaches zero it is reset and the associated interrupt is triggered. Theoretically one can change the period of the timer by modifying the integer, but the default value of 1ms set for us by STM32CubeIDE is sufficient for this project.

Locating SysTick_Handler

Setting up SysTick is quite challenging, but luckily in the starter code you received at the beginning of the term, it is set up for you. Presuming you have not modified the initial setup function calls then somewhere near the top of int main, you will see the following line: SystemClock_Config();

This configures SysTick to trigger an interrupt once every millisecond. Specifically, the interrupt routine that is triggered can be found in the Core->Src->stm32f4xx_it.c file. Open that file and search for a function named SysTick_Handler. It should look like this:

void SysTick_Handler(void)
{
	/* USER CODE BEGIN SysTick_IRQn 0 */
	
	/* USER CODE END SysTick_IRQn 0 */	
	HAL_IncTick();
	/* USER CODE BEGIN SysTick_IRQn 1 */
	
	/* USER CODE END SysTick_IRQn 1 */
}

Any modifications that you make to this function must be written after the line HAL_IncTick(). That function is used internally by various peripherals, for example UART, and if you remove or otherwise interfere with it many things will break!

When working with SysTick_Handler, assume that it is being called in a similar way to osYield – it is a C function that sets up and calls assembly code only when needed.

Suggested Exercise

In order to learn how SysTick works, it is recommended that you attempt a simple exercise. This is best done using a clean copy of the starter code, so that you can be sure that any errors are not due to your RTX interfering.

  1. In main.c, declare a global integer
  2. In int main(), initialize that integer to 1000
  3. Declare that integer as extern in STM32F4xx_it.c
  4. Every time SysTick_Handler triggers, decrement the integer
  5. Whenever the integer is zero, print a short message using printf then reset the integer back to 1000

If everything was done correctly, you have just written a basic task that triggers once per second, rather

than once per millisecond. The concept that you learned here is important – your tasks will have different deadlines. It is not possible to use one hardware timer per task, therefore you will need some other way of keeping track of those task deadlines and when they trigger.

// In main.c
int global_counter = 1000;
 
// In int main()
global_counter = 1000;
 
// In STM32F4xx_it.c (above Systeick_Handler declaration?)
extern int global_counter;
 
// In STM32F4xx_it.c, within SysTick_Handler
void SysTick_Handler(void)
{
    /* Other code */
    
    HAL_IncTick();
    
    if (global_counter > 0) {
        global_counter--;
        if (global_counter == 0) {
            printf("One second passed.\n");
            global_counter = 1000; // Reset the counter
        }
    }
}
 

Suggested Pre-lab: Pre-emptive Round-Robing Scheduling

Implementing the API for this lab requires you to modify your RTX so that it can handle pre-emptively switching tasks. To get you started, we recommend integrating SysTick into your existing co-operative scheduling framework first. This will require minimal modifications compared to the whole API, and it will give you a good idea of how to proceed with the more difficult parts. Attempt to implement the following functionality:

  1. Every task should have a variable stored in its TCB. This variable should store the number of milliseconds remaining before this task must be switched out. We will refer to this as the task’s “timeslice length”
  2. Using SysTick, modify your RTX so that multiple tasks can loop forever without calling osYield. Whenever the timeslice of the currently running task expires, a context switch should be triggered and the round-robin scheduler should run
  3. Verify that you didn’t break anything by adding in another task that does call osYield as it runs, while the other tasks continue to run for their entire timeslice. It is crucial that this task also has a timeslice length value, and it is possible for it to take so long to run that it does not reach the call to yield during a single period

Point 3 above is crucial, and you need to be very careful. Consider the following scenario: Presume you have three tasks, A, B, and C, of which A and B are scheduled only by timeslice length and C is scheduled only by yielding. A race condition can occur where, in the middle of a context switch triggered by yielding, a second context switch is triggered because of timing out. Although this is a fairly rare occurrence, you must ensure that your OS does not fall victim to this race condition. If this happens, you will almost certainly hard fault, and it will appear randomly and be very hard to debug!

Ramp Up Exercise

time interrupt?? Systick_handler??? What do we do inside? Call asm(PendSV)?

We are going to use PendSV

We use EDF or something like that

Actual Lab

When we are in SVC interrupt, nothing happens. _set_invoked_through_SVN(1) will look at the current task. CHeck if comes from main or threads and will use that as the owner

osCreateDeadlineTask_Impl fetch arguments and

Context switch is done ins switch(svc_num). Go to k_task.c, in osTaskExit for example. Do calculate to get where it is.

get_task_to_run get the running or ready task with the least deadline. -asm(isb) pendsv. we go back to svc_handler_main and returns to the magic number and fix all registers used. And we go back to SVC_handler at the end. Immediately pendsv will trigger. we are in PendSV_Handlers. and it will have the registers and stuff. it wil push all the things and call the ktask_pendsv and get the next psp, tid for us. pop and push…

svc does a

c code fix all of its registers. if we queued a context switch

Testing:

main.c

/* USER CODE BEGIN Header */
/**
  ******************************************************************************
  * @file           : main.c
  * @brief          : Main program body
  ******************************************************************************
  * @attention
  *
  * Copyright (c) 2023 STMicroelectronics.
  * All rights reserved.
  *
  * This software is licensed under terms that can be found in the LICENSE file
  * in the root directory of this software component.
  * If no LICENSE file comes with this software, it is provided AS-IS.
  *
  ******************************************************************************
  */
/* USER CODE END Header */
/* Includes ------------------------------------------------------------------*/
#include "main.h"
#include <stdio.h> //You are permitted to use this library, but currently only printf is implemented. Anything else is up to you!
#include "common.h"
#include "k_mem.h"
#include "k_task.h"
 
void taskA(void*);
void taskB(void*);
void taskC(void*);
void taskD(void*);
void taskE(void*);
void taskF(void*);
 
void taskA(void* args) {
  printf("Task A - 1\r\n");
  osSleep(1000);
  printf("Task A - 2\r\n");
 
  // Should preempt A.
  TCB tcb;
  tcb.ptask = &taskC;
  osCreateDeadlineTask(5, 0x200, &tcb);
 
  printf("Task A - 3\r\n");
 
  osTaskExit();
}
 
void taskB(void* args) {
  printf("Task B\r\n");
  osTaskExit();
}
 
void taskC(void* args) {
  printf("Task C\r\n");
  k_mem_debug();
  osTaskExit();
}
 
 
uint32_t test_i = 0;
 
void taskD(void* args) {
  while (1) {
    printf("Task D: %d\r\n", test_i);
    // osYield();
  }
}
 
void taskE(void* args) {
  while (1) {
    test_i += 1;
    osPeriodYield();
  }
}
 
void taskF(void* args) {
  while (1) {
    printf("F - %d\r\n", test_i);
    osPeriodYield();
  }
}
 
void taskG(void* args) {
  for (int i = 0; i < 200000; i++) {
    test_i++;
    osYield();
  }
 
  osSetDeadline(5, 1);
 
  while (1) {
    test_i++;
    osYield();
  }
}
 
int* arr = NULL;
 
void taskH(void* args) {
  printf("Task H start.\r\n");
 
  arr = k_mem_alloc(sizeof(int) * 30);
  for (int i = 0; i < 30; i++) {
    arr[i] = i * i;
  }
 
  printf("Task H sleeping.\r\n");
  osSleep(1000);
  
  int sum = 0;
  for (int i = 0; i < 30; i++) {
    sum += arr[i];
  }
 
  printf("sum=%d\r\n", sum);
  if (k_mem_dealloc(arr) == RTX_OK) {
    printf("dealloc OK\r\n");
  } else {
    printf("dealloc ERR\r\n");
  }
 
  osTaskExit();
}
 
void taskI() {
  printf("Task I start.\r\n");
  while (1) {
    k_mem_dealloc(arr);
  }
}
 
/**
  * @brief  The application entry point.
  * @retval int
  */
int main(void)
{
 
  /* MCU Configuration: Don't change this or the whole chip won't work!*/
 
  /* Reset of all peripherals, Initializes the Flash interface and the Systick. */
  HAL_Init();
  /* Configure the system clock */
  SystemClock_Config();
 
  /* Initialize all configured peripherals */
  MX_GPIO_Init();
  MX_USART2_UART_Init();
  /* MCU Configuration is now complete. Start writing your code below this line */
 
  printf("Hello!\r\n");
 
  TCB tcb;
 
  osKernelInit();
  k_mem_init();
 
  // tcb.ptask = &taskF;
  // tcb.stack_size = 0x400;
  // osCreateTask(&tcb);
 
  // osSetDeadline(100, tcb.tid);
 
  // tcb.ptask = &taskG;
  // osCreateDeadlineTask(10, 0x400, &tcb);
 
  tcb.ptask = &taskH;
  osCreateDeadlineTask(1000, 0x400, &tcb);
 
  tcb.ptask = &taskI;
  osCreateDeadlineTask(100000, 0x400, &tcb);
 
  osKernelStart();
 
  while (1) {}
}
 

Testing:

  • Memory allocator should work
  • Error conditions to test
  • 3:28 pm group 1. Preemption, prempt this current task, not premptp if im the current task. if called in maiin, shouldnt happen. tid
    • create a task in main
    • create a task in
    • fail no stack
    • fail if no tid
    • reuse tid
  • osSetDeadline: here is the deadline, set it in main, set it in task, if set in task, is it going to prempt or not, also case if set in main nothing is prempted.
    • osYield and osSleep and osPeriodYield on top of osSetDeadline
  • osTaskInfo: test if its bad tid, wrong tid, sleeping task returns sleeping. do it correctly.
  • k_mem: create some task and memory free it.

Task testing:

  • osSetDeadline
  • osYield
  • osSleep
  • osPeriodYield