Homework
E6118 Fall 2025
DUE: Friday 9/26/2025 at 11:59pm EDT
Overview:
In this assignment, you will use a large language model (LLM) of your choice to build a Linux scheduler, such as ChatGPT, Claude, and DeepSeek, to name but a few.
You will guide the LLM with prompts to create the scheduler, but you are not allowed to provide the LLM directly with any code that will go in your scheduler implementation. All of the code for the scheduler implementation must be generated by the LLM. For example, if the LLM produces buggy code, you may not provide the LLM with the correct code that it should implement instead. However, as part of your prompts, you are free to provide the LLM with code that already exists in the Linux kernel or that the LLM previously generated. You are also free to write your own test code to test the correctness and functionality of what the LLM generates.
You may work individually or in groups of 2 of your choosing, and should select your LLM of choice in advance.
Sign up for your chosen LLM here.
To ensure variety, no more than three groups should use the same LLM.
You should submit the following via email to e6118@lists.cs.columbia.edu:
-
Implementation patches for your scheduler, formatted in the style of Linux kernel mailing list submissions
-
A PDF report describing the generated code, your prompting strategy, evaluation results, and experiences, including your evaluation of the effectiveness of the LLM
-
The complete LLM chat log showing how the code was generated
Your scheduler must be implemented on the
Linux 6.14 kernel.
Run it on Ubuntu 25.04 using VMWare. The implementation should work on either x86 or Arm64.
Follow the VM setup instructions here,
and refer to this guide if you need a refresher on compiling and booting a new kernel.
Programming Problems:
The goal of this assignment is to create a new Linux scheduler, a Weight
Fair Queuing scheduler.
-
A Multicore Scheduler
Add a new scheduling policy to the Linux kernel to support
weight fair queuing
scheduling. Call this policy
WFQ. The algorithm should work as follows:
-
The new scheduling policy should serve as the default scheduling
policy for all tasks.
- Multicore systems must be fully supported.
- Every task has a 1 tick time slice (quantum).
-
Every task has an assigned weight (task_weight), with the default
weight being 10.
-
When deciding which CPU a task should be assigned to, it should be
assigned to the CPU with the least total weight. This means you
have to keep track of the total weight of all the tasks running on
each CPU (cpu_total_weight). If there are multiple CPUs with the
same weight, assign the task to the lowest number CPU among CPUs
with the same weight.
-
Each CPU has a virtual time which starts at zero. When a task runs
on that CPU for some time t, the CPU's virtual time increases by t
/ cpu_total_weight.
-
Each task has a virtual time which is initialized to the CPU
virtual time when it is assigned to a CPU. When a task runs for
some time t, the task's virtual time increases by t / task_weight.
-
Each task has a virtual finishing time (VFT) which is equal to the
task's virtual time plus (duration of 1 tick) / task_weight. The VFT is essentially
what the value of a task's virtual time will be after it runs for
an additional time slice.
-
Each CPU will select the task to run from its run queue that has
the smallest VFT.
-
The kernel does not support floating point arithmetic, so all
calculations, particularly those involving virtual time, should be
done using 64-bit integers and scaled by some scale factor to
avoid rounding error.
-
The Linux scheduler implements individual scheduling classes
corresponding to different scheduling policies. For this assignment,
you need to create a new scheduling class,
sched_wfq_class, for the
WFQ policy, and implement the necessary
functions in kernel/sched/wfq.c. You can
find some good examples of how to create a scheduling class in
kernel/sched/rt.c and
kernel/sched/fair.c. Other interesting
files that will help you understand how the Linux scheduler works
are kernel/sched/core.c,
include/linux/sched.h, and
/include/asm-generic/vmlinux.lds.h. While
there is a fair amount of code in these files, a key goal of this
assignment is for you to understand how to abstract the scheduler
code so that you learn in detail the parts of the scheduler that are
crucial for this assignment and ignore the parts that are not.
-
Your scheduler should operate alongside the existing Linux
scheduler. Therefore, you should add a new scheduling policy,
SCHED_WFQ. The value of
SCHED_WFQ should be 8.
SCHED_WFQ should be made the default
scheduler class of init_task.
-
Only tasks whose policy is set to
SCHED_WFQ should be considered for
selection by your new scheduler.
-
Tasks using the SCHED_WFQ policy should
take priority over tasks using the
SCHED_NORMAL policy, but not over
tasks using the SCHED_RR or
SCHED_FIFO policies.
-
Your scheduler must be capable of working on both uniprocessor
systems and multicore/multiprocessor systems; you can change the
number of cores on your VM for testing. All cores should be utilized
on multiprocessor systems.
-
Proper synchronization and locking is crucial for a multicore
scheduler, but not easy. Pay close attention to the kind of locking
used in existing kernel schedulers. This is extremely critical for
the next part of the assignment. It is very easy to overlook a key
synchronization aspect and cause a deadlock!
-
You will want to refrain from immediately making your scheduler the
default scheduler for init_task and
instead, test by manually configuring tasks to use your policy with
sched_setscheduler().
-
For a more responsive system, you may want to set the scheduler of
kernel threads to be SCHED_WFQ as well
(otherwise, SCHED_WFQ tasks can starve the
SCHED_NORMAL tasks to a degree). To do
this, you can modify kernel/kthread.c and
replace SCHED_NORMAL with
SCHED_WFQ. It is strongly suggested that
you do this to ensure that your VM is responsive enough for the test
cases.
-
Check out the debugging tips provided
below.
-
Getting WFQ info
To make it easier to monitor the status of your scheduler, you will
implement the following system calls:
#define MAX_CPUS 8 /* We will be testing only on the VMs */
struct wfq_info {
int num_cpus;
int nr_running[MAX_CPUS];
int total_weight[MAX_CPUS];
};
/* This system call will fill the buffer with the required values
* i.e. the total number of CPUs, the number of WFQ processes on each
* CPU and the total weight of WFQ processes on each CPU.
* Return value will also be the total number of CPUs.
* System call number 467.
*/
int get_wfq_info(struct wfq_info *wfq_info);
/* This system call will change the weight for the calling process.
* Only a root user should be able to increase the weight beyond
* the default value of 10. The system call should return an error
* for any weight less than 1.
* System call number 468.
*/
int set_wfq_weight(int weight);
The two system calls should be implemented in
kernel/sched/core.c.
-
Add load-balancing features to WFQ
You should have a working scheduler by now. Make sure you test it before
you move on to this part. So far your scheduler will run a task only on
the CPU that it was originally assigned. Let's change that now! For this
part you will be implementing two balancing features. The first will be
a periodic load balancing where you will try to move a task from the
heaviest CPU to the lightest one and the second is called idle balance
which is when a CPU will attempt to steal a task from another CPU when
it doesn't have any tasks to run i.e. when its runqueue is empty.
Load balancing is a key feature of the default Linux scheduler
(Completely Fair Scheduler). While CFS follows a rather complicated
policy to balance tasks and works to move more than one task in a
single go, your scheduler will have a simple policy in this regard.
Note while the spec and the Linux kernel refer to default scheduler
as the CFS, as of kernel version 6.6 the CFS was merged with the
Earliest Eligible Virtual Deadline First (EEVDF) scheduler.
Periodic load balancing works as follows:
-
In periodic load balancing, a single job from the run queue with the
highest total weight should be moved to the run queue with the
lowest total weight.
-
The job that should be moved is the first eligible job in the run
queue which can be moved without causing the imbalance to reverse.
-
Jobs that are currently running are not eligible to be moved and
some jobs may have restrictions on which CPU they can be run on.
-
Periodic load balancing should be attempted every 500ms. For
example, if CPU 2 performs periodic load balancing (e.g. moves a
task from CPU 4 to CPU 1), no CPU should perform periodic load
balancing for the next 500ms.
-
You may assume for load balancing that the CPUs receive periodic
timer interrupts.
-
While there is a quite a bit of code in the CFS implementation, look
in particular for the functions that determine whether a task is
eligible to moved to another core. Don't expect to move a task every
periodic load balancing cycle.
Idle balancing works as follows:
-
Idle balance is when an idle CPU (i.e. a CPU with an empty runqueue)
pulls one task from another CPU.
-
Again, take a look at the CFS implementation to figure out where
this is taking place and how it is called from
kernel/sched/core.c.
-
The CPU that you pull a task from should have at least two tasks on
its runqueue and have the greatest total weight, although the
measure of greatest total weight can be an estimate and can reflect
measurements at somewhat different times on different CPUs.
-
You should again ensure that the task you are stealing is eligible
to be moved and allowed to run on the idle CPU.
-
Investigate and Demo
Demonstrate that your scheduler effectively balances weights across all
CPUs. The total weight of the processes running on each CPU should be
roughly the same. You should start with a CPU-bound test program such as
a while loop then try running more complex programs that do I/O to show
that the scheduler continues to operate correctly. You should include a
test program test/test.c that calls the system
calls. You should also include a
test/Makefile that builds the program in the same
directory resulting in an executable named test.
Experiment with different weights and see what impact they
have.
You should provide a complete set of results that describe your
experiments. Your results and any explanations should be put into the
PDF report. The writeup should be of
sufficient detail for someone else to be able to repeat your
experiments.
Debugging Tips
The following are some debugging tips you may find useful as you create
your new scheduling class.
-
Before testing, take a snapshot of your VM if you have not done so
already. That way, if your kernel crashes or is unresponsive because of
scheduler malfunction, you can restore the state of the VM prior to the
problem.
-
It is possible that your kernel gets stuck at boot time, preventing you
from reading debug messages via dmesg. In this
scenario, you may find it helpful to redirect your debug messages to a
file on your host machine.
- Right click on your VM and click "Settings"
- Under "Hardware", click "Add" and create a Serial Port.
- Under "Device", click on your new Serial Port.
-
Click on "Use output file", and specify the file on your host
machine to which your would like to dump the kernel log.
-
Turn on your VM, move to your kernel in the GRUB menu, and press
e.
-
Move toward the bottom until you find a line that looks like (linux /boot/vmlinuz-6.14.0-...”).
-
At the end of this line, replace
quiet with
console=ttyS0 (try
console=ttyS1 if this doesn't work).
-
Hit F10 to boot your kernel. The kernel
log should be written to the file on your host machine you specified
earlier.
-
Although Linux in theory allows you to
define new scheduling classes, Linux makes
assumptions in various places that the only scheduling classes
are the ones that have already been defined. You will need to
change those places in the kernel code to allow your kernel to
boot with your scheduling class, allow you to assign tasks
to your scheduling class, and allow your scheduling class
to be used to pick the next task to run.
-
You may find it helpful to use ps,
top, or
htop to view information such as the
CPU usage and scheduling policies of your tasks.
These commands rely on execution time statistics reported by
certain scheduler functions in the kernel. As a result,
bugs in your scheduling class could cause these commands to
report inaccurate information. This cuts two ways. First,
it is possible that your scheduler is working properly in
terms of selecting the right tasks to execute for the right
amount of time, but your calculation of execution time
statistics in the kernel is wrong, so these commands appear to
report that your tasks are not running at all when in fact
they are. Second, it is possible that your scheduler is not
working properly such that these tools report that tasks are
using your scheduling class when in fact they are not.
As a result,
you should not exclusively rely on these tools to determine
if your scheduling class is working properly. For example, when you
make your scheduling class the default scheduling class, the
fact that user-level tools claim that all tasks are using your
scheduling class may not necessarily mean that this is the case.
Instead, to ensure that your scheduling class functions
are actually being used, you might add a printk to a function
like your class's enqueue_task() and
verify that it appears in dmesg
output. Make sure that you do not submit your code
with such debugging printk statements as they can cause issues if
invoked too frequently.