Linux wait queue has to handle two problems: (1) lost wake up. e.g. add_wait_queue wake_up // set task runnable set_current_state(TASK_UNINTERRUPTIBLE) schedule() // sleeps forever! this problem occurs in general when we need to check condition then sleep. need to make the check then the sleep atomic. fix 1: check condition add_wait_queue wake_up // set task runnable set_current_state(TASK_UNINTERRUPTIBLE) if(condition) // check if still needs to sleep schedule() fix 2: use prepare_to_wait and finish_wait why in prepare_to_wait, set_current_state has to be after wait_queue add? fix 3: re-order set_current_state and add_wait_queue set_current_state(TASK_UNINTERRUPTIBLE) add_wait_queue wake_up // set task runnable schedule() fix 4: use own lock spin_lock(l); add_wait_queue(); set_current_state(TASK_UNINTERRUPTIBLE); spin_unlock(l); (2) unfortunate preemption. using fix 3 as example set_current_state(TASK_UNINTERRUPTIBLE) // preempted, sleeps forever! add_wait_queue wake_up // set task runnable schedule() // sleeps forever! if there is a preemption right after we set current state to not runnable, but before we get onto wait queue, then we appear to be in trouble. however, Linux distinguished if a schedule() call is caused by preemption or voluntarily. if by preemption, schedule() doesn't take current task off runqueue. relevant code: preempt_schedue_irq(), called when returning from interrupt, sets PREEMPT_ACTIVE flag in schedule() if(prev->state != TASK_RUNNING && !(preempt_count() & PREEMPT_ACTIVE)) { ... deactive_task(prev, q); }