Linux STREAMS (LiS)


LiS SMP Implementation

Introduction

Beginning with LiS-2.12, LiS makes aggressive use of multiple CPUs in SMP kernels. It is useful for the STREAMS programmer to have some insight into this design in order to know whether, or which, locking techniques must be used in driver code.


Contents

  CPU Scheduling
  Queue Locking
  Service Procedure Context
  Scheduling Statistics

CPU Scheduling

LiS starts up a kernel thread for each CPU on the system. In the output of a ps display each thread will show as a process with a name such as "LiS-2.12:0". This notation means that an LiS kernel thread is running and is bound to CPU 0 (":0").

LiS maintains a single global list of queues whose service procedures need to be run. A queue is place into this list by calling the function qenable, whether directly or indirectly. A given queue can only be in this list once. The read queue and write queue of a queue-pair are considered two different queues for scheduling purposes and both can be scheduled simultaneously.

At "certain points in time" LiS performs an operation that makes a decision concerning the manner in which service procedures are to be invoked via the list of scheduled queues. There are several factors which influence this decision.

  • If the decision is being made just prior to LiS exiting back to user mode from a system call, if the LiS kernel thread for this CPU is inactive and if there are few enough entries in the list of scheduled queues then LiS calls the routine that processes queues directly without waking up any of its kernel threads.
  • If the decision is being made from an interrupt routine then the queue processing routine is not to be called directly.
  • If the decision is being made from a call on the routine qenable then the queue processing routine is not to be called directly.
  • The number of queues in the scheduling list affects the decision making process.
  • The number of LiS kernel threads running affects the decision making process.
  • Whether or not LiS is executing a system call affects the decision making process. In this case LiS may defer any queue processing action until the system call is about to exit.

In the case that the queue processing routine does not get called directly, LiS needs to decide whether to wake up a kernel thread process or whether to defer queue processing until an LiS system call is about to exit.

LiS tries to enlist one CPU for every four queues that are scheduled. This number is based on considerations of CPU loading and average queue lengths from queueing theory. If the number of CPUs currently processing queues is not enough to meet this target then the scheduling process seeks to enlist more CPUs until the number of CPUs is sufficient to meet this target value. Of course, sometimes there are simply too may queues schedule for processing for the number of available CPUs. In that case, all available CPUs run their queue processing threads.

When making the decision as to whether or not to wake up a kernel thread, LiS gives precedence to the CPU that it is running on. If the scheduling algorithm is called from a point just prior to executing back to the user, and if the kernel thread for the active CPU is sleeping, then LiS will simply call the queue processing routine without waking up the kernel thread. This saves the wakeup and context switch overhead.

The routine that actually runs the queues removes one element at a time from the list of scheduled queues and calls the service procedure pointed to by the queue. The routine continues until the list of scheduled queues is empty. Thus, when a kernel thread is actively processing queues, and if the number of scheduled queues does not exceed the estimated capacity of the running threads, it is quite efficient to simply add a queue to the list and let the already running threads process them in due course.

Back to Top


Queue Locking

The queue_t structure in LiS contains a spin lock that is used by LiS to ensure that service procedures are not reentered for the same queue. This lock is not to be used by driver code.

When the LiS queue running routine removes a queue from the list of scheduled queues it acquires this lock prior to calling the service procedure.

LiS also acquires this lock when calling the put procedure associated with a queue. Thus, execution of the put and service procedure are excluded for the same queue.

In a multi-CPU environment, it can happen that one CPU is calling the put procedure while a second CPU is calling the service procedure for the same queue. In this case, one or the other spins until the first CPU finishes the operation and releases the spin lock.

When LiS is about to call the put procedure of a queue from the put or service procedure of a neighboring queue (because the driver called the putnext function), it continues to hold the lock for the calling queue while acquiring the lock for the destination queue. The locks are acquired sequentially as the chain of putnext calls traverse the stream. The locks are released in reverse order as the put procedures return. This has the effect of incrementally locking the entire stream as messages are passed from one module to another.

This behavior is only of interest when modules are I_PUSHed on top of a driver. Otherwise, it is just the stream head write queue and the driver write queue that need to be locked (or other pairwise combinations such as the driver read queue and stream head read queue, or queues involving multiplexors).

The lock that LiS uses has the effect of excluding multiple entries from different threads into the put or service procedure for a given queue. The other queue in the queue pair is unaffected by this locking. Therefore, if there are data structures shared between the read and write put and service procedures of a driver or module, it is up to the driver writer to protect these structures with spin locks.

Back to Top


Service Procedure Context

Due to the manner in which service procedures are called, sometimes from the LiS queue runner threads and sometimes from a "borrowed" system call, service procedures may or may not have some user context present when they run. Service procedures should always assume that there is no user context. Even in the cases where there is some user context, the identity of the user process is unpredictable.

LiS does, however, maintain a copy of the credentials of the process that opened the stream when it calls service procedures on the stream. LiS saves the user and group identifiers plus the capability masks (credentials) of the running process in the stream head structure at the time that the STREAMS file is opened. These identifiers are restored to the task structure before calling a service procedure on that stream.

When calling put procedures, however, no such identity restoration occurs. So the credentials in place when a driver or module put procedure is invoked are those of the invoking entity. Because the queue runner theads always begin driver entry with a call to the service procedure, entries into the put procedures of subsequent drivers will have the credentials of the stream whose service procedure was called in the first instance. When a driver's put procedure is entered from a system call the credentials will be that of the user process which issued the system call.

Back to Top


Scheduling Statistics

LiS gathers statistics on its queue scheduling algorithm. They can be printed out with the command streams -S. The output looks like the following.

N-CPUs N-Qrunners N-Running N-Requested
     2          2         0          0

CPU   Qsched-Cnts Qsched-ISR Svc-Q-Cnts Qrun-Cnts Active Thread-PID
  0     540752204  175753842  459587537 239611835      0        857
  1     540683832  175833424  459150290 239672683      0        858

The fields have the following meanings.

  N-CPUs Number of CPUs on the system.
  N-Qrunners Number of queue runner kernel threads. These are the processes that appear as LiS-2.12:0 in a ps display.
  N-Running Number of qeuue runner threads that are currently active.
  N-Requested The number of queues that are in the list of scheduled queues.
  CPU The remainder of the statistics are kept on a per-CPU basis.
  Qsched-Cnts This is the number times the routine (lis_setqsched) that decides whether or not to wake up a queue runner process or to directly process scheduled queues has been called. This routine is called whenever a queue is added to the scheduling list. The counter reflects which CPU made the call to the routine.
  Qsched-ISR The number of times lis_setqsched was called from an interrupt routine and from which CPU.
  Svc-Q-Cnts The number of callouts to service procedures on a per-CPU basis.
  Qrun-Cnts The number of times the routine (queurun) that removes queues from the schedule list was called. This routine does not return until the queue scheduling list is empty. It can be running on multiple CPUs simultaneously. It is typically called from the queue runner threads but can also be called from an LiS system call either just prior to returning to the user or just prior to sleeping on some event such as the arrival of messages at the stream head.
  Active Displays as 0 or 1 depending upon whether there is a queue runner thread running on the particular CPU at the time that the statistics were sampled.
  Thread-PID The process id of the queue runner thread assaigned to eachCPU.

Back to Top