Changeset fbe6b65 in mainline for kernel/generic/src/synch/rcu.c


Ignore:
Timestamp:
2012-11-19T00:36:55Z (12 years ago)
Author:
Adam Hraska <adam.hraska+hos@…>
Branches:
lfn, master, serial, ticket/834-toolchain-update, topic/msim-upgrade, topic/simplify-dev-export
Children:
618d02a
Parents:
089c23d
Message:

rcu: Gave a high-level overview/description of the rcu algorithms.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • kernel/generic/src/synch/rcu.c

    r089c23d rfbe6b65  
    3535 * @file
    3636 * @brief Preemptible read-copy update. Usable from interrupt handlers.
     37 *
     38 * @par Podzimek-preempt-RCU (RCU_PREEMPT_PODZIMEK)
     39 *
     40 * Podzimek-preempt-RCU is a preemptible variant of Podzimek's non-preemptible
     41 * RCU algorithm [1, 2]. Grace period (GP) detection is centralized into a
     42 * single detector thread. The detector requests that each cpu announces
     43 * that it passed a quiescent state (QS), ie a state when the cpu is
     44 * outside of an rcu reader section (CS). Cpus check for QSs during context
     45 * switches and when entering and exiting rcu reader sections. Once all
     46 * cpus announce a QS and if there were no threads preempted in a CS, the
     47 * GP ends.
     48 *
     49 * The detector increments the global GP counter, _rcu_cur_gp, in order
     50 * to start a new GP. Readers notice the new GP by comparing the changed
     51 * _rcu_cur_gp to a locally stored value last_seen_gp which denotes the
     52 * the last GP number for which the cpu noted an explicit QS (and issued
     53 * a memory barrier). Readers check for the change in the outer-most
     54 * (ie not nested) rcu_read_lock()/unlock() as these functions represent
     55 * a QS. The reader first executes a memory barrier (MB) in order to contain
     56 * memory references within a CS (and to make changes made by writers
     57 * visible in the CS following rcu_read_lock()). Next, the reader notes
     58 * that it reached a QS by updating the cpu local last_seen_gp to the
     59 * global GP counter, _rcu_cur_gp. Cache coherency eventually makes
     60 * the updated last_seen_gp visible to the detector cpu, much like it
     61 * delivered the changed _rcu_cur_gp to all cpus.
     62 *
     63 * The detector waits a while after starting a GP and then reads each
     64 * cpu's last_seen_gp to see if it reached a QS. If a cpu did not record
     65 * a QS (might be a long running thread without an RCU reader CS; or cache
     66 * coherency has yet to make the most current last_seen_gp visible to
     67 * the detector; or the cpu is still in a CS) the cpu is interrupted
     68 * via an IPI. If the IPI handler finds the cpu still in a CS, it instructs
     69 * the cpu to notify the detector that it had exited the CS via a semaphore.
     70 * The detector then waits on the semaphore for any cpus to exit their
     71 * CSs. Lastly, it waits for the last reader preempted in a CS to
     72 * exit its CS if there were any and signals the end of the GP to
     73 * separate reclaimer threads wired to each cpu. Reclaimers then
     74 * execute the callbacks queued on each of the cpus.
     75 *
     76 *
     77 * @par A-RCU algorithm (RCU_PREEMPT_A)
     78 *
     79 * A-RCU is based on the user space rcu algorithm in [3] utilizing signals
     80 * (urcu) and Podzimek's rcu [1]. Like in Podzimek's rcu, callbacks are
     81 * executed by cpu-bound reclaimer threads. There is however no dedicated
     82 * detector thread and the reclaimers take on the responsibilities of the
     83 * detector when they need to start a new GP. A new GP is again announced
     84 * and acknowledged with _rcu_cur_gp and the cpu local last_seen_gp. Unlike
     85 * Podzimek's rcu, cpus check explicitly for QS only during context switches.
     86 * Like in urcu, rcu_read_lock()/unlock() only maintain the nesting count
     87 * and never issue any memory barriers. This makes rcu_read_lock()/unlock()
     88 * simple and fast.
     89 *
     90 * If a new callback is queued for a reclaimer and no GP is in progress,
     91 * the reclaimer takes on the role of a detector. The detector increments
     92 * _rcu_cur_gp in order to start a new GP. It waits a while to give cpus
     93 * a chance to switch a context (a natural QS). Then, it examines each
     94 * non-idle cpu that has yet to pass a QS via an IPI. The IPI handler
     95 * sees the most current _rcu_cur_gp and last_seen_gp and notes a QS
     96 * with a memory barrier and an update to last_seen_gp. If the handler
     97 * finds the cpu in a CS it does nothing and let the detector poll/interrupt
     98 * the cpu again after a short sleep.
     99 *
     100 * @par Caveats
     101 *
     102 * last_seen_gp and _rcu_cur_gp are always 64bit variables and they
     103 * are read non-atomically on 32bit machines. Reading a clobbered
     104 * value of last_seen_gp or _rcu_cur_gp or writing a clobbered value
     105 * of _rcu_cur_gp to last_seen_gp will at worst force the detector
     106 * to unnecessarily interrupt a cpu. Interrupting a cpu makes the
     107 * correct value of _rcu_cur_gp visible to the cpu and correctly
     108 * resets last_seen_gp in both algorithms.
     109 *
     110 *
     111 *
     112 * [1] Read-copy-update for opensolaris,
     113 *     2010, Podzimek
     114 *     https://andrej.podzimek.org/thesis.pdf
     115 *
     116 * [2] (podzimek-rcu) implementation file "rcu.patch"
     117 *     http://d3s.mff.cuni.cz/projects/operating_systems/rcu/rcu.patch
     118 *
     119 * [3] User-level implementations of read-copy update,
     120 *     2012, appendix
     121 *     http://www.rdrop.com/users/paulmck/RCU/urcu-supp-accepted.2011.08.30a.pdf
     122 *
    37123 */
    38124 
     
    9921078                 * As a result, both the detector and this cpu will believe
    9931079                 * this cpu has not yet passed a QS although it really did.
     1080                 *
     1081                 * Reloading _rcu_cur_gp is benign, because it cannot change
     1082                 * until this cpu acknowledges it passed a QS by writing to
     1083                 * last_seen_gp. Since interrupts are disabled, only this
     1084                 * code may to so (IPIs won't get through).
    9941085                 */
    9951086                CPU->rcu.last_seen_gp = _rcu_cur_gp;
     
    10551146        ipl_t ipl = interrupts_disable();
    10561147       
     1148        /* todo: Replace with cpu-local atomics to be NMI-safe */
    10571149        if (THE->rcu_nesting == RCU_WAS_PREEMPTED) {
    10581150                THE->rcu_nesting = 0;
     1151                /*
     1152                 * NMI handlers are never preempted but may call rm_preempted_reader()
     1153                 * if a NMI occurred in _rcu_preempted_unlock() of a preempted thread.
     1154                 * rm_preempted_reader() will not deadlock because none of the locks
     1155                 * it uses are locked in this case.
     1156                 */
    10591157                rm_preempted_reader();
    10601158        }
Note: See TracChangeset for help on using the changeset viewer.