Changeset 1b20da0 in mainline for kernel/generic/src/synch/rcu.c


Ignore:
Timestamp:
2018-02-28T17:52:03Z (7 years ago)
Author:
Jiří Zárevúcky <zarevucky.jiri@…>
Branches:
lfn, master, serial, ticket/834-toolchain-update, topic/msim-upgrade, topic/simplify-dev-export
Children:
3061bc1
Parents:
df6ded8
git-author:
Jiří Zárevúcky <zarevucky.jiri@…> (2018-02-28 17:26:03)
git-committer:
Jiří Zárevúcky <zarevucky.jiri@…> (2018-02-28 17:52:03)
Message:

style: Remove trailing whitespace on non-empty lines, in certain file types.

Command used: tools/srepl '\([^[:space:]]\)\s\+$' '\1' -- *.c *.h *.py *.sh *.s *.S *.ag

File:
1 edited

Legend:

Unmodified
Added
Removed
  • kernel/generic/src/synch/rcu.c

    rdf6ded8 r1b20da0  
    3535 * @file
    3636 * @brief Preemptible read-copy update. Usable from interrupt handlers.
    37  * 
     37 *
    3838 * @par Podzimek-preempt-RCU (RCU_PREEMPT_PODZIMEK)
    39  * 
     39 *
    4040 * Podzimek-preempt-RCU is a preemptible variant of Podzimek's non-preemptible
    4141 * RCU algorithm [1, 2]. Grace period (GP) detection is centralized into a
     
    4343 * that it passed a quiescent state (QS), ie a state when the cpu is
    4444 * outside of an rcu reader section (CS). Cpus check for QSs during context
    45  * switches and when entering and exiting rcu reader sections. Once all 
    46  * cpus announce a QS and if there were no threads preempted in a CS, the 
     45 * switches and when entering and exiting rcu reader sections. Once all
     46 * cpus announce a QS and if there were no threads preempted in a CS, the
    4747 * GP ends.
    48  * 
    49  * The detector increments the global GP counter, _rcu_cur_gp, in order 
    50  * to start a new GP. Readers notice the new GP by comparing the changed 
     48 *
     49 * The detector increments the global GP counter, _rcu_cur_gp, in order
     50 * to start a new GP. Readers notice the new GP by comparing the changed
    5151 * _rcu_cur_gp to a locally stored value last_seen_gp which denotes the
    5252 * the last GP number for which the cpu noted an explicit QS (and issued
    5353 * a memory barrier). Readers check for the change in the outer-most
    54  * (ie not nested) rcu_read_lock()/unlock() as these functions represent 
    55  * a QS. The reader first executes a memory barrier (MB) in order to contain 
    56  * memory references within a CS (and to make changes made by writers 
    57  * visible in the CS following rcu_read_lock()). Next, the reader notes 
     54 * (ie not nested) rcu_read_lock()/unlock() as these functions represent
     55 * a QS. The reader first executes a memory barrier (MB) in order to contain
     56 * memory references within a CS (and to make changes made by writers
     57 * visible in the CS following rcu_read_lock()). Next, the reader notes
    5858 * that it reached a QS by updating the cpu local last_seen_gp to the
    5959 * global GP counter, _rcu_cur_gp. Cache coherency eventually makes
    6060 * the updated last_seen_gp visible to the detector cpu, much like it
    6161 * delivered the changed _rcu_cur_gp to all cpus.
    62  * 
    63  * The detector waits a while after starting a GP and then reads each 
    64  * cpu's last_seen_gp to see if it reached a QS. If a cpu did not record 
     62 *
     63 * The detector waits a while after starting a GP and then reads each
     64 * cpu's last_seen_gp to see if it reached a QS. If a cpu did not record
    6565 * a QS (might be a long running thread without an RCU reader CS; or cache
    6666 * coherency has yet to make the most current last_seen_gp visible to
     
    6868 * via an IPI. If the IPI handler finds the cpu still in a CS, it instructs
    6969 * the cpu to notify the detector that it had exited the CS via a semaphore
    70  * (CPU->rcu.is_delaying_gp). 
     70 * (CPU->rcu.is_delaying_gp).
    7171 * The detector then waits on the semaphore for any cpus to exit their
    72  * CSs. Lastly, it waits for the last reader preempted in a CS to 
     72 * CSs. Lastly, it waits for the last reader preempted in a CS to
    7373 * exit its CS if there were any and signals the end of the GP to
    7474 * separate reclaimer threads wired to each cpu. Reclaimers then
    7575 * execute the callbacks queued on each of the cpus.
    76  * 
    77  * 
     76 *
     77 *
    7878 * @par A-RCU algorithm (RCU_PREEMPT_A)
    79  * 
     79 *
    8080 * A-RCU is based on the user space rcu algorithm in [3] utilizing signals
    81  * (urcu) and Podzimek's rcu [1]. Like in Podzimek's rcu, callbacks are 
    82  * executed by cpu-bound reclaimer threads. There is however no dedicated 
    83  * detector thread and the reclaimers take on the responsibilities of the 
    84  * detector when they need to start a new GP. A new GP is again announced 
     81 * (urcu) and Podzimek's rcu [1]. Like in Podzimek's rcu, callbacks are
     82 * executed by cpu-bound reclaimer threads. There is however no dedicated
     83 * detector thread and the reclaimers take on the responsibilities of the
     84 * detector when they need to start a new GP. A new GP is again announced
    8585 * and acknowledged with _rcu_cur_gp and the cpu local last_seen_gp. Unlike
    86  * Podzimek's rcu, cpus check explicitly for QS only during context switches. 
     86 * Podzimek's rcu, cpus check explicitly for QS only during context switches.
    8787 * Like in urcu, rcu_read_lock()/unlock() only maintain the nesting count
    8888 * and never issue any memory barriers. This makes rcu_read_lock()/unlock()
    8989 * simple and fast.
    90  * 
     90 *
    9191 * If a new callback is queued for a reclaimer and no GP is in progress,
    92  * the reclaimer takes on the role of a detector. The detector increments 
    93  * _rcu_cur_gp in order to start a new GP. It waits a while to give cpus 
     92 * the reclaimer takes on the role of a detector. The detector increments
     93 * _rcu_cur_gp in order to start a new GP. It waits a while to give cpus
    9494 * a chance to switch a context (a natural QS). Then, it examines each
    9595 * non-idle cpu that has yet to pass a QS via an IPI. The IPI handler
     
    9898 * finds the cpu in a CS it does nothing and let the detector poll/interrupt
    9999 * the cpu again after a short sleep.
    100  * 
     100 *
    101101 * @par Caveats
    102  * 
     102 *
    103103 * last_seen_gp and _rcu_cur_gp are always 64bit variables and they
    104104 * are read non-atomically on 32bit machines. Reading a clobbered
    105105 * value of last_seen_gp or _rcu_cur_gp or writing a clobbered value
    106106 * of _rcu_cur_gp to last_seen_gp will at worst force the detector
    107  * to unnecessarily interrupt a cpu. Interrupting a cpu makes the 
     107 * to unnecessarily interrupt a cpu. Interrupting a cpu makes the
    108108 * correct value of _rcu_cur_gp visible to the cpu and correctly
    109109 * resets last_seen_gp in both algorithms.
    110  * 
    111  * 
    112  * 
     110 *
     111 *
     112 *
    113113 * [1] Read-copy-update for opensolaris,
    114114 *     2010, Podzimek
    115115 *     https://andrej.podzimek.org/thesis.pdf
    116  * 
     116 *
    117117 * [2] (podzimek-rcu) implementation file "rcu.patch"
    118118 *     http://d3s.mff.cuni.cz/projects/operating_systems/rcu/rcu.patch
    119  * 
     119 *
    120120 * [3] User-level implementations of read-copy update,
    121121 *     2012, appendix
    122122 *     http://www.rdrop.com/users/paulmck/RCU/urcu-supp-accepted.2011.08.30a.pdf
    123  * 
     123 *
    124124 */
    125125
     
    139139#include <macros.h>
    140140
    141 /* 
    142  * Number of milliseconds to give to preexisting readers to finish 
     141/*
     142 * Number of milliseconds to give to preexisting readers to finish
    143143 * when non-expedited grace period detection is in progress.
    144144 */
    145145#define DETECT_SLEEP_MS    10
    146 /* 
    147  * Max number of pending callbacks in the local cpu's queue before 
     146/*
     147 * Max number of pending callbacks in the local cpu's queue before
    148148 * aggressively expediting the current grace period
    149149 */
     
    159159#define UINT32_MAX_HALF    2147483648U
    160160
    161 /** 
    162  * The current grace period number. Increases monotonically. 
     161/**
     162 * The current grace period number. Increases monotonically.
    163163 * Lock rcu.gp_lock or rcu.preempt_lock to get a current value.
    164164 */
     
    171171        /** Reclaimers use to notify the detector to accelerate GP detection. */
    172172        condvar_t expedite_now;
    173         /** 
     173        /**
    174174         * Protects: req_gp_end_cnt, req_expedited_cnt, completed_gp, _rcu_cur_gp;
    175175         * or: completed_gp, _rcu_cur_gp
     
    177177        SPINLOCK_DECLARE(gp_lock);
    178178        /**
    179          * The number of the most recently completed grace period. At most 
    180          * one behind _rcu_cur_gp. If equal to _rcu_cur_gp, a grace period 
     179         * The number of the most recently completed grace period. At most
     180         * one behind _rcu_cur_gp. If equal to _rcu_cur_gp, a grace period
    181181         * detection is not in progress and the detector is idle.
    182182         */
     
    189189        /** Reader that have been preempted and might delay the next grace period.*/
    190190        list_t next_preempted;
    191         /** 
    192          * The detector is waiting for the last preempted reader 
    193          * in cur_preempted to announce that it exited its reader 
     191        /**
     192         * The detector is waiting for the last preempted reader
     193         * in cur_preempted to announce that it exited its reader
    194194         * section by up()ing remaining_readers.
    195195         */
     
    198198#ifdef RCU_PREEMPT_A
    199199       
    200         /** 
    201          * The detector waits on this semaphore for any preempted readers 
     200        /**
     201         * The detector waits on this semaphore for any preempted readers
    202202         * delaying the grace period once all cpus pass a quiescent state.
    203203         */
     
    212212        /** Number of consecutive grace periods to detect quickly and aggressively.*/
    213213        size_t req_expedited_cnt;
    214         /** 
     214        /**
    215215         * Number of cpus with readers that are delaying the current GP.
    216216         * They will up() remaining_readers.
    217217         */
    218218        atomic_t delaying_cpu_cnt;
    219         /** 
     219        /**
    220220         * The detector waits on this semaphore for any readers delaying the GP.
    221          * 
    222          * Each of the cpus with readers that are delaying the current GP 
    223          * must up() this sema once they reach a quiescent state. If there 
    224          * are any readers in cur_preempted (ie preempted preexisting) and 
     221         *
     222         * Each of the cpus with readers that are delaying the current GP
     223         * must up() this sema once they reach a quiescent state. If there
     224         * are any readers in cur_preempted (ie preempted preexisting) and
    225225         * they are already delaying GP detection, the last to unlock its
    226226         * reader section must up() this sema once.
     
    252252static void start_reclaimers(void);
    253253static void synch_complete(rcu_item_t *rcu_item);
    254 static inline void rcu_call_impl(bool expedite, rcu_item_t *rcu_item, 
     254static inline void rcu_call_impl(bool expedite, rcu_item_t *rcu_item,
    255255        rcu_func_t func);
    256256static void add_barrier_cb(void *arg);
     
    396396
    397397
    398 /** Cleans up global RCU resources and stops dispatching callbacks. 
    399  * 
     398/** Cleans up global RCU resources and stops dispatching callbacks.
     399 *
    400400 * Call when shutting down the kernel. Outstanding callbacks will
    401401 * not be processed. Instead they will linger forever.
     
    444444                snprintf(name, THREAD_NAME_BUFLEN - 1, "rcu-rec/%u", cpu_id);
    445445               
    446                 cpus[cpu_id].rcu.reclaimer_thr = 
     446                cpus[cpu_id].rcu.reclaimer_thr =
    447447                        thread_create(reclaimer, NULL, TASK, THREAD_FLAG_NONE, name);
    448448
    449                 if (!cpus[cpu_id].rcu.reclaimer_thr) 
     449                if (!cpus[cpu_id].rcu.reclaimer_thr)
    450450                        panic("Failed to create RCU reclaimer thread on cpu%u.", cpu_id);
    451451
     
    460460static void start_detector(void)
    461461{
    462         rcu.detector_thr = 
     462        rcu.detector_thr =
    463463                thread_create(detector, NULL, TASK, THREAD_FLAG_NONE, "rcu-det");
    464464       
    465         if (!rcu.detector_thr) 
     465        if (!rcu.detector_thr)
    466466                panic("Failed to create RCU detector thread.");
    467467       
     
    479479}
    480480
    481 /** Unlocks the local reader section using the given nesting count. 
    482  * 
    483  * Preemption or interrupts must be disabled. 
    484  * 
    485  * @param pnesting_cnt Either &CPU->rcu.tmp_nesting_cnt or 
     481/** Unlocks the local reader section using the given nesting count.
     482 *
     483 * Preemption or interrupts must be disabled.
     484 *
     485 * @param pnesting_cnt Either &CPU->rcu.tmp_nesting_cnt or
    486486 *           THREAD->rcu.nesting_cnt.
    487487 */
     
    493493                _rcu_record_qs();
    494494               
    495                 /* 
    496                  * The thread was preempted while in a critical section or 
    497                  * the detector is eagerly waiting for this cpu's reader 
    498                  * to finish. 
    499                  * 
     495                /*
     496                 * The thread was preempted while in a critical section or
     497                 * the detector is eagerly waiting for this cpu's reader
     498                 * to finish.
     499                 *
    500500                 * Note that THREAD may be NULL in scheduler() and not just during boot.
    501501                 */
     
    518518         */
    519519       
    520         /* 
     520        /*
    521521         * If the detector is eagerly waiting for this cpu's reader to unlock,
    522522         * notify it that the reader did so.
     
    566566        assert(!rcu_read_locked());
    567567       
    568         synch_item_t completion; 
     568        synch_item_t completion;
    569569
    570570        waitq_initialize(&completion.wq);
     
    584584void rcu_barrier(void)
    585585{
    586         /* 
     586        /*
    587587         * Serialize rcu_barrier() calls so we don't overwrite cpu.barrier_item
    588588         * currently in use by rcu_barrier().
     
    590590        mutex_lock(&rcu.barrier_mtx);
    591591       
    592         /* 
     592        /*
    593593         * Ensure we queue a barrier callback on all cpus before the already
    594594         * enqueued barrier callbacks start signaling completion.
     
    610610}
    611611
    612 /** Issues a rcu_barrier() callback on the local cpu. 
    613  * 
    614  * Executed with interrupts disabled. 
     612/** Issues a rcu_barrier() callback on the local cpu.
     613 *
     614 * Executed with interrupts disabled.
    615615 */
    616616static void add_barrier_cb(void *arg)
     
    631631}
    632632
    633 /** Adds a callback to invoke after all preexisting readers finish. 
    634  * 
     633/** Adds a callback to invoke after all preexisting readers finish.
     634 *
    635635 * May be called from within interrupt handlers or RCU reader sections.
    636  * 
     636 *
    637637 * @param rcu_item Used by RCU to track the call. Must remain
    638638 *         until the user callback function is entered.
     
    655655
    656656/** rcu_call() inline-able implementation. See rcu_call() for comments. */
    657 static inline void rcu_call_impl(bool expedite, rcu_item_t *rcu_item, 
     657static inline void rcu_call_impl(bool expedite, rcu_item_t *rcu_item,
    658658        rcu_func_t func)
    659659{
     
    667667        rcu_cpu_data_t *r = &CPU->rcu;
    668668
    669         rcu_item_t **prev_tail 
     669        rcu_item_t **prev_tail
    670670                = local_atomic_exchange(&r->parriving_cbs_tail, &rcu_item->next);
    671671        *prev_tail = rcu_item;
     
    704704{
    705705        assert(THREAD && THREAD->wired);
    706         /* 
    707          * Accessing with interrupts enabled may at worst lead to 
     706        /*
     707         * Accessing with interrupts enabled may at worst lead to
    708708         * a false negative if we race with a local interrupt handler.
    709709         */
     
    740740static bool wait_for_pending_cbs(void)
    741741{
    742         if (!all_cbs_empty()) 
     742        if (!all_cbs_empty())
    743743                return true;
    744744
     
    772772                if (exec_cnt < CRITICAL_THRESHOLD) {
    773773                        exec_cbs(&CPU->rcu.cur_cbs);
    774                         exec_cbs(&CPU->rcu.next_cbs);   
     774                        exec_cbs(&CPU->rcu.next_cbs);
    775775                } else {
    776                         /* 
    777                          * Getting overwhelmed with too many callbacks to run. 
    778                          * Disable preemption in order to prolong our time slice 
     776                        /*
     777                         * Getting overwhelmed with too many callbacks to run.
     778                         * Disable preemption in order to prolong our time slice
    779779                         * and catch up with updaters posting new callbacks.
    780780                         */
    781781                        preemption_disable();
    782782                        exec_cbs(&CPU->rcu.cur_cbs);
    783                         exec_cbs(&CPU->rcu.next_cbs);   
     783                        exec_cbs(&CPU->rcu.next_cbs);
    784784                        preemption_enable();
    785785                }
     
    792792                        exec_cbs(&CPU->rcu.cur_cbs);
    793793                } else {
    794                         /* 
    795                          * Getting overwhelmed with too many callbacks to run. 
    796                          * Disable preemption in order to prolong our time slice 
     794                        /*
     795                         * Getting overwhelmed with too many callbacks to run.
     796                         * Disable preemption in order to prolong our time slice
    797797                         * and catch up with updaters posting new callbacks.
    798798                         */
     
    828828        CPU->rcu.stat_max_cbs = max(arriving_cnt, CPU->rcu.stat_max_cbs);
    829829        if (0 < arriving_cnt) {
    830                 CPU->rcu.stat_avg_cbs = 
     830                CPU->rcu.stat_avg_cbs =
    831831                        (99 * CPU->rcu.stat_avg_cbs + 1 * arriving_cnt) / 100;
    832832        }
     
    834834
    835835/** Prepares another batch of callbacks to dispatch at the nest grace period.
    836  * 
     836 *
    837837 * @return True if the next batch of callbacks must be expedited quickly.
    838838 */
     
    849849        CPU->rcu.arriving_cbs_cnt = 0;
    850850       
    851         /* 
     851        /*
    852852         * Too many callbacks queued. Better speed up the detection
    853853         * or risk exhausting all system memory.
    854854         */
    855855        bool expedite = (EXPEDITE_THRESHOLD < CPU->rcu.next_cbs_cnt)
    856                 || CPU->rcu.expedite_arriving; 
     856                || CPU->rcu.expedite_arriving;
    857857        CPU->rcu.expedite_arriving = false;
    858858
     
    860860        CPU->rcu.next_cbs = CPU->rcu.arriving_cbs;
    861861       
    862         /* 
     862        /*
    863863         * At least one callback arrived. The tail therefore does not point
    864864         * to the head of arriving_cbs and we can safely reset it to NULL.
     
    873873                ACCESS_ONCE(CPU->rcu.parriving_cbs_tail) = &CPU->rcu.arriving_cbs;
    874874        } else {
    875                 /* 
    876                  * arriving_cbs was null and parriving_cbs_tail pointed to it 
     875                /*
     876                 * arriving_cbs was null and parriving_cbs_tail pointed to it
    877877                 * so leave it that way. Note that interrupt handlers may have
    878878                 * added a callback in the meantime so it is not safe to reset
     
    884884        upd_stat_cb_cnts(CPU->rcu.next_cbs_cnt);
    885885       
    886         /* 
    887          * Make changes prior to queuing next_cbs visible to readers. 
     886        /*
     887         * Make changes prior to queuing next_cbs visible to readers.
    888888         * See comment in wait_for_readers().
    889889         */
     
    898898                CPU->rcu.next_cbs_gp = _rcu_cur_gp + 1;
    899899               
    900                 /* 
     900                /*
    901901                 * There are no callbacks to invoke before next_cbs. Instruct
    902902                 * wait_for_cur_cbs_gp() to notify us of the nearest GP end.
    903                  * That could be sooner than next_cbs_gp (if the current GP 
     903                 * That could be sooner than next_cbs_gp (if the current GP
    904904                 * had not yet completed), so we'll create a shorter batch
    905905                 * of callbacks next time around.
     
    907907                if (cur_cbs_empty()) {
    908908                        CPU->rcu.cur_cbs_gp = rcu.completed_gp + 1;
    909                 } 
     909                }
    910910               
    911911                spinlock_unlock(&rcu.gp_lock);
     
    916916        assert(CPU->rcu.cur_cbs_gp <= CPU->rcu.next_cbs_gp);
    917917       
    918         return expedite;       
     918        return expedite;
    919919}
    920920
     
    922922#ifdef RCU_PREEMPT_A
    923923
    924 /** Waits for the grace period associated with callbacks cub_cbs to elapse. 
    925  * 
    926  * @param expedite Instructs the detector to aggressively speed up grace 
     924/** Waits for the grace period associated with callbacks cub_cbs to elapse.
     925 *
     926 * @param expedite Instructs the detector to aggressively speed up grace
    927927 *            period detection without any delay.
    928  * @param completed_gp Returns the most recent completed grace period 
     928 * @param completed_gp Returns the most recent completed grace period
    929929 *            number.
    930930 * @return false if the thread was interrupted and should stop.
     
    951951                        condvar_broadcast(&rcu.gp_ended);
    952952                } else {
    953                         /* GP detection is in progress.*/ 
     953                        /* GP detection is in progress.*/
    954954                       
    955                         if (expedite) 
     955                        if (expedite)
    956956                                condvar_signal(&rcu.expedite_now);
    957957                       
    958958                        /* Wait for the GP to complete. */
    959                         errno_t ret = _condvar_wait_timeout_spinlock(&rcu.gp_ended, &rcu.gp_lock, 
     959                        errno_t ret = _condvar_wait_timeout_spinlock(&rcu.gp_ended, &rcu.gp_lock,
    960960                                SYNCH_NO_TIMEOUT, SYNCH_FLAGS_INTERRUPTIBLE);
    961961                       
    962962                        if (ret == EINTR) {
    963963                                spinlock_unlock(&rcu.gp_lock);
    964                                 return false;                   
     964                                return false;
    965965                        }
    966966                }
     
    984984        while (!cpu_mask_is_none(reader_cpus)) {
    985985                /* Give cpus a chance to context switch (a QS) and batch callbacks. */
    986                 if(!gp_sleep(&expedite)) 
     986                if(!gp_sleep(&expedite))
    987987                        return false;
    988988               
     
    996996        }
    997997       
    998         /* 
     998        /*
    999999         * All cpus have passed through a QS and see the most recent _rcu_cur_gp.
    10001000         * As a result newly preempted readers will associate with next_preempted
     
    10381038               
    10391039        if (locked && !passed_qs) {
    1040                 /* 
     1040                /*
    10411041                 * This cpu has not yet passed a quiescent state during this grace
    10421042                 * period and it is currently in a reader section. We'll have to
     
    10571057        assert(interrupts_disabled());
    10581058
    1059         /* 
    1060          * In order not to worry about NMI seeing rcu_nesting change work 
     1059        /*
     1060         * In order not to worry about NMI seeing rcu_nesting change work
    10611061         * with a local copy.
    10621062         */
    10631063        size_t nesting_cnt = local_atomic_exchange(&THE->rcu_nesting, 0);
    10641064       
    1065         /* 
     1065        /*
    10661066         * Ensures NMIs see .rcu_nesting without the WAS_PREEMPTED mark and
    10671067         * do not accidentally call rm_preempted_reader() from unlock().
     
    10791079
    10801080        if (CPU->rcu.last_seen_gp != _rcu_cur_gp) {
    1081                 /* 
    1082                  * Contain any memory accesses of old readers before announcing a QS. 
     1081                /*
     1082                 * Contain any memory accesses of old readers before announcing a QS.
    10831083                 * Also make changes from the previous GP visible to this cpu.
    1084                  * Moreover it separates writing to last_seen_gp from 
     1084                 * Moreover it separates writing to last_seen_gp from
    10851085                 * note_preempted_reader().
    10861086                 */
    10871087                memory_barrier();
    1088                 /* 
     1088                /*
    10891089                 * The preempted reader has been noted globally. There are therefore
    10901090                 * no readers running on this cpu so this is a quiescent state.
    1091                  * 
    1092                  * Reading the multiword _rcu_cur_gp non-atomically is benign. 
     1091                 *
     1092                 * Reading the multiword _rcu_cur_gp non-atomically is benign.
    10931093                 * At worst, the read value will be different from the actual value.
    10941094                 * As a result, both the detector and this cpu will believe
    10951095                 * this cpu has not yet passed a QS although it really did.
    1096                  * 
     1096                 *
    10971097                 * Reloading _rcu_cur_gp is benign, because it cannot change
    10981098                 * until this cpu acknowledges it passed a QS by writing to
     
    11031103        }
    11041104
    1105         /* 
     1105        /*
    11061106         * Forcefully associate the reclaimer with the highest priority
    11071107         * even if preempted due to its time slice running out.
     
    11091109        if (THREAD == CPU->rcu.reclaimer_thr) {
    11101110                THREAD->priority = -1;
    1111         } 
     1111        }
    11121112       
    11131113        upd_max_cbs_in_slice(CPU->rcu.arriving_cbs_cnt);
     
    11231123}
    11241124
    1125 /** Called from scheduler() when exiting the current thread. 
    1126  * 
     1125/** Called from scheduler() when exiting the current thread.
     1126 *
    11271127 * Preemption or interrupts are disabled and the scheduler() already
    11281128 * switched away from the current thread, calling rcu_after_thread_ran().
     
    11321132        assert(THE->rcu_nesting == 0);
    11331133       
    1134         /* 
    1135          * The thread forgot to exit its reader critical section. 
     1134        /*
     1135         * The thread forgot to exit its reader critical section.
    11361136         * It is a bug, but rather than letting the entire system lock up
    1137          * forcefully leave the reader section. The thread is not holding 
     1137         * forcefully leave the reader section. The thread is not holding
    11381138         * any references anyway since it is exiting so it is safe.
    11391139         */
     
    11621162        size_t prev = local_atomic_exchange(&THE->rcu_nesting, 0);
    11631163        if (prev == RCU_WAS_PREEMPTED) {
    1164                 /* 
     1164                /*
    11651165                 * NMI handlers are never preempted but may call rm_preempted_reader()
    11661166                 * if a NMI occurred in _rcu_preempted_unlock() of a preempted thread.
     
    11681168                 * in _rcu_preempted_unlock() is: an IPI/sample_local_cpu() and
    11691169                 * the initial part of rcu_after_thread_ran().
    1170                  * 
     1170                 *
    11711171                 * rm_preempted_reader() will not deadlock because none of the locks
    11721172                 * it uses are locked in this case. Neither _rcu_preempted_unlock()
     
    11801180#elif defined(RCU_PREEMPT_PODZIMEK)
    11811181
    1182 /** Waits for the grace period associated with callbacks cub_cbs to elapse. 
    1183  * 
    1184  * @param expedite Instructs the detector to aggressively speed up grace 
     1182/** Waits for the grace period associated with callbacks cub_cbs to elapse.
     1183 *
     1184 * @param expedite Instructs the detector to aggressively speed up grace
    11851185 *            period detection without any delay.
    1186  * @param completed_gp Returns the most recent completed grace period 
     1186 * @param completed_gp Returns the most recent completed grace period
    11871187 *            number.
    11881188 * @return false if the thread was interrupted and should stop.
     
    11901190static bool wait_for_cur_cbs_gp_end(bool expedite, rcu_gp_t *completed_gp)
    11911191{
    1192         /* 
     1192        /*
    11931193         * Use a possibly outdated version of completed_gp to bypass checking
    11941194         * with the lock.
    1195          * 
    1196          * Note that loading and storing rcu.completed_gp is not atomic 
    1197          * (it is 64bit wide). Reading a clobbered value that is less than 
    1198          * rcu.completed_gp is harmless - we'll recheck with a lock. The 
    1199          * only way to read a clobbered value that is greater than the actual 
    1200          * value is if the detector increases the higher-order word first and 
    1201          * then decreases the lower-order word (or we see stores in that order), 
    1202          * eg when incrementing from 2^32 - 1 to 2^32. The loaded value 
    1203          * suddenly jumps by 2^32. It would take hours for such an increase 
    1204          * to occur so it is safe to discard the value. We allow increases 
     1195         *
     1196         * Note that loading and storing rcu.completed_gp is not atomic
     1197         * (it is 64bit wide). Reading a clobbered value that is less than
     1198         * rcu.completed_gp is harmless - we'll recheck with a lock. The
     1199         * only way to read a clobbered value that is greater than the actual
     1200         * value is if the detector increases the higher-order word first and
     1201         * then decreases the lower-order word (or we see stores in that order),
     1202         * eg when incrementing from 2^32 - 1 to 2^32. The loaded value
     1203         * suddenly jumps by 2^32. It would take hours for such an increase
     1204         * to occur so it is safe to discard the value. We allow increases
    12051205         * of up to half the maximum to generously accommodate for loading an
    12061206         * outdated lower word.
    12071207         */
    12081208        rcu_gp_t compl_gp = ACCESS_ONCE(rcu.completed_gp);
    1209         if (CPU->rcu.cur_cbs_gp <= compl_gp 
     1209        if (CPU->rcu.cur_cbs_gp <= compl_gp
    12101210                && compl_gp <= CPU->rcu.cur_cbs_gp + UINT32_MAX_HALF) {
    12111211                *completed_gp = compl_gp;
     
    12241224        assert(_rcu_cur_gp <= CPU->rcu.cur_cbs_gp);
    12251225       
    1226         /* 
    1227          * Notify the detector of how many GP ends we intend to wait for, so 
     1226        /*
     1227         * Notify the detector of how many GP ends we intend to wait for, so
    12281228         * it can avoid going to sleep unnecessarily. Optimistically assume
    12291229         * new callbacks will arrive while we're waiting; hence +1.
     
    12321232        req_detection(remaining_gp_ends + (arriving_cbs_empty() ? 0 : 1));
    12331233       
    1234         /* 
    1235          * Ask the detector to speed up GP detection if there are too many 
     1234        /*
     1235         * Ask the detector to speed up GP detection if there are too many
    12361236         * pending callbacks and other reclaimers have not already done so.
    12371237         */
    12381238        if (expedite) {
    1239                 if(0 == rcu.req_expedited_cnt) 
     1239                if(0 == rcu.req_expedited_cnt)
    12401240                        condvar_signal(&rcu.expedite_now);
    12411241               
    1242                 /* 
    1243                  * Expedite only cub_cbs. If there really is a surge of callbacks 
     1242                /*
     1243                 * Expedite only cub_cbs. If there really is a surge of callbacks
    12441244                 * the arriving batch will expedite the GP for the huge number
    12451245                 * of callbacks currently in next_cbs
     
    12521252       
    12531253        *completed_gp = rcu.completed_gp;
    1254         spinlock_unlock(&rcu.gp_lock); 
     1254        spinlock_unlock(&rcu.gp_lock);
    12551255       
    12561256        if (!interrupted)
     
    12691269        /* Wait until wait_on_gp ends. */
    12701270        while (rcu.completed_gp < wait_on_gp && !interrupted) {
    1271                 int ret = _condvar_wait_timeout_spinlock(&rcu.gp_ended, &rcu.gp_lock, 
     1271                int ret = _condvar_wait_timeout_spinlock(&rcu.gp_ended, &rcu.gp_lock,
    12721272                        SYNCH_NO_TIMEOUT, SYNCH_FLAGS_INTERRUPTIBLE);
    12731273                interrupted = (ret == EINTR);
     
    12981298       
    12991299        while (wait_for_detect_req()) {
    1300                 /* 
     1300                /*
    13011301                 * Announce new GP started. Readers start lazily acknowledging that
    13021302                 * they passed a QS.
     
    13061306                spinlock_unlock(&rcu.gp_lock);
    13071307               
    1308                 if (!wait_for_readers()) 
     1308                if (!wait_for_readers())
    13091309                        goto unlocked_out;
    13101310               
     
    13291329       
    13301330        while (0 == rcu.req_gp_end_cnt && !interrupted) {
    1331                 int ret = _condvar_wait_timeout_spinlock(&rcu.req_gp_changed, 
     1331                int ret = _condvar_wait_timeout_spinlock(&rcu.req_gp_changed,
    13321332                        &rcu.gp_lock, SYNCH_NO_TIMEOUT, SYNCH_FLAGS_INTERRUPTIBLE);
    13331333               
     
    13571357        cpu_mask_active(reading_cpus);
    13581358
    1359         /* 
    1360          * Give readers time to pass through a QS. Also, batch arriving 
     1359        /*
     1360         * Give readers time to pass through a QS. Also, batch arriving
    13611361         * callbacks in order to amortize detection overhead.
    13621362         */
     
    14171417}
    14181418
    1419 /** Invoked on a cpu delaying grace period detection. 
    1420  * 
    1421  * Induces a quiescent state for the cpu or it instructs remaining 
     1419/** Invoked on a cpu delaying grace period detection.
     1420 *
     1421 * Induces a quiescent state for the cpu or it instructs remaining
    14221422 * readers to notify the detector once they finish.
    14231423 */
     
    14321432                if (0 < CPU->rcu.nesting_cnt) {
    14331433                        assert(!CPU->idle);
    1434                         /* 
    1435                          * Note to notify the detector from rcu_read_unlock(). 
    1436                          * 
     1434                        /*
     1435                         * Note to notify the detector from rcu_read_unlock().
     1436                         *
    14371437                         * ACCESS_ONCE ensures the compiler writes to is_delaying_gp
    14381438                         * only after it determines that we are in a reader CS.
     
    14431443                        atomic_inc(&rcu.delaying_cpu_cnt);
    14441444                } else {
    1445                         /* 
    1446                          * The cpu did not enter any rcu reader sections since 
     1445                        /*
     1446                         * The cpu did not enter any rcu reader sections since
    14471447                         * the start of the current GP. Record a quiescent state.
    1448                          * 
     1448                         *
    14491449                         * Or, we interrupted rcu_read_unlock_impl() right before
    1450                          * it recorded a QS. Record a QS for it. The memory barrier 
    1451                          * contains the reader section's mem accesses before 
     1450                         * it recorded a QS. Record a QS for it. The memory barrier
     1451                         * contains the reader section's mem accesses before
    14521452                         * updating last_seen_gp.
    1453                          * 
     1453                         *
    14541454                         * Or, we interrupted rcu_read_lock() right after it recorded
    14551455                         * a QS for the previous GP but before it got a chance to
     
    14611461                }
    14621462        } else {
    1463                 /* 
    1464                  * This cpu already acknowledged that it had passed through 
    1465                  * a quiescent state since the start of cur_gp. 
     1463                /*
     1464                 * This cpu already acknowledged that it had passed through
     1465                 * a quiescent state since the start of cur_gp.
    14661466                 */
    14671467        }
    14681468       
    1469         /* 
     1469        /*
    14701470         * smp_call() makes sure any changes propagate back to the caller.
    14711471         * In particular, it makes the most current last_seen_gp visible
     
    14951495        assert(interrupts_disabled());
    14961496
    1497         /* 
     1497        /*
    14981498         * Prevent NMI handlers from interfering. The detector will be notified
    1499          * in this function if CPU->rcu.is_delaying_gp. The current thread is 
     1499         * in this function if CPU->rcu.is_delaying_gp. The current thread is
    15001500         * no longer running so there is nothing else to signal to the detector.
    15011501         */
    15021502        CPU->rcu.signal_unlock = false;
    1503         /* 
    1504          * Separates clearing of .signal_unlock from accesses to 
     1503        /*
     1504         * Separates clearing of .signal_unlock from accesses to
    15051505         * THREAD->rcu.was_preempted and CPU->rcu.nesting_cnt.
    15061506         */
     
    15161516        }
    15171517       
    1518         /* 
     1518        /*
    15191519         * The preempted reader has been noted globally. There are therefore
    15201520         * no readers running on this cpu so this is a quiescent state.
     
    15221522        _rcu_record_qs();
    15231523
    1524         /* 
    1525          * Interrupt handlers might use RCU while idle in scheduler(). 
    1526          * The preempted reader has been noted globally, so the handlers 
     1524        /*
     1525         * Interrupt handlers might use RCU while idle in scheduler().
     1526         * The preempted reader has been noted globally, so the handlers
    15271527         * may now start announcing quiescent states.
    15281528         */
    15291529        CPU->rcu.nesting_cnt = 0;
    15301530       
    1531         /* 
    1532          * This cpu is holding up the current GP. Let the detector know 
    1533          * it has just passed a quiescent state. 
    1534          * 
    1535          * The detector waits separately for preempted readers, so we have 
     1531        /*
     1532         * This cpu is holding up the current GP. Let the detector know
     1533         * it has just passed a quiescent state.
     1534         *
     1535         * The detector waits separately for preempted readers, so we have
    15361536         * to notify the detector even if we have just preempted a reader.
    15371537         */
     
    15411541        }
    15421542
    1543         /* 
     1543        /*
    15441544         * Forcefully associate the detector with the highest priority
    15451545         * even if preempted due to its time slice running out.
    1546          * 
     1546         *
    15471547         * todo: Replace with strict scheduler priority classes.
    15481548         */
    15491549        if (THREAD == rcu.detector_thr) {
    15501550                THREAD->priority = -1;
    1551         } 
     1551        }
    15521552        else if (THREAD == CPU->rcu.reclaimer_thr) {
    15531553                THREAD->priority = -1;
    1554         } 
     1554        }
    15551555       
    15561556        upd_max_cbs_in_slice(CPU->rcu.arriving_cbs_cnt);
     
    15661566        CPU->rcu.nesting_cnt = THREAD->rcu.nesting_cnt;
    15671567       
    1568         /* 
     1568        /*
    15691569         * Ensures NMI see the proper nesting count before .signal_unlock.
    15701570         * Otherwise the NMI may incorrectly signal that a preempted reader
     
    15731573        compiler_barrier();
    15741574       
    1575         /* 
    1576          * In the unlikely event that a NMI occurs between the loading of the 
    1577          * variables and setting signal_unlock, the NMI handler may invoke 
     1575        /*
     1576         * In the unlikely event that a NMI occurs between the loading of the
     1577         * variables and setting signal_unlock, the NMI handler may invoke
    15781578         * rcu_read_unlock() and clear signal_unlock. In that case we will
    15791579         * incorrectly overwrite signal_unlock from false to true. This event
    1580          * is benign and the next rcu_read_unlock() will at worst 
     1580         * is benign and the next rcu_read_unlock() will at worst
    15811581         * needlessly invoke _rcu_signal_unlock().
    15821582         */
     
    15841584}
    15851585
    1586 /** Called from scheduler() when exiting the current thread. 
    1587  * 
     1586/** Called from scheduler() when exiting the current thread.
     1587 *
    15881588 * Preemption or interrupts are disabled and the scheduler() already
    15891589 * switched away from the current thread, calling rcu_after_thread_ran().
     
    15951595        assert(PREEMPTION_DISABLED || interrupts_disabled());
    15961596       
    1597         /* 
    1598          * The thread forgot to exit its reader critical section. 
     1597        /*
     1598         * The thread forgot to exit its reader critical section.
    15991599         * It is a bug, but rather than letting the entire system lock up
    1600          * forcefully leave the reader section. The thread is not holding 
     1600         * forcefully leave the reader section. The thread is not holding
    16011601         * any references anyway since it is exiting so it is safe.
    16021602         */
     
    16231623        ++_rcu_cur_gp;
    16241624       
    1625         /* 
     1625        /*
    16261626         * Readers preempted before the start of this GP (next_preempted)
    1627          * are preexisting readers now that a GP started and will hold up 
     1627         * are preexisting readers now that a GP started and will hold up
    16281628         * the current GP until they exit their reader sections.
    1629          * 
    1630          * Preempted readers from the previous GP have finished so 
    1631          * cur_preempted is empty, but see comment in _rcu_record_qs(). 
     1629         *
     1630         * Preempted readers from the previous GP have finished so
     1631         * cur_preempted is empty, but see comment in _rcu_record_qs().
    16321632         */
    16331633        list_concat(&rcu.cur_preempted, &rcu.next_preempted);
     
    16421642{
    16431643        /*
    1644          * Ensure the announcement of the start of a new GP (ie up-to-date 
    1645          * cur_gp) propagates to cpus that are just coming out of idle 
     1644         * Ensure the announcement of the start of a new GP (ie up-to-date
     1645         * cur_gp) propagates to cpus that are just coming out of idle
    16461646         * mode before we sample their idle state flag.
    1647          * 
     1647         *
    16481648         * Cpus guarantee that after they set CPU->idle = true they will not
    16491649         * execute any RCU reader sections without first setting idle to
     
    16601660         * on the previously idle cpu -- again thanks to issuing a memory
    16611661         * barrier after returning from idle mode.
    1662          * 
     1662         *
    16631663         * idle -> non-idle cpu      | detector      | reclaimer
    16641664         * ------------------------------------------------------
    16651665         * rcu reader 1              |               | rcu_call()
    16661666         * MB X                      |               |
    1667          * idle = true               |               | rcu_call() 
    1668          * (no rcu readers allowed ) |               | MB A in advance_cbs() 
     1667         * idle = true               |               | rcu_call()
     1668         * (no rcu readers allowed ) |               | MB A in advance_cbs()
    16691669         * MB Y                      | (...)         | (...)
    1670          * (no rcu readers allowed)  |               | MB B in advance_cbs() 
     1670         * (no rcu readers allowed)  |               | MB B in advance_cbs()
    16711671         * idle = false              | ++cur_gp      |
    16721672         * (no rcu readers allowed)  | MB C          |
    16731673         * MB Z                      | signal gp_end |
    16741674         * rcu reader 2              |               | exec_cur_cbs()
    1675          * 
    1676          * 
     1675         *
     1676         *
    16771677         * MB Y orders visibility of changes to idle for detector's sake.
    1678          * 
    1679          * MB Z pairs up with MB C. The cpu making a transition from idle 
     1678         *
     1679         * MB Z pairs up with MB C. The cpu making a transition from idle
    16801680         * will see the most current value of cur_gp and will not attempt
    16811681         * to notify the detector even if preempted during this GP.
    1682          * 
     1682         *
    16831683         * MB Z pairs up with MB A from the previous batch. Updaters' changes
    1684          * are visible to reader 2 even when the detector thinks the cpu is idle 
     1684         * are visible to reader 2 even when the detector thinks the cpu is idle
    16851685         * but it is not anymore.
    1686          * 
     1686         *
    16871687         * MB X pairs up with MB B. Late mem accesses of reader 1 are contained
    1688          * and visible before idling and before any callbacks are executed 
     1688         * and visible before idling and before any callbacks are executed
    16891689         * by reclaimers.
    1690          * 
     1690         *
    16911691         * In summary, the detector does not know of or wait for reader 2, but
    16921692         * it does not have to since it is a new reader that will not access
     
    16961696       
    16971697        cpu_mask_for_each(*cpu_mask, cpu_id) {
    1698                 /* 
    1699                  * The cpu already checked for and passed through a quiescent 
     1698                /*
     1699                 * The cpu already checked for and passed through a quiescent
    17001700                 * state since the beginning of this GP.
    1701                  * 
    1702                  * _rcu_cur_gp is modified by local detector thread only. 
    1703                  * Therefore, it is up-to-date even without a lock. 
    1704                  * 
     1701                 *
     1702                 * _rcu_cur_gp is modified by local detector thread only.
     1703                 * Therefore, it is up-to-date even without a lock.
     1704                 *
    17051705                 * cpu.last_seen_gp may not be up-to-date. At worst, we will
    1706                  * unnecessarily sample its last_seen_gp with a smp_call. 
     1706                 * unnecessarily sample its last_seen_gp with a smp_call.
    17071707                 */
    17081708                bool cpu_acked_gp = (cpus[cpu_id].rcu.last_seen_gp == _rcu_cur_gp);
     
    17501750                list_append(&THREAD->rcu.preempt_link, &rcu.cur_preempted);
    17511751        } else {
    1752                 /* 
     1752                /*
    17531753                 * The reader started after the GP started and this cpu
    17541754                 * already noted a quiescent state. We might block the next GP.
     
    17741774        bool last_removed = now_empty && !prev_empty;
    17751775
    1776         /* 
    1777          * Preempted readers are blocking the detector and 
    1778          * this was the last reader blocking the current GP. 
     1776        /*
     1777         * Preempted readers are blocking the detector and
     1778         * this was the last reader blocking the current GP.
    17791779         */
    17801780        if (last_removed && rcu.preempt_blocking_det) {
     
    18011801               
    18021802                return semaphore_down_interruptable(&rcu.remaining_readers);
    1803         }       
     1803        }
    18041804       
    18051805        return true;
     
    18211821void rcu_print_stat(void)
    18221822{
    1823         /* 
    1824          * Don't take locks. Worst case is we get out-dated values. 
    1825          * CPU local values are updated without any locks, so there 
     1823        /*
     1824         * Don't take locks. Worst case is we get out-dated values.
     1825         * CPU local values are updated without any locks, so there
    18261826         * are no locks to lock in order to get up-to-date values.
    18271827         */
     
    18341834       
    18351835        printf("Config: expedite_threshold=%d, critical_threshold=%d,"
    1836                 " detect_sleep=%dms, %s\n",     
     1836                " detect_sleep=%dms, %s\n",
    18371837                EXPEDITE_THRESHOLD, CRITICAL_THRESHOLD, DETECT_SLEEP_MS, algo);
    18381838        printf("Completed GPs: %" PRIu64 "\n", rcu.completed_gp);
    18391839        printf("Expedited GPs: %zu\n", rcu.stat_expedited_cnt);
    1840         printf("Delayed GPs:   %zu (cpus w/ still running readers after gp sleep)\n", 
     1840        printf("Delayed GPs:   %zu (cpus w/ still running readers after gp sleep)\n",
    18411841                rcu.stat_delayed_cnt);
    18421842        printf("Preempt blocked GPs: %zu (waited for preempted readers; "
Note: See TracChangeset for help on using the changeset viewer.