Important Announcement
PubHTML5 Scheduled Server Maintenance on (GMT) Sunday, June 26th, 2:00 am - 8:00 am.
PubHTML5 site will be inoperative during the times indicated!

Home Explore The Art of Multiprocessor Programming

The Art of Multiprocessor Programming

Published by Willington Island, 2021-08-23 09:42:55

Description: The Art of Multiprocessor Programming, Second Edition, provides users with an authoritative guide to multicore programming. This updated edition introduces higher level software development skills relative to those needed for efficient single-core programming, and includes comprehensive coverage of the new principles, algorithms, and tools necessary for effective multiprocessor programming. The book is an ideal resource for students and professionals alike who will benefit from its thorough coverage of key multiprocessor programming issues.
Features new exercises developed for instructors using the text, with more algorithms, new examples, and other updates throughout the book
Presents the fundamentals of programming multiple threads for accessing shared memory
Explores mainstream concurrent data structures and the key elements of their design, as well as synchronization techniques, from simple locks to transactional memory systems

Search

Read the Text Version

542 Index Blocking lock implementation, 56, 147, 184 Cache granularity, 524 Blocking operation, 56 Cache hit, 153, 523 Cache line, 475, 524 dequeue, 64 Cache miss, 153, 523 Blocking progress condition, see Progress condition Cache state Blocking synchronization, 184 Boolean register, see Register exclusive, 152, 476, 525 Bottleneck invalid, 153, 476, 525 modified, 476, 525 sequential, 254, 265 shared, 152, 476, 525 synchronization, 298 Cache-coherent NUMA (cc-NUMA) system, 524 Bouncer, 44, 45 Call, method, see Method call Bounded pool, see Pool Capacity (of hash table), see Hash set Bounded queue, see Queue CAS, 530 Bounded timestamp, see Timestamp Checkpoint, 477 Bounded wait-free property, see Wait-freedom Chicken sexing, 75 Bounded work stealing deque, see Work stealing CircularArray, 395–398 Class, 52 deque universal, 129 Bounded-range priority queue, see Priority queue Clean double collect, 92 BoundedDEQue, 391–394 Closed addressing, see Hashing BoundedQueue, 231–234 Cluster ID, 167 Bucket (of hash table), see Hash set Cluster (in NUMA system), 166 BucketList, 318, 319 Cluster lock, 168 Buffer, 229 Clustering algorithm, 406 ClusterLocal, 167 write, see Write buffer CMPXCHG instruction, 119, 530 Bus, 152, 475, 522 Coarse-grained synchronization, 201, 206, 308 CoarseHashSet, 308, 309 controller, 522 CoarseList, 206, 207 snooping, 153, 476, 522 Coherence protocol, see Cache coherence protocol Busy (for lock), see Lock Cohort, 168 Busy-waiting, see Spinning Cohort detection, lock supports, 169 Cohort lock, 168 C CohortDetectionLock, 169 Collect operation, 92 C++ Collisions, see Hashing notify_all(), 512 Combining, software, see Software combining notify_one(), 512 Combining status, 267 std::atomic, see std::atomic Combining tree barrier, see Barrier std::condition_variable, 511 CombiningTree, 266–268, 275 std::lock_guard, 510 Common2 register, see Register std::mutex, 509 Communication, transient vs. persistent, 9 std::recursive_mutex, 509 Comparator, 293 std::shared_mutex, 509 lambda expression, see Lambda expression std::unique_lock_wrapper, 510 synchronous, 293 thread, 508 Compare-and-swap (CAS), 116, 119, 529 vs. Java, 451 CompareAndSet(), 116, 119, 120, 506 wait(), 511 Comparison network, 293 isomorphic to balancing network, 294 C++ memory model, 512 synchronous, 293 Cache, 152, 475, 521, 523 complete(H ), 61 Complete method call, see Method call direct-mapped, 477, 524 Compositionality, 57, 63 dirty, 528 problems, 471 eviction, 476 Computability, 1 fully associative, 524 hit ratio, 523 replacement policy, 524 set associative, 524 Cache coherence, transactional, 476 Cache coherence protocol, 152, 475, 476, 524 MESI, 476, 524, 525

Index 543 Computation as a dag, 384 performance, 287 Computational model, 75 periodic, 285, 286 Concrete representation, 204 pipelining, 287 Concurrent algorithm, 1 saturate, 287 Concurrent program, 1 software, 280, 286 Condition, 185, 186 Covering state, 39 Condition, 185, 190, 232 Covers, thread, 40 Condition variable, see Condition Critical section, 22, 500 Critical state, 106 C++, 510 Cuckoo hashing Conflict, thread, 66 concurrent, 324 Consensus, 104 sequential, 323 Consensus, 103, 104 striped concurrent, 329 binary, 104 D solves n-thread, 104 wait-free, 104 Dag, see Directed acyclic graph Consensus number, 103, 105, 129 Data parallelism, 405 Common2 register, 118 Data race, 469 compareAndSet, 119, 120, 129 Data structure, dual, see Dual data structure multiple assignment, 113, 115 Data-parallel algorithm, 405 queue, 110, 111 Deadlock, 24, 29, 468, 500 register, 107, 108 RMW register, 117 vs. livelock, 29 Consensus object, 104 Deadlock-free method, 67 Consensus protocol, 104, 107, 109, 110, 115, 119 Deadlock-free mutual exclusion, see Mutual randomized, 126 ConsensusProtocol, 109, 122 exclusion Consistency Deadlock-freedom, 8, 24, 25, 67, 68 external, see External consistency Decision value, 105 quiescent, see Quiescent consistency Dependent progress condition, see Progress sequential, see Sequential consistency Construction condition register, see Register constructions Deque, 389 universal, see Universal construction Consumer, 229, 230, 244 work stealing, see Work stealing deque Contention, 154, 265, 524 Deschedule (a thread), 522 Contention management, 66, 482 Descriptor, 486, 488 Context switch, 522 Deterministic object, 131 Convoying, 467 Diffracting tree, 288 Coordination protocol, 6 DiffractingBalancer, 291, 292 Correctness, 49 DiffractingTree, 291, 292 Correctness condition Dining philosophers problem, 15 linearizability, see Linearizability Direct-mapped cache, see Cache nonblocking, 56 Directed acyclic graph (dag), 384 quiescent consistency, see Quiescent Disjoint-access-parallel, 305 Dissemination barrier, see Barrier consistency Distributed coordination patterns, 265, 298 sequential consistency, see Sequential Distributed structure, 298 Dominates, in precedence graph, 37 consistency Doorway section, 33, 35, 42 Counter, 5, 22, 360 Double-checked locking, 504, 528 Counter, 23 Double-ended queue, see Deque Counter, bounded method, 360 Driver, 199 Counting network, 276, 279, 298 Dual data structure, 244, 246 DualStack, incorrect, 262 bitonic, 279, 280 contention, 289 E diffracting tree, 288 logarithmic depth, 288 Elimination, 254, 288

544 Index EliminationArray, 254, 255, 257–260 ForkJoinPool, 379 EliminationBackoffStack, 254, 255, 258–260 Free list, 240, 241 Embarrassingly parallel, 13 Free (a lock), see Lock EmptyException, 50 Freedom from interference, 204, 212, 213 Epoch-based memory management, see Memory Fulfill (a reservation), see Reservation FullException, 50 management Functional interface, Java, 409 Equivalent histories, see History Functional programming, 414 Event, 21, 60 G invocation, see Invocation precedence relation, 21 Garbage collection, 240, 454, 455 response, see Response reliance on, 204, 452 Eviction, see Cache Exchanger, 255, 290 GetAndAdd, 116 Exclusion GetAndIncrement, 116 -exclusion, see -exclusion problem GetAndSet, 116, 151 mutual, see Mutual exclusion Ghost variable, see Auxiliary variable Exclusive cache state, see Cache state Global lock, 168 Execute in isolation, 66 Global state, 39 Executor service, 379 Greedy schedule, see Schedule Explicit speculation, problems, 468 Exponential back-off, see Back-off H Extensible hashing, see Hashing External consistency, 59 Hand-over-hand locking, 208–210 Hardware transactional memory, see Transactional F memory Failed method call, see Method call Hash code (of object), 202, 305 Fair schedule, see Schedule Hash function, 305 Fairness, 33, 192, 230 Hash set, 305 False sharing, 158, 159, 526, 527 Fast path, 178 bucket, 305 FastPath, 44 bucket threshold, 307 Fault-tolerance, 9 capacity, 306 Fence, see Memory barrier coarse-grained, 308 Fetch-and-add, 116 fine-grained, 310 Fetch-and-increment, 116 global threshold, 307 FibTask, 384 incremental resizing, 315 FifoReadWriteLock, 192, 193 lock-free, 315, 322 Filter, 30, 31, 43 refinable, 311 Filter (a stream), see Stream resizing, 306, 307, 311, 315, 329, 331 Filter lock, 30 split-ordered, 317 final field, Java, 506 striped, 310 Final state, 105 table, 305 finally clause (of try block), 23 Hash table, see Hash set Fine-grained synchronization, 201, 207, 310, 355 Hashing, 305–334 FineGrainedHeap, 365–371 closed addressing, 305 FineList, 208–211 collisions, 305 First-come-first-served property, 33 cuckoo, see Cuckoo hashing First-in-first-out (FIFO), 230 extensible, 306 Flag, 7, 8 open addressing, 305, 323, 331 Flag principle, 7 probe set, 324 Flag protocol, 7 Hazard pointer, 458, 460, 461 Flaky, 43 Head (of queue), 230 Fork-join pool, 379 Heap, 363 Fork-join task, 379 concurrent, 365 Forking (a task), 379 sequential, 363 HeapNode, FineGrainedHeap, 366, 367 Helping, 90, 93, 134, 238, 239

Index 545 Hierarchical lock, 166–170 lambda expression, see Lambda expression High contention, see Contention notify(), 501 History, 60 notifyAll, 501 sleep(), 502 equivalent, 61 synchronization event, 505 legal, 61 synchronized, 500 linearizable, 62 thread, 497 sequential, 61 ThreadLocalRandom, 503 well-formed, 61 volatile, see volatile Hit, cache, see Cache hit vs. C++, 451 Hit ratio, see Cache wait(), 500, 501 Hold (a lock), see Lock yield(), 502 Hotspot, 265 Java memory model, 150, 504 HWQueue, 73, 250 fundamental property, 504 Hybrid transactional memory, see Transactional Joining (a task), 384 memory K I Kernel, 388 KMeans, 406 Implementation blocking, see Blocking implementation MapReduce-based, 411 vs. interface, 57 stream-based, 417–419 Implementing atomic register, cannot use mutual L exclusion, 77 -exclusion problem, 44 In-place sorting, see Sorting Lambda expression Inconsistent state, lock, see Lock Increment requires two distinct operations, 5 Java, 409 Incremental resizing, see Hash set comparator, 423 Independent progress condition, see Progress interfering, 420 stateful, 420 condition Index distribution mechanism, quiescently Last-in-first-out (LIFO), 230, 251 Latency, 265, 385 consistent counter as, 60 Inherent parallelism, 377 vs. throughput, 267 Initial state, 105 Layer, 286 Inner class, 190, 502 LAYER network, 286 Lazy computation, see Stream anonymous, 498 Lazy method, 238, 239 Interconnect, 521, 522 Lazy synchronization, 164, 201, 215, 225, 337 Interface, vs. implementation, 57 LazyList, 216–220, 337 Interference, freedom from, see Freedom from LazySkipList, 337–342, 344, 345, 355 Legal history, 61 interference Legal object history, 61 Interrupt, 9, 186 LFUniversal, 132 InterruptedException, 186 Linear speedup, 386 Interval, 22 Linearizability, 61, 205, 359 precedence relation, 22 compositionality of, 63 method call, 61 nonblocking, 59, 63 Invalid cache state, see Cache state not guaranteed on real systems, 149 Invariant, 204 vs. quiescent consistency, 60 Invocation, 53, 60, 80, 130 vs. sequential consistency, 59 matching response, 60 Linearizability, 58, 59 IQueue, 72 Linearizable history, see History Isolation, execute in, 66 Linearization (of a history), 62 Linearization point, 58, 205, 341, 343, 348 J Linearized (at a point), method call, 58, 205 Java final field, 506 functional interface, see Functional interface

546 Index Linked list, 202, 231 SimpleReentrantLock, 194, 195 List, 306 spinning, see Spinning lock implementation List-based set implementation, 202 TASLock, 151, 519–521, 526, 527 Livelock, 29 TOLock, 164, 165 Livelock, 29 TTASLock, 151, 154, 519–521, 526, 527 Liveness, 2 Lock striping, 310, 329 Load balancing, see Balancing, workload Lock-free hash set, see Hash set Load operation, 39, 76 Lock-free method, 65, 103, 205, 355 Load-linked and store-conditional (LL/SC), 530 Lock-free object, 66 Local spinning, 153, 526 Lock-free queue, see Queue Local state, 39 Lock-free stack, see Stack Local thread, 166 Lock-free universal construction, see Universal Lock, 23, 184 Lock construction Lock-freedom, 65, 67 acquire, 24, 183, 500 back-off, see Back-off lock vs. wait-freedom, 66 busy, 24 LockedQueue, 188 cluster, see Cluster lock LockFreeExchanger, 255, 256 cohort, see Cohort lock LockFreeHashSet, 319–321 doorway section, 33, 34 LockFreeList, 220, 345 first-come-first-served, 33–35 LockFreeQueue, 236–238 free, 24 LockFreeQueueRecycle, 243 global, see Global lock LockFreeSkipList, 345–347, 349–355, 371, 372 hierarchical, see Hierarchical lock LockFreeStack, 252–254 hold, 24, 183 Locking, problems, 467 idiom for use, 23 LockOne, 25 inconsistent state, 39 Lockout-freedom, 24 queue, see Queue lock LockTwo, 26, 27 readers–writers, see Readers–writers lock Log, in universal construction, 131 reentrant, see Reentrant lock Logarithmic search structure, 335 release, 24, 184, 500 Logical bucket (of lock-free hash set), 315 sequence, see Sequence lock Logical removal, 216, 220, 221, 337, 338, 347, 372 spin, see Spin lock LongStream, 423 starvation-free, 27, 28, 32, 35 Lost wakeup, 187, 501 supports cohort detection, 169 Low contention, see Contention test-and-set, see Test-and-set lock Lower bound, number of locations, 39 test-and-test-and-set, see Test-and-test-and-set M lock thread-oblivious, see Thread-oblivious lock (m, n)-assignment problem, 113 timeout, 164, 172 M-valued register, see Register Lock cohorting, 167 Main memory, 521, 523 Lock coupling, 208 Map (a stream), see Stream Lock elision, transactional, 478 Mapper, 408 Lock implementation BackoffLock, 155 KMeans, 411 Bakery, 34 WordCount, 410, 411 blocking, see Blocking lock implementation Mapper task, 407 CLHLock, 159, 160 Mapper thread, 406 CohortBackoffMCSLock, 170, 171 MapReduce, 408, 409 CohortDetectionMCSLock, 171 implementation, 411, CohortLock, 170 MapReduce, 406–408 413 CompositeFastPathLock, 179, 180, 182 Mark bit, 221 CompositeLock, 173–176, 179 Matching invocation and response, 60 HBOLock, 168 Matrix, 379, 380 MCSLock, 162, 170, 171 Matrix multiplication, 377, 378, 381 MatrixAddTask, 381, 382 MatrixMulTask, 381, 383

Index 547 MatrixMultiply, 425 Multi-reader single-writer (MRSW) register, see MatrixVector, 423 Register Maximal progress, 66, 68 Maximum speedup, 13 MultiConsensus, 114 MemManager, 455, 458, 460, 461 Multicore, 1, 527 Memory barrier, 149, 529 Multiple assignment object, 113 Memory consistency, relaxed, 528 Multiple assignment problem, 113 Memory consistency model, 64 Multiprogrammed environment, 390 Memory contention, see Contention Multithreaded architecture, 528 Memory controller, 152 Mutual exclusion, 5, 10, 21–47, 147–182 Memory fence, see Memory barrier Memory management, 240, 454 deadlock-free, 39 properties of, 8 automatic, see Garbage collection delegating reclamation, 455 N epoch-based, 462 manual, 453 Network protecting memory, 455, 456 balancing, see Balancing network Memory model comparison, see Comparison network C++, 512 counting, see Counting network Java, see Java memory model sorting, see Sorting network Memory model, see Memory consistency model switching, see Switching network Memory reclamation, see Memory management Merger, 281, 282 Node MERGER network, 279, 283, 284 BoundedQueue, 231, 232 MESI protocol, see Cache coherence protocol C++ list, 456 Method, 52 CombiningTree, 268, 271–273 deadlock-free, see Deadlock-free method LazySkipList, 339, 340 Java, 497 LFUniversal, 131 lazy, see Lazy method list-based sets, 202 lock-free, see Lock-free method LockFreeQueue, 237, 238 obstruction-free, see Obstruction-free method LockFreeStack, 252, 253 partial, see Partial method PrioritySkipList, 372 starvation-free, see Starvation-free method StaticTreeBarrier, 438, 439 synchronous, see Synchronous method SynchronousDualQueue, 246, 247 total, see Total method TreeBarrier, 436, 437 wait-free, see Wait-free method Universal, 131 Method call, 53, 80 complete, 61 Nonblocking algorithms, problems, 470 interval of, 61 Nonblocking correctness condition, see Correctness linearization point, see Linearization point overlapping, 61 condition pending, 54, 61 linearizability, 59 precedence relation, 62, 80 Nonblocking method, 56 successful, 202 Nonblocking object implementation, 56 unsuccessful (or failed), 202 Nonblocking operation, 56 Minimal progress, 66, 68 Nonblocking progress condition, see Progress Miss, cache, see Cache miss MMThread, 378 condition Modular system design, 57 Nonblocking synchronization, 201, 220, 225, 315 Modularity, 183 Monitor, 183, 187, 244, 498, 516, 517 price of, 225 Move, thread, 93 Nontrivial register, 117 Multi-reader multi-writer (MRMW) register, see Nonuniform memory access system, see NUMA Register system Notification time, 432 notify(), Java, 501 notify_all(), C++, 512 notifyAll(), Java, 501 notify_one(), C++, 512 NUMA architecture, cacheless, 161, 163 NUMA system, 166, 522

548 Index O Performance (of multiprocessor programs), analyzing, 2 Object, 52 consensus, see Consensus object Periodic, 287 deterministic, see Deterministic object PERIODIC network, 285, 288 Java, 497 Periodic counting network, see Counting network lock-free, see Lock-free object Persistent communication, 9 multiple assignment, see Multiple assignment Peterson, 27, 28, 148 object obstruction-free, see Obstruction-free object generalized, 30 sequential, see Sequential object Peterson lock, generalized, 44 shared concurrent, 76 Peterson’s algorithm, 27, 43 universal, see Universal object PhasedCuckooHashSet, 324–326 wait-free, see Wait-free object Physical removal, 216, 220–222, 338, 347, 372 Pipelining, 287, 298 Object history, legal, 61 Polynomial, 402 Object implementation, wait-free, see Wait-free Pool, 229 Pool, 265 object Object state, 52 bounded, 229 Object subhistory, see Subhistory fork-join, see Fork-join pool Obliterating, 41 thread, see Thread pool Obstruction-free method, 66 unbounded, 229 Obstruction-free object, 66 Postcondition, 52 Obstruction-free snapshot, see Snapshot PQueue, 359 Obstruction-freedom, 66, 68 Precedence graph, timestamps, see Timestamp ODDEVEN network, 294, 295 Precedence relation OneBit, 47 events, see Event Open addressing, see Hashing intervals, see Interval Operation, blocking, see Blocking operation method call, see Method call Operation, nonblocking, see Nonblocking operation Precondition, 52 Optimistic synchronization, 201, 211, 355 Predecessor task, 384 Prefix, 443 coarse-grained, 215 Priority inversion, 467 OptimisticList, 212–215 Priority queue, 359 Order bounded-range, 359 heap-based, 363 program, see Program order range, 360 real-time, see Real-time order skiplist-based, 371 write, 80 unbounded-range, 359, 363, 371 Orec, see Ownership record PrioritySkipList, 371–374 Overlapping method calls, see Method call Prism, 290 Overwriting, 41 Probabilistic data structure, 336 Ownership record (orec), 485 Probe set, see Hashing Process, 387 P Processor, 387, 390, 521, 522 Producer, 229, 230, 244 Padding, 158, 159 Producer–consumer problem, 9, 10 Parallel prefix computation, 443 Producer–consumer property, 10 Parallel sorting, 293 Program correctness, 2 Parallelism, 1, 386 Program order, 54 Progress condition, 64–68, 205 analyzing, 385 blocking, 56, 67 data, see Data parallelism deadlock-freedom, see Deadlock-freedom inherent, see Inherent parallelism dependent, 68 ParallelStream, 419–421 guarantees maximal progress, 68 Partial method, 61, 229, 230 guarantees minimal progress, 68 Partial order (strict), 61 independent, 67 Partial queue, see Queue PeekableStack, 141 Pending method call, see Method call Performance measure, 265

Index 549 lock-freedom, see Lock-freedom RangePolicy, 258, 259 nonblocking, 56, 65, 66 RateLimiter, 200 obstruction-freedom, see Obstruction-freedom Read lock, 189 starvation-freedom, see Starvation-freedom Read–modify–write operation, 5, 116 wait-freedom, see Wait-freedom Read–modify–write register, see Register Progress guarantee, see Progress condition Read–write register, see Register Protocol Reader, 189 cache coherence, see Cache coherence protocol Readers–writers lock, 189–192 consensus, see Consensus protocol Readers–writers problem, 11, 12 coordination, see Coordination protocol ReadWriteLock, 189 Protocol state, 105 Ready (node), 388 Real-time order, 56, 61 Q Rebalancing, 363 Recursive split-ordering, 315 QNode RecursiveAction, 379, 381 CLHLock, 160 RecursiveTask, 379 CompositeLock, 174 RecursiveWordCount, 421 MCSLock, 162 RecursiveWordCountTask, 421, 422 TOLock, 164 Redo log, 487 Reduce (a stream), see Stream Queue, 125, 187 Reducer, 408 Queue, 230 KMeans, 411 bounded, 230, 231 WordCount, 410, 412 consensus number, 110, 111 Reducer task, 407 lock-free, 236 Reducer thread, 406 partial, 230 Reentrant lock, 187, 194 priority, see Priority queue ReentrantLock, 50, 187, 505 single-enqueuer/single-dequeuer, 52 Reference counting, 454 synchronous, 244 Refinable hash set, see Hash set total, 235 RefinableCuckooHashSet, 331–333 unbounded, 235, 236 RefinableHashSet, 312, 313 Queue implementation Register, 77 LockBasedQueue, 50 Register, 54, 76 WaitFreeQueue, 52 1-regular, 102 Queue lock, 156, 170, 171 atomic, 77, 85, 87, 90, 105, 106 timeout, 163 atomic MRMW, 90 QueueConsensus, 110 atomic MRSW, 87, 90 Quicksort, 297 atomic SRSW, 85, 87 Quiescent balancer, see Balancer Boolean, 76, 84 Quiescent balancing network, see Balancing Common2, 117, 118 M-valued, 77, 84 network multi-reader multi-writer (MRMW), 90 Quiescent consistency, 59, 359, 361, 363, 373, 374 multi-reader single-writer (MRSW), 78, 79, compositionality of, 60 82–84, 87, 90, 92 does not preserve program order, 60 nontrivial, 117 nonblocking, 60 read–modify–write, 116 pools and counters, 276 read–write, 76 vs. linearizability, 60 regular, 79, 83–85 vs. sequential consistency, 60 regular Boolean MRSW, 83 Quiescent object, 59 regular M-valued MRSW, 84 regular MRSW, 79 R regular SRSW, 79, 85 safe, 78, 82, 83 r-bounded waiting, 42 safe Boolean MRSW, 83 Random number generator, importance of safe MRSW, 78, 82 thread-local, 257, 503 Randomization, 297, 298, 503 Randomized consensus protocol, see Consensus protocol

550 Index safe SRSW, 78 SeqObject, 130 single-reader single-writer (SRSW), 78, 79, 85, SeqSnapshot, 93 Sequence lock, 468 87 Sequential bottleneck, see Bottleneck wait-free, 91 Sequential consistency, 55, 56 wraparound, 102 Register constructions, 81–91 nonblocking, 56 Regular Boolean MRSW register, see Register not compositional, 57 Regular M-valued MRSW register, see Register not guaranteed on real systems, 149 Regular MRSW register, see Register vs. linearizability, 59 Regular register, see Register vs. quiescent consistency, 60 Regular SRSW register, see Register Sequential consistency, 53–58 RegularBooleanMRSWRegister, 83 Sequential heap, see Heap RegularMRSWRegister, 84 Sequential history, see History Release (a lock), see Lock Sequential object, 130, 134 Release (a semaphore), 196 Sequential skiplist, see Skiplist Remote thread, 166 Sequential specification, 53, 61 Rendezvous, 230, 244, 298 SequentialHeap, 364, 365 Representation invariant, 204 Sequentially consistent, modern systems are not, 64 Reservation, 244 SequentialRegister, 77 fulfill, 245, 247 Set, 202 Resource-acquisition-is-initialization (RAII) idiom, SetAgree, 123 Shared cache state, see Cache state 510 Shared concurrent object, 76 Response, 53, 60, 80, 130 Shared counter, 4 Response, matching invocation, 60 Shared-memory computational model, 75 RevBarrier, 447 Shared-memory multiprocessors, 1 Reverse tree barrier, see Barrier Side effect, 52 RMW operation, see Read–modify–write operation SimpleBarrier, incorrect, 432, 433 RMWConsensus, 117 SimpleLinear, 360 Robustness, 275 SimpleReadWriteLock, 190, 191 Rooms, 198, 263 SimpleSnapshot, 94 Runnable, Java, 497, 498 SimpleTDBarrier, 441 Runs solo, thread, 107 SimpleTree, 361 Single-enqueuer/single-dequeuer queue, see Queue S Single-reader single-writer register, see Register SkipList, 335, 336 Safe Boolean MRSW register, see Register Skiplist, 335, 371 Safe MRSW register, see Register balancing property, 336 Safe register, see Register level, 335 Safe SRSW register, see Register probabilistic, 336 SafeBooleanMRSWRegister, 82 sequential, 335 Safety, 2 shortcut, 336 Sample sorting, 293, 296 Skiplist property, 337 not maintained, 346 generalization of quicksort, 297 SkipQueue, 371, 373, 374 randomization, 297 lock-free, 374 Saturate counting network, see Counting network quiescently consistent, 373, 374 Scan operation, 92 sleep(), Java, 502 Schedule, 65 Smoothing network, 299 fair, 66 Snapshot, 92 greedy, 388 Snapshot Scheduler, 65, 387, 388 atomic, 81, 92, 113 Semaphore, 196 obstruction-free, 92 Semaphore, 194 wait-free, 93 Sense reversing barrier, see Barrier Snooping, see Bus SenseBarrier, 433, 434 Sentinel node, 203, 232, 235, 238, 246, 247, 317, 336

Index 551 Software bitonic counting network, see Counting StickyBit, 123 network Store buffer, see Write buffer Store operation, 39, 76 Software combining, 266 Stream, 414 Software periodic counting network, see Counting Stream, 416 network aggregate operation, 415, 416 Software transaction, see Transactional memory filter, 407 Solo, thread runs, 107 intermediate operation, 415 Sorting, 292 lazy computation, 415 map, 407 in-place, 295 reducer, 407 Sorting network, 293 terminal operation, 415 Stream programming, 406, 414 bitonic, 295 StripedCuckooHashSet, 329, 330 isomorphic to counting network, 294 StripedHashSet, 310 synchronous, 295 Striping, lock, see Lock striping Span law, 386 Subhistory, 61 Span (parallelism analysis), 386 object, 61 Specification, sequential, see Sequential thread, 61 Successful method call, see Method call specification Successor task, 384 Speedup, 12, 386 Supplier, 409 Spin lock, 147 Switching network, 302 Spinning, 526 adding network, 302 Symmetric multiprocessing (SMP), 522 local, see Local spinning Synchronization Spinning lock implementation, 56, 147, 184 blocking, see Blocking synchronization Split-ordering, recursive, see Recursive coarse-grained, see Coarse-grained split-ordering synchronization SRSW register, see Register fine-grained, see Fine-grained synchronization SSSP (single-source shortest-path), 426 lazy, see Lazy synchronization Stack, 251 nonblocking, see Nonblocking synchronization optimistic, see Optimistic synchronization incorrect, 263, 264 Synchronization event, Java, 505 Stack, 251 Synchronization primitive, 103, 529, 530 relative power, 103 lock-free, 251 synchronized block or method, Java, 187, 500, 505 unbounded, 251 Synchronous method, 229, 230 Stamp, 242, 391 Synchronous queue, see Queue StampedSnap, 95, 96 SynchronousDualQueue, 246, 247 StampedValue, 86 SynchronousQueue, 244, 245 Starvation, 24, 66 System state, 39 Starvation-free lock, see Lock Starvation-free method, 67 T Starvation-freedom, 8, 10, 24, 25, 67, 68 State Tail (of queue), 230 bivalent, see Bivalent Task, 377, 390 object, see Object state TDBarrier, 440, 441 protocol, see Protocol state Team consensus, 124 univalent, see Univalent Termination detection barrier, see Barrier Static tree barrier, see Barrier Test-and-set instruction, 150 StaticTreeBarrier, 438, 439 Test-and-set lock, 150, 519 Status, combining, see Combining status Test-and-test-and-set lock, 151, 519 std::atomic, C++, 452, 508, 512 Thief, work stealing, 389 std::condition_variable, C++, 511 std::lock_guard, C++, 510 std::mutex, C++, 509 std::recursive_mutex, C++, 509 std::shared_mutex, C++, 509 std::thread, C++, 508 std::unique_lock_wrapper, C++, 510 Step property, see Balancing network

552 Index Thread Transactional programming, 467, 472 C#, 514, 515 data structures, 493 Java, 498, 499 hardware support, 475 self-abort, 483 Thread, 1, 76, 387, 390, 521, 522 C++, 508 Transient communication, 9 C#, 514 TREE network, 289 conflict, 66 Tree barrier, see Barrier Java, 497 Tree width, 266 join, 498, 508 TreeBarrier, 435–437 local, 166 TreeNode, 361, 362 mapper, 406 TurnArbiter, 171 reducer, 406 TwoThreadLockFreeQueue, 249 remote, 166 runs solo, 107 U start, 498 worker, 377 Unbounded pool, see Pool yield, see Yielding Unbounded queue, see Queue Unbounded stack, see Stack Thread as state machine, 21 Unbounded work stealing deque, see Work stealing Thread ID, 25 Thread pool, 377 deque Unbounded-range priority queue, see Priority queue advantages, 377 UnboundedDEQue, 395–399 Thread subhistory, see Subhistory UnboundedQueue, 236 Thread-local variable, 157, 240, 241, 502, 513, 517 UnboundedResizeLockFreeHashSet, 334 Thread-oblivious lock, 169 Undo log, 486 ThreadContext, 458, 460, 461 Univalent, 105 ThreadID, 3, 25 Universal, 135, 138 Universal class, 130 C#, 517 Universal construction, 129, 131, 134 Java, 502, 503 thread_local, C++, 513 lock-free, 130, 132, 133 ThreadLocal, Java, 502 log, 131 ThreadLocalRandom, Java, 503 wait-free, 134–136 ThreadStart, C#, 514, 515 Universal object, 129, 130 ThreadStatic, C#, 517 Unsuccessful method call, see Method call Throughput, 265 vs. latency, 267 V Timeout, 163, 172 Timestamp, 36, 46 Valence, 105 Timestamp, 76 Validation, 211, 212, 214, 487 bounded, 35 precedence graph, 36 value-based, 490, 491 Timestamp system, 46 Variable, auxiliary, see Auxiliary variable Timestamping, 85, 90 Version number, 173 TimestampSystem, 36 Victim, work stealing, 389 TLE, see Transactional lock elision volatile, omitted declarations, 26 Total method, 56, 61, 229, 230 volatile array, special attention required, 506 Total order, 62 volatile field, Java, 150, 451, 506 Total queue, see Queue volatile keyword, Java, 64, 529 TourBarrier, 445 Tournament tree barrier, see Barrier W Transaction descriptor, 485 Transactional lock elision (TLE), 478 wait() Transactional memory, 481 C++, 511 hardware, 494 Java, 500, 501 hybrid, 492, 493 software, 483, 484 Wait-free consensus, 104 Wait-free method, 65, 205, 215, 220, 225, 339, 348, 355 Wait-free object, 65, 103

Index 553 Wait-free object implementation, 78 Work distribution, 389 Wait-free register, see Register, wait-free Work law, 386 Wait-free snapshot, see Snapshot Work (parallelism analysis), 385 Wait-free universal construction, see Universal Work stealing, 389 Work stealing deque, 390–399 construction Wait-freedom, 65, 67 bounded, 391 unbounded, 395 bounded, 33 Worker, 199, 377, 379 vs. lock-freedom, 66 Worker thread, 377 Waiting, 8 WorkSharingThread, 400 r-bounded, 42 WorkStealingThread, 390, 440, 442 Waiting filter, 301 Wraparound register, see Register Wakeup, lost, see Lost wakeup Write absorption, 528 Well founded ordered set, 203 Write buffer, 149, 528 Well-formed history, see History Write lock, 189 Well-formed use of locks, 23 Write order, 80 WFSnapshot, 96 WriteOnceRegister, 101 Width (of a tree), 266 Writer, 189 Window, 223 WordCount, 405, 406 Y MapReduce-based, 410 stream-based, 416, 417 yield(), Java, 502 Work dealing, 389, 397 Yielding (a processor), 390, 502


Like this book? You can publish your book online for free in a few minutes!
Create your own flipbook