### CAUSAL MEMORY: DEFINITION, IMPLEMENTATION AND PROGRAMMING

Presented by Yoav Kaempfer

# DEFINITIONS

## DEFINITIONS

Memory – A finite set of locations

Operations on memory: • Read Operation –  $r_i(x)v$ • Write Operation -  $w_i(x)v$ 

System – A finite set of processes  $\mathcal{P}=\{p_1,\ldots,p_n\}$  interacting via shared memory

## DEFINITIONS — CONT.

Local history of a process  $p_i$  – A sequence of operations denoted  $L_i$ .

History – A collection of local histories denoted 
$$H = < L_1, \dots, L_n >$$

If operation  $o_1$  precedes operation  $o_2$  in  $L_i$ , we write:  $o_1 \xrightarrow[i]{} o_2$ 

# DEFINITIONS — CONT. 2

A serialization S of a set of operations A:

- A linear sequence
- $\hfill \mbox{Containing exactly the operations of } A$
- Each read operation returns the most recent value written to the location (initial value  $\perp$ )

S respects order  $\rightarrow$  if  $o_1 \rightarrow o_2$  implies  $o_1$  precedes  $o_2$  in S

 $A_{i+w}^{H}$  is the set of all operations in  $L_{i}$  and all write operations in H

## **DEFINITIONS — EXAMPLE**

| $p_1$     | $p_2$       |
|-----------|-------------|
| $w_1(x)0$ | $w_2(x)1$   |
| $r_1(x)0$ | $r_{2}(x)0$ |

#### Formally:

$$L_{1} = (w_{1}(x)0, r_{1}(x)0), w_{1}(x)0 \xrightarrow{\rightarrow}{1} r_{1}(x)0, A_{1+w}^{H} = \{w_{1}(x)0, r_{1}(x)0, w_{2}(x)1\}$$
$$L_{2} = (w_{2}(x)1, r_{2}(x)0), w_{2}(x)1 \xrightarrow{\rightarrow}{2} r_{2}(x)0, A_{2+w}^{H} = \{w_{1}(x)0, w_{2}(x)1, r_{2}(x)0\}$$

A possible serialization of H:

$$S = w_2(x) \mathbf{1}, w_1(x) \mathbf{0}, r_1(x) \mathbf{0}, r_2(x) \mathbf{0}$$
  
$$w_1(x) \mathbf{0}$$
  
$$w_2(x) \mathbf{1}$$
  
$$w_2(x) \mathbf{0}$$
  
$$w_2(x) \mathbf{1}$$

## **CONSISTENT MEMORY**

A memory is said to be X consistent if all histories permitted by it are X consistent

Thus, a program execution on an X consistent memory can always be described by some X consistent history H

# **SEQUENTIAL CONSISTENCY (SC)**

Definition: There is a serialization S of the history H that respects all program orders  $\xrightarrow{i}$ 

The processes cannot tell they are not using a single memory

|                               | $p_1$     | $p_2$          |
|-------------------------------|-----------|----------------|
| Example:                      | $w_1(x)0$ | $r_2(x) \perp$ |
| $S = r_2(x) \perp, w_2(x) 1,$ | $r_1(x)0$ | $w_2(x)1$      |
| $w_1(x)0, r_2(x)0, r_1(x)0$   |           | $r_{2}(x)0$    |

#### **Costly to implement**

# PIPELINED RAM (PRAM)

Definition: For each process  $p_i$  there is a serialization  $S_i$  of  $A_{i+w}^H$  that respects all program orders  $\xrightarrow{j}$ 

Each process sees only the writes of other processes in program order

|              | $p_1$                             | $p_2$                             |
|--------------|-----------------------------------|-----------------------------------|
| Example:     | $w_1(x)0$                         | $w_2(x)1$                         |
|              | $r_1(x)$ 1                        | $r_{2}(x)0$                       |
| Hard to code | $S_1 = w_1(x)0, w_2(x)1, r_1(x)1$ | $S_2 = w_2(x)1, w_1(x)0, r_2(x)0$ |

### CAUSAL MEMORY

### WRITE-INTO ORDER

Associates a write operation with each read operation (except reads of initial value  $\perp$ )

Formally, a write-into order  $\mapsto$  on a history H is a relation such that:

- If  $o_1 \mapsto o_2$ , then there are x and v s.t.  $o_1 = w(x)v$  and  $o_2 = r(x)v$
- For any operation  $o_2$ , there is at most one  $o_1$  s.t.  $o_1 \mapsto o_2$
- If  $o_2 = r(x)v$  and there is no  $o_1$  s.t.  $o_1 \mapsto o_2$ , then  $v = \perp$

A history H may have more than one write-into orders

## **CAUSALITY ORDER**

The transitive closure of the union of all  $\xrightarrow{i}_{i}$  and  $\mapsto$ 

Alternatively,  $o_1 \rightsquigarrow o_2$  if and only if at least one of the cases holds:

• 
$$o_1 \xrightarrow{i} o_2$$
  
•  $o_1 \mapsto o_2$   
• There exists  $o'$  s.t.  $o_1 \dashrightarrow o' \dashrightarrow o_2$ 

If ----> is cyclic, then it is not a causality order

A history H may have more than one causality orders

## EXAMPLE



There are two possible write-into orders, each corresponding to a causality order:



# CAUSAL MEMORY (CM)

A history H is causal if there exists a causality order  $\dashrightarrow$  such that: For each process  $p_i$ , there is a serialization  $S_i$  of  $A_{i+w}^H$  that respects  $\dashrightarrow$ 

Each process sees the writes of the other processes in the same causality order

Strictly weaker than SC but strictly stronger than PRAM

### CM < SC



The history is not SC, but it is CM:



Serializations:  $S_1 = w_1(x)0, w_2(x)1, r_1(x)1$  $S_2 = w_2(x)1, w_1(x)0, r_2(x)0$ 

## $\mathsf{PRAM} < \mathsf{CM}$

| $p_1$     | $p_2$       | $p_3$       |
|-----------|-------------|-------------|
| $w_1(x)0$ | $r_{2}(x)1$ | $r_{3}(y)2$ |
| $w_1(x)1$ | $w_2(y)2$   | $r_{3}(x)0$ |

The history is PRAM, but it is not CM; There is only one possible causality order:



And there is no possible serialization for  $p_3$ 

### MORE CM EXAMPLES





#### Serializations:

$$S_{1} = w_{1}(x)1, w_{3}(y)1$$
  

$$S_{2} = w_{1}(x)1, r_{2}(x)1, r_{2}(y) \perp, w_{3}(y)1$$
  

$$S_{3} = w_{1}(x)1, w_{3}(y)1$$
  

$$S_{4} = w_{3}(y)1, r_{4}(y)1, r_{4}(x) \perp, w_{1}(x)1$$

# MORE CM EXAMPLES — CONT.

| $p_1$      | $p_2$       |
|------------|-------------|
| $w_1(x)$ 1 | $w_2(y)1$   |
| $w_1(y)2$  | $w_2(x)^2$  |
| $r_1(x)$ 1 | $r_2(x)1$   |
| $r_1(y)1$  | $r_{2}(y)1$ |

#### Serializations:

 $S_1 = w_1(x)1, w_1(y)2, r_1(x)1, w_2(y)1, r_1(y)1, w_2(x)2$  $S_2 = w_2(y)1, w_2(x)2, w_1(x)1, r_2(x)1, r_2(y)1, w_1(y)2$ 



## IMPLEMENTATION

## DATA STRUCTURES

Each process holds:

- ${}^{\bullet}M$  A private copy of the shared memory  ${\mathcal M}$
- t A vector clock (an integer array of size n)
- OutQueue a FIFO queue of outgoing messages
- InQueue a priority queue of incoming messages, ordered by "timestamp"

Two vector clocks (timestamps) can be compared element wise: • If each element in  $t_1$  is less or equal to its correspondence in  $t_2$  then  $t_1 \leq t_2$ • If  $t_1 \leq t_2$  and  $t_1 \neq t_2$  then  $t_1 < t_2$ 

### $\leq$ is transitive

## IMPLEMENTATION

/\* Initialization: \*/ for each  $x \in \mathcal{M}$  do  $M[x] := \bot$ for j := 1 to n do t[j] := 0  $OutQueue := \langle \rangle$   $InQueue := \langle \rangle$ /\* Read action: to read from x \*/ return(M[x])

/\* Write action: to write v to x \*/ t[i] := t[i] + 1 M[x] := venqueue  $\langle i, x, v, t \rangle$  to OutQueue /\* Add  $r_i(x)$ \* to  $L_i$  and  $S_i$  \*/

/\* Add  $w_i(x)v$  to  $L_i$  and  $S_i$  \*/

## IMPLEMENTATION — CONT.

/\* Send action: executed infinitely often \*/

if  $OutQueue \neq \langle \rangle$  then let A be some nonempty prefix of OutQueueremove A from OutQueuesend A to all others

/\* Receive action: upon receipt of A from  $p_j$  \*/ foreach  $\langle j, x, v, s \rangle \in A$ enqueue  $\langle j, x, v, s \rangle$  to InQueue

/\* Apply action: executed infinitely often \*/

if  $InQueue \neq \langle \rangle$  then let  $\langle j, x, v, s \rangle$  be head of InQueueif  $s[k] \leq t[k]$  for all  $k \neq j$  and s[j] = t[j] + 1 then remove  $\langle j, x, v, s \rangle$  from InQueue t[j] := s[j]M[x] := v /\* .

/\* Add  $w_j(x)v$  to  $S_i$  \*/

## EXAMPLES

| Positive example:   | $     p_1     w_1(x)0     r_1(x)1 $ |                           | $\frac{p_2}{v_2(x)1}$ $r_2(x)0$ |
|---------------------|-------------------------------------|---------------------------|---------------------------------|
| Negative example: – | $     p_1     w_1(x)0     w_1(x)1 $ | $p_2$ $r_2(x)1$ $w_2(y)2$ | $p_3$<br>$r_3(y)2$<br>$r_3(x)0$ |

**Reminder:** Data structures are *M*, *t*, *InQueue*, *OutQueue* 

Write-tuple is  $\langle i, x, v, t \rangle$ 

## **IMPLEMENTATION NOTES**

Correctness is proved in the paper

Assuming that local computation is negligible with respect to message delays, d is the worst-case message delay, R is the worst-case time for a read and W is the worst-case time for a write: In SC:  $R + W \ge d$ In CM: R = W = 0

Implementation requires reliability, which can be dropped in exchange for inefficiency

### PROGRAMMING

### PROGRAMMING

Two classes of programs are shown, which can be written assuming SC and run correctly on CM

The above statement is proven rigorously in the paper, we will sketch a proof for a special case

This implies improved performance with little coding hassle

## **CONCURRENT-WRITE FREE PROGRAMS**

If neither  $o_1 \dashrightarrow o_2$  nor  $o_2 \dashrightarrow o_1$  hold,  $o_1$  and  $o_2$  are said to be concurrent with respect to  $\dashrightarrow$ 

A program  $\Pi$  is concurrent-write free, if for all histories H of  $\Pi$ : • For all causality orders  $\dashrightarrow$  of H:

• If H has a serialization that respects  $\rightsquigarrow$  (H is SC), then it has no two concurrent write operations with respect to  $\rightsquigarrow$ 

"No two writes can occur interchangeably assuming SC"

### **CONCURRENT-WRITE FREE PROGRAM EXAMPLE**

x, y, and z are shared variables, initially 0; a, b, c, and d are local variables

| <b>process</b> $p_1$ : | <b>process</b> $p_2$ : | process $p_3$ : |
|------------------------|------------------------|-----------------|
| x := 1                 | $\mathbf{repeat}$      | b := y;         |
| y := 1                 | a := y                 | repeat          |
|                        | until $a = 1$          | c := z          |
|                        | z := 1                 | until $c = 1$   |
|                        |                        | d := x          |

## **PROOF SKETCH**

**Theorem (4):** If  $\Pi$  is concurrent-write free, then all histories of  $\Pi$  with causal memory are sequentially consistent

### **Proof sketch (finite case):**

- Let H be a finite causal history of  $\Pi$  and  $\dashrightarrow$  its causality order
- •Using structural induction, prove that H is concurrent-write free with respect to  $\dashrightarrow$  and has a serialization that respects  $\dashrightarrow$

## PROOF SKETCH — CONT.

- -Let H be a finite causal history of  $\Pi$
- •Assume towards contradiction concurrent writes  $W_1$ ,  $W_2$
- •Let H' be the (proper) prefix of H excluding  $w_1$ ,  $w_2$
- ${}^{\bullet}H'$  has serialization S' because of induction assumption
- •Let  $\widehat{H}$  be H' where  $w_1$  and  $w_2$  are added to the right processes
- • $S'w_1w_2$  is a serialization of  $\widehat{H}$ , but  $\Pi$  is concurrent-write free **CONTRADICTION**
- => H is concurrent-write free with respect to  $\rightarrow$

## PROOF SKETCH - CONT. 2

- •Let H be a finite causal history of  $\Pi$
- •Let  $S_i$  be the serialization of  $A_{i+w}^H$  that respects  $\rightsquigarrow$
- •Let  $\implies$  be the transitive closure of  $\rightsquigarrow$  union with:
  - ${}^{\bullet}o_1 \Rightarrow o_2$  if  $o_1$  is a read by  $p_i, o_2$  is a write and  $o_1$  precedes  $o_2$  in  $S_i$
- ■⇒ is acyclic, so choose *o* s.t. there is no o' s.t.  $o \Rightarrow o'$
- $\overline{S}$  is a serialization of H o because of induction assumption
- • $\overline{S}$ ; *o* is a serialization of H

## DATA-RACE FREE PROGRAMS

A program  $\Pi$  is data-race free, if for all histories H of  $\Pi$ :

- For all causality orders  $\dashrightarrow$  of H:
  - If H has a serialization that respects we (H is SC), then it has no two operations that:
    - Both access the same location
    - At least one is a write
    - They are concurrent with respect to -----

"No two operations (which are not both reads) can access the same location interchangeably assuming SC"

### DATA-RACE FREE PROGRAM EXAMPLE

#### Data-Race Free but Not Concurrent-Write Free!

process  $p_0$ : while not done for i := 1 to nawait(complete[i] = 1) for i := 1 to ncomplete[i] := 0 for i := 1 to nawait(changed[i] = 1) done := converged(A, x, b) for i := 1 to nchanged[i] := 0

### process $p_i$ : while not done $t[i] := \left(b[i] - \sum_{j=1}^{i-1} A[i, j] x[j] - \sum_{j=i+1}^{n} A[i, j] x[j]\right) / A[i, i]$ complete[i] := 1 await(complete[i] = 0) x[i] := t[i] changed[i] := 1await(changed[i]) = 0

### DATA-RACE FREE PROGRAM NEGATIVE EXAMPLE

x, y, and z are shared variables, initially 0; a, b, c, and d are local variables

| process $p_1$ :                  | process $p_2$ : | process $p_3$ : |
|----------------------------------|-----------------|-----------------|
| x := 1                           | repeat          | b := y;         |
| y := 1                           | a := y          | repeat          |
|                                  | until $a = 1$   | c := z          |
| <b>Concurrent-Write Free But</b> | z := 1          | until $c = 1$   |
| Not Data-Race Free!              |                 | d := x          |

## **CM SYNCHRONIZATION**

CM can improve performance for DRF programs with synchronization, e.g. busy-wait **await** statements

Mutual exclusion cannot be realized with CM without cooperation, thus **semaphores** can be added

Although blocking, CM with synchronization primitives can be faster in practice than SC

### SUMMARY

We have seen:

General consistency related definitions

The definition of Causal Memory

Two classes of programs which can be programmed for SC and run on CM

### CM is faster than SC, but more easily programmable than PRAM

