「SF-PLF」13 References

Programming Language Foundations - Typing Mutable References

Posted by Hux on March 13, 2019

Hux: this chapter is very similar to TAPL - ch13 References But under a “formal verification” concept, it’s more interesting and practical and push you to think about it!

computational effects - “side effects” of computation - impure features

  • assign to mutable variables (reference cells, arrays, mutable record fields, etc.)
  • perform input and output to files, displays, or network connections;
  • make non-local transfers of control via exceptions, jumps, or continuations;
  • engage in inter-process synchronization and communication

The main extension will be dealing explicitly with a

  • store (or heap) and
  • pointers (or reference) that name store locations, or address

interesting refinement: type preservation

Definition

forms of assignments:

  • rare : Gallina - No
  • some : ML family - Explicit reference and dereference
  • most : C family - Implicit …

For formal study, use ML’s model.

Syntax

Types & Terms

1
2
3
4
5
6
7
8
9
10
11
12
T ::= 
    | Nat
    | Unit
    | T  T
    | Ref T

t ::= 
    | ...                Terms
    | ref t              allocation
    | !t                 dereference
    | t := t             assignment
    | l                  location
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Inductive ty : Type :=
  | Nat : ty
  | Unit : ty
  | Arrow : ty  ty  ty
  | Ref : ty  ty.

Inductive tm : Type :=
  (* STLC with numbers: *)
  ...
  (* New terms: *)
  | unit : tm
  | ref : tm  tm
  | deref : tm  tm
  | assign : tm  tm  tm
  | loc : nat  tm.         (** 这里表示 l 的方式是 wrap 一个 nat as loc **)

Typing

1
2
3
4
5
6
7
8
9
10
11
12
                       Gamma |- t1 : T1
                   ------------------------                         (T_Ref)
                   Gamma |- ref t1 : Ref T1

                    Gamma |- t1 : Ref T11
                    ---------------------                         (T_Deref)
                      Gamma |- !t1 : T11

                    Gamma |- t1 : Ref T11
                      Gamma |- t2 : T11
                   ------------------------                      (T_Assign)
                   Gamma |- t1 := t2 : Unit

Values and Substitution

1
2
3
4
Inductive value : tm  Prop :=
  ...
  | v_unit :     value unit
  | v_loc  : l, value (loc l).  (* <-- 注意这里是一个 Π (l:nat) . value (loc l) *)
1
2
3
4
5
6
7
8
9
Fixpoint subst (x:string) (s:tm) (t:tm) : tm :=
  match t with
  ...
  | unit          t
  | ref t1        ref (subst x s t1)
  | deref t1      deref (subst x s t1)
  | assign t1 t2  assign (subst x s t1) (subst x s t2)
  | loc _         t
  end.

Pragmatics

Side Effects and Sequencing

1
r:=succ(!r); !r

can be desugar to

1
(\x:Unit. !r) (r:=succ(!r)).

then we can write some “imperative programming”

1
2
3
4
r:=succ(!r); 
r:=succ(!r); 
r:=succ(!r); 
!r

References and Aliasing

shared reference brings _shared state

1
2
3
4
let r = ref 5 in
let s = r in
s := 82;
(!r)+1

Shared State

thunks as methods

1
2
3
4
5
6
7
8
9
    let c = ref 0 in
    let incc = \_:Unit. (c := succ (!c); !c) in
    let decc = \_:Unit. (c := pred (!c); !c) in (
      incc unit; 
      incc unit;          -- in real PL: the concrete syntax is `incc()`
      decc unit
    )

Objects

constructor and encapsulation!

1
2
3
4
5
6
7
8
9
    newcounter =
      \_:Unit.            -- add `(self, init_val)` would make it more "real"
        let c = ref 0 in  -- private and only accessible via closure (特权方法)
        let incc = \_:Unit. (c := succ (!c); !c) in
        let decc = \_:Unit. (c := pred (!c); !c) in
        { i=incc, 
          d=decc  }       -- return a "record", or "struct", or "object"!
          

References to Compound Types (e.g. Function Type)

Previously, we use closure to represent map, with functional update 这里的”数组” (这个到底算不算数组估计都有争议,虽然的确提供了 index 但是这个显然是 O(n) 都不知道算不算 random access… 并不是 in-place update 里面的数据的,仅仅是一个 ref 包住的 map 而已 (仅仅是多了可以 shared

其实或许 list (ref nat) 也可以表达数组? 反正都是 O(n) 每次都 linear search 也一样……

1
2
3
4
5
6
7
    newarray = \_:Unit. ref (\n:Nat.0)
    lookup = \a:NatArray. \n:Nat. (!a) n   
    update = \a:NatArray. \m:Nat. \v:Nat.
               let oldf = !a in
               a := (\n:Nat. if equal m n then v else oldf n);

Null References

nullptr!

Deref a nullptr:

  • exception in Java/C#
  • insecure in C/C++ <– violate memory safety!!
1
2
3
4
    type Option T   = Unit + T
    type Nullable T = Option (Ref T)

Why is Option outside? think about C, nullptr is A special const location, like Unit (None in terms of datacon) here.

Garbage Collection

last issue: store de-allocation

w/o GC, extremely difficult to achieve type safety…if a primitive for “explicit deallocation” provided one can easily create dangling reference i.e. references -> deleted

One type-unsafe example: (pseudo code)

1
2
3
4
5
6
7
   a : Ref Nat = ref 1;       -- alloc loc 0
   free(a);                   -- free  loc 0
   b : Ref Bool = ref True;   -- alloc loc 0
   
   a := !a + 1                -- BOOM!

Operational Semantics

Locations

what should be the values of type Ref T?

ref allocate some memory/storage!

run-time store is essentially big array of bytes. different datatype need to allocate different size of space (region)

we think store as array of values, abstracting away different size of different values we use the word location here to prevent from modeling pointer arithmetic, which is un-trackable by most type system

location n is float doesn’t tell you anything about location n+4

Stores

we defined replace as Fixpoint since it’s computational and easier. The consequence is it has to be total.

Reduction

Typing

typing context:

1
Definition context := partial_map ty.

Store typings

why not just make a context a map of pair? we don’t want to complicate the dynamics of language, and this store typing is only for type check.

The Typing Relation

Properties

Well-Typed Stores

Extending Store Typings

Preservation, Finally

Substitution Lemma

Assignment Preserves Store Typing

Weakening for Stores

Preservation!

Progress

References and Nontermination