The abstract data type of hash tables provides the server’s internal representation of the directory. It provides the usual get and set operations, but also actions that extract a range of key as a shard, and incorporate a shard into the table. Here is the interface of hash_table:

include collections

module hash_table(key,value,shard) = {

    action set(k:key.t, v:value)
    action get(k:key.t) returns (v:value)
    action extract_(lo:key.iter.t,hi:key.iter.t) returns(res:shard.t)
    action incorporate(s:shard.t)

    function hash(X:key.t) : value

    object spec = {
        after init {
            hash(X) := 0
        }

        before set {
            hash(k) := v
        }

        after get {
            assert v = hash(k)
        }

        after extract_ {
            assert shard.lo(res) = lo;
            assert shard.hi(res) = hi;
            assert key.iter.between(lo,X,hi)-> shard.value(res,X) = hash(X)
        }

        before incorporate(s:shard.t) {
            assert shard.valid(s);
            hash(K) := shard.value(s,K) 
                       if key.iter.between(shard.lo(s),K,shard.hi(s))
                       else hash(K)
        }
    }

The interface maintains an abstract map hash that represents the contents of the table. Initially, all keys map to the background value 0. The set and get actions have the usual semantics. The extract_ action returns a shard with lower bound lo and upper bound hi, such that keys in the interval [lo,hi) match the abstract table hash. Notice the use of the representation function shard.value and also the use of between to test whether a key X is in the interval [lo,hi). Using IVy’s method notation, we can also write key.iter.between(lo,X,hi) as lo.between(X,hi).

As an aside, extract_ has an underscore since it conflicts with an IVy keyword.

The incorporate action sets all the keys in the interval [lo,hi) to match the shard s. Because of the way shard.value is defined, this means that keys not present in the shard’s array are set to 0.

Implementation

The implementation of hash_table actually uses an ordered map rather than a hash table. This is because we need to be able to efficiently iterate over an interval of keys to extract a shard. Ordered maps are provided by the ordered_map module in collections. Here is the relevant part of the interface:

module ordered_map(key,value) = {

    action set(nkey:key.t,v:value)
    action get(k:key.t) returns (v:value)
    action erase(lo:key.iter.t,hi:key.iter.t)
    action lub(it:key.iter.t) returns (res:key.iter.t)
    action next(it:key.iter.t) returns (res:key.iter.t)

    relation contains(K:key.t)
    relation maps(K:key.t,V:value)
    ...

The set and get actions are as usual. The erase action removes an interval [lo,hi) of keys. The lub action returns an iterator to the least key in the set greater than or equal to it, or end if there is none. The next action is similar but returns the least key greater than it. These operations are implemented in the standard library using the map template in the C++ standard template library. This makes set, get, lub and next O(log n), while erase is O(m + log n), where m is the number of keys in the interval.

The module provides two abstract predicates, contains and maps. The contains predicate indicates whether a key is in the map, while maps(K,V) indicates that key K maps to value V. The maps relation is injective, but not total. That is, a key maps to zero or one values.

Get and set

The ordered map representing the table is called tab. We just call the actions of tab to implement get and set:

object impl = {
    
    instance tab : ordered_map(key,value)

    implement set {
        call tab.set(k,v)
    }

    implement get {
        v := tab.get(k)
    }
    ...

Extract

To implement extract_, we iterate over the interval [lo,hi) in the map, recording the key/value pairs in a shard:

implement extract_ {
    res.kv := shard.kvt.empty;
    var idx := tab.lub(lo);
    while idx < hi
    ...
    {
        var k := idx.val;
        res.kv := res.kv.append_pair(k,tab.get(k,0));
        idx := tab.next(idx)
    };                
    res.lo := lo;
    res.hi := hi
}

We start be setting the shard’s array to an empty array. We set our iterator idx to point to the least key (if any) greater than or equal to lo. Then, while idx is less than hi, we append the key/value pair pointed to by idx to the array and move idx to the next key in the map. Finally, we set the lo and hi fields of the shard.

To prove correctness of this implementation, we must decorate the loop with several invariants that allow IVy to prove the post-condition of extract_. These are inserted in place of ... above:

invariant lo <= idx & (idx < hi -> tab.contains(idx.val))
invariant lo.between(X,idx) & tab.contains(X) -> 
               exists I. (res.key_at(I,X) & tab.maps(X,res.value_at(I)))
invariant res.key_at(I,X) -> lo.between(X,idx) & tab.contains(X)
invariant shard.valid(res)

The first is fairly obvious. We don’t state that idx <= hi since this may be false when we enter the loop (if no keys are in the interval). Because we are iterating over tab, we know that idx is always present in tab, except if we are at the end of the iteration.

The second two invariants say that that the contents of the shard match the map up to the current index idx. That is, any key in the map we have already seen must occur in some position I in the key/value array, and this position must contain the expected value. Conversley, any key in the shard must be a key we have already seen.

Finally, we need to establish the representation invariant of shard. That is, we need to maintain the invariant that no key occurs twice in the shard.

Notice the use of the key_at relation here. We never have to use a function that looks up the key in a given position. Rather, we say there exists a position having a given key. This means our functions are always from keys to positions rather than the other way around, avoiding a function cycle. When we do look up the key in a position in the code, we use the method get_key. This implicitly guarantees that there is a key in the given position, but only in that one position. Thus, all functions from positions to keys are hidden in the implementation of keyval. This is a very typical idiom in IVy: if there is a function cycle, we break it by segragating the functions in different directions into different isolates.

Incorporate

This operation is more or less the reverse of extract. We loop over the key/value pairs in a shard, inserting them in the map. First, though, we must erase any keys in the interval [lo,hi), since the specification requires that keys not present in the shard be removed. Here is the implementation:

implement incorporate(s:shard.t) {
    var lo := s.lo;
    var hi := s.hi;
    call tab.erase(lo,hi);
    var pos:shard.index.t  := 0;
    while pos < s.kv.end
    ...
    {
        var k := s.kv.get_key(pos);
        var d := s.kv.get_value(pos);
        if lo.between(k,hi) & d ~= 0{
            call tab.set(k,d)
        };                        
        pos := pos.next
    }
}        

Notice that in the loop, we use between to test whether each key is actually in the shard’s interval. As an alternative, we could have stated in the shard representation invariant that no keys are outside the interval. There is a slight optimization: if a key has the background value 0, we don’t add it to tab.

Also notice that we left out the invariants of the loop. Here they are:

invariant 0 <= pos & pos <= s.kv.end

invariant lo.between(X,hi) & s.value(X) = 0 -> ~tab.contains(X)
invariant lo.between(X,hi) & 0 <= Y & Y < pos & s.key_at(Y,X) & s.value(X) ~= 0
                 -> tab.contains(X) & tab.maps(X,s.value(X))
invariant ~lo.between(X,hi) -> spec.tab_invar(X,Y)

invariant tab.maps(X,Y) & tab.maps(X,Z) -> Y = Z & tab.contains(X)

Yikes. Let’s take them in groups. The first standard: the loop index pos ranges from 0 up to the end of the array. The next three state the correct contents of the map tab at iteration pos. Basically, for keys in the shard’s interval [lo,hi) the map should match the content of the shard up to position pos. This is stated by the second and third invariant. There is the subtlety that zero values are not added to the map. The fourth invariant says that keys outside [lo,hi) should match the abstract map hash. This is stated using the predicate tab_invar(X,Y), which states that the two maps match for key X and value Y:

function tab_invar(X,Y) =
  (tab.contains(X) & tab.maps(X,Y) -> hash(X) = Y)
  & (~tab.contains(X) -> hash(X) = 0)
  & (tab.contains(X) -> tab.maps(X,hash(X)))

Finally, the last invariant is just injectivity of the map. This is really an object invariant of tab, but we have to state it here since tab is modified by the loop. Some day, IVy will have implicit object invariants and this won’t be needed.

To prove our implementation is correct, we need one invariant conjecture:

conjecture spec.tab_invar(X,Y)

This says that the concrete map tab matches the abstract map hash.

Verifying the table implementation

Before using hash_table in our protocol, it’s a good idea to try verifying it and maybe even testing it a bit so we’re fairly confident the specification is what we want.

Here’s a simple program to do the test:

include shard
include table
include key

type value

instance shard : table_shard(key,value)
instance tab : hash_table(key,value,shard)

export tab.set
export tab.get
export tab.extract_
export tab.incorporate

isolate iso_tab = tab with shard,key
isolate iso_key = key

extract iso_impl = tab,shard,key

First, we need the types key, shard and value. We use an uninterpreted type for value and our existing modules for key and shard. We create a table object tab and export its actions. Then we prove tab using the shard and key types. We also have to prove key since it has properties to discharge.

Let’s try verifiying:

$ ivy_check table_test.ivy
Checking isolate iso_key...
Checking isolate iso_tab...
trying ext:tab.extract_...
checking consecution...
trying ext:tab.get...
checking consecution...
trying ext:tab.incorporate...
checking consecution...
trying ext:tab.set...
checking consecution...
Checking isolate key.iso...
trying ext:key.iter.end...
checking consecution...
Checking isolate shard.iso_index...
trying ext:shard.index.next...
checking consecution...
OK

Life is good. Of course it didn’t work out that way the first time. The bugs in the implementations, specifications and invariants have to be worked out by examining counterexamples. It is a useful exercise to try removing some invariants to see the counterexamples.

Testing

Let’s try running a few manual tests:

$ make table_test
ivy_to_cpp target=repl isolate=iso_impl table_test.ivy
g++ -g -o table_test table_test.cpp
$ ./table_test
> tab.set(13,42)
> tab.get(13)
42
> tab.get(14)
0
> tab.extract_({is_end:false,val:11},{is_end:false,val:15})
{lo:{is_end:0,val:11},hi:{is_end:0,val:15},kv:[{p_key:13,p_value:42}]}
> tab.set(17,666)
> tab.extract_({is_end:false,val:11},{is_end:false,val:19})
{lo:{is_end:0,val:11},hi:{is_end:0,val:19},kv:[{p_key:13,p_value:42},{p_key:17,p_value:666}]}
> tab.extract_({is_end:false,val:11},{is_end:false,val:14})
{lo:{is_end:0,val:11},hi:{is_end:0,val:14},kv:[{p_key:13,p_value:42}]}
> tab.extract_({is_end:false,val:11},{is_end:false,val:13})
{lo:{is_end:0,val:11},hi:{is_end:0,val:13},kv:[]}
...

This exercise is useful before implementing on top of hash_table, since we aren’t sure at this point if the specification is right.