6.5840 Lab 2: Key/Value Server
“In this lab you will build a key/value server for a single machine that ensures that each operation is executed exactly once despite network failures and that the operations are linearizable.”
There are two main challenges in this lab. The first is to ensure that each operation is executed exactly once despite possible retries. The second one is to optimize the memory usage for the server: there are fairly specific test cases for memory usage.
General Approach
Question for the reader: the RPC is implemented over a reliable TCP connection, so what’s the point of retrying in the first place? Why not increasing the timeout instead?
The client will retry the operation forever until ck.server.Call
returns true
:
for {
if ok := ck.server.Call("KVServer.Get", &args, &reply); ok {
return reply.Value
}
}
It’s possible that the server didn’t receive a request; or the server received a request, produced a response but the client failed to receive it. In both cases, the client will retry the operation. For the first case, the server does not need to do anything. For the second case, the server needs to make sure that the subsequent requests do not further modify the state, and also produce a correct response.
For a Get
operation, the server can always safely return the current state of the KV store, as it does not modify the state. For a Put
operation, the server needs to check if the request has already been processed, and if so, ignore the request. For an Append
operation, the server also needs to check if the request has already been processed, on top of that, it also needs to store the response value of the processed request, and return it for all subsequent retries as it cannot derive it from a current state.
Another important property we can leverage is that the client will only send requests sequentially. This means that the server only needs to keep track of the last request processed per client, and safely ignore all previous requests.
Data Model
To identify individual requests, we need to assign a unique identifier ArgsId
to each one of them. To identify clients, we also need a unique identifier ClerkId
for each of them. The request identifier really only needs 1 bit of information, but for convenience, we use a uint8
here. There can be an aribtrary number of clients, so we use a uint64
for the client identifier. We initialize the client identifier to a random number using nrand
to avoid conflicts. We initialize the request identifier to 0 and increment it for each request. Unsigned intergers wrap around safely when the overflow. We define the following data structure:
type ClerkId int64
type ArgsId uint8
type Clerk struct {
server *labrpc.ClientEnd
id ClerkId
nextArgsId ArgsId
}
For each request, we attach the client identifier and the request identifier to it:
type PutAppendArgs struct {
ClerkId ClerkId
ArgsId ArgsId
Key string
Value string
}
type GetArgs struct {
ClerkId ClerkId
ArgsId ArgsId
Key string
}
Now, for the server state, we need:
- A mutex to synchronize access to the state.
- A map to maintain the key-value store.
- A cache to de-duplicate requests.
type Response struct {
argsId ArgsId
value string
}
type KVServer struct {
mu sync.Mutex
cache map[ClerkId]Response
kv map[string]string
}
We use the Response
struct to store the request identifier and the response value for the last request processed by the server for a given client. A Get
request clears the cache for that client, a Put
request updates the argsId
and sets the value
to the empty string, and Append
request updates the argsId
and sets the value
to the value before the append operation.
This concludes Lab 2.
If you find this post helpful, please consider sponsoring.
Sponsor