CS 144 Checkpoint 4 - Interoperating in the world
“This checkpoint is about testing your TCP implementation in the real world and measuring the long-term statistics of a particular Internet path.”
Using your TCP implementation in the real world
If you have a correct implementation and passed all the previous tests, you might not need to write any code for this task. However, if you are not using the standard recommended development environment, you might encounter weird environment issues. I am using an arm64 Ubuntu 23.10 devcontainer on my M1 Pro Macbook. The TCP packets I receive somehow all have incorrect checksums. It took me quite a while to figure out the problem. Eventually, I managed to pass the test on a VPS without changing any code.
There are a lot of supporting code provided for this checkpoint. I find it helpful to understand the whole codebase from end (IP) to end (Byte Stream).
Let’s get started!
The TUN Device
The TUN device is a virtual network device provided by the kernel that allows us to send and receive IP packets. We can create a TUN device by
# Create a TUN device
sudo ip tuntap add mode tun user pcloud dev tun144
# Add an IP address to the TUN device
sudo ip addr add "169.254.144.1/24" dev tun144
# Bring up the TUN device
sudo ip link set dev tun144 up
# Add a route to the TUN device
sudo ip route change "169.254.144.0/24" dev tun144 rto_min 10ms
Checkout scripts/tun.sh
for more details.
We can then get the file descriptor of the TUN device by opening the /dev/net/tun
file, and we can adjust the settings using the ioctl
system call. Here is the implementation in tun.cc
static constexpr const char* CLONEDEV = "/dev/net/tun";
TunTapFD::TunTapFD( const string& devname, const bool is_tun )
: FileDescriptor( ::CheckSystemCall( "open", open( CLONEDEV, O_RDWR | O_CLOEXEC ) ) )
{
struct ifreq tun_req
{};
tun_req.ifr_flags = static_cast<int16_t>( ( is_tun ? IFF_TUN : IFF_TAP ) | IFF_NO_PI ); // no packetinfo
// copy devname to ifr_name, making sure to null terminate
strncpy( static_cast<char*>( tun_req.ifr_name ), devname.data(), IFNAMSIZ - 1 );
tun_req.ifr_name[IFNAMSIZ - 1] = '\0';
CheckSystemCall( "ioctl", ioctl( fd_num(), TUNSETIFF, static_cast<void*>( &tun_req ) ) );
}
Now, we can send IP packets by writing to the TUN device, and receive IP packets by reading from the TUN device.
TCP to IP
The TUN device only works with IP packets, so we need to wrap our TCP packets into an IP packet before sending it down to the TUN device.
class TCPOverIPv4OverTunFdAdapter : public TCPOverIPv4Adapter
{
//! Creates an IPv4 datagram from a TCP segment and writes it to the TUN device
void write( const TCPMessage& seg ) { _tun.write( serialize( wrap_tcp_in_ip( seg ) ) ); }
};
The wrap_tcp_in_ip
function sets up the IP header and calculates the checksum for the TCP segment.
//! Takes a TCP segment, sets port numbers as necessary, and wraps it in an IPv4 datagram
//! \param[in] seg is the TCP segment to convert
InternetDatagram TCPOverIPv4Adapter::wrap_tcp_in_ip( const TCPMessage& msg )
{
TCPSegment seg { .message = msg };
// set the port numbers in the TCP segment
seg.udinfo.src_port = config().source.port();
seg.udinfo.dst_port = config().destination.port();
// create an Internet Datagram and set its addresses and length
InternetDatagram ip_dgram;
ip_dgram.header.src = config().source.ipv4_numeric();
ip_dgram.header.dst = config().destination.ipv4_numeric();
ip_dgram.header.len = ip_dgram.header.hlen * 4 + 20 /* tcp header len */ + seg.message.sender.payload.size();
// set payload, calculating TCP checksum using information from IP header
seg.compute_checksum( ip_dgram.header.pseudo_checksum() );
ip_dgram.header.compute_checksum();
ip_dgram.payload = serialize( seg );
return ip_dgram;
}
IP to TCP
When we read a received IP packet from the TUN device, we need to unwrap the IP packet and extract the TCP segment.
class TCPOverIPv4OverTunFdAdapter : public TCPOverIPv4Adapter
{
optional<TCPMessage> read()
{
vector<string> strs( 2 );
strs.front().resize( IPv4Header::LENGTH );
_tun.read( strs );
InternetDatagram ip_dgram;
const vector<string> buffers = { strs.at( 0 ), strs.at( 1 ) };
if ( parse( ip_dgram, buffers ) ) {
return unwrap_tcp_in_ip( ip_dgram );
}
return {};
}
};
The unwrap_tcp_in_ip
function attempts to parse a TCP segment from the IP datagram’s payload. If this succeeds, it then checks that the received segment is related to the current connection. When a TCP connection has been established, this means checking that the source and destination ports in the TCP header are correct. You can find the implementation in tcp_over_ip.cc
.
Foreground Thread
In minnow
, all the network operations are done in a separate backgroud thread. The foreground thread (the “main thread”) runs application logic, it connects or listens, writes to and reads from a reliable data stream using the public interface of the tcp_minnow_socket
class:
template<TCPDatagramAdapter AdaptT>
class TCPMinnowSocket
{
public:
//! Construct from the interface that the TCPPeer thread will use to read and write datagrams
explicit TCPMinnowSocket( AdaptT&& datagram_interface );
//! Close socket, and wait for TCPPeer to finish
//! \note Calling this function is only advisable if the socket has reached EOF,
//! or else may wait foreever for remote peer to close the TCP connection.
void wait_until_closed();
//! Connect using the specified configurations; blocks until connect succeeds or fails
void connect( const TCPConfig& c_tcp, const FdAdapterConfig& c_ad );
//! Listen and accept using the specified configurations; blocks until accept succeeds or fails
void listen_and_accept( const TCPConfig& c_tcp, const FdAdapterConfig& c_ad );
// Inherited from FileDescriptor
// Read into `buffer`
void read( std::string& buffer );
void read( std::vector<std::string>& buffers );
// Inherited from FileDescriptor
// Attempt to write a buffer
// returns number of bytes written
size_t write( std::string_view buffer );
size_t write( const std::vector<std::string_view>& buffers );
size_t write( const std::vector<std::string>& buffers );
//! When a connected socket is destructed, it will send a RST
~TCPMinnowSocket();
//! \name
//! This object cannot be safely moved or copied, since it is in use by two threads simultaneously
//!@{
TCPMinnowSocket( const TCPMinnowSocket& ) = delete;
TCPMinnowSocket( TCPMinnowSocket&& ) = delete;
TCPMinnowSocket& operator=( const TCPMinnowSocket& ) = delete;
TCPMinnowSocket& operator=( TCPMinnowSocket&& ) = delete;
//!@}
//! \name
//! Some methods of the parent Socket wouldn't work as expected on the TCP socket, so delete them
//!@{
void bind( const Address& address ) = delete;
Address local_address() const = delete;
void set_reuseaddr() = delete;
//!@}
// Return peer address from underlying datagram adapter
const Address& peer_address() const { return _datagram_adapter.config().destination; }
};
In practice, the TCPDatagramAdapter AdaptT
is the TCPOverIPv4OverTunFdAdapter
we discussed earlier who is responsible for reading and writing IP packets to the TUN device.
Background Thread
The background thread takes care of the back-end tasks that the kernel would perform for a TCPSocket
: reading and parsing datagrams from the wire, filtering out segments unrelated to the connection, etc.
class TCPPeer {
void receive( TCPMessage msg, const TransmitFunction& transmit )
{
if ( not active() ) {
return;
}
// Record time in case this peer has to linger after streams finish.
time_of_last_receipt_ = cumulative_time_;
// If SenderMessage occupies a sequence number, make sure to reply.
need_send_ |= ( msg.sender.sequence_length() > 0 );
// If SenderMessage is a "keep-alive" (with intentionally invalid seqno), make sure to reply.
// (N.B. orthodox TCP rules require a reply on any unacceptable segment.)
const auto our_ackno = receiver_.send().ackno;
need_send_ |= ( our_ackno.has_value() and msg.sender.seqno + 1 == our_ackno.value() );
// Did the inbound stream finish before the outbound stream? If so, no need to linger after streams finish.
if ( receiver_.writer().is_closed() and not sender_.reader().is_finished() ) {
linger_after_streams_finish_ = false;
}
// Give incoming TCPSenderMessage to receiver.
receiver_.receive( std::move( msg.sender ) );
// Give incoming TCPReceiverMessage to sender.
sender_.receive( msg.receiver );
// Send reply if needed.
if ( need_send_ ) {
send( sender_.make_empty_message(), transmit );
}
}
};
The receive
function is called when a new TCP message is received.
First the receiver processes it. Then the sender processes it. Finally the sender sends a reply if needed. The transmit
function is just the TCPOverIPv4OverTunFdAdapter::write
function we saw earlier.
void send( const TCPSenderMessage& sender_message, const TransmitFunction& transmit )
{
TCPMessage msg { sender_message, receiver_.send() };
transmit( std::move( msg ) );
need_send_ = false;
}
How does the main thread and the background thread communicate? They use the socketpair
syscall to create a pair of connected Unix-domain sockets. The syscall returns two file descriptors, one for each end of the socket. The two threads can communicate by writing to and reading from these file descriptors.
//! \brief Call [socketpair](\ref man2::socketpair) and return connected Unix-domain sockets of specified type
//! \param[in] type is the type of AF_UNIX sockets to create (e.g., SOCK_SEQPACKET)
//! \returns a std::pair of connected sockets
template<std::derived_from<Socket> SocketType>
inline std::pair<SocketType, SocketType> socket_pair_helper( int domain, int type, int protocol = 0 )
{
std::array<int, 2> fds {};
CheckSystemCall( "socketpair", ::socketpair( domain, type, protocol, fds.data() ) );
return { SocketType { FileDescriptor { fds[0] } }, SocketType { FileDescriptor { fds[1] } } };
}
//! \param[in] datagram_interface is the underlying interface (e.g. to UDP, IP, or Ethernet)
template<TCPDatagramAdapter AdaptT>
TCPMinnowSocket<AdaptT>::TCPMinnowSocket( AdaptT&& datagram_interface )
: TCPMinnowSocket( socket_pair_helper<LocalStreamSocket>( AF_UNIX, SOCK_STREAM ),
std::move( datagram_interface ) )
{}
The Event Loop
Who calls the TCPPeer::receive
function? The background thread is running an event loop. The poll
syscall is used to wait for events on the TUN device and the socketpair. When a new packet is received, or new data is sent from the main thread, the TCPPeer
class is notified. Checkout the EventLoop::wait_next_event
function inside eventloop.cc
, and TCPMinnowSocket<AdaptT>::_initialize_TCP
inside tcp_minnow_socket_impl.hh
for more details. Essentially, there are three events to handle:
- Incoming datagram received (needs to be given to
TCPPeer::receive
method) - Outbound bytes received from the main thread via a
write()
call (needs to be read from the socketpair and given toTCPPeer
) - Incoming bytes reassembled by the
Reassembler
(needs to be read from theinbound_stream
and written to the socketpair back to the main thread)
Webget
Now let’s put everything together, what’s happening behind these 10 lines of code?
auto socket = CS144TCPSocket();
socket.connect( { host, "http"s } );
socket.write( std::format( "GET {} HTTP/1.1\r\n"s, path ) );
socket.write( std::format( "Host: {}\r\n"s, host ) );
socket.write( std::format( "Connection: close\r\n\r\n"s ) );
string buffer;
while ( !socket.eof() ) {
socket.read( buffer );
cout << buffer;
}
socket.wait_until_closed();
socket.connect
starts the three-way handshake with the server, it sends the SYN
packet and waits for the SYN-ACK
packet.
std::cerr << "DEBUG: minnow connecting to " << c_ad.destination.to_string() << "...\n";
if ( not _tcp.has_value() ) {
throw std::runtime_error( "TCPPeer not successfully initialized" );
}
_tcp->push( [&]( auto x ) { _datagram_adapter.write( x ); } );
if ( _tcp->sender().sequence_numbers_in_flight() != 1 ) {
throw std::runtime_error( "After TCPConnection::connect(), expected sequence_numbers_in_flight() == 1" );
}
_tcp_loop( [&] { return _tcp->sender().sequence_numbers_in_flight() == 1; } );
if ( _tcp->inbound_reader().has_error() ) {
std::cerr << "DEBUG: minnow error on connecting to " << c_ad.destination.to_string() << ".\n";
} else {
std::cerr << "DEBUG: minnow successfully connected to " << c_ad.destination.to_string() << ".\n";
}
If the connection is successful, it launches the background thread to handle the event loop.
Everytime we call socket.write
, the main thread writes the data to the socketpair, the background thread reads the data from the socketpair and then sends it to the TUN device. Everytime a packet is received from the TUN device, the TCP sender and TCP receiver we implemented is called to handle the packet.
Everytime we call socket.read
, the background thread sends back the reassembled data to the main thread via the socketpair.
Finally, when we call socket.wait_until_closed
, the main thread shutdown the socketpair and join the background thread.
template<TCPDatagramAdapter AdaptT>
void TCPMinnowSocket<AdaptT>::wait_until_closed()
{
shutdown( SHUT_RDWR );
if ( _tcp_thread.joinable() ) {
std::cerr << "DEBUG: minnow waiting for clean shutdown... ";
_tcp_thread.join();
std::cerr << "done.\n";
}
}
That’s it! Hopefully this helps you debug your TCP implementation in the real world.
This concludes Checkpoint 4.
If you find this post helpful, please consider sponsoring.
Sponsor