TCP/IP is widely used both in the Internet as well as in data centers. The protocol makes very few assumptions about the underlying network and provides useful guarantees such as reliable transmission, in-order delivery, or control flow. The price for this functionality is complexity, latency, and computational overhead, which is especially pronounced in software implementations. While for Internet communication this is acceptable, the overhead is too high in data centers. In this paper, we explore how to optimize a TCP/IP stack running on an FPGA for data center applications with an emphasis on data processing (e.g., key value stores). Using a key-value store and a low-latency consensus protocol implemented on an FPGA as an example of the requirements that arise in data centers, we provide an extensive analysis of the overheads of TCP/IP and the solutions that can be adopted to minimize such an overhead. The proposed optimized TCP/IP stack minimizes tail latencies (a key metric in distributed data processing) and is efficiently implemented so as to be able to share the FPGA with application logic.