1. OSI Reference Model and TCP/IP Reference Model#
As shown in the figure, the OSI reference model consists of 7 layers, from top to bottom: application layer, presentation layer, session layer, transport layer, network layer, data link layer, and physical layer. The TCP/IP model is a simplified version of the OSI reference model, consisting of 4 layers. Both models use a hierarchical structure and provide peer-to-peer communication between layers. The difference is that the TCP/IP reference model is clearer and more concise than the OSI reference model. In terms of functionality, there is not much difference between the two models, as they both aim to facilitate communication between two or more endpoints.
2. Which layer does TCP communication belong to in the network model?#
TCP (Transmission Control Protocol) is a connection-oriented, reliable, byte-stream-based transport layer communication protocol. Whether in the OSI reference model or the TCP/IP reference model, TCP belongs to the transport layer. TCP is specifically designed to provide reliable end-to-end byte-stream transmission over unreliable internetworks.
3. How to understand connection-oriented, reliable, and byte-stream?#
-
Connection-oriented: This means that TCP communication is point-to-point and cannot send messages to multiple hosts simultaneously like UDP. It cannot achieve one-to-many scenarios.
-
Reliable: Regardless of how the network link changes, TCP ensures that the packets can reach the receiving end.
-
Byte-stream: Based on byte-stream, this means that TCP can transmit messages of any size. The messages are also ordered, so if the previous message has not been received, even if the subsequent bytes have been received, they cannot be passed to the application layer. TCP also automatically discards duplicate packets.
4. Why is TCP protocol needed?#
Because the IP layer is unreliable, it does not guarantee the definite delivery, ordered delivery, or complete delivery of network packets. Therefore, if the reliability of network packets needs to be ensured, it is necessary to use the TCP protocol at the transport layer.
5. Differences and connections between TCP and UDP?#
TCP and UDP both belong to the transport layer protocols. The differences are as follows:
- Connection mechanism:
- TCP is a connection-oriented transport layer protocol.
- UDP does not require a connection.
- Service objects:
- TCP provides one-to-one communication.
- UDP supports one-to-one, one-to-many, and many-to-many communication.
- Reliability:
- TCP guarantees data delivery without loss or duplication, and ensures on-demand delivery.
- UDP makes the best effort to deliver data and does not guarantee delivery.
- Congestion control and flow control:
- TCP has congestion control and flow control mechanisms.
- UDP does not have congestion control and flow control mechanisms.
6. TCP Header Analysis#
The TCP header occupies at least 20 bytes and includes source port number, destination port number, sequence number, acknowledgment number, control bits, checksum, etc. The specific details are as follows:
7. Briefly explain the three-way handshake of TCP#
- Both the server and the client are in the CLOSED state.
- The server actively listens on a port and enters the LISTEN state.
- The client sends a SYN packet with seq=x and SYN=1.
- The server replies with a SYN+ACK packet with seq=y, ack=x+1, SYN=1, and ACK=1.
- The client replies with an ACK packet with ack=y+1 and ACK=1.
8. Briefly explain the four-way handshake of TCP#
Both the server and the client are in the ESTABLISHED state.
- The client initiates the closing of the connection by sending a FIN packet and enters the FIN_WAIT_1 state.
- The server replies with an ACK packet and enters the CLOSED_WAIT state.
- After receiving the ACK response from the client, the client enters the FIN_WAIT_2 state.
- After processing the remaining data, the server sends a FIN packet to the client and enters the LAST_ACK state.
- The client replies with an ACK packet and enters the TIME_WAIT state.
- After receiving the ACK response from the client, the server enters the CLOSED state, and the server completes the connection closure.
- After a certain period of time (2MSL), the client automatically enters the CLOSED state, and the client also completes the connection closure.
9. Why is the TCP handshake exactly three times?#
When establishing a TCP connection, the three-way handshake can prevent the establishment of historical connections, reduce unnecessary resource consumption for both parties, and help synchronize the initialization sequence numbers. Sequence numbers ensure that packets are not duplicated, discarded, and transmitted in order.
Reasons for not using "two-way handshake" and "four-way handshake":
- Two-way handshake: It cannot prevent the establishment of historical connections, resulting in wasted resources for both parties, and cannot reliably synchronize the sequence numbers of both parties.
- Four-way handshake: The three-way handshake is already theoretically the minimum reliable connection establishment, so there is no need for more communication rounds.
10. Why does the TCP handshake require four times?#
Looking back at the process of both parties sending FIN packets in the four-way handshake, it can be understood why four times are needed:
- When closing the connection, when the client sends a FIN, it only means that the client will no longer send data but can still receive data.
- When the server receives the client's FIN packet, it first replies with an ACK packet. The server may still have data to process and send. Only when the server no longer sends data, it sends a FIN packet to the client to indicate its agreement to close the connection.
- Since the server usually needs to wait for data to be sent and processed, the server's ACK and FIN are usually sent separately, resulting in one more round than the three-way handshake.
11. What is the TIME_WAIT state in the four-way handshake?#
First, it is important to note that the TIME_WAIT state is only applicable to the actively closing party.
The reason for the TIME_WAIT state is mainly twofold:
- To prevent the server from receiving packets that are still in transit in the network when reconnecting to the same port.
- To ensure that the passively closing party can be correctly closed, that is, to ensure that the last ACK can be received by the passively closing party, thereby helping it to close normally.
12. Why is the TIME_WAIT time set to 2MSL?#
MSL (Maximum Segment Lifetime) is the maximum time a packet can exist on the network. If the last ACK packet for closing the connection is not received by the passive closing party within the TIME_WAIT time, the passive closing party will trigger a timeout and retransmit the FIN packet. After receiving the FIN, the other party will retransmit the ACK to the passive closing party. This process takes 2 MSL.
The 2MSL time starts from when the client receives the FIN and sends the ACK. If the client receives a retransmitted FIN packet from the server within the TIME_WAIT time, the 2MSL time will be restarted. In the Linux system, the default value of 2MSL is 60 seconds.
13. What is TCP's keep-alive mechanism?#
The keep-alive mechanism defines a time period during which if there is no activity related to the connection, TCP keep-alive will start to work. It sends a probe packet at regular intervals, which contains very little data. If several consecutive probe packets do not receive a response, the current TCP connection is considered dead, and the kernel of the system notifies the upper-layer application of the error.
14. What to do if a client fails after establishing a connection?#
This situation will trigger the TCP keep-alive mechanism, including parameters such as keep-alive time, the number of keep-alive probes, and the time interval between keep-alive probes. It means that if the client suddenly fails, it will take 7200 seconds plus 75 seconds multiplied by 9, which is 7875 seconds or 2 hours and 11 minutes and 15 seconds for the server to determine that the connection is invalid. These parameters can be manually set.
15. What is the relationship between TCP/IP protocol and Socket?#
There is a saying about the relationship between Socket and TCP/IP protocol in the network:
TCP/IP is just a protocol stack, similar to the operating system's operating mechanism. It must be implemented specifically and provide operation interfaces to the outside world. This is similar to the standard programming interfaces provided by the operating system, such as the Win32 programming interface. TCP/IP also needs to provide interfaces for programmers to develop network applications, and this is the Socket programming interface.
Therefore, Socket and TCP/IP are not necessarily related. When designing the Socket programming interface, it is hoped that it can be used for other network protocols. The appearance of Socket is just to facilitate the use of the TCP/IP protocol stack. It abstracts TCP/IP and forms some basic function interfaces, such as Send, Listen, etc.
16. What is SYN attack?#
We all know that TCP connection establishment requires a three-way handshake. Suppose an attacker forges SYN packets with different IP addresses in a short period of time. Each time the server receives a SYN packet, it enters the SYN_RCVD state. Over time, the server's SYN receive queue (unconnected queue) will be filled, making the server unable to serve normal users.
17. How to prevent SYN attack?#
There are two solutions to prevent SYN attacks:
- Modify the kernel parameters to control the queue size and determine how to handle it when the queue is full. For example, after the queue is full, directly reply with RST to new SYN packets and discard the connection.
- After the SYN queue is full, subsequent SYN packets do not directly enter the SYN queue, but calculate the Cookie value first and then send it. The legitimacy of ACK packets can be verified later.
18. TCP Server Socket Programming Flow#
- Server initializes the socket and obtains the file descriptor.
- Server calls Bind to bind to the IP address and port.
- Server calls Listen to start listening.
- Server calls Accept to establish a client connection.
- Server sends messages to the client using Send.
- Server receives messages from the client using Receive.
19. TCP Client Socket Programming Flow#
- Client initializes the socket and obtains the file descriptor.
- Client calls Connect to connect to the server.
- After the connection is established, the client sends messages to the server using Send.
- Client receives messages from the server using Receive.
20. What does the backlog parameter mean in Listen?#
In the Linux kernel, there are two queues maintained:
- Incomplete connection queue (SYN queue): When a SYN connection request is received, it enters the SYN_RCVD state.
- Completed connection queue (Accept queue): After completing the TCP three-way handshake, it enters the ESTABLISHED state.
After kernel version 2.2, the backlog parameter became the Accept queue, which is the length of the queue for completed connections. Therefore, backlog is now generally considered to be the Accept queue.