Discussion:
tcp: smoother receiver autotuning
(too old to reply)
Dave Taht
2017-12-11 02:52:28 UTC
Permalink
Raw Message
One of the hacks in the android world has been to limit the receive
window. The patches eric just submitted might have some impact on the
results reported here, years ago:

https://pdfs.semanticscholar.org/b293/ec57821a27bfb96d15cd11d8141e04610153.pdf


---------- Forwarded message ----------
From: Eric Dumazet <***@google.com>
Date: Sun, Dec 10, 2017 at 5:55 PM
Subject: [PATCH net-next 3/3] tcp: smoother receiver autotuning
To: "David S . Miller" <***@davemloft.net>, Neal Cardwell
<***@google.com>, Yuchung Cheng <***@google.com>, Soheil
Hassas Yeganeh <***@google.com>, Wei Wang <***@google.com>,
Priyaranjan Jha <***@google.com>
Cc: netdev <***@vger.kernel.org>, Eric Dumazet
<***@google.com>, Eric Dumazet <***@gmail.com>


Back in linux-3.13 (commit b0983d3c9b13 ("tcp: fix dynamic right sizing"))
I addressed the pressing issues we had with receiver autotuning.

But DRS suffers from extra latencies caused by rcv_rtt_est.rtt_us
drifts. One common problem happens during slow start, since the
apparent RTT measured by the receiver can be inflated by ~50%,
at the end of one packet train.

Also, a single drop can delay read() calls by one RTT, meaning
tcp_rcv_space_adjust() can be called one RTT too late.

By replacing the tri-modal heuristic with a continuous function,
we can offset the effects of not growing 'at the optimal time'.

The curve of the function matches prior behavior if the space
increased by 25% and 50% exactly.

Cost of added multiply/divide is small, considering a TCP flow
typically would run this part of the code few times in its life.

I tested this patch with 100 ms RTT / 1% loss link, 100 runs
of (netperf -l 5), and got an average throughput of 4600 Mbit
instead of 1700 Mbit.

Signed-off-by: Eric Dumazet <***@google.com>
Acked-by: Soheil Hassas Yeganeh <***@google.com>
Acked-by: Wei Wang <***@google.com>
---
net/ipv4/tcp_input.c | 19 +++++--------------
1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 2900e58738cde0ad1ab4a034b6300876ac276edb..fefb46c16de7b1da76443f714a3f42faacca708d
100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -601,26 +601,17 @@ void tcp_rcv_space_adjust(struct sock *sk)
if (sock_net(sk)->ipv4.sysctl_tcp_moderate_rcvbuf &&
!(sk->sk_userlocks & SOCK_RCVBUF_LOCK)) {
int rcvmem, rcvbuf;
- u64 rcvwin;
+ u64 rcvwin, grow;

/* minimal window to cope with packet losses, assuming
* steady state. Add some cushion because of small variations.
*/
rcvwin = ((u64)copied << 1) + 16 * tp->advmss;

- /* If rate increased by 25%,
- * assume slow start, rcvwin = 3 * copied
- * If rate increased by 50%,
- * assume sender can use 2x growth, rcvwin = 4 * copied
- */
- if (copied >=
- tp->rcvq_space.space + (tp->rcvq_space.space >> 2)) {
- if (copied >=
- tp->rcvq_space.space + (tp->rcvq_space.space >> 1))
- rcvwin <<= 1;
- else
- rcvwin += (rcvwin >> 1);
- }
+ /* Accommodate for sender rate increase (eg. slow start) */
+ grow = rcvwin * (copied - tp->rcvq_space.space);
+ do_div(grow, tp->rcvq_space.space);
+ rcvwin += (grow << 1);

rcvmem = SKB_TRUESIZE(tp->advmss + MAX_TCP_HEADER);
while (tcp_win_from_space(sk, rcvmem) < tp->advmss)
--
2.15.1.424.g9478a66081-goog



--

Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
Loading...