Discussion:
Fwd: [Bug 1436945] Re: devel: consider fq_codel as the default qdisc for networking
(too old to reply)
Jan Ceuleers
2018-05-24 15:38:34 UTC
Permalink
Raw Message
Took 3 years after Dave approached them, but Ubuntu is finally adopting
fq_codel as the default qdisc.


-------- Forwarded Message --------
Subject: [Bug 1436945] Re: devel: consider fq_codel as the default qdisc
for networking
Date: Thu, 24 May 2018 14:50:09 -0000
From: Laurent Bonnaud <***@laposte.net>
Reply-To: Bug 1436945 <***@bugs.launchpad.net>
To: ***@computer.org

I also see fq_codel used as default:

# cat /proc/sys/net/core/default_qdisc
fq_codel

# ip addr
[...]
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state
UP group default qlen 1000


** Changed in: linux (Ubuntu)
Status: Confirmed => Fix Released

--
You received this bug notification because you are subscribed to the bug
report.
https://bugs.launchpad.net/bugs/1436945

Title:
devel: consider fq_codel as the default qdisc for networking

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1436945/+subscriptions
Rich Brown
2018-05-24 16:31:55 UTC
Permalink
Raw Message
> On May 24, 2018, at 12:00 PM, bloat-***@lists.bufferbloat.net wrote:
>
> Took 3 years after Dave approached them, but Ubuntu is finally adopting
> fq_codel as the default qdisc.

And I was sorry to have missed the SIXTH anniversary of fq_codel on Monday, 14 May 2018.

I have always been aware that engineering projects have inertia, but it's fascinating to see how that inertia plays out in the real world, even when the technology has such a potential to benefit everyone.

Happy Sixth Birthday! Take a moment to pat yourselves on the back.

Rich
Jonathan Morton
2018-06-05 17:24:22 UTC
Permalink
Raw Message
> On 5 Jun, 2018, at 6:10 pm, Jonathan Foulkes <***@jonathanfoulkes.com> wrote:
>
> Jonathan, in the past the recommendation was for NOECN on egress if capacity <4Mbps. Is that still the case in light of this?

I would always use ECN, no exceptions - unless the sender is using a TCP congestion control algorithm that doesn't support it (eg. BBR currently). That's true for both fq_codel and Cake.

With ECN, codel's action doesn't drop packets, only resizes the congestion window.

- Jonathan Morton
Mario Hock
2018-06-06 08:55:27 UTC
Permalink
Raw Message
Am 06.06.2018 um 10:15 schrieb Sebastian Moeller:
> Well, sending a packet incurs serialization delay for all queued up packets, so not sending a packet reduces the delay for all packets that are sent by exactly the serialization delay. If egress bandwidth is precious (so when it is congested and low in comparison with the amount of data that should be send) resorting to congestion signaling by dropping seems okay to me, as that immeiately frees up a "TX-slot" for another flow.

If the packet is dropped and the "TX-slot" is freed up, two things can
happen:

1. The next packet belongs to the same flow. In this case, a TCP flow
has no benefit because head-of-line-block occurs until the packet is
retransmitted. (This might be different for loss-tolerant
latency-sensitive UDP traffic, though.)

2. The next packet belongs to another flow. Obviously, this flow would
benefit. However, the question which flow should be served next should
be made by the scheduler, not by the dropper. (In the case of
scheduler/dropper combinations, such as fq_codel.)

Best, Mario Hock
Jonathan Morton
2018-06-05 07:49:30 UTC
Permalink
Raw Message
> On 5 Jun, 2018, at 10:44 am, Mario Hock <***@kit.edu> wrote:
>
> Just to make sure that I got your answer correctly. The benefit for endsystems comes from the "fq" (flow queuing) part, not from the "codel" part of fq_codel?

That's a fair characterisation, yes.

In fact, even for middleboxes, the "flow isolation" semantics of FQ have the most impact on reducing inter-flow induced latency. The "codel" part (AQM) helps with intra-flow latency, which is usually much less important once flow isolation is in place, but is still worth having.

- Jonathan Morton
Mario Hock
2018-06-05 11:01:57 UTC
Permalink
Raw Message
Am 05.06.2018 um 09:49 schrieb Jonathan Morton:
>> On 5 Jun, 2018, at 10:44 am, Mario Hock <***@kit.edu> wrote:
>>
>> Just to make sure that I got your answer correctly. The benefit for endsystems comes from the "fq" (flow queuing) part, not from the "codel" part of fq_codel?
>
> That's a fair characterisation, yes.
>
> In fact, even for middleboxes, the "flow isolation" semantics of FQ have the most impact on reducing inter-flow induced latency. The "codel" part (AQM) helps with intra-flow latency, which is usually much less important once flow isolation is in place, but is still worth having.

Thanks for the confirmation.

A potential drawback of using the codel part (of fq_codel) in the
endsystems is that it can cause packet drops already at the sender.

I could actually confirm this assumption with a very simple experiment
consisting of two servers connected over a 1Gbit/s link and 100 parallel
flows (iperf3). With fq_codel I had 5,000-10,000 retransmissions within
60s. With fq (or pfifo_fast) no packets are dropped. (I presume either
"TCP small queues" or backpressure keeps the queues from overflowing.)

Also, ping times (delays for short flows) were similar with fq and
fq_codel (mostly <= 1ms).

Is there any advantage of using fq_codel over fq at the endsystems?

Mario Hock
Dave Taht
2018-08-16 21:08:31 UTC
Permalink
Raw Message
On Tue, Jun 5, 2018 at 12:49 AM Jonathan Morton <***@gmail.com> wrote:
>
> > On 5 Jun, 2018, at 10:44 am, Mario Hock <***@kit.edu> wrote:
> >
> > Just to make sure that I got your answer correctly. The benefit for endsystems comes from the "fq" (flow queuing) part, not from the "codel" part of fq_codel?
>
> That's a fair characterisation, yes.
>
> In fact, even for middleboxes, the "flow isolation" semantics of FQ have the most impact on reducing inter-flow induced latency. The "codel" part (AQM) helps with intra-flow latency, which is usually much less important once flow isolation is in place, but is still worth having.

So, jonathan, this portion of the debate leaked over into
https://github.com/systemd/systemd/issues/9725

And I lost a great deal of hair over it. the codel portion is way
worth it on "end-systems".

>
> - Jonathan Morton
>
> _______________________________________________
> Bloat mailing list
> ***@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat



--

Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
Mikael Abrahamsson
2018-06-06 04:14:54 UTC
Permalink
Raw Message
On Tue, 5 Jun 2018, Jonas MÃ¥rtensson wrote:

>> What about PLPMTU? Do you think they might tweak that too?
>>
>> net.ipv4.tcp_mtu_probing=2
>> (despite name, applies to IPv6 too)
>
>
> Maybe, suggest it on their github. But I would maybe propose instead
> net.ipv4.tcp_mtu_probing=1.

MTU probing would be awsome. I am great fan of PLPMTU and this should be
default-on everywhere in all protocols.

--
Mikael Abrahamsson email: ***@swm.pp.se
Simon Iremonger (bufferbloat)
2018-06-07 12:56:55 UTC
Permalink
Raw Message
>> What about PLPMTU? Do you think they might tweak that too?
>> net.ipv4.tcp_mtu_probing=2
>> (despite name, applies to IPv6 too)
>
> Maybe, suggest it on their github. But I would maybe propose instead
> net.ipv4.tcp_mtu_probing=1.


OK now this needs to become *organized* now...

What about putting explanations of the above into the
bufferbloat-wiki?

https://github.com/tohojo/bufferbloat-net/


What are the risks?
What is the advantages?

Are there other flags worth changing?
Can somebody who knows more help with checking on state of BBR
congestion control?

Which of these changes, depend on particular Linux versions,which MUST
NOT be applied without a particular kernel version?
etc...


Can somebody help make pull-requests on archiving "old" areas
of the bufferbloat-wiki e.g. old-changes that aren't relevant
any more given linux 3.2+ and soforth....?

https://github.com/tohojo/bufferbloat-net/



--Simon
Jonathan Morton
2018-06-05 19:31:30 UTC
Permalink
Raw Message
> On 5 Jun, 2018, at 9:34 pm, Sebastian Moeller <***@gmx.de> wrote:
>
> The rationale for that decision still is valid, at low bandwidth every opportunity to send a packet matters…

Yes, which is why the DRR++ algorithm is used to carefully choose which flow to send a packet from.

> …and every packet being transferred will increase the queued packets delay by its serialization delay.

This is trivially true, but has no effect whatsoever on inter-flow induced latency, only intra-flow delay, which is already managed adequately well by an ECN-aware sender.

May I remind you that Cake never drops the last packet in a flow subqueue due to AQM action, but may still apply an ECN mark to it. That's because dropping a tail packet carries a risk of incurring an RTO before retransmission occurs, rather than "only" an RTT delay. Both RTO and RTT are always greater than the serialisation delay of a single packet.

Which is why ECN remains valuable even on very low-bandwidth links.

- Jonathan Morton
Sebastian Moeller
2018-06-06 06:53:05 UTC
Permalink
Raw Message
Hi Jonathan,



> On Jun 5, 2018, at 21:31, Jonathan Morton <***@gmail.com> wrote:
>
>> On 5 Jun, 2018, at 9:34 pm, Sebastian Moeller <***@gmx.de> wrote:
>>
>> The rationale for that decision still is valid, at low bandwidth every opportunity to send a packet matters…
>
> Yes, which is why the DRR++ algorithm is used to carefully choose which flow to send a packet from.

Well, but look at it that way, the longer the traversal path after the cake instance the higher the probability that the packet gets dropped by a later hop. So on ingress we in all likelihood already passed the main bottleneck (but beware of the local WLAN quality) on egress most of the path is still ahead of us.

>
>> …and every packet being transferred will increase the queued packets delay by its serialization delay.
>
> This is trivially true, but has no effect whatsoever on inter-flow induced latency, only intra-flow delay, which is already managed adequately well by an ECN-aware sender.

I am not sure that I am getting your point, at 0.5Mbps every full-MTU packet will hog the line foe 20+ milliseconds, so all other flows will incur at least that 20+ ms additional latency, this is independent of inter- or intra-flow perspective, no?.

>
> May I remind you that Cake never drops the last packet in a flow subqueue due to AQM action, but may still apply an ECN mark to it.

I believe this not dropping is close to codel's behavior?

> That's because dropping a tail packet carries a risk of incurring an RTO before retransmission occurs, rather than "only" an RTT delay. Both RTO and RTT are always greater than the serialisation delay of a single packet.

Thanks for the elaboration; clever! But dropping a packet will instantaneous free bandwidth for other flows independent of whether the sender has already realized that fact; sure the flow with the dropped packet will not as smoothly revover from the loss as it would deal with ECN signaling, but tat is not the vintage point from which I am looking at the issue here..

>
> Which is why ECN remains valuable even on very low-bandwidth links.

Well, I guess I should revisit that and try to get some data at low bandwidths, but my hunch still is that
>
> - Jonathan Morton
>
Jonathan Morton
2018-06-06 13:04:59 UTC
Permalink
Raw Message
>>> The rationale for that decision still is valid, at low bandwidth every opportunity to send a packet matters…
>>
>> Yes, which is why the DRR++ algorithm is used to carefully choose which flow to send a packet from.
>
> Well, but look at it that way, the longer the traversal path after the cake instance the higher the probability that the packet gets dropped by a later hop.

That's only true in case Cake is not at the bottleneck, in which case it will only have a transient queue and AQM will disengage anyway. (This assumes you're using an ack-clocked protocol, which TCP is.)

>>> …and every packet being transferred will increase the queued packets delay by its serialization delay.
>>
>> This is trivially true, but has no effect whatsoever on inter-flow induced latency, only intra-flow delay, which is already managed adequately well by an ECN-aware sender.
>
> I am not sure that I am getting your point…

Evidently. You've been following Cake development for how long, now? This is basic stuff.

> …at 0.5Mbps every full-MTU packet will hog the line foe 20+ milliseconds, so all other flows will incur at least that 20+ ms additional latency, this is independent of inter- or intra-flow perspective, no?.

At the point where the AQM drop decision is made, Cake (and fq_codel) has already decided which flow to service. On a bulk flow, most packets are the same size (a full MTU), and even if the packet delivered is the last one presently in the queue, probably another one will arrive by the time it is next serviced - so the effect of the *flow's* presence remains even into the foreseeable future.

So there is no effect on other flows' latency, only subsequent packets in the same flow - and the flow is always hurt by dropping packets, rather than marking them.

- Jonathan Morton
Dave Taht
2018-06-12 06:39:09 UTC
Permalink
Raw Message
"So there is no effect on other flows' latency, only subsequent
packets in the same flow - and the flow is always hurt by dropping
packets, rather than marking them."

Disagree. The flow being dropped from will reduce its rate in an rtt,
reducing the latency impact on other flows.

I regard an ideal queue length as 1 packet or aggregate, as "showing"
all flows the closest thing to the real path rtt. You want to store
packets in the path, not buffers.

ecn has mass. It is trivial to demonstrate an ecn marked flow starving
out a non-ecn flow, at low rates.

On Wed, Jun 6, 2018 at 6:04 AM, Jonathan Morton <***@gmail.com> wrote:
>>>> The rationale for that decision still is valid, at low bandwidth every opportunity to send a packet matters…
>>>
>>> Yes, which is why the DRR++ algorithm is used to carefully choose which flow to send a packet from.
>>
>> Well, but look at it that way, the longer the traversal path after the cake instance the higher the probability that the packet gets dropped by a later hop.
>
> That's only true in case Cake is not at the bottleneck, in which case it will only have a transient queue and AQM will disengage anyway. (This assumes you're using an ack-clocked protocol, which TCP is.)
>
>>>> …and every packet being transferred will increase the queued packets delay by its serialization delay.
>>>
>>> This is trivially true, but has no effect whatsoever on inter-flow induced latency, only intra-flow delay, which is already managed adequately well by an ECN-aware sender.
>>
>> I am not sure that I am getting your point…
>
> Evidently. You've been following Cake development for how long, now? This is basic stuff.
>
>> …at 0.5Mbps every full-MTU packet will hog the line foe 20+ milliseconds, so all other flows will incur at least that 20+ ms additional latency, this is independent of inter- or intra-flow perspective, no?.
>
> At the point where the AQM drop decision is made, Cake (and fq_codel) has already decided which flow to service. On a bulk flow, most packets are the same size (a full MTU), and even if the packet delivered is the last one presently in the queue, probably another one will arrive by the time it is next serviced - so the effect of the *flow's* presence remains even into the foreseeable future.
>
> So there is no effect on other flows' latency, only subsequent packets in the same flow - and the flow is always hurt by dropping packets, rather than marking them.
>
> - Jonathan Morton
>
> _______________________________________________
> Bloat mailing list
> ***@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat



--

Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
Dave Taht
2018-06-12 06:47:26 UTC
Permalink
Raw Message
as for the tail loss/rto problem, doesn't happen unless we are already
in a drop state for a queue, and it doesn't happen very often, and
when it does, it seems like a good idea to me to so thoroughly back
off in the face of so much congestion.

fq_codel originally never dropped the last packet in the queue, which
led to a worst case latency of 1024 * mtu at the bandwidth. That got
fixed and I'm happy with the result. I honestly don't know what cake
does anymore except that jonathan rarely tests at real rtts where the
amount of data in the pipe is a lot more than what's sane to have
queued, where I almost always have realistic path delays.

It would be good to resolve this debate in some direction one day,
perhaps by measuring utilization > 0 on a wide range of tests.

On Mon, Jun 11, 2018 at 11:39 PM, Dave Taht <***@gmail.com> wrote:
> "So there is no effect on other flows' latency, only subsequent
> packets in the same flow - and the flow is always hurt by dropping
> packets, rather than marking them."
>
> Disagree. The flow being dropped from will reduce its rate in an rtt,
> reducing the latency impact on other flows.
>
> I regard an ideal queue length as 1 packet or aggregate, as "showing"
> all flows the closest thing to the real path rtt. You want to store
> packets in the path, not buffers.
>
> ecn has mass. It is trivial to demonstrate an ecn marked flow starving
> out a non-ecn flow, at low rates.
>
> On Wed, Jun 6, 2018 at 6:04 AM, Jonathan Morton <***@gmail.com> wrote:
>>>>> The rationale for that decision still is valid, at low bandwidth every opportunity to send a packet matters…
>>>>
>>>> Yes, which is why the DRR++ algorithm is used to carefully choose which flow to send a packet from.
>>>
>>> Well, but look at it that way, the longer the traversal path after the cake instance the higher the probability that the packet gets dropped by a later hop.
>>
>> That's only true in case Cake is not at the bottleneck, in which case it will only have a transient queue and AQM will disengage anyway. (This assumes you're using an ack-clocked protocol, which TCP is.)
>>
>>>>> …and every packet being transferred will increase the queued packets delay by its serialization delay.
>>>>
>>>> This is trivially true, but has no effect whatsoever on inter-flow induced latency, only intra-flow delay, which is already managed adequately well by an ECN-aware sender.
>>>
>>> I am not sure that I am getting your point…
>>
>> Evidently. You've been following Cake development for how long, now? This is basic stuff.
>>
>>> …at 0.5Mbps every full-MTU packet will hog the line foe 20+ milliseconds, so all other flows will incur at least that 20+ ms additional latency, this is independent of inter- or intra-flow perspective, no?.
>>
>> At the point where the AQM drop decision is made, Cake (and fq_codel) has already decided which flow to service. On a bulk flow, most packets are the same size (a full MTU), and even if the packet delivered is the last one presently in the queue, probably another one will arrive by the time it is next serviced - so the effect of the *flow's* presence remains even into the foreseeable future.
>>
>> So there is no effect on other flows' latency, only subsequent packets in the same flow - and the flow is always hurt by dropping packets, rather than marking them.
>>
>> - Jonathan Morton
>>
>> _______________________________________________
>> Bloat mailing list
>> ***@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/bloat
>
>
>
> --
>
> Dave Täht
> CEO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-669-226-2619



--

Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
Dave Taht
2018-08-11 19:17:01 UTC
Permalink
Raw Message
In revisiting this old thread, in light of this,

https://github.com/systemd/systemd/issues/9725

and my test results of cake with and without ecn under big loads... I
feel as though I'm becoming a
pariah in favor of queue length management, by dropping packets! In
bufferbloat.net! cake used to drop ecn marked packets at overload, I'm
seeing enormous differences in queue depth w and without ecn. (On one
test at 100mbit, 10ms queues vs 30ms), more details later.

Now, some of this is that cubic tcp is just way too aggressive and
perhaps some mods to it have arrived in the last 5 years that make it
even worse... so I'm going to go do a bit of testing with osx's
implementation
in particular. The ecn responses laid out in the original rfc were
against reno, a sawtooth, against iw2, and I also think that cwnd is
not decreasing enough nowadays in the first place.
Dave Taht
2018-08-13 22:29:22 UTC
Permalink
Raw Message
On Tue, Jun 5, 2018 at 12:31 PM Jonathan Morton <***@gmail.com> wrote:
>
> > On 5 Jun, 2018, at 9:34 pm, Sebastian Moeller <***@gmx.de> wrote:
> >
> > The rationale for that decision still is valid, at low bandwidth every opportunity to send a packet matters…
>
> Yes, which is why the DRR++ algorithm is used to carefully choose which flow to send a packet from.
>
> > …and every packet being transferred will increase the queued packets delay by its serialization delay.
>
> This is trivially true, but has no effect whatsoever on inter-flow induced latency, only intra-flow delay, which is already managed adequately well by an ECN-aware sender.
>
> May I remind you that Cake never drops the last packet in a flow subqueue due to AQM action, but may still apply an ECN mark to it. That's because dropping a tail packet carries a risk of incurring an RTO before retransmission occurs, rather than "only" an RTT delay. Both RTO and RTT are always greater than the serialisation delay of a single packet.
>
> Which is why ECN remains valuable even on very low-bandwidth links.

I guess everybody knows at this point that I'm not a big fan of ecn.
I'd done a bit of work on making
"drop and mark" work in earlier versions of cake and completely missed
that it had got ripped out until a month or two back.

I'd like to point at this bit of codel, where it can, does, and will
do bulk dropping, and increase the drop schedule, even drop the last
packet in that queue, while overloaded, in an attempt to get things
back to the real rtt.

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/include/net/codel_impl.h#n176

years ago, even on simple traffic, codel would spend 30% of it's time
here. On the kinds of massive overloads
for the path roland has done, it wouldn't surprise me if it was > 90%.

With ecn'd traffic, in this bit of code, we do not drain the excess
packets, nor do we increase the mark rate as frantically. I've aways
felt this was a major flaw in codel's ecn handling, and have tried to
fix it in various ways.

Even pie starts dropping ecn, when the drop probability exceeds 10%.

> - Jonathan Morton
>
> _______________________________________________
> Bloat mailing list
> ***@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat



--

Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619
Loading...