Discussion:
Steam's download CDNs - breaking bufferbloat and inbound policers
(too old to reply)
Jonathan Morton
2017-04-27 04:27:29 UTC
Permalink
Raw Message
I'm looking for any comments on Steam's game distribution download system - specifically how it defeats any bufferbloat management system I've used.
It seems to push past inbound policers, exceeding them by about 40%. That is to say, you must police steam traffic to half your line rate, then enough capacity will remain to avoid packet loss, latency, etc. Obviously this is too much bandwidth to reserve for practical use.
Without any inbound control, you can expect very heavy packet loss and jitter. With fq_codel or sfq and taking the usual recommended 15% off the table, you get improved, but still unacceptable performance in your small flows / ping etc.
The behavior can be observed by downloading any free game on their platform. I'm trying to figure out how they've accomplished this and how to mitigate this behavior. It operates with 20 http connections simultaneously, which is normally not an issue (20 multiple web downloads perform well under fq_codel and 15% reserve bandwidth)
Have you tried using Cake in its new ingress mode? That counts dropped packets against the shaper, ensuring that the load they already imposed on the link upstream of the shaper is accounted for. This is probably impossible to implement with a separate shaper and AQM (eg. HTB + fq_codel).

I’ve also found that Steam responds well to ECN, at least on my local instances, so you should turn ECN on fully on your end-hosts. I would dearly love to see ECN on by default, but so far only Apple (of all vendors!) has shown that level of courage.

- Jonathan Morton
Tristan Seligmann
2017-04-27 18:11:22 UTC
Permalink
Raw Message
The behavior can be observed by downloading any free game on their
platform. I'm trying to figure out how they've accomplished this and how to
mitigate this behavior. It operates with 20 http connections
simultaneously, which is normally not an issue (20 multiple web downloads
perform well under fq_codel and 15% reserve bandwidth)
I think it starts new connections quite often as it hops content delivery
nodes to find the fastest ones, perhaps this contributes to the problem?
Loading...