What causes the randomness of internet speeds, even on Ethernet?
There are many factors that come into play in internet connections, almost too many. The fact is, we have been studying the internet as long as it existed, and there are multiple papers published every year on traffic statistics, models, case studies on specific use cases, new protocols and protocol enhancements, census data, performance analyses, client characteristics, etc. Since our networks constantly change, their behaviour also does so. Characteristics of traffic of DSL users today may not be the same in 5-10 years, and has changed a lot in the last 10-15.
Given the question is so broad, there is a lot of things to be said, so I’ll stick with the most well-established ones.
- Typical DSL clients’ web traffic has a few interesting characteristics:
- your upstream bandwidth limits your downstream throughput Charzinski, 2000
- for some ISPs, there is an observed drop in performance early in the morning and late in the evening, performance variability increases for all ISPs during peak hours Sundaresan et al, 2011
- the last-mile latency and jitter are lower than upstream, and losses are usually bursty - if you lose one packet, it’s more likely you’ll lose the next one, too
- excessive buffering tends to increase latency, just like insufficient buffering, but tends to increase jitter, contrary to insufficient buffering (a phenomenon aka “bufferbloat”)
- there is no single best ISP for everyone
- The principal someone behind the “randomness”, or to put it differently the exhibited negative impact, is self-similarity. Self-similarity is the property of a time series (in this case) to exhibit the same characteristics in varying scales. As an example, a series which exhibit bursty behaviour at a wide variety of scale. Surprisingly, this was shown for Ethernet as far back as 1995 by Leland et al, and Crovella & Bestavros showed its adverse effects on HTTP traffic.
I will stop here, as I realise I’ve spent about an hour trying to summarise this stuff in my head and only typed a handful about it, and I’ve left out huge amounts of literature on wireless & mobile networks, medium differences, caching, and network churn. If you have any specific questions ask away and I’ll do my best to answer or at least point you to relevant literature.
One factor is that when you are on the Internet you typically are using TCP. What TCP does is continually try to send more data. When the maximum is reached and the network is full a loss event will occur. When it does TCP will cut the amount of data it is trying to send, typically it will cut down to nothing and increase exponentially, or it will cut in half and increase linearly. This will continually happen and so the speed will always fluctuate.
this is a good starting point: https://en.wikipedia.org/wiki/Network_congestion
congestion is probably the leading cause of the ‘randomness’ of network speeds. others have brought it up but didn’t really explain it too well.
network hardware has limitations just like any other hardware; there’s only a certain amount of traffic it can handle.
think of it like a funnel. you are capable of producing liquid at a rate of a liter per second. the funnel may be able to handle 10 liters per second. if you’re the only one using it, sweet, everything works great. but then 10 other people also start using the funnel at a liter per second. now the funnel is trying to handle 11 liters per second and things start backing up.
this is the basic idea behind distributed denial of service attacks. you have a bunch of connections that can produce liquid at a certain rate and you get enough of them to send everything at a single funnel until it gets backed up and eventually fails.
There are going to be many factors involved. The main ones are going to be the medium the signal is travelling through, the route and network congestion.
Say you want to watch a Youtube video and the video wont load fast enough. Lets first assume your internet bandwidth is more than capable of handling the data.
The location of that video could be on the other side of the world. When a request comes from your computer to view that video it starts a complicated series of events. ***Very simplified version - An electrical signal from router to the ISP, the ISP to their other routers, those routers to other ISP routers, then to undersea cables, then to international ISP, then through their network, then through to other networks they have data agreements with then finally to the server that is hosting the video. And thats just to start the process. To watch the entire Youtube video this process must take place in a back and forth motion at the speed of light until the data has been received.
The cable you just used to watch that video could be 1000’s of miles long. It could pass through Turkey who currently has a fibre break and are re-routing the traffic causing congestion. It could may have rained somewhere along those 1000’s of miles and caused one of the pits to fill with water (a pit is a junction where the street cables meet), there could be a ship’s anchor that has just cut the undersea cable, there could be a foreign submarine tapping that undersea cable causing the electrical signal to weaken etc. etc. etc. As you could imagine there virtually an unlimited number of things that can go wrong. Another big one which doesnt get much mention is human error. A lot of slowness of your internet could be caused by somebody within your ISP pressing the wrong button on the keyboard. Oops somebody within your ISP just tried to migrate a bunch of services to a shinny new piece of hardware and it didnt work out. Trust me, that happens all the time.
On the ISP side, utilization rates are making a comeback. In the early 90s, it was a big deal but tapered off with advances in data transfer. Now with all the streaming going on coupled with population growth, carrier over-utilization is a thing again. I work in management for a cable company. It doesn’t happen too often (maybe once every couple of months), but when it does…it’s catastrophic.
Congestion!!!
ISPs over subscribe customers to any particular pipe all the time.
lets say your apt building has 40 apartments and the handoff to your apartment building is only a 1 gig port. Your ISP wants to charge insane amounts of money to customers so they are willing to offer you bandwidth speeds that are so great that you likely will never hit the max thresh hold. So you pay $100+ for 100Mbs and boast to your friends about the alpha speeds you’re getting. But realistically you wont even come close to using even 1/3 of the bandwidth at a given time. You’re only 1 apartment out of the 40 in your building. Others are doing the same thing. Some are getting 75Mbs, some 50 or 25Mbs. Maybe that 1000Mbs port is maxed out at 20 apartments but when all 20 are using the internet during peak usage times, the collective nternet usage of all 20 customers on the 1Gig/1000Mbs port is only 30-50%. Isp’s could at this point drop another access port at your building to provide bandwidth for the remaining 20 apartments who still are with out internet or they could just see that the existing sold out circuit of 1000Mbs has never passed 50% utilization mark. So they end up selling the already sold bandwidth to the rest of building. It’s only when everyone is on at the same time and they are accessing bandwidth heavy content is when that 1000Mbs pipe starts to get congested.
Look at it this way… Suppose the entrance to your apartment building has a a door that can allow 10 people to walk through it when standing shoulder to shoulder. But at most only 3-4 people ever walk out together, now imagine 40 people wanting to walk out of the building at the same time. The bottleneck of 10 people will keep everyone from walking out at the same time. This is what congestion is. Some people will get out ok, others will have to wait a few seconds before they can get out and even when they succeed to get out, they’d be squeezed in with others trying to pass through that door thus slowing them down
I think you mean rates, not speeds. The speed is the near the speed of light, but a rate is the quantity of data you can receive or send over a specific time period.
In order for data to travel over the internet it needs to be broken up into predictable chunks, and labeled for delivery. This is a poor analogy, but think of the previous sentence like this:
[In order] - 1/15 destination reddit.com
[ for dat] - 2/15 destination reddit.com
[a to tra] - 3/15 destination reddit.com
[vel over] - 4/15 destination reddit.com
[ the inte] - 5/15 destination reddit.com
[rnet it ] - 6/15 destination reddit.com
… and so on.
All those packages have the same destination address, but the internet doesn’t care which delivery company you use to get them there, which map service they use to look up the reddit.com address, or which roads the trucks that take them there use. Much like any delivery service, there are a lot of variables in how things get done.
Is there a speed limit in Omaha? - This would be analogous to throttling or ‘rate’ limiting.
Is there a traffic jam near Chicago? - This could be a distributed denial of service attack forcing our driver to drive all the way down to Texas to avoid it because nothing is going through Chicago?
Are the roads just crappy in Pennsylvania? Yes they are. This could cause your driver to need to driver slower than previously anticipated.
Did the driver lose his map? (bad DNS service) - He might need to spend way longer trying to find out where he is going than you anticipated.
And then, once the package get’s all the way to your building, maybe the mail room sucks and the delivery guy sits in line for half an hour waiting for someone to sign that they got it (server issues).
Let’s say FedEx doesn’t deliver to Miami, so it passes one or all of the packages off to another delivery service for delivery. It doesn’t have control over how they work, so it could be better or worse service, but at least the package gets there.
I think what you’re asking is why would internet have consistently slower rates for all sites and services when plugged into ethernet, and your answer in that case would almost always be this last one. Your local provider is over capacity or just sucks.
However, that answer varies if you’re talking about a single website, or single computer experiencing this on your local network, or any number of other variables.
For a really technical explanation:
Depending on what you use, it’s a combination of all of the following, to a more or lesser degree:
-
All data transmission, on its lowest level, is subject to a degree of random noise and related degradation. The world is analog. Most transmission schemes use retransmission of data and alterating the data rate in-the-fly to keep the link stable under varying noise conditions.
-
A lot of communication protocols have inherent randomness (e.g. to prevent collisions when 2 users are on the same channel) and/or speed fluctuation (TCP sliding window etc…)
-
As soon as multiple users share one medium/datapipe, it needs to be multiplexed. This multiplexing is always rather coarse (never 1 bit at a time, usually 1 packet (64bytes-1500bytes) at a time.
-
Both client and server perform most tasks sequentially. This is related to the aforementioned multiplexing, but on a system level. For example if your Ethernet controller tells your CPU “feed me data”, it might respond immediately, or take a little longer because the CPU was doing some other high prior task.
One good way to prevent this randomness in your home network is QoS, or Quality of Service. It’s a method where the router/switch you’re using will limit the speed to something just below your maximum capabilities so that there’s more room for TCP overhead (the metadata of the actual transmission). This way everything gets a steady stable speed and there’s room for the network to keep communicating with each other over the data streams.
Number one reason is peak period load. Basically the more people using the Internet at one point in time in one area means it will be bogged down in that area. Most people do similar things at similar times therfore it bogs down the local infrastructure. There are many other factors but this is literally 95% of the reason. Sure packet loss and large outages account for some rerouting issues, but it is mostly just over use of local infrastructure.
For example when you look at your contract for your cable Internet you will see “speeds up to to 100mps!!!”. In the fine print it will say “speeds of 5mbps guaranteed… shhhhhh”. This is because you may have one local blade servicing 200 homes with a max of 5000mps. For the most part you will be able to serve all the homes 100mps, but at 7pm when every home has 3+ people streaming video, playing games, and downloading files you will start to see bottlenecks.
Because the whole system that gets data from your computer to a server and back again is so complex, there are many many opportunities for small things to affect your speed in unpredictable ways. Alone, most of these things are not random (an exception would be a cosmic ray hitting a router and causing it to drop one of your packets!) but the outcome is chaos.
Along the perhaps a dozen links between you and a website, there might be one which is experiencing abnormally high load due to maintenance running on one of the routers which consumes a lot of its processing power. This will cause it to drop some of your packets, which means your connection gets slower. This is not random; the maintenance may even have been planned, but you can’t see it happening and it is compounded by a thousand other effects like weather, physical damage, heat, normal load from other users and so on. To the end user, it looks random.
However, a good-quality internet connection should not look that random, because generally most places that experience these variations have more capacity than your link between your computer and your ISP. That means that if a single routing point is facing abnormally high load, it will only reduce your speed by a tiny amount. If your speed varies hugely, it means something is wrong.
An internet connection isn’t just your connection. Sure you have a pipe to your ISP’s neighborhood box, but from there everyone on the ISP uses the same “pipe” to connect out to the world. And then it flows out on even more shared “pipes” to it’s intended destination and then when communication is sent back it flows back on shared pipes once again.
The internet uses something called “packet switching” where all the links/lines (copper, fiber, RF, etc.) are always on and data from all sources and destinations are broken up into pieces (packets and frames) and sent in pieces shuffled together almost like playing cards to share these links. The system was invented to replace the circuit based phone system during the cold war in an effort to ensure command and control of military/government assets. If parts of the phone/communication system were to be destroyed. Phone, fax, and etc. connections were built as a electrical circuit. When the pentagon keep an open circuit to a command center in another region of the country, if a city/exchange it passes through got nuked the circuit would be b roken and it could take minutes or hours to rebuild the circuit around the damage to restore communication. If you can break down the communication into digital packets and send it out on a packet switched network, the switching and routing protocols automatically find routes for the data if the previous best path becomes unavailable. Ensuring messages could reach their destinations.
As this government network evolved (arpanet/csnet/nsfnet) it became evident that it could be opened to the general public, Al Gore wrote the High Performance Computing Act of 1991 creating the public internet we have today. Now instead of ensuring nuclear missiles can be launched after an attack, it is used to view porn and order pizza. Periods of high traffic can be predicted as humans are creatures of habit. But, it can be random when the traffic peaks on specific link or region, so the latency and throughput available from one destination to the next can vary from minute to minute or second to second outside of “peak hours” and traffic triggering events.
Nobody really answered OP’s question, just a lot of responses about ISPs, congestion, QoS, etc.
Let’s say we have an internet of two computers, connected by Ethernet through a router or switch, and they communicate with each other and nothing else. There will be almost no randomness in speed. The link will operate at whatever the transfer limit of the slowest device is pretty much non-stop.
So basically, there’s no randomness. There are issues that occur with large scale networks just like any other large network. They are all predictable and explainable, but not random.
Depending on what you define by random speeds (some sites take longer to load than others? Download speeds are uneven?), there are several possible reasons:
-
Not every server has the total available resources for you at anytime, which means that they may not be able to serve you at your full last mile speed, for example, if a site has a lots of visitors, the resources (bandwidth, CPU, memory and so on), may be at capacity, thus it appears slow to you.
The same can be said when you’re downloading something from a server which has a lot of usage. -
There may be congestion on the network (from the server to you, at any point, including your ISP/lastmile network), so packets take longer to reach the you or are lost in transit, so they either retransmission is needed, congestion control kicks in (part of TCP)
-
Your computer may be overwhelmed, so it reduces the window size (flow control).
-
Traffic Shaping policies by the part of your ISP.
All the other answers are correct, but here’s the simplest way it can be put: Upstream traffic! There is a lot of factors that could play into what’s making you fluctuate, but people in your home (local network), your neighborhood, your city, your state, etc all the way up to the world could be using more or less of what you are trying using at the same time! That may be the same website, same wifi router in your home, same fibre line buried under your street, etc.
The reasons for the randomness will depend on the type of connection. Some people have already described some issues with tcp.
I’ll go a little more low level here.
So you have the operating system kernel (think of this as a piece of software which allows all of the components of your computer to communicate).
There’s also CPU cache, think of this as extremely fast memory that is very limited (high end cpus these days will only have about 40mb). To give you a sense of, it will take 1 clock cycle to access something in level 1 cache whereas it could take up to 100 clock cycles to access something in main memory.
If you’re running a basic setup, your system is likely experiencing a context switch on each packet received. This is basically when your CPU switches between two process, it will move the existing state from cache over to main and then fetch the state from the other process in main, then store in cache. This may also happen as a result of another process that needs to do something on that same core.
The kernel will also move around your process across different cores depending on what it sees fit. Further, if you are on a multi-socket (more than 1 CPU) machine - the kernel may move across physical cpus. All of this will create overhead.
You may also be running a blocking process, this is when everything is halted until some event happens (packet receive in this case).
Again assuming a basic setup, whenever a packet hits your network card, the packet is copied by your kernel and then put into main memory.
Some things you can do to increase speed and reduce “randomness”:
Run a non-blocking process, pin the process to a specific core, give the process higher priority, move different flows to specific cores/queues, use zero copy if your card supports it.
With all of that you can probably take the standard deviation of your latency into single digit microseconds.
The simpler answer has to do with traffic. Not unlike automobile traffic, some routes to/from certain places (websites) have more or less traffic but only a finite amount of bandwidth to provide it over. That’s why people host things on places like Amazon, Google, Akamai, Rackspace, etc. because their bandwidth is far larger than that of what a business might purchase to host things on their own network.
As others have already mentioned, congestion “causes the randomness of speeds on the internet”. On a local “Ethernet” network that is not experiencing congestion the speed should be consistent. Some factors may cause inconsistent speeds on a local network include:
- using the wrong catagory cable
- exceeded the maximum cable length
- improperly terminated cable
- interference (aka noise) from external devices