Introduction
After many months of research, my own internal testing, and understanding of protocols such as TCP/IP, this article im about to write and the links included will give a very deep analysis into "why ESO lags and what must be done to fix it." In my day job im a Network administrator with over 10 years of experience in Cisco, Linux, Windows, and BSD based administration.
Some of my past articles on this site include:
PSA-Please Don't Use Port Forwarding for ESO
This post will be very technical in nature and i HIGHLY recommend reading every article i link before posting comments as it will be required to understand the very deep network inner workings especially on the TCP side of things.
This post will primarily deal with the PC version of ESO
The console versions of the game are very different as Xbox Live and PSN have different requirements for how network connections and protocols work and I don't own either of those consoles to see if they work and use the same protocols the PC version does.
What im typing is "trying" to break this down into simple terms someone without a degree or longtime background into networking and protocols can understand. As the entire intent of this post is educational
that being said, lets get started
What Protocol does ESO use on PC
ESO mostly uses the TCP Protocol on PC. As can be seen in the
ESO Ports knowledge base article ESO uses the following ports on PC:
- TCP / UDP Ports 24100 through 24131
- TCP / UDP Ports 24500 through 24507
- TCP / UDP Ports 24300 through 24331
- TCP Port 80
- TCP Port 433
ESO don't appear to use UDP from what I can see on PC, but relies solely on TCP for its connections. The Consoles may be different as Xbox Live and PSN are more closed environments then PC and may have different network requirements.
Why is the use of TCP important?
the
TCP Protocol has built-in network congestion algorithms built in to it. It was designed this way to prevent a network from just falling over under normal working conditions. Why is this important?
Its important because of
Packet LossWhat is Packet Loss
Packet Loss is when a packet sent from your network to a remote network does not reach its destination for a large number of reasons. Packet Loss can be caused by a variety of factors from an overloaded router, a bad network cable or NIC, or even the server your sending packets too is overlaoded and simply can't service anymore requests(receive any more packets) so the packets are simply dropped.
Why is Packet Loss important to TCP
By default TCP assumes that packet loss is caused by "Congestion". This means if your client doesn't receive a response in X amount of time, TCP considers the packet as lost, and will then cut throughput and figure out which packet must be resent, all the while reducing throughput until it catches up. This means you can have spikes of 1000+ latency that last much longer then they would if another protocol, such as a custom UDP implementation with latency mitigation algoriths were used. This is why engines, such as Unreal Engine 3 do so well in the online gaming world.
A single packet loss is enough to first cause a 1000+ms delay, then increase the consequent roundtrips, which only slowly return to normal as TCP ramps up the allowed throughput.
What does ESO ping Indicator actually show when its red?
The in-game ping indicator is a showing of response time. when the Indicator turns red, you don't have a 999+ ping, what you have is Packet Loss. This means the server simply can't handle anymore requests being sent to it and what else can it do but just drop the packet. Imagine overloading a Cisco router to 100% CPU utilization by flooding it with packets and then trying to reach its admin interface over a network...good luck...it simply can't handle anymore and packets are just dropped as there are no resources left to issue a response of any kind.
This is what happens in ESO and PVP when lag gets really bad...as more and more players are pushing buttons for skills(sending requests to the server) the server simply can't handle anymore. Since your skills won't fire until it gets a response from the server, when things are really laggy, you simply cna't get anything to work...the 999+ red ping indicator is telling you about massive packet loss.
No amount of "LOS checks" and "changing ability animations" are going to Fix this issue.
What will Fix ESO lag issues for good?
Removing the dependence on TCP and writing the game network code in a RUDP(Reliable UDP) protocol and layering in latency mitigation algorithms.
Example:
With tcp the lag after its starts may never dissipate as it actually requires more bandwidth to catch up. Which is why restarting a connection corrects the issue.
An perfect example of this is when tcp gets a backlog stuck in its buffer. If the packet contains the information of every other player location on the map. Then when being over bandwidth it is really a massive advantage of loosing the packet. with tcp you will still need to read and process out of data information when suffering from packet loss. With udp the last packet your receive always contains the most up to date implementation.
The 2nd part of this simple example is how do you catch up again. The reason your probably suffering from packet loss is drops due to queues overflowing on a router because the link is overloaded. In order to resend the data you now need to send the original data and the new data to catch up. so you are at a serious disadvantage.
As you can see, one packet loss or dropped packet just snowballs with TCP leading to those 999+ lag you see in PVP and some PVE instances due to the lack of algorithms(outside of standard TCP congestion controls) to address packet loss.
TCP is a Layer 4 on the
OSI Model. Its designed to handle latency issues as common network related issues(meaning congestion controls) not for the issues associated with games.
The 1st step to fixing the lag issues with ESO is to re-write the netcode using RUDP(Reliable UDP). This will then allow ZOS to write their own latency control issues into the custom protocol that would kinda make it a mix between Layer 4(UDP) with custom code built on top of it to handle these issues(Layer 7) functionality.
By doing this, the best example would be say you have one packet dropped, instead of having to resend 2 packets, the server would send you back one that contains all the data you need which by itself would reduce stress on their systems by 50% theoretically.
Conclusion
I think the 1st step towards fixing ESO lag issues lies in the netcode and its reliance on TCP. I think a custom RUDP(Reliable UDP protocol) implementation would go a long way in reducing server overhead and making the game lag significantly less in high stress situations. TCP simply isn't the best choice for a game the scale of ESO and this becomes woefully apparent on the PVP side of the game. FPS issues and such on the client side can be addressed later down the pipe, and may not even need addressed at all if the underlying network code is moved to a RUDP implementation to allow it to scale significantly higher then what can be achieved with TCP. Such a move would also give ZOS far more granual error and packet loss control options allowing them to truly write a robust netcode that handle a large player base of scale.
Good luck, and thank you for your time.
Useful Referenceshttps://en.wikipedia.org/wiki/List_of_network_protocols_(OSI_model)#Layer_7_.28Application_Layer.29https://en.wikipedia.org/wiki/OSI_model#Layer_4:_Transport_Layerhttp://www.freebsd.org/cgi/man.cgi?query=pollinghttps://1024monkeys.wordpress.com/https://1024monkeys.wordpress.com/2014/04/08/udp-vs-tcp-a-follow-up/