Maintenance for the week of April 6:
• [COMPLETE] ESO Store and Account System for maintenance – April 9, 9:00AM EDT (13:00 UTC) - 6:00PM EDT (22:00 UTC)

A serious question to those that know: Reason for such long server maintenace times?

dtm_samuraib16_ESO
dtm_samuraib16_ESO
✭✭✭✭
As per the question: why do these maintenances take so long nowadays?
Back in the days, simply restarting a server was enough, "back in 10 minutes" it was.
Now, 4+ hours...
Can someone explain please?

BTW: surf's up.
Earthdawn Game Master Role Play Quotes by me:
"If it looks like a bear, if it feels like a bear, smells and tastes like a bear, then be VERY aware, it could be something ENTIRELY different..."
"Be careful what you wish for, you might get plenty of it..."
  • DRXHarbinger
    DRXHarbinger
    ✭✭✭✭✭
    ✭✭
    Google IT app or hub rooms. Thier equipment is a bit more complex than your router at home and far larger. Thier app rooms are most likely larger than your house.

    Most likely in parallel rows with a terminal for each rack and one person doing the job will take a good fews hours to defrag it.
    PC Master Race

    1001CP
    8 Flawless Toons, all Classes.
    Master Angler
    Dro-M'artha Destroyer (at last)
    Tamriel Hero
    Grand Overlord
    Every Skyshard
    Down With BOP!
  • svartorn
    svartorn
    ✭✭✭✭✭
    Eh?

    I don't remember it ever being 10 minutes for any game ever. Its always been a couple hours to most of the day.
  • TheRealMcKoy
    TheRealMcKoy
    ✭✭
    Perhaps they are also implementing infrastructure security patches, firewall and malware prevention and mitigation improvements...and if Microsoft has anything to do with them, there's your problem right there. They are probably running several maintenance scripts to keep file systems and user accounts clean, and are possibly downloading system and performance logs they accumulate for use in guiding future patching. They may also be doing some validation testing before opening up to the masses because, believe it or not, they really would rather avoid botched go lives or having end users have a bad experience because something was misconfigured. Just my two cents.
  • Reorx_Holybeard
    Reorx_Holybeard
    ✭✭✭✭✭
    ESO maintenance has always taken 3-4 hours typically, sometimes much longer for serious maintenance. Besides the patch notes they don't typically tell us everything that goes on behind the scenes, nor do we have much if any details on their server infrastructure.

    As a server admin I can only tell you my experience in general when I do server updates/maintenance. It can involve things like:
    • Backups first in case something bad happens.
    • Copy files/data (takes a while when there is GBs of it).
    • Upgrade one or more pieces of software. This can be a one line command or multiple steps.
    • Update databases. This can be a one line command but can take a while depending on the size and complexity of the update.
    • Compiling/building software. I would expect this occurs before the actual maintenance for ESO however.
    • Testing/double-checking that everything updated correctly and isn't broken.
    • Fixing things which did break during the update.
    • Rebooting and checking logs for any issues.

    So that x100 or x1000 depending how many servers they have running. I would expect they have some sort of infrastructure to test updates before hand and push them to multiple servers. This doesn't eliminate the need to test the updates or deal with issues as they come up. I can tell you that fixing issues can easily turn a 10 minute maintenance job into a many-hour ordeal.
    Edited by Reorx_Holybeard on May 4, 2016 3:10PM
    Reorx Holybeard -- NA/PC
    Founder/Admin of www.uesp.net -- UESP ESO Guilds
    Creator of the "Best" ESO Build Editor
    I'm on a quest to build the world's toughest USB drive!
  • TastesAllColors
    TastesAllColors
    ✭✭✭
    All computer programs are basically a database coupled with core program routines. When changes are made to the central game logic the integrity of the database must be verified. There are a lot of player characters in the database so it takes a long time to check them all.
  • Cronopoly
    Cronopoly
    ✭✭✭✭✭
    I'll answer as I have 21 Years of enterprise IT experience in a world class data center that powered a top 5 in the world retailer. This is just an example as I do not know Zenimax's architecture.

    In any AAA MMO system you have several Enterprise class components that are responsible for your session running smoothly.

    Firewall's: Enterprise systems must use them for protected DMZ's, Demilitarized Zones as they are called :) which protect all network traffic within (Incoming and outgoing) as having Firewalls controlling data coming in and out by port and protocol to prevent unwarranted traffic & hackers etc... We used separate Firewall Zones for Initiall Web ingress of traffic, Credit Card zone, Database Zone, Application Zones, this way all different types of data was protected separately. So much easier to put rules around it that way, though not simple. I'll skip the talk of State tables, and NAT's getting full...

    Most enterprise level Firewalls are configured in pairs with a crossover cable to make sure that if one side fails the other can take the traffic seamlessly without users knowing a failure occured. (redundancy is huge)

    Firewalls maintenance windows for vendor updates (security etc) are required unless you want to be hacked sooner than later. this can sometimes take the Firewall team hours as they do one Firewall Pair at a time. Firewalls are most finicky when updating and sometimes require rebooting several times to become stable. And being that a leading vendor is in Israel (Checkpoint) makes this a pain if you have to contact them with timezones involved.

    Application Servers / HTTP Servers Many applications can use app servers to host the users session, that in the case of an MMO might hold the states of a users session with the mirrored variables of the users location, stats, all actions etc.

    App servers along with getting required security and O'/S patches which can take hours to apply, (shutdown, patching, startup), can be just that servers that get "Dirty" over time especially with the use of code that leverages JVM's (Java) that can get corrupted over time and have frequent memory problems. spelled pain in the butt. Required weekly reboots are common to prevent alot of problems. With an MMO there are many many servers to reboot and double check that they come back up clean, as users are pretty unforgiving if after maintenance something is not fully working...

    Network Typically the most stable part of an enterprise once setup correctly, however with more and more exploits targeting Internet facing Routers with new vulnerabilities that you guessed it have to be Security/firmware patched.

    ACE/VPN concentrators: Commonly used to provide a secure tunnel of connected users from the Internet to datacenter backend systems. These must always get patched like everything else and care taken to deprecate old protocols that over time always have vulnerabilities pop up. TLS etc...

    One network component that does have a higher failure rate (and time fixing them) are Load Balancers / Content Switches. These can take more maintenance as they are typically setup to take a network request targeting a single IP address and rout traffic round robin etc to multiple backend servers. The pain this causes during longer maintenance window due to Network team members not being thorough or overworked cannot be understated. I always cringed when I saw a change control for a CSS (content switch).


    Database servers I'll state this simply as I've gone down the wabbit hole too far already considering this is a gaming forum and not "Unpaid consulting hour" :smiley: Obviously all your data in several databases in an MMO have millions of transactions of reads and writes. Databases need maintenance in order to optimize paths to the disk to shorten read and write times. Database Administrators can run Stats, Reorgs, create new Indexes or refresh index to keep your data access optimal.
    Simply put on large databases this can take HOURS. Nothing is free here and it's typically not optional unless you want your system response time to slow to a crawl, and in an MMO that would kill your application for all users...

    You have the same level of complexity with other related systems:

    Storage Farms: Servers are typically connected to their remote NAS or SAN hard drives (EMC etc) which require periodic maintenance as well.

    Backup and Recovery Servers need to be backed up just in case of catastopophic failure which does occur. Many backups need to take place when users are not on the system. This can sometimes take hour(s) depending on how much local storage of the O/S+ Application and Data are on the server. I can see this occuring 1st during a maintenance window.

    TL:DR - Running an MMO requires mandatory maintenance to keep everything running and "secure". No free lunch.

    And I would be remiss If I I didn't state that it takes dedicated people to do all this work right. My personal opinion is that IT is the backbone of any modern company and most times no one praises or hears about them unless something goes wrong. I wish they were valued more and staffed appropriately across more US companies.






    Edited by Cronopoly on May 4, 2016 3:57PM
  • Nestor
    Nestor
    ✭✭✭✭✭
    ✭✭✭✭✭
    And I would be remiss If I I didn't state that it takes dedicated people to do all this work right. My personal opinion is that IT is the backbone of any modern company and most times no one praises or hears about them unless something goes wrong. I wish they were valued more and staffed appropriately across more US companies.

    As an IT guy myself (Network Architect/Sales Engineer), I completely agree. Especially when you tell management how much it will cost to do it right, then management tells you to do it for 50% of that cost, then management screams when it does not function as expected.





    Edited by Nestor on May 4, 2016 4:15PM
    Enjoy the game, life is what you really want to be worried about.

    PakKat "Everything was going well, until I died"
    Gary Gravestink "I am glad you died, I needed the help"

  • dtm_samuraib16_ESO
    dtm_samuraib16_ESO
    ✭✭✭✭
    svartorn wrote: »
    Eh?

    I don't remember it ever being 10 minutes for any game ever. Its always been a couple hours to most of the day.
    Freelancer, Unreal Tournament, the first 2 Quakes, ...
    I said back in the days...
    Earthdawn Game Master Role Play Quotes by me:
    "If it looks like a bear, if it feels like a bear, smells and tastes like a bear, then be VERY aware, it could be something ENTIRELY different..."
    "Be careful what you wish for, you might get plenty of it..."
  • dtm_samuraib16_ESO
    dtm_samuraib16_ESO
    ✭✭✭✭
    Thank you to those that answered.
    Earthdawn Game Master Role Play Quotes by me:
    "If it looks like a bear, if it feels like a bear, smells and tastes like a bear, then be VERY aware, it could be something ENTIRELY different..."
    "Be careful what you wish for, you might get plenty of it..."
  • Nestor
    Nestor
    ✭✭✭✭✭
    ✭✭✭✭✭
    svartorn wrote: »
    Eh?

    I don't remember it ever being 10 minutes for any game ever. Its always been a couple hours to most of the day.
    Freelancer, Unreal Tournament, the first 2 Quakes, ...
    I said back in the days...

    That was probably back when they had One Server, or a group of single servers that people logged onto a specific one. A single server can be rebooted in 10 to 20 minutes. Server Farms? Servers in an AWS Hosting Environment? Lots of pieces in play there. I am sure they just don't hit the power button to bring things on line either. Probably have to bring up one system, then next etc etc.
    Enjoy the game, life is what you really want to be worried about.

    PakKat "Everything was going well, until I died"
    Gary Gravestink "I am glad you died, I needed the help"

Sign In or Register to comment.