There is NO technical reason why you should have to take the entire service offline to perform maintenence
What sources would you like? I'm a little rusty being out of the server administration field for the last two years but I'm sure I can dig up something.
Storage array is irrelevant to this case as they don't typically EVER have to take a storage array offline for weekly maintenance, disk drives are hot swappable, weekly maintenance is another term for software patch installation with reservation to swap any failing hardware if needed, which 9 times out of 10 is not needed, patch only.
Authentication services are handled either off site or by a separate authentication server, and are also not needed to be interrupted by weekly software maintenance windows. I'm most familiar with AD but I know there are others out there. But regardless, weekly maintenance is not done, or really ever done, on user credential data (which authentication servers deal with, not your game data which is in your profile on the storage array) so the authentication service has no bearing on my point of 99.9% service uptime.
The only thing that is really focused on here is pulling down the actual host boxes to install the new patches and updates which can be handled on a separate set of host machines in a separate pool which when completed can then have traffic routed to it through the load balancer so you can then take the old pool offline for updating the next time. Should require at worst asking users to log out and back in to get routed to the new live pool. We did it all the time in our terminal services business when we had to roll out software updates and patches, or if one of the hosts had to be brought offline for hardware maintenance.
The Datacenter is also irrelevant, I would be HIGHLY surprised if they did not use a reputable Datacenter for their servers (such as Peak10) where they guarantee 99.9% uptime. Not even a factor for maintenance windows anyway so not sure why you even brought them up.
Also, hardware cost is almost negligible as well these days, yes they can be upwards of $10-20k for solid specs capable of hosting hundreds of thousands of concurrent users, but servers can be leased or even acquired with a financed monthly payment so the business does not have to shell out the full cost up front.
What sources would you like? I'm a little rusty being out of the server administration field for the last two years but I'm sure I can dig up something.
Storage array is irrelevant to this case as they don't typically EVER have to take a storage array offline for weekly maintenance, disk drives are hot swappable, weekly maintenance is another term for software patch installation with reservation to swap any failing hardware if needed, which 9 times out of 10 is not needed, patch only.
Authentication services are handled either off site or by a separate authentication server, and are also not needed to be interrupted by weekly software maintenance windows. I'm most familiar with AD but I know there are others out there. But regardless, weekly maintenance is not done, or really ever done, on user credential data (which authentication servers deal with, not your game data which is in your profile on the storage array) so the authentication service has no bearing on my point of 99.9% service uptime.
The only thing that is really focused on here is pulling down the actual host boxes to install the new patches and updates which can be handled on a separate set of host machines in a separate pool which when completed can then have traffic routed to it through the load balancer so you can then take the old pool offline for updating the next time. Should require at worst asking users to log out and back in to get routed to the new live pool. We did it all the time in our terminal services business when we had to roll out software updates and patches, or if one of the hosts had to be brought offline for hardware maintenance.
The Datacenter is also irrelevant, I would be HIGHLY surprised if they did not use a reputable Datacenter for their servers (such as Peak10) where they guarantee 99.9% uptime. Not even a factor for maintenance windows anyway so not sure why you even brought them up.
Also, hardware cost is almost negligible as well these days, yes they can be upwards of $10-20k for solid specs capable of hosting hundreds of thousands of concurrent users, but servers can be leased or even acquired with a financed monthly payment so the business does not have to shell out the full cost up front.
Seriously , whats wrong with people. This guy gives a perfectly ( sounding at least) explanation of the possibilities of doing maintenance without the servers coming down and the only replies are ' has to be done, needed for performance..blah blah'...Does anyone have a SOLID answer as to why his/her solution of a backup server isnt possible ???
Not trolling but seriously curios..Is it actually possible to do this ???
Seriously , whats wrong with people. This guy gives a perfectly ( sounding at least) explanation of the possibilities of doing maintenance without the servers coming down and the only replies are ' has to be done, needed for performance..blah blah'...Does anyone have a SOLID answer as to why his/her solution of a backup server isnt possible ???
Not trolling but seriously curios..Is it actually possible to do this ???
There is NO technical reason why you should have to take the entire service offline to perform maintenence. Unless you're running everything and everyone off a single machine, which, shame on you if so.
There really is no technical reason as to why any service, including those vastly larger, far more complex, and with many more customers than a computer game, needs to be taken offline for maintenance. Hence why global enterprise services, such as Office 365, Azure, AWS, vCloud etc., all offer out of the box uptimes of 99.95%, and more if desired. If those services went down for patching then those businesses would not be the global heavyweights that they are.
However it is all about the money honey. The more availability you want, the more money you need to invest. This is just a game, no one will die if it goes down for patching, so I guess ZOS do not see the requirement to invest in high availability. It annoys us, but the only way to change that is to vote with our wallets and therefore exert a sufficient penalty to drive a business case for more resilience.
But without knowing the ZOS profit margin, it is hard to know if such a business case might also drive an increase in sub fee and/or an increase in the prevalance of utter dreck in the Crown store to fund such resilience.
anitajoneb17_ESO wrote: »
But without knowing the ZOS profit margin, it is hard to know if such a business case might also drive an increase in sub fee and/or an increase in the prevalance of utter dreck in the Crown store to fund such resilience.
You are right that is comes down to the money.
You are not right to compare with services provided by big companies. They have, as you say, a VERY LARGE customer base, and therefore a much higher income. They can afford the costs of a duplicate infrastructure - which ZOS cannot.
If we insisted on 99.99% (virtually 100%) uptime, we would need to pay much more for playing the game than we currently are.
Inversely, if we insisted on 100% uptime and stopped playing/paying because of the downtimes, then the game probably wouldn't exist because there would be no sustainable business model for it.
So yes, it's entirely possible to make any service resilient, you just need to chuck money at it, but as ZOS are small fry in enterprise terms, they do not have that sort of money and would end up passing that cost to us, no doubt by 'virtue' of more Crown Store garbage. Therefore we either accept the game is run on a shoe string, or we defect to another game, no doubt run on an equally thread bear shoe-string.
I've worked in software development for Global banks for quite a few years. They have money, yet they always take systems down to patch and it always takes several hours.