Upcoming PC NA Datacenter Hardware Replacement

Soldier224 · May 2022

I think Bethesda/ZOS are in the USA, so they give the NA Server an preferential treatment. Not more. I cant remember that a American Company gives an EU Server a priority.

Elsonso · May 2022

Soldier224 wrote: »

I think Bethesda/ZOS are in the USA, so they give the NA Server an preferential treatment. Not more. I cant remember that a American Company gives an EU Server a priority.

I am pretty sure that someone already figured out why the PC NA server is first. The hardware is the oldest.

ZOS will never confirm or deny this, though.

Shagreth · May 2022

Hopefully the EU upgrades will come within this year.

TechMaybeHic · May 2022

I'm baffled by how replacing 2012 servers years later will not provide performance improvement. Is this equivalent of replacing a windows 7 gaming PC with a Chromebook?

Why?

Elsonso · May 2022

TechMaybeHic wrote: »

I'm baffled by how replacing 2012 servers years later will not provide performance improvement. Is this equivalent of replacing a windows 7 gaming PC with a Chromebook?

Why?

There is a logical reason why upgrading the hardware, which is basically a CPU upgrade, won't improve performance. That is if the server is not CPU bound. If it is network bound, meaning the server is waiting on other servers (database, storage, etc) on the megaserver network, upgrading the CPU may not have that much of a performance boost. It is like upgrading your car from a Geo Metro to a Ferrari. You might be able to go faster, but if the Geo was already going the speed limit, you won't get there any faster. Just more fashionably.

Whether that is the actual reason... no idea... and we will likely never know. ¯\_(ツ)_/¯

TechMaybeHic · May 2022

Elsonso wrote: »

TechMaybeHic wrote: »

I'm baffled by how replacing 2012 servers years later will not provide performance improvement. Is this equivalent of replacing a windows 7 gaming PC with a Chromebook?

Why?

There is a logical reason why upgrading the hardware, which is basically a CPU upgrade, won't improve performance. That is if the server is not CPU bound. If it is network bound, meaning the server is waiting on other servers (database, storage, etc) on the megaserver network, upgrading the CPU may not have that much of a performance boost. It is like upgrading your car from a Geo Metro to a Ferrari. You might be able to go faster, but if the Geo was already going the speed limit, you won't get there any faster. Just more fashionably.

Whether that is the actual reason... no idea... and we will likely never know. ¯\_(ツ)_/¯

Should be a major CPU and memory upgrade from servers running 2012. They may not be updating their storage though, but that also begs a question of why is 5hay not being refreshed 10 years later as well? That 2012 puts it around when the game was being made, but it's been a long time.

Then again; maybe when it comes to gaming, it's a common practice to just run everything until it collapses, half expecting the game to die off long before.

Gaeliannas · May 2022

TechMaybeHic wrote: »

I'm baffled by how replacing 2012 servers years later will not provide performance improvement. Is this equivalent of replacing a windows 7 gaming PC with a Chromebook?

Why?

It does make you wonder right? The most serious issue the game has right now is the abysmal performance, which buying the right hardware could have gone a long ways towards addressing, but we get this instead? It is like they think software and coding can fix everything and taking a year+ more to fix that mess is somehow acceptable to their paying customers. And based on their track record with coding, you would have thought they would have least tried another avenue instead of putting all their eggs in that basket, ya know, just in case?

Gaeliannas · May 2022

Elsonso wrote: »

TechMaybeHic wrote: »

I'm baffled by how replacing 2012 servers years later will not provide performance improvement. Is this equivalent of replacing a windows 7 gaming PC with a Chromebook?

Why?

There is a logical reason why upgrading the hardware, which is basically a CPU upgrade, won't improve performance. That is if the server is not CPU bound. If it is network bound, meaning the server is waiting on other servers (database, storage, etc) on the megaserver network, upgrading the CPU may not have that much of a performance boost. It is like upgrading your car from a Geo Metro to a Ferrari. You might be able to go faster, but if the Geo was already going the speed limit, you won't get there any faster. Just more fashionably.

Whether that is the actual reason... no idea... and we will likely never know. ¯\_(ツ)_/¯

Matt did not say they were just replacing servers, he said *All hardware n their datacenter*. And if replacing your storage, networking, servers, etc. doesn't give a performance increase over 2012 *era* hardware, you are doing something incredibly wrong, or have cheaped out so hard, it is questionable if whomever made that decision, should be making those sorts of decisions at all.

Since 2012:
CPU's faster & more cores
Server bus architecture is better/faster
Memory is faster & servers hold more of it
Flash storage costs about what spinning disks did in 2012
10gb network is the standard, 1gb is out (and it doesn't need to be fiber now)

And lets not forget, a lot of customers have rock solid 1gb+ Internet at home now, and expect performance. Blaming customers Internet for game issues, no longer fly's.

FlopsyPrince · May 2022

Gaeliannas wrote: »

Elsonso wrote: »

TechMaybeHic wrote: »

I'm baffled by how replacing 2012 servers years later will not provide performance improvement. Is this equivalent of replacing a windows 7 gaming PC with a Chromebook?

Why?

There is a logical reason why upgrading the hardware, which is basically a CPU upgrade, won't improve performance. That is if the server is not CPU bound. If it is network bound, meaning the server is waiting on other servers (database, storage, etc) on the megaserver network, upgrading the CPU may not have that much of a performance boost. It is like upgrading your car from a Geo Metro to a Ferrari. You might be able to go faster, but if the Geo was already going the speed limit, you won't get there any faster. Just more fashionably.

Whether that is the actual reason... no idea... and we will likely never know. ¯\_(ツ)_/¯

Matt did not say they were just replacing servers, he said *All hardware n their datacenter*. And if replacing your storage, networking, servers, etc. doesn't give a performance increase over 2012 *era* hardware, you are doing something incredibly wrong, or have cheaped out so hard, it is questionable if whomever made that decision, should be making those sorts of decisions at all.

Since 2012:
CPU's faster & more cores
Server bus architecture is better/faster
Memory is faster & servers hold more of it
Flash storage costs about what spinning disks did in 2012
10gb network is the standard, 1gb is out (and it doesn't need to be fiber now)

And lets not forget, a lot of customers have rock solid 1gb+ Internet at home now, and expect performance. Blaming customers Internet for game issues, no longer fly's.

Not necessarily. You can't just flip a switch and change how systems interact and use each other. Lets say they were running CORBA (a very old sharing technology if I am remembering right). They couldn't immediately switch to modern technologies just by replacing hardware. The new hardware might not even run the old "standard" technology well or at all!

This is why smart companies pay attention to these things over time. They don't wait until things completely break.

Elsonso · May 2022

Gaeliannas wrote: »

Since 2012:
CPU's faster & more cores
Server bus architecture is better/faster
Memory is faster & servers hold more of it
Flash storage costs about what spinning disks did in 2012
10gb network is the standard, 1gb is out (and it doesn't need to be fiber now)

Yes, they can upgrade the network to 10GbE, database servers with beefier CPUs and more RAM, and they can upgrade the NAS with beefier CPUs and faster storage, but the difference is probably going to be on the order of milliseconds, at best, and no one player is going to see a significant change. My expectation is that the benefit here is that the server can handle more players, not that any one player is going to see radically better performance. To that end, all players should see better performance when more players are logged in, not that we know when this happens, or how many more players we are talking about.

After all of that, I think their megaserver will still be network bound.

Gaeliannas wrote: »

Matt did not say they were just replacing servers, he said *All hardware n their datacenter*.

This is one time where I am not taking what Matt says literally. I usually do this, but this time some of what he says does not add up to a complete Happy Meal.

I figure it is a press release, more focused on the broad picture of what they are doing and less focused on accuracy down to the details.

He might have said they are replacing all hardware in their datacenter, but I am sure that is exactly not what is actually happening. Not by my understanding of the words "hardware" and "datacenter". (Have you ever seen a datacenter?) They are replacing something, and by measure, a lot of something, and that is all I think the statement is meant to convey.

FlopsyPrince · May 2022

Elsonso wrote: »

Gaeliannas wrote: »

Since 2012:
CPU's faster & more cores
Server bus architecture is better/faster
Memory is faster & servers hold more of it
Flash storage costs about what spinning disks did in 2012
10gb network is the standard, 1gb is out (and it doesn't need to be fiber now)

Yes, they can upgrade the network to 10GbE, database servers with beefier CPUs and more RAM, and they can upgrade the NAS with beefier CPUs and faster storage, but the difference is probably going to be on the order of milliseconds, at best, and no one player is going to see a significant change. My expectation is that the benefit here is that the server can handle more players, not that any one player is going to see radically better performance. To that end, all players should see better performance when more players are logged in, not that we know when this happens, or how many more players we are talking about.

After all of that, I think their megaserver will still be network bound.

Gaeliannas wrote: »

Matt did not say they were just replacing servers, he said *All hardware n their datacenter*.

This is one time where I am not taking what Matt says literally. I usually do this, but this time some of what he says does not add up to a complete Happy Meal. I figure it is a press release, more focused on the broad picture of what they are doing and less focused on accuracy down to the details.

He might have said they are replacing all hardware in their datacenter, but I am sure that is exactly not what is actually happening. Not by my understanding of the words "hardware" and "datacenter". (Have you ever seen a datacenter? I work next to one) They are replacing something, and by measure, a lot of something, and that is all I think the statement is meant to convey.

They are probably not replacing all the rack hardware, cabling, etc.

I would just create a new data center and migrate there if I was really going to do that. Replacing things in place though can be quite challenging!

FlopsyPrince · May 2022

And replacing things is prone to mistakes: Fiber is quite fragile from what I read several years back in this usage. It can break easily and fully testing out it has been connected properly could be challenging.

The more I read her, the more I think a complete rewrite of the game would have value. I expect they would scrap the game before doing that. I can't think of an example of another game that had such an overhaul. (One Tamriel, prior to my time, revamped the game content, not the framework supporting it.)

Gaeliannas · May 2022

Elsonso wrote: »

Gaeliannas wrote: »

Since 2012:
CPU's faster & more cores
Server bus architecture is better/faster
Memory is faster & servers hold more of it
Flash storage costs about what spinning disks did in 2012
10gb network is the standard, 1gb is out (and it doesn't need to be fiber now)

Yes, they can upgrade the network to 10GbE, database servers with beefier CPUs and more RAM, and they can upgrade the NAS with beefier CPUs and faster storage, but the difference is probably going to be on the order of milliseconds, at best, and no one player is going to see a significant change. My expectation is that the benefit here is that the server can handle more players, not that any one player is going to see radically better performance. To that end, all players should see better performance when more players are logged in, not that we know when this happens, or how many more players we are talking about.

After all of that, I think their megaserver will still be network bound.

Gaeliannas wrote: »

Matt did not say they were just replacing servers, he said *All hardware n their datacenter*.

This is one time where I am not taking what Matt says literally. I usually do this, but this time some of what he says does not add up to a complete Happy Meal. I figure it is a press release, more focused on the broad picture of what they are doing and less focused on accuracy down to the details.

He might have said they are replacing all hardware in their datacenter, but I am sure that is exactly not what is actually happening. Not by my understanding of the words "hardware" and "datacenter". (Have you ever seen a datacenter? I work next to one) They are replacing something, and by measure, a lot of something, and that is all I think the statement is meant to convey.

Try looking at it this way. Every zone, instance, etc is a virtual server, each VMWare host runs X amount of them based on the specs of the instance they are running, some obviously take more resources to run than others. You ran out resources long ago and have been piling new stuff in anyways by lowering the amount of resources available to each instance, and affecting its ability to run smoothly, this would also overtask the host itself as it would no longer be able to keep up, as there is a finite amount of processing you can ask any server to do before processes become delayed to a noticeable point.

Once again obviously, your prioritize some zones over others, like Vivec, Mournhold, etc.. to keep the appearance of halfway decent performance in those zones where a ton of players congregate. You pick zones that are unpopulated or don't care about (Cyrodiil anyone?) and reallocate those resources elsewhere, to the point they barely run at all, except a couple times a year during an event, where it "magically" works again (to the layman), because they allocated resources back to it. Then along comes this last event, where it stayed horrible, probably because we are so resource thin at this point, there was nothing left to move around.

Which now explains why trials and other PVE activities are also experiencing what has been happening to Cyrodiil for years, too much load and not enough resources to handle it. I was predicting a total game meltdown or near to it with the release of High Isle, so it was either an unlikely coincidence or they finally figured it out as well after the fallout from the DLC they dropped last month, and decided to scramble and get the new equipment installed finally.

Honestly though, I still believe we will see a return to somewhat decent performance after they install the upgrades.

And BTW, my theory about the optimizations they are working on, while it may have a bit to do with unscrambling their code, it is a lot more likely they are trying to figure out how to spin instances up/down better and migrate them around, so they can reallocate unused resources more effectively across the entire datacenter, and not just a cluster. Once they are done, I expect to see players migrated off numerous low pop instances and combined into one new or existing instance automatically in order to keep a few players from holding an entire instance open. At least that is what I would be working on.

FlopsyPrince · May 2022

Gaeliannas wrote: »

Elsonso wrote: »

Gaeliannas wrote: »

Since 2012:
CPU's faster & more cores
Server bus architecture is better/faster
Memory is faster & servers hold more of it
Flash storage costs about what spinning disks did in 2012
10gb network is the standard, 1gb is out (and it doesn't need to be fiber now)

Yes, they can upgrade the network to 10GbE, database servers with beefier CPUs and more RAM, and they can upgrade the NAS with beefier CPUs and faster storage, but the difference is probably going to be on the order of milliseconds, at best, and no one player is going to see a significant change. My expectation is that the benefit here is that the server can handle more players, not that any one player is going to see radically better performance. To that end, all players should see better performance when more players are logged in, not that we know when this happens, or how many more players we are talking about.

After all of that, I think their megaserver will still be network bound.

Gaeliannas wrote: »

Matt did not say they were just replacing servers, he said *All hardware n their datacenter*.

This is one time where I am not taking what Matt says literally. I usually do this, but this time some of what he says does not add up to a complete Happy Meal. I figure it is a press release, more focused on the broad picture of what they are doing and less focused on accuracy down to the details.

He might have said they are replacing all hardware in their datacenter, but I am sure that is exactly not what is actually happening. Not by my understanding of the words "hardware" and "datacenter". (Have you ever seen a datacenter? I work next to one) They are replacing something, and by measure, a lot of something, and that is all I think the statement is meant to convey.

Try looking at it this way. Every zone, instance, etc is a virtual server, each VMWare host runs X amount of them based on the specs of the instance they are running, some obviously take more resources to run than others. You ran out resources long ago and have been piling new stuff in anyways by lowering the amount of resources available to each instance, and affecting its ability to run smoothly, this would also overtask the host itself as it would no longer be able to keep up, as there is a finite amount of processing you can ask any server to do before processes become delayed to a noticeable point.

Once again obviously, your prioritize some zones over others, like Vivec, Mournhold, etc.. to keep the appearance of halfway decent performance in those zones where a ton of players congregate. You pick zones that are unpopulated or don't care about (Cyrodiil anyone?) and reallocate those resources elsewhere, to the point they barely run at all, except a couple times a year during an event, where it "magically" works again (to the layman), because they allocated resources back to it. Then along comes this last event, where it stayed horrible, probably because we are so resource thin at this point, there was nothing left to move around.

Which now explains why trials and other PVE activities are also experiencing what has been happening to Cyrodiil for years, too much load and not enough resources to handle it. I was predicting a total game meltdown or near to it with the release of High Isle, so it was either an unlikely coincidence or they finally figured it out as well after the fallout from the DLC they dropped last month, and decided to scramble and get the new equipment installed finally.

Honestly though, I still believe we will see a return to somewhat decent performance after they install the upgrades.

And BTW, my theory about the optimizations they are working on, while it may have a bit to do with unscrambling their code, it is a lot more likely they are trying to figure out how to spin instances up/down better and migrate them around, so they can reallocate unused resources more effectively. Once they are done, I expect to see players migrated off numerous low pop instances and combined into one new or existing instance automatically in order to keep a few players from holding an entire instance open. At least that is what I would be working on.

Being able to add and remove instances "on the fly" would go a long ways toward mitigating problems here most likely.

I have not looked into VM instances much lately, but using them may still have the challenge of connecting things. I was working with a team on that problem more than 15 years ago.

Elsonso · May 2022

FlopsyPrince wrote: »

They are probably not replacing all the rack hardware, cabling, etc.

I would just create a new data center and migrate there if I was really going to do that. Replacing things in place though can be quite challenging!

Actually, this is what confuses me.

I mean, it is certainly possible that they are replacing _all_ of the hardware that they have in the datacenter, which would include racks, enclosures, server systems, NAS, power supplies, power cables, network cables, network switches, for all 6 megaservers across their (presumably) two datacenters.

That is an expensive option, out of several options, available to them.

Then again, this maintenance only lasts for 10 hours, and PC NA isn't some gaming computer sitting on Mom's basement floor.

It is entirely possible that they actually built a new megaserver, staged it next to the old one, and will spend the 10 hour maintenance copying all of the live data over, bringing it up, and testing it. ( <---- EDIT: ZOS.. pics or it didn't happen )

Upgrading in place would take more than 10 hours to complete for _all_ the hardware, I would think, so if they are doing that, they are definitely not replacing _all_ of the hardware.

So, my expectation is that either they built a new system, or they are replacing select parts of the existing system. The latter is the more cost effective route.

Gaeliannas · May 2022

.

FlopsyPrince wrote: »

Gaeliannas wrote: »

Elsonso wrote: »

Gaeliannas wrote: »

Since 2012:
CPU's faster & more cores
Server bus architecture is better/faster
Memory is faster & servers hold more of it
Flash storage costs about what spinning disks did in 2012
10gb network is the standard, 1gb is out (and it doesn't need to be fiber now)

Yes, they can upgrade the network to 10GbE, database servers with beefier CPUs and more RAM, and they can upgrade the NAS with beefier CPUs and faster storage, but the difference is probably going to be on the order of milliseconds, at best, and no one player is going to see a significant change. My expectation is that the benefit here is that the server can handle more players, not that any one player is going to see radically better performance. To that end, all players should see better performance when more players are logged in, not that we know when this happens, or how many more players we are talking about.

After all of that, I think their megaserver will still be network bound.

Gaeliannas wrote: »

Matt did not say they were just replacing servers, he said *All hardware n their datacenter*.

This is one time where I am not taking what Matt says literally. I usually do this, but this time some of what he says does not add up to a complete Happy Meal. I figure it is a press release, more focused on the broad picture of what they are doing and less focused on accuracy down to the details.

He might have said they are replacing all hardware in their datacenter, but I am sure that is exactly not what is actually happening. Not by my understanding of the words "hardware" and "datacenter". (Have you ever seen a datacenter? I work next to one) They are replacing something, and by measure, a lot of something, and that is all I think the statement is meant to convey.

Try looking at it this way. Every zone, instance, etc is a virtual server, each VMWare host runs X amount of them based on the specs of the instance they are running, some obviously take more resources to run than others. You ran out resources long ago and have been piling new stuff in anyways by lowering the amount of resources available to each instance, and affecting its ability to run smoothly, this would also overtask the host itself as it would no longer be able to keep up, as there is a finite amount of processing you can ask any server to do before processes become delayed to a noticeable point.

Once again obviously, your prioritize some zones over others, like Vivec, Mournhold, etc.. to keep the appearance of halfway decent performance in those zones where a ton of players congregate. You pick zones that are unpopulated or don't care about (Cyrodiil anyone?) and reallocate those resources elsewhere, to the point they barely run at all, except a couple times a year during an event, where it "magically" works again (to the layman), because they allocated resources back to it. Then along comes this last event, where it stayed horrible, probably because we are so resource thin at this point, there was nothing left to move around.

Which now explains why trials and other PVE activities are also experiencing what has been happening to Cyrodiil for years, too much load and not enough resources to handle it. I was predicting a total game meltdown or near to it with the release of High Isle, so it was either an unlikely coincidence or they finally figured it out as well after the fallout from the DLC they dropped last month, and decided to scramble and get the new equipment installed finally.

Honestly though, I still believe we will see a return to somewhat decent performance after they install the upgrades.

And BTW, my theory about the optimizations they are working on, while it may have a bit to do with unscrambling their code, it is a lot more likely they are trying to figure out how to spin instances up/down better and migrate them around, so they can reallocate unused resources more effectively. Once they are done, I expect to see players migrated off numerous low pop instances and combined into one new or existing instance automatically in order to keep a few players from holding an entire instance open. At least that is what I would be working on.

Being able to add and remove instances "on the fly" would go a long ways toward mitigating problems here most likely.

I have not looked into VM instances much lately, but using them may still have the challenge of connecting things. I was working with a team on that problem more than 15 years ago.

It is tricky but quite doable, you just need to code your stuff right. I helped build a out a datacenter a few years ago for a financial company, and we spun virtual servers up/down constantly, but each server was assigned a singular task. We then used load balancing to use the new servers, or stop when servers spun down, to keep the website able to handle millions of transactions flawlessly, without having to dedicate huge amounts of compute power to individual parts of the application. So now it didn't matter whether there was a buying spree, selling spree, investments, or fund reallocations going on, it all just worked and the end user only saw a highly performant website regardless of what they were doing.

Oh, and it took a lot less hardware to do it this way, and was almost infinitely scalable by tossing a new server, switch, array, etc. into the mix.

Hurbster · May 2022

And of course

Gaeliannas wrote: »

.

FlopsyPrince wrote: »

Gaeliannas wrote: »

Elsonso wrote: »

Gaeliannas wrote: »

Since 2012:
CPU's faster & more cores
Server bus architecture is better/faster
Memory is faster & servers hold more of it
Flash storage costs about what spinning disks did in 2012
10gb network is the standard, 1gb is out (and it doesn't need to be fiber now)

Yes, they can upgrade the network to 10GbE, database servers with beefier CPUs and more RAM, and they can upgrade the NAS with beefier CPUs and faster storage, but the difference is probably going to be on the order of milliseconds, at best, and no one player is going to see a significant change. My expectation is that the benefit here is that the server can handle more players, not that any one player is going to see radically better performance. To that end, all players should see better performance when more players are logged in, not that we know when this happens, or how many more players we are talking about.

After all of that, I think their megaserver will still be network bound.

Gaeliannas wrote: »

Matt did not say they were just replacing servers, he said *All hardware n their datacenter*.

This is one time where I am not taking what Matt says literally. I usually do this, but this time some of what he says does not add up to a complete Happy Meal. I figure it is a press release, more focused on the broad picture of what they are doing and less focused on accuracy down to the details.

He might have said they are replacing all hardware in their datacenter, but I am sure that is exactly not what is actually happening. Not by my understanding of the words "hardware" and "datacenter". (Have you ever seen a datacenter? I work next to one) They are replacing something, and by measure, a lot of something, and that is all I think the statement is meant to convey.

Try looking at it this way. Every zone, instance, etc is a virtual server, each VMWare host runs X amount of them based on the specs of the instance they are running, some obviously take more resources to run than others. You ran out resources long ago and have been piling new stuff in anyways by lowering the amount of resources available to each instance, and affecting its ability to run smoothly, this would also overtask the host itself as it would no longer be able to keep up, as there is a finite amount of processing you can ask any server to do before processes become delayed to a noticeable point.

Once again obviously, your prioritize some zones over others, like Vivec, Mournhold, etc.. to keep the appearance of halfway decent performance in those zones where a ton of players congregate. You pick zones that are unpopulated or don't care about (Cyrodiil anyone?) and reallocate those resources elsewhere, to the point they barely run at all, except a couple times a year during an event, where it "magically" works again (to the layman), because they allocated resources back to it. Then along comes this last event, where it stayed horrible, probably because we are so resource thin at this point, there was nothing left to move around.

Which now explains why trials and other PVE activities are also experiencing what has been happening to Cyrodiil for years, too much load and not enough resources to handle it. I was predicting a total game meltdown or near to it with the release of High Isle, so it was either an unlikely coincidence or they finally figured it out as well after the fallout from the DLC they dropped last month, and decided to scramble and get the new equipment installed finally.

Honestly though, I still believe we will see a return to somewhat decent performance after they install the upgrades.

And BTW, my theory about the optimizations they are working on, while it may have a bit to do with unscrambling their code, it is a lot more likely they are trying to figure out how to spin instances up/down better and migrate them around, so they can reallocate unused resources more effectively. Once they are done, I expect to see players migrated off numerous low pop instances and combined into one new or existing instance automatically in order to keep a few players from holding an entire instance open. At least that is what I would be working on.

Being able to add and remove instances "on the fly" would go a long ways toward mitigating problems here most likely.

I have not looked into VM instances much lately, but using them may still have the challenge of connecting things. I was working with a team on that problem more than 15 years ago.

It is tricky but quite doable, you just need to code your stuff right. I helped build a out a datacenter a few years ago for a financial company, and we spun virtual servers up/down constantly, but each server was assigned a singular task. We then used load balancing to use the new servers, or stop when servers spun down, to keep the website able to handle millions of transactions flawlessly, without having to dedicate huge amounts of compute power to individual parts of the application. So now it didn't matter whether there was a buying spree, selling spree, investments, or fund reallocations going on, it all just worked and the end user only saw a highly performant website regardless of what they were doing.

Oh, and it took a lot less hardware to do it this way, and was almost infinitely scalable by tossing a new server, switch, array, etc. into the mix.

And the code for pretty much everything in ESO is the worst spaghetti coding, which must make everything seem like bug-testing by repeatedly smashing your head on the keyboard, IMO. Or trying to pvp in the last 5 years or so.

Salix_alba · May 2022

New Hamsters!!!

Gaeliannas · May 2022

Hurbster wrote: »

And of course

Gaeliannas wrote: »

.

FlopsyPrince wrote: »

Gaeliannas wrote: »

Elsonso wrote: »

Gaeliannas wrote: »

Since 2012:
CPU's faster & more cores
Server bus architecture is better/faster
Memory is faster & servers hold more of it
Flash storage costs about what spinning disks did in 2012
10gb network is the standard, 1gb is out (and it doesn't need to be fiber now)

Yes, they can upgrade the network to 10GbE, database servers with beefier CPUs and more RAM, and they can upgrade the NAS with beefier CPUs and faster storage, but the difference is probably going to be on the order of milliseconds, at best, and no one player is going to see a significant change. My expectation is that the benefit here is that the server can handle more players, not that any one player is going to see radically better performance. To that end, all players should see better performance when more players are logged in, not that we know when this happens, or how many more players we are talking about.

After all of that, I think their megaserver will still be network bound.

Gaeliannas wrote: »

Matt did not say they were just replacing servers, he said *All hardware n their datacenter*.

This is one time where I am not taking what Matt says literally. I usually do this, but this time some of what he says does not add up to a complete Happy Meal. I figure it is a press release, more focused on the broad picture of what they are doing and less focused on accuracy down to the details.

He might have said they are replacing all hardware in their datacenter, but I am sure that is exactly not what is actually happening. Not by my understanding of the words "hardware" and "datacenter". (Have you ever seen a datacenter? I work next to one) They are replacing something, and by measure, a lot of something, and that is all I think the statement is meant to convey.

Try looking at it this way. Every zone, instance, etc is a virtual server, each VMWare host runs X amount of them based on the specs of the instance they are running, some obviously take more resources to run than others. You ran out resources long ago and have been piling new stuff in anyways by lowering the amount of resources available to each instance, and affecting its ability to run smoothly, this would also overtask the host itself as it would no longer be able to keep up, as there is a finite amount of processing you can ask any server to do before processes become delayed to a noticeable point.

Once again obviously, your prioritize some zones over others, like Vivec, Mournhold, etc.. to keep the appearance of halfway decent performance in those zones where a ton of players congregate. You pick zones that are unpopulated or don't care about (Cyrodiil anyone?) and reallocate those resources elsewhere, to the point they barely run at all, except a couple times a year during an event, where it "magically" works again (to the layman), because they allocated resources back to it. Then along comes this last event, where it stayed horrible, probably because we are so resource thin at this point, there was nothing left to move around.

Which now explains why trials and other PVE activities are also experiencing what has been happening to Cyrodiil for years, too much load and not enough resources to handle it. I was predicting a total game meltdown or near to it with the release of High Isle, so it was either an unlikely coincidence or they finally figured it out as well after the fallout from the DLC they dropped last month, and decided to scramble and get the new equipment installed finally.

Honestly though, I still believe we will see a return to somewhat decent performance after they install the upgrades.

And BTW, my theory about the optimizations they are working on, while it may have a bit to do with unscrambling their code, it is a lot more likely they are trying to figure out how to spin instances up/down better and migrate them around, so they can reallocate unused resources more effectively. Once they are done, I expect to see players migrated off numerous low pop instances and combined into one new or existing instance automatically in order to keep a few players from holding an entire instance open. At least that is what I would be working on.

Being able to add and remove instances "on the fly" would go a long ways toward mitigating problems here most likely.

I have not looked into VM instances much lately, but using them may still have the challenge of connecting things. I was working with a team on that problem more than 15 years ago.

It is tricky but quite doable, you just need to code your stuff right. I helped build a out a datacenter a few years ago for a financial company, and we spun virtual servers up/down constantly, but each server was assigned a singular task. We then used load balancing to use the new servers, or stop when servers spun down, to keep the website able to handle millions of transactions flawlessly, without having to dedicate huge amounts of compute power to individual parts of the application. So now it didn't matter whether there was a buying spree, selling spree, investments, or fund reallocations going on, it all just worked and the end user only saw a highly performant website regardless of what they were doing.

Oh, and it took a lot less hardware to do it this way, and was almost infinitely scalable by tossing a new server, switch, array, etc. into the mix.

And the code for pretty much everything in ESO is the worst spaghetti coding, which must make everything seem like bug-testing by repeatedly smashing your head on the keyboard, IMO. Or trying to pvp in the last 5 years or so.

While it does seem that way, it is just an assumption. It could be they have the cleanest code in the world, but an extremely poorly architected system that isn't modular at all, and takes a huge amount of resources to perform even the most minor of transactions, which walk all over each other constantly.

Elsonso · May 2022

Gaeliannas wrote: »

While it does seem that way, it is just an assumption. It could be they have the cleanest code in the world, but an extremely poorly architected system that isn't modular at all, and takes a huge amount of resources to perform even the most minor of transactions, which walk all over each other constantly.

I wouldn't say "poorly architected", but I would say not designed for the game that it became.

Maybe they were short sighted, or never really thought it would be as popular as it is, but it sounds like they made architectural decisions that ended up being... sub-optimal.

Thus... a rewrite of parts of the server code. I also think that it is the core reason behind AwA, cold storage, and probably more.

Gaeliannas · May 2022

Elsonso wrote: »

Gaeliannas wrote: »

While it does seem that way, it is just an assumption. It could be they have the cleanest code in the world, but an extremely poorly architected system that isn't modular at all, and takes a huge amount of resources to perform even the most minor of transactions, which walk all over each other constantly.

I wouldn't say "poorly architected", but I would say not designed for the game that it became.

Maybe they were short sighted, or never really thought it would be as popular as it is, but it sounds like they made architectural decisions that ended up being... sub-optimal. Thus... a rewrite of parts of the server code. I also think that it is the core reason behind AwA, cold storage, and probably more.

Yup, I am sure as well. Just funny that so many keep tossing "spaghetti code" around, when it could literally be any number of other just as impactful issues behind the scenes. None of which we are privy too.

EDIT: On a side note, it may not even have been a developer or architect who made the decision. It is just if not more likely that the business ran their numbers, did their research and came to the conclusion that ESO would never have more than say 15K concurrent users, and there was no need to waste money designing or building a system that could scale past that. In defense of coders, most that I have met in all my years have been pretty good at what they do and write good code, are proud of their work, but have to work within the limitations imposed upon them by the business and the tools they are given to work with.

JoeCapricorn · May 2022

Wow, 2012 era hardware?

Yeah, there should be a dramatic increase in improvement in server-side stuff. I was running ESO on a computer I had built in 2014 - in 2014 it was a BEAST but in 2022 the CPU just couldn't keep up all the time, and the liquid cooler eventually started to give out a bit. 8 full years though, not too shabby!

But now I am cranking with an 8-core processor that can hit 5.2ghz. Even with my old computer's video card (an RTX 2070), I haven't found a game that I can't run at absolute bonkers settings (Cyberpunk 2077 at full settings with ray-tracing still averages about 45 to 50fps at 1080p). And much of that is to do with being relieved of the CPU bottleneck.

coletas · May 2022

Not exactly. In studios you design all architecture first and later make the tooling. Thats why it all began in 2007. Some coders are responsable of those limitations. Like those who still choose mysql for a 10M players game...

Everything you code for business has to be scalable. If not, you will hit the rock sooner or later. Was Napoleon who said "if i only have 1 hour for a battle, i use 55min planning it and 5 in the assault" ??

And now they are going to repeat for the new triple AAA. Hiring! Hiring! Mysql and go... What could go wrong? Lol

Matt, go to MS, ask them for profesionals (not some gurus like now... Those with big mouth that only know to hide their errors) and infrastructure (including postgre, mssql, oracle... In any order...) and begin the new project with stable legs.

FlopsyPrince · May 2022

Elsonso wrote: »

FlopsyPrince wrote: »

They are probably not replacing all the rack hardware, cabling, etc.

I would just create a new data center and migrate there if I was really going to do that. Replacing things in place though can be quite challenging!

Actually, this is what confuses me.

I mean, it is certainly possible that they are replacing _all_ of the hardware that they have in the datacenter, which would include racks, enclosures, server systems, NAS, power supplies, power cables, network cables, network switches, for all 6 megaservers across their (presumably) two datacenters.

That is an expensive option, out of several options, available to them.

Then again, this maintenance only lasts for 10 hours, and PC NA isn't some gaming computer sitting on Mom's basement floor. It is entirely possible that they actually built a new megaserver, staged it next to the old one, and will spend the 10 hour maintenance copying all of the live data over, bringing it up, and testing it. ( <---- EDIT: ZOS.. pics or it didn't happen )

Upgrading in place would take more than 10 hours to complete for _all_ the hardware, I would think, so if they are doing that, they are definitely not replacing _all_ of the hardware.

So, my expectation is that either they built a new system, or they are replacing select parts of the existing system. The latter is the more cost effective route.

Why would they need to replace the racks (just metal) themselves? I would expect the power supplies have been replaced a few times already. Few will make it with continuous use that many years.

FlopsyPrince · May 2022

Gaeliannas wrote: »

It is tricky but quite doable, you just need to code your stuff right.

Many do not code their systems correctly for this, at least many didn't in 2012 and earlier. That is likely a challenge here if I had to bet.

TechMaybeHic · May 2022

So I hear they treated it's not new/more hardware. Did they just buy the same model they had before? Something sitting in a stockroom somewhere? Because I'm sure hardware that old is EOL.

Tornaad · May 2022

When I first read this I replaced word data center with the word server. And started to do a quick search on line to see how long server hardware lasts. Thankfully, after a few minutes of getting a lot of conflicting information, I realized what I did and then started to do a search for how long datacenter hardware lasts, and one of the top results I found claims that data center hardware will last between 10 to 15 years. Which means that Zos is in the average range for replacing their hardware.
https://info.pcxcorp.com/blog/when-is-the-right-time-to-expand-your-companys-data-center-design

Gaeliannas · May 2022

Zuboko wrote: »

When I first read this I replaced word data center with the word server. And started to do a quick search on line to see how long server hardware lasts. Thankfully, after a few minutes of getting a lot of conflicting information, I realized what I did and then started to do a search for how long datacenter hardware lasts, and one of the top results I found claims that data center hardware will last between 10 to 15 years. Which means that Zos is in the average range for replacing their hardware.
https://info.pcxcorp.com/blog/when-is-the-right-time-to-expand-your-companys-data-center-design

You looked up when to rebuild a data center, which is the building where ZOS puts all their hardware. You need to look up the average lifespan of servers, storage arrays and network equipment, which is considerably shorter, like 2/3 shorter for servers. Historically somewhere in the 5-7 year range it becomes less expensive to replace your equipment than to renew the maintenance contracts on it, because even the manufacturer knows it is going to constantly break after that point. Yes, it can last much longer with a lot of part replacements, but is usually a poor choice to push it this far.

Here is a start:
https://www.promax.com/blog/how-long-do-servers-last

Elsonso · May 2022

Gaeliannas wrote: »

Zuboko wrote: »

When I first read this I replaced word data center with the word server. And started to do a quick search on line to see how long server hardware lasts. Thankfully, after a few minutes of getting a lot of conflicting information, I realized what I did and then started to do a search for how long datacenter hardware lasts, and one of the top results I found claims that data center hardware will last between 10 to 15 years. Which means that Zos is in the average range for replacing their hardware.
https://info.pcxcorp.com/blog/when-is-the-right-time-to-expand-your-companys-data-center-design

You looked up when to rebuild a data center, which is the building where ZOS puts all their hardware. You need to look up the average lifespan of servers, storage arrays and network equipment, which is considerably shorter, like 2/3 shorter for servers. Historically somewhere in the 5-7 year range it becomes less expensive to replace your equipment than to renew the maintenance contracts on it, because even the manufacturer knows it is going to constantly break after that point. Yes, it can last much longer with a lot of part replacements, but is usually a poor choice to push it this far.

There are companies out there that pay for extended maintenance on hardware, long after that hardware is no longer for sale, just because it is cheaper to buy a maintenance contract than it is to buy new hardware and migrate to it. The latter being the real cost, and risk, for some of these companies.

Gaeliannas · May 2022

Elsonso wrote: »

Gaeliannas wrote: »

Zuboko wrote: »

When I first read this I replaced word data center with the word server. And started to do a quick search on line to see how long server hardware lasts. Thankfully, after a few minutes of getting a lot of conflicting information, I realized what I did and then started to do a search for how long datacenter hardware lasts, and one of the top results I found claims that data center hardware will last between 10 to 15 years. Which means that Zos is in the average range for replacing their hardware.
https://info.pcxcorp.com/blog/when-is-the-right-time-to-expand-your-companys-data-center-design

You looked up when to rebuild a data center, which is the building where ZOS puts all their hardware. You need to look up the average lifespan of servers, storage arrays and network equipment, which is considerably shorter, like 2/3 shorter for servers. Historically somewhere in the 5-7 year range it becomes less expensive to replace your equipment than to renew the maintenance contracts on it, because even the manufacturer knows it is going to constantly break after that point. Yes, it can last much longer with a lot of part replacements, but is usually a poor choice to push it this far.

There are companies out there that pay for extended maintenance on hardware, long after that hardware is no longer for sale, just because it is cheaper to buy a maintenance contract than it is to buy new hardware and migrate to it. The latter being the real cost, and risk, for some of these companies.

Yeah, it becomes a total roll of the dice after about 5 years, and your hardware architecture plays a big part as well. If you have enough redundancy build in, the risk is lower, if you don't, you are probably shooting yourself in the foot. Then the shortsighted also just look at the contract vs buy numbers, never considering the business impact and possible loss of revenue & customers due to having an unstable platform. I worked on one system where the entire thing was speced for 6 nines, because the cost for 10 minutes downtime was pushing a million dollars. Obviously that isn't ESO, as they get paid up front and through reoccurring subs, so their loss is just in crown sales and customer frustration, but the players here seem very forgiving, so that isn't a huge factor as well to them I would suspect.