quite bad desync occuring

edo660

New Member
hi guys,

have noticed since updating to 1.7.5.1 and reality that our servers are experiencing huge amounts of desync every few minutes, if there's more than around 10 or so people on. e.g. players teleporting around the place (not hacks, as in you observe them running in one direction, then they are pulled back to somewhere else) , zombies not spawning for some time then suddenly all appear, etc...

unsure if this is a result of 1.7.5.1, or something with reality, but it certainly wasn't like this with 1.7.4/bliss. running latest (101480) beta, only changes made was installed new 1.7.5.1 dayz files and moved over to reality from bliss.

has anyone else seen this happening?

server is dual quad-core xeon, 16gb ram... so resources are not a problem i should think. 50 slot server, but this occurs even with 25 on.

thanks all.
 
I have no issues on my test server with 30+ users online, is your mysql local to the host or on a remote host?
 
Make sue MySQL isn't struggling. Check for pool errors. My MySQL server is busy and I occasionally get these. I don't know where to check in the MySQL logs but I get this error back from my websites when this happens.

This is the error I get from my websites when this happens.: MySqlClient.MySqlException (0x80004005): error connecting: Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.

There is a fix for this, I haven't looked much more into it as scheduling a server restart fixed it. Very rarely it happens and coincides with higher latency, one of the reasons I may be moving my hosting..

I have noticed zombies lagging, where they slide around, I think is more to do with my servers connectivity issues, which is why I have kept the slots lower. Think Broadband do a great free broadband quality monitor if you think this may be an issue: http://www.thinkbroadband.com/ping

I would have thought that this type of MySQL error would show up in the server console or .rpt file, are you seeing unable to connect to MySQL, in orange, in server console ?
 
We can report the same. Server performance overall seems to be worse than before the update.
However I have to add that our servers are constantly hammered - they reach 50/50 a few minutes after start and keep at this level until the next restart (every 3 hours).

MySQL Server, Disk and Network I/O are defiantly not the bottleneck here. Servers are running on 3930K CPUs so this should also be pretty sufficient.
Amount of desync does not seem to be directly related to the server runtime.
 
I have no issues on my test server with 30+ users online, is your mysql local to the host or on a remote host?

mysql is on a remote server, 100mb between them (same switch) - has never been an issue before though. I increased the max packet size on the config files to see if that helped (was 1400, now 2048) but that hasn't helped. Next, I'm probably going to try a fresh rebuild of the server files...

Make sue MySQL isn't struggling. Check for pool errors. My MySQL server is busy and I occasionally get these. I don't know where to check in the MySQL logs but I get this error back from my websites when this happens.

Didn't think of that, but have checked out mysql - no errors, and it isn't struggling at all - the only real db that this particular sql server is running is for dayz in fact.
 
actually, i wouldn't mind comparing peoples basic.cfg... below is mine, which worked just fine previously but wondering if it could be tweaked more? have read a few threads about it, but not since 1.7.5.1

Code:
MinBandwidth=83886080;
MaxBandwidth=125829120;
MaxMsgSend=92;
MaxSizeGuaranteed=128;
MaxSizeNonguaranteed=64;
MinErrorToSendNear=0.029999999;
MinErrorToSend=0.0019999994;
MaxCustomFileSize=0;
Windowed=0;
adapter=-1;
3D_Performance=1;
Resolution_Bpp=32;
class sockets
{
    maxPacketSize=2048;
};
serverLongitude=153;
serverLatitude=-27;
serverLongitudeAuto=153;
serverLatitudeAuto=-27;
 
mines default from the reality files I believe

Code:
MinBandwidth=104857600;
MaxBandwidth=1073741824;
MaxMsgSend=256;
MaxSizeNonguaranteed=256;
MinErrorToSendNear=0.029999999;
MinErrorToSend=0.003;
MaxCustomFileSize=0;
Windowed=0;
adapter=-1;
3D_Performance=1;
Resolution_Bpp=32;
class sockets
{
    maxPacketSize=1400;
};
serverLongitude=9;
serverLatitude=51;
serverLongitudeAuto=9;
serverLatitudeAuto=51;

Btw
Code:
{
    maxPacketSize=2048;
};
I found that 2048 would cause issues for some players connecting to the server.
 
Ok, this is a pretty big thing here and the cause of a lot of headaches for servers. Most likely the issue is this config option. Here is the formula on how to find out how much bandwidth your server is taking. Min bandwidth and max bandwidth does not matter as much as these four items. My server tool has this built onto the interface already.

In additon MaxPacket size should not exceed MTU of the line you are connecting to. Default internet traffic is 1500 and this line should be below that. Your specifying to send 2048 bits per packet which get stripped to 1500 (the max) since this protocol does not send ack, the systems just error out and fail to get the desired packet. Not all packets will be sent at this and therefore some will actually make it through properly. As the packets get larger you will have more problems as more of the packets fail.

(((Server FPS * NumPlayers) * MaxSizeGuaranteed) * MaxMsgSendText)

@Edd660 your server would consume 29440000 bps with 50 players while running at 50 FPS.
@Karl your server will consume 163840000 bps based on your settings.




1nU90
 
Thanks for that info - I'll try tweaking those settings (and changing back to 1500 max packet, since it's only 1500 mtu on the network). We're running on a 100Mbps IP link (that's actual available bandwidth, not just the connection speed). Before updating to 1.7.5.1, we were only peaking at 15-20Mbps (couple of servers, total of around 80-90 players in peak times). Based on your calculations, we should have plenty of bandwidth available...
 
MTU 1500 is too high as you have to substract at least 8 bytes for header data. Different network components can lower the maximum size even further. So setting the MTU to 1500 will cause fragmentation in case a packet is 1500 bytes or larger.

You can determine the MTU the following way:

Code:
ping www.google.com -f -l 1492

If it says something about packages need fragmentation then the MTU is too high.
Reduce the number until the pings go through - then you have the maximum size a package can be without being fragmented.

Regarding FPS: Even on high end machines FPS drop to 3-5 at 50 Players.
 
I don't really understand much of this, can someone help me?
I have two 60-slots servers which usually is full several hours a day.
My server FPS goes from around 1 to 4, and the max MTU from that ping test was 1472.

What settings is optimal for this?
 
Here is my config, it has zero to no lag and typically runs between 30-50ish FPS with a full server.

Code:
MinBandwidth=10000000;
MaxBandwidth=40000000;
MaxMsgSend=26;
MaxSizeNonguaranteed=256;
MinErrorToSendNear=0.050000001;
MinErrorToSend=0.001;
MaxCustomFileSize=0;
MaxSizeGuaranteed=512;
Windowed=0;
adapter=-1;
3D_Performance=1;
Resolution_Bpp=32;
class sockets
{
maxPacketSize=1400;
};
 
Running DayZ? I find that hard to believe. We are talking about server FPS taken from the FPS Debug loop in the RPT right?
The reason for the low FPS on DayZ servers is that the AI thread (which handles the zombies serverside) is singlethreaded and it was not build to scale to hundrets of zombies on a server.
This is why ALL DayZ servers I have seen run with low FPS
 
Running DayZ? I find that hard to believe. We are talking about server FPS taken from the FPS Debug loop in the RPT right?
The reason for the low FPS on DayZ servers is that the AI thread (which handles the zombies serverside) is singlethreaded and it was not build to scale to hundrets of zombies on a server.
This is why ALL DayZ servers I have seen run with low FPS

30 users online during this


Code:
18:37:52 "DISCONNECT: swiggy-diggy-dagger (40722438) Object: 2620a040# 1064189: man_survivor.p3d, _characterID: 248"
18:37:53 "WRITE: "["PASS"]""
18:37:53 Client: Remote object 54:13 not found
18:37:53 Client: Remote object 2:501 not found
18:38:13 "READ/WRITE: "["PASS",[2013,2,5,22,31]]""
18:38:43 "DEBUG FPS  : 49.8442"
18:39:02 "DISCONNECT: Lagrange (80422598) Object: any, _characterID: any"
18:39:02 "WRITE: "["PASS"]""
18:39:02 Client: Remote object 2:511 not found
18:39:02 Client: Remote object 58:13 not found
18:39:53 "DISCONNECT: Gingersnaps (63133062) Object: 2c58a040# 1064350: man_survivor.p3d, _characterID: 257"
18:39:53 "WRITE: "["PASS"]""
18:39:53 Client: Remote object 2:503 not found
18:40:25 "DISCONNECT: Ammo (95700038) Object: 2c50c080# 1064233: ghillie_overall.p3d, _characterID: 125"
18:40:25 "WRITE: "["PASS"]""
18:40:25 Client: Remote object 55:13 not found
18:40:25 Client: Remote object 2:505 not found
18:41:44 "DEBUG FPS  : 50.3145"
18:43:13 "READ/WRITE: "["PASS",[2013,2,5,22,36]]""
18:44:45 "DEBUG FPS  : 50.3145"
18:47:46 "DEBUG FPS  : 50.1567"
 
This comes at quite a surprise.
Could you share a full RPT from your server? I'd like to compare a few thigs.

Which NIC are you using?
Which CPU?
Which Operating System?
Any special TCP/IP Tweaks OS side?
What ping limit do you have configured?

Also: would it be possible that you verify the output by logging in as admin ingame and running #monitor 1?
 
This comes at quite a surprise.
Could you share a full RPT from your server? I'd like to compare a few thigs.

Which NIC are you using?
Which CPU?
Which Operating System?
Any special TCP/IP Tweaks OS side?
What ping limit do you have configured?

Also: would it be possible that you verify the output by logging in as admin ingame and running #monitor 1?

I am running this on a Windows Server 2008 R2 x 64 machine with 8 GB ram and 8 cpus running as a VM on a ESX 5 box. The ESX box is a Power edge 1950 with 24 GB ram and dual Xeon e5410 processors. The ESX boxes connect to a Power edge 2950 that houses the enterprise Storage iSCSI disks for redundancy purposes.

I have 5 network cards in each ESX box, 1 connects directly to the cable modem, the other 4 connect to the gigabit switch for local network and iSCSI traffic (again enterprise class). The enterprise storage has 8 nics in it to allow fast iSCSI traffic between both ESX hosts.

I run on a Comcast business class tier 3 connection 50/10.

I just killed all of the servers (there are 7 dayz servers running on the box) a bit before to update beta versions so there is no CPU traffic at the moment.

In the report file you will see the lowest it drops is to around 29 FPS, except in the beginning as it's spinning up the server at 7 FPS. The rest of the time it is running around 50. You can also see the hacker come in and clear the server :(

Report File



1jmJ8


1GUHB
 
more red chains since the update than I have ever seen.

This update moved move stuff server side that previously. In order to pitch a tent your client now sends 3 event handlers.

The reason for the low FPS on DayZ servers is that the AI thread (which handles the zombies serverside)

Zombies are not handled by the server, they are spawned as Agents and not units, the entity is bound to the machine that created them, if that player logs out the server will kill the zombies that their pc was in control of (Killing uncontrolled zombie) the only thing the server does with zombies is sync between clients.

I am running this on a Windows Server 2008 R2 x 64 machine with 8 GB ram and 8 cpus running as a VM on a ESX 5 box. The ESX box is a Power edge 1950 with 24 GB ram and dual Xeon e5410 processors. The ESX boxes connect to a Power edge 2950 that houses the enterprise Storage iSCSI disks for redundancy purposes.

I have 5 network cards in each ESX box, 1 connects directly to the cable modem, the other 4 connect to the gigabit switch for local network and iSCSI traffic (again enterprise class). The enterprise storage has 8 nics in it to allow fast iSCSI traffic between both ESX hosts.

I run on a Comcast business class tier 3 connection 50/10.

that setup is S e X e
 
Back
Top