PDA

View Full Version : EIdSocketError.Host not found - server stops to respond


Gargamel
29-05-2006, 17:51
I am an owner of big public non commerical TS server - we are having nasty problem with server

error can be found in server.log

ERROR,All,TTSUDPSender.Execute, SID: 1 Exception2 EIdSocketError.Host not found.

after this error server is kicking all users - noone can join it but server is working it is also visible as online in tsviewer.com - we have to restart it manually olny then people can join it

here are the details about our server from this topic - http://forum.goteamspeak.com/showthread.php?t=12916&page=13

Can you please post some useful info? Show us:
(a) a process listing when the problem occures
(b) operating system details, what kernel, what distro
(c) TS server details, version, virtual server count, slot count, how the server is used (tcpquery, webadmin, custom webadmin?)
(d) How often the error occures, can you reproduce it? If so how?

a - listing of processes :
10751 ? SN 0:01 ./server_linux -PID=tsserver2.pid
10752 ? S 0:00 ./server_linux -PID=tsserver2.pid
10753 ? S 0:26 ./server_linux -PID=tsserver2.pid
10754 ? S 0:55 ./server_linux -PID=tsserver2.pid
10755 ? S 0:58 ./server_linux -PID=tsserver2.pid
10756 ? S 0:00 ./server_linux -PID=tsserver2.pid
10757 ? S 0:00 ./server_linux -PID=tsserver2.pid
10758 ? S 0:00 ./server_linux -PID=tsserver2.pid
10759 ? S 0:00 ./server_linux -PID=tsserver2.pid

So the server is running correctly but it's kicking all users out.

b - Slackware 10.2, kernel 2.6.16.5

Server Details : Dual P4 Xeon 2.8GHz with 4GB of Ram, running SMP kernel.


c - we were using server version 2.0.21.3 and newwest beta 2.0.22.2 - we are having 150 slots but max users we are having is about 100 in the evenings - we are using standard query port and webadmnin - we were trying to disable webadmin and also change the value of query port - Number of virtual servers : 1 - we are aslo using sqlite

d - error occures mostly 2-3 times in a day (once at noon and second time in the evening hours - a number of users doesn't affecting on it ) but we had aslo almost 2 days without it or had it 6-7 times in a day - we cant reproduce this error - the server processes are running correctly but the servers kicks all users out and stops to respond to anything - in advance logs we can see that users are joing the server but after 2 sec they are disconected

List of topics connected to the problem :
http://forum.goteamspeak.com/showthread.php?t=19705
http://forum.teamspeak.org/showthread.php?t=16165&page=2
http://forum.goteamspeak.com/archive/index.php/t-3026.html
http://forum.teamspeak.org/showthread.php?t=16165
http://forum.goteamspeak.com/showthread.php?t=12916

There is no solution to the problem and we need help fast

Peter
29-05-2006, 19:50
Can you get a traceroute on the server when its in a "I don't accept anybody" state?

Gargamel
29-05-2006, 20:45
traceroute is working correctly both to the server and out

Peter
29-05-2006, 21:30
Hmmm ok,

I think I heard Ralf mention this error message once, I forgot what the story was though. We will have to wait for him to comment. Don't expect the response before Tue 6.6.6 though, not that ralf is that satanic, but he's currently on vacation.

SBHS-Scott
31-05-2006, 00:16
I've been having this problem ever since we switched to the latest dev binary as well.

I have 5 boxes running upwards 2000 total slots amongst them all and it happens completely at random across each one. Running various kernel versions and distros.

There is nothing wrong with the servers.... They run (and have run for years) the older binaries just fine. Too bad they are prone to crashing from an exploit.

I think whatever was changed in the latest binary introduced a bug that is causing this.

Gargamel
31-05-2006, 11:12
There is nothing wrong with the servers.... They run (and have run for years) the older binaries just fine. Too bad they are prone to crashing from an exploit.

I think whatever was changed in the latest binary introduced a bug that is causing this.Thanks Scott - Could You tell me what version should we use to get rid off this nasty bug (we checked 2.0.21.3 and newwest beta 2.0.22.2) ?? and the most important thing what exploit we have to expect using this older binaries ??

Now I am trying to use 2.0.20.1

SBHS-Scott
31-05-2006, 12:42
2.0.20.1 - the whole server will crash.

Another thing you can try with the latest version is to create another virtual server on your server, and hope that's the one that hangs.

Gargamel
31-05-2006, 14:59
2.0.20.1 is the latest version that I can replace - when I am taking a version from archives - ftp://ftp.freenet.de/pub/4players/teamspeak.org/releases/archive/

2.0.19.16 - replacing server_linux makes server working but without channels directory - that is probably becuase this old version should have anew installing - we will make it if it is necessary but I need to know which exactly old version was without this EIdSocketError ?? Scott could You remeber that ??

I am still believing that devs are gonna handle this case and find the solution to this problem

UPDATE - version 2.0.20.1 tested - error still occurs :(

Tarkin
01-06-2006, 17:16
First of all I have a simmilar problem with mit ts server currently using version 2.0.22.1. I use this server only for a small group of friends (maximum of 5 users, usually 2-3 online at the same time). This Problem exists since I installed the TS server and noone could help me with it, but today I found some interesting piece of information (don't know the URL anymore).

It said that this happend to users on managed servers and it could be solved by the administrators changing the firewall configuration. So i took a closer look at my box and realised that this exception occured at specific times (i must have been blind to oversee this). At this specific moments i had my System run a script which setup the firewall configuration. Due to this fact I assume, that the TS Server Process crashes on my machine, when he could not deliever a packet (in this little time-window it takes to setup the rules) and goes into an undefined state as described above.

I disabled this script an will see if the problem is solved.

for the developers: Is ist possible that one Packet which cannot be delieverd causes the whole server to crash? Perhabs only Packets which could not leave the System (Firewall, broken Network link...) cause this? Shouldn't be so in my oppionion ;-)

SBHS-Scott
02-06-2006, 13:21
Gargamel: Strange, I never get the error with 2.0.20.1.

Tarkin: I saw a page possibly the same one you are referring to, but I've gotten the error even after disabling the firewall on a server.

The error does seem to indicate that it can't contact a remote host though.

Peter
02-06-2006, 13:59
So too sum up what we have so far:

- The error can be triggered by disallowing the server to send packets via firewall (e.g. by restarting a firewall script)
- The error comes up apparently randomly and firewall unrelated for some people. Would be nice to have some more information here, but it can be hard to get sometimes...

In any case, it seems the TS2 Server overreacts on this error message, and shuts down, instead of spitting a warning and continuing, we will see what Ralf has to say when he returns from his hollidays.

FlashMaster
04-06-2006, 04:22
Sorry up front for the lengthy note.

I may have found the root cause of my "Host not found." problem as described here and elsewere on this forum. It may only apply to my server, but hopefully it serves to others as a pointer at least.

Topology:
I run a private TS2 clanserver on Fedora Core 4.
I have a LAN connecting to a PC gateway which again connects via the ADSL router to the WAN.
I am running the TS server on the PC gateway, and I also run Firestarter on it (all necessary ports forwarded and/or open).
I have two nics: eth0 connects to WAN (router), eth1 connects to LAN.
My TS client connects to the TS server from the WAN as I live in a different location from where the server is.

Problem:
My fellow clanmembers and I are connected to the server for around 2 hours when it shuts down. The tsserver2.pid processes are still running, but the virtual server is unreachable. To get it back up and running, I have to manually restart the server application. The problem only appears when users are logged on, and not when the server is not utilised. It is a reoccurring problem.

I did a comparison between the TS serverlog and the system log, and found that the error coincided with my dhcp daemon restarting:

TS server.log:

---------------------------------------------------------------
-------------- log started at 03-06-06 18:37 -------------
---------------------------------------------------------------
03-06-06 18:37:26,ALL,Info,server, Server init initialized
03-06-06 18:37:26,ALL,Info,server, Server version: 2.0.19.40 Linux
03-06-06 18:37:27,ALL,Info,server, Starting server with port: 8767
03-06-06 18:37:34,ALL,Info,server, Server init finished
03-06-06 18:37:34,WARNING,Info,server, TeamSpeak Server daemon activated
03-06-06 19:21:28,ERROR,All,TTSUDPSender.Execute, SID: 1 Exception1 EIdSocketError.Host not found.
03-06-06 20:17:17,ERROR,All,TTSUDPSender.Execute, SID: 1 Exception2 EIdSocketError.Host not found.
03-06-06 20:18:52,ALL,Info,server, Starting server with port: 8767


Extract from /var/log/messages:

Jun 3 19:21:22 localhost dhclient: DHCPREQUEST on eth0 to 10.0.0.1 port 67
Jun 3 19:21:22 localhost dhclient: DHCPACK from 10.0.0.1
Jun 3 19:21:29 localhost dhcpd: dhcpd shutdown succeeded
Jun 3 19:21:29 localhost dhcpd: Internet Systems Consortium DHCP Server V3.0.2-RedHat
Jun 3 19:21:29 localhost dhcpd: Copyright 2004 Internet Systems Consortium.
Jun 3 19:21:29 localhost dhcpd: All rights reserved.
Jun 3 19:21:29 localhost dhcpd: For info, please visit http://www.isc.org/sw/dhcp/
Jun 3 19:21:29 localhost dhcpd: Wrote 3 leases to leases file.
Jun 3 19:21:29 localhost dhcpd: Listening on LPF/eth1/XX:XX:XX:XX:XX:XX/192.168.8/24
Jun 3 19:21:29 localhost dhcpd: Sending on LPF/eth1/XX:XX:XX:XX:XX:XX/192.168.8/24
Jun 3 19:21:29 localhost dhcpd:
Jun 3 19:21:29 localhost dhcpd: No subnet declaration for eth0 (10.0.0.6).
Jun 3 19:21:29 localhost dhcpd: ** Ignoring requests on eth0. If this is not what
Jun 3 19:21:29 localhost dhcpd: you want, please write a subnet declaration
Jun 3 19:21:29 localhost dhcpd: in your dhcpd.conf file for the network segment
Jun 3 19:21:29 localhost dhcpd: to which interface eth0 is attached. **
Jun 3 19:21:29 localhost dhcpd:
Jun 3 19:21:29 localhost dhcpd: Sending on Socket/fallback/fallback-net
Jun 3 19:21:29 localhost dhcpd: dhcpd startup succeeded
Jun 3 19:21:29 localhost dhclient: bound to 10.0.0.6 -- renewal in 3342 seconds.


Jun 3 20:17:11 localhost dhclient: DHCPREQUEST on eth0 to 10.0.0.1 port 67
Jun 3 20:17:11 localhost dhclient: DHCPACK from 10.0.0.1
Jun 3 20:17:17 localhost dhcpd: dhcpd shutdown succeeded
Jun 3 20:17:17 localhost dhcpd: Internet Systems Consortium DHCP Server V3.0.2-RedHat
Jun 3 20:17:17 localhost dhcpd: Copyright 2004 Internet Systems Consortium.
Jun 3 20:17:17 localhost dhcpd: All rights reserved.
Jun 3 20:17:17 localhost dhcpd: For info, please visit http://www.isc.org/sw/dhcp/
Jun 3 20:17:17 localhost dhcpd: Wrote 3 leases to leases file.
Jun 3 20:17:17 localhost dhcpd: Listening on LPF/eth1/XX:XX:XX:XX:XX:XX/192.168.8/24
Jun 3 20:17:17 localhost dhcpd: Sending on LPF/eth1/XX:XX:XX:XX:XX:XX/192.168.8/24
Jun 3 20:17:17 localhost dhcpd:
Jun 3 20:17:17 localhost dhcpd: No subnet declaration for eth0 (10.0.0.6).
Jun 3 20:17:17 localhost dhcpd: ** Ignoring requests on eth0. If this is not what
Jun 3 20:17:17 localhost dhcpd: you want, please write a subnet declaration
Jun 3 20:17:17 localhost dhcpd: in your dhcpd.conf file for the network segment
Jun 3 20:17:17 localhost dhcpd: to which interface eth0 is attached. **
Jun 3 20:17:17 localhost dhcpd:
Jun 3 20:17:17 localhost dhcpd: Sending on Socket/fallback/fallback-net
Jun 3 20:17:17 localhost dhcpd: dhcpd startup succeeded
Jun 3 20:17:17 localhost dhclient: bound to 10.0.0.6 -- renewal in 3282 seconds.


As you may notice, the error occurs at the same time as the dhcp daemon shuts down for restart:

First TS error:
03-06-06 19:21:28,ERROR,All,TTSUDPSender.Execute, SID: 1 Exception1 EIdSocketError.Host not found.
dhcpd restart:
Jun 3 19:21:29 localhost dhcpd: dhcpd shutdown succeeded

Second error and subsequent TS crash:
03-06-06 20:17:17,ERROR,All,TTSUDPSender.Execute, SID: 1 Exception2 EIdSocketError.Host not found.
dhcpd restart:
Jun 3 20:17:17 localhost dhcpd: dhcpd shutdown succeeded

Here is another earlier entry from the logs:

First error:
02-06-06 20:31:58,ERROR,All,TTSUDPSender.Execute, SID: 1 Exception1 EIdSocketError.Host not found.
dhcpd restart:
Jun 2 20:31:58 localhost dhcpd: dhcpd shutdown succeeded

Second error and TS crash:
02-06-06 21:29:09,ERROR,All,TTSUDPSender.Execute, SID: 1 Exception2 EIdSocketError.Host not found.
dhcpd restart:
Jun 2 21:29:09 localhost dhcpd: dhcpd shutdown succeeded

Again, the error coincides with the dhcp daemon restarting.

As you can see, I am running TS version 2.0.19.40. I was running 2.0.20.1 before that, but I got this problem, so I decided to test the older v19.40 to see if it was due to a bug in the newer version, but I guess not...

Solution:
In my specific case, I am running two nics - eth0(WAN) and eth1(LAN). I have no subnet declaration for eth0 as this NIC receives DHCP configuration from the ADSL router. I have a subnet declaration for eth1 as this NIC serves as a DHCP provider for the LAN.

To ensure that the dhcp daemon did not interfere with eth0 (through which the TS clients connect), only eth1, I made the following change to /etc/sysconfig/dhcpd (RH/FC):

# vi /etc/sysconfig/dhcpd
DHCPDARGS=eth1

I then did:
# ./etc/init.d/dhcpd restart

This way, I am telling the dhcp daemon to provide it's service only to eth1 whenever it automatically restarts - it's the equivalent of manually doing:

# ./etc/init.d/dhcpd eth1

My server has now been running with users continuously logged on for 15 hours without failure, which is a lot more than the 2 hours it lasted previously. I am now testing with v2.0.20.1.

For those who only use one nic, perhaps there is a DHCP server conflict somewhere or the dhcp daemon is pointed to the wrong NIC if you have more than one installed in the server. I'm no expert at DHCP nor teamspeak, so this is only speculation, but I suggest to check your /var/log/messages or equivalent to be sure.

Good luck.

FlashMaster

maggy
05-06-2006, 06:39
i have to agree with the above poster and peter, the problem seems to be centered around reloading of traffic rules (either via resetting a firewall or dhcp) and that TS should be supressing this error instead of choking on it. i noticed that this problem for me started once i began using APF (http://www.rfxnetworks.com/apf.php) which regularly reloads its rules. unfortunately on that machine i dont have any option of using it sense it is an important box.

as for the inner mechanics of the why, im sorry i cant be of much help there :)

FlashMaster
05-06-2006, 23:26
As an update to my previous post, I tested 2.0.20.1, but the problem came back. Consequently I disabled the dhcpclient on eth0 (WAN) and assigned a permanent IP to this NIC. The server has now been running continuously for 12 hours with members logged in, with no sign of trouble, and no errors in the server.log file.

Imho I think the TS server is giving up too easily on this error, and it is clear that it is able to continue operating after the first error, but crashes on the second one. As maggy says - is it not possible to have the server suppress this error as it does not seem to be significant?

Thanks

FlashMaster

SBHS-Scott
07-06-2006, 06:39
Well whatever is causing it, it's obvious that it can't contact whatever host it's wanting to, and is waiting for contact to be made before continuing to run (without trying anymore).

Any word on a fix yet?

Peter
07-06-2006, 14:31
Hey,

we just added two new files to our download section[1] (one windows and one linux binary). They go by the version number 2.0.22.3, and seem to fix the problem described here. Everybody that has these errors please try the binary, and report back how it goes (especially if it doesn't fix the bug, which I don't hope :P). For those of you that don't know how to use developer releases, just pop the respective binary into the teamspeak2 server folder (overwriting the old binary) and start up.

[1] Download Section: http://www.goteamspeak.com/index.php?page=downloads

maggy
08-06-2006, 20:38
installing now
will start testing immediately

Jahklar
09-06-2006, 20:37
My server has been running for roughly 31 hours now with the new binary and no problem as of yet. Will post if something happens, if I don't post you can assume everything is working nicely. :)

maggy
10-06-2006, 04:47
32 hours and no freezes yet
i will come back in a week or so and if it's still running i think we can say case closed on this one hopefully

Bastian
11-06-2006, 00:10
Thank you for the feedback.

Gargamel
11-06-2006, 09:57
Thank you for the feedback.
we would like to thank DEVs for great work
my server is running for almost 3 days without this error

Jahklar
20-06-2006, 00:21
my server has been running for 11.5 days now with no issues. I would be willing to say this issue has been completely resolved.

Thanks for the fix devs!

Tarkin
26-06-2006, 12:04
My Problem seems to be fixed too. Server runnig for several days now without experiencing any problems.

thx devs