Categories
Downtimes Problems

Server flatlined four times in not even two hours…

Due to until now unknown reasons the server which holds the Jabber services crashed four times in the last not even two hours. From one second to the other the ejabberd processes took every resource they could get, and even more. 8 gigs of RAM and 8 gigs of swap, everything gone. Plus a lot of CPU load. The machine was loaded that “top” refreshed just every 5 minutes and in the end just a hardware reset helped to reboot the machine.

For the tech geeks:

top – 19:56:21 up 31 min,  1 user,  load average: 22.86, 13.11, 8.71
Tasks: 240 total,   3 running, 231 sleeping,   0 stopped,   6 zombie
Cpu(s):  1.4%us,  5.8%sy,  0.0%ni, 12.4%id, 80.3%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:   8190900k total,  8138972k used,    51928k free,      796k buffers
Swap:  8393848k total,  7276916k used,  1116932k free,    42404k cached

PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
3239 ejabberd  20   0 15.7g 6.0g 460 S   23 76.7   3:06.65 beam.smp

We are looking into this issue. Maybe a severe bug with ejabberd, maybe a DoS attack. We don’t know, yet.

Categories
Problems Transports

ICQ Transport stable again

We had stability problems with ICQ Transport starting yesterday morning. This was caused by a certain version of libpurple. Issues should be fixed now.

Please note that with Spectrum transports crashes (not “normal” stops or restarts) it might happen that your client thinks it’s still online. Therefore some contacts appear online even if the client is not really logged into the transport at the moment. Spectrum developers are working on this issue. Until then I just hope that the Spectrum transports run stable. ;-)

With the new libpurple Gadu-Gadu transport should finally be stable, too. We’ll see how this works out in the near feature.

Categories
Downtimes Problems

Problem with Jabber database

Unfortunally there was a major problem with the database for all accounts of the jabber.hot-chilli.net domain (not accounts from other domains, like jabber.hot-chilli.eu).

Finally we decided to restore a backup from 4th/5th of May 2010 (day of the server move) and had to take the Jabber server down for about 2 hours.

Affected are just the contact lists and contact groups. This means that as an affected user of this you have to add/delete all buddies you changed since then.

We really apologize for the trouble caused, especially because the backup is one week old.

The question remains why we just got 20 rows of data inside our current database backup from this morning, missing 150000 (!) other rows. We will take a deep look into the backup process.

Categories
Maintenance Problems Transports

Jabber Disk, SMS Gateway and JMC moved

Finally, Jabber Disk, SMS Gateway and JMC moved to the new server. Jabber Disk and the SMS Gateway are up and running.

Unfortunally JMC still runs very unstable. Please be patient, I’m in contact with the programmer and it looks like he is willing to put some time into the code. We’ll see how this will work out. I hope that we can achieve some improvement because it seems that this feature is a well seen and used feature.

Categories
Downtimes Problems

Network problems for hours

Our provider experienced network problems starting yesterday (05/06/2010) at 2pm CEST. The outages covered a lot of ISPs. T-Online and Alice worked here in Germany, a lot of others like KabelBW and Strato did not work. These severe problems went away at about 6pm, but we still experienced some problems until this morning. According to our server provider the problems are gone now. The problems were cause by a external attack with more than 50 gbits.

Categories
Downtimes Problems

6 hour downtime…

The Jabber server was just recovering from a 6 hour downtime on 9am CEST.

Sorry guys, there was a mistake in the config file due to adding a new Jabber domain to it.

We deeply apologize for the trouble caused.

Categories
Downtimes Maintenance Problems Transports

Server move almost finished

Ok, the server move is almost finished. JMC, jDisk and SMS have still to be moved, that’s gonna happen later the day… ;-)

If you experience any other problems, please contact us!

Categories
Downtimes Maintenance Problems

Upcoming Jabber server move

Due to massive hardware problems we will move the Jabber service to a new machine. This will happen soon, maybe already this evening/night. You won’t be able to reach the services for about an hour. The new IP address will be 178.63.27.18 – if your DNS does not update in time. We apologize for the trouble caused.