Loading....
Recent Article links:

Archive for February, 2009

Downtime

We’re down again due to another DDOS. We’re trying our best to tackle this issue and will be back ASAP. We regret the invonvenience caused.

EDIT: We’re back again. :)

Good Linux blogs.

I usually bookmark good Linux blogs for further reading. Here is one of them:

http://duartes.org/gustavo/blog/

The above is a blog that is rather advanced about Linux, but it’s understandable.

Below is a resource any Linux admin will come to consult with once in their life time:

http://www.cyberciti.biz/

Lighttpd breathes with life.

We have never been great fans of Apache. Infact, many large websites do not use it and opt in for alternatives such as Lighty or NginX. We choose Lighty mainly for it’s speed in serving static content and a slight increase in speed by using the fcgi implementation of PHP, this is compared to mod_php used by Apache. We also choose Lighty because of it’s single threaded, so it essentially uses less memory and does not have to fork out new processes everytime a new request comes in – especially in times of DDoS.

However, over the past few years, I have seen Lighty nearly dead in development. Though the code is stable in terms of the features we use, it’s still not a good sight to see it not being worked on. However, thankfully, the dev(s?) stbuehler has breathed a lot of life back into Lighty’s 1.4.x development.

So a big thank you to him for the updates. At the moment, WBB uses the latest version of lighttpd. It is version 1.4.21. We’re already eyeing up version 1.4.22 as 1.4.21 introduced a few bugs.

500 errors, blank pages and connection interrupted errors fixed.

Yes, those dreaded errors are finally gone. The fault was due to a network stack problem in the kernel.

You may now continue kissing the monitor and moaning “yes, yes, yes!” in front of awkward turtles.

DDoS was thought to be blocked but wasn’t.

I cannot detail how I blocked this DDoS, however I caused for efforts to be reversed by incorrectly writing a line of the configuration. This caused a downtime of roughly 4 hours from 4PM to 9PM (GMT) time yesterday as the DDoS was allowed to hit the site. This has now been fixed, and we’re again successfully blocking the attack.

This weekends updates.

We have been negotiating with the DC over the past few days at a rather slow pace to get one of our servers connected back to the internet again. The server does not have a synchronized clock causing massive wait times between posts. This can be fixed if we forward packets of our main port to the other server when it requests any data from the web, but that takes some time to set up. Hence fixing the IP conflict maybe the faster way out of the problem.

At this very moment we’re fixing the search. There has been a reported bug of Graveyarded topics showing up in normal search results that specifically exclude the Graveyard. This issue is due to code updates not being fully propagated to the LinkChecker Bot’s code. This resulted in the LinkChecker Bot not taking new variables into consideration, but it has now been fixed.

However whilst fixing this issue, we have had to reindex the whole forum (yes, all 10 million articles) and hence we’re receiving 500 errors as MySQL cannot cater to the load given by Sphinx and the load by the site. The recent upgrade of the search was designed to prevent this sort of reindexing behaviour daily but instead fornightly or weekly at off peak hours. This reindexing, if performed at the correct time will not cause 500 errors.

DDoS

We’ve just suffered and blocked a DDoS attack on our servers. Things are running smoothly again.

Some emergency maintenance.

There’s a reason why we run non swap servers or atleast disable them. We simply hate having a 2GB swap filled up even when there is free memory available. Today we shut down MySQL and a number of servers in order to disable swap.

We had downtime of roughly 20min.

The time is currently: 7:17pm Tuesday (GMT) – Time in GMT

Maintenance #2.

We’re just waiting for some new HDDs to be installed onto the server.

Cacti.

Cacti is one of the worlds most powerful monitoring tools. It uses RRDtools to graph out the serverload, network load, the number of users on WBB, and much more (if you know how to use it). It can be used anywhere a graph is seen – even where graphs are logged in the GB per second range.

WBB will begin to use this tool to graph our statistics in a secret location.

Now if only the installation was much easier.

ACF loading animated gif