What’s in the pipe line.
Boy have we been busy rolling out updates to many aspects of the community. Here is just some things that has been going on behind the scenes.
1. Searching has been improved. The bug that caused Graveyard topics to appear in normal results even when explicitly specifying not to has been fixed. This bug was caused to a flaw in the design of the DB schema. There has also been other enhancements to the search backend. One you will not notice is a decrease in IO when indexing. This tweak will decrease HDD IO by two folds. There has also been a bug fixed in the search where the flood limit feature was not working properly.
2. Index page has had moderators removed. This is currently only a cosmetic change with the code that drives this feature still intact. We plan to remove it when we rewrite the the index page’s PHP code. The biggest problem with the index page is that it requires numerous loops to generate the page. This is unnecessary and we will be using better programming techniques to improve it’s efficiency there. There is also a planned rewrite of the expand/collapse feature on the index page. This is mainly to fix the invalid use of arrays identifiers in HTML, in laymen terms it means we’ll fix the expand/collapse issue with the General category.
3. Fixes to Topic Views. This has probably been said before, however we have altered the script that handles the committing of topic views to the database. The script now scales better and can handle hundreds of thousands of topics that need to be committed compared to only a few ten thousands from before. We have also set up a different memcache process dedicated for buffering topic view updates. This prevents topic view entries from being pushed out of the cache because of other incoming data. Memcache has also been upgraded to their latest versions.
4. MySQL setting tweaks. One of the extremely useful features of the Percona MySQL binaries (64bit) is that it includes a patch that allows for more than one background read/write threads. This is extremely useful on multicore servers such as ours.
MySQL works by using threads within one process because many servers these days are multicore. Back when single core procs were still mainstream, programmers will program only to work on one core. However, with multiple cores in today’s procs, threading is needed to scale up performance because each thread can only work within one core. Having multiple read/write threads increases the number of writes/reads that the system can request to the harddrive and this ultimately improves the performance provided that the HDD hasn’t maxed out already (which in our case, it hasn’t).
There has also been improved contention issues between the log buffer and buffer pool that are detailed here. This also helps MySQL’s performance.
PS. If you’re using the INNODB Plugin version 1.0.3. Make sure you turn on tcmalloc memory allocator with:
innodb_use_sys_malloc = 1
PS2. Useful INNODB tweaking resource here.
What we hope/wish/want in the future.
One of the biggest issues with a large database and a large userbase is that whenever we restart MySQL, we get hit with a large number of requests. Most of which we cannot serve as fast as we normally would. A rough guesstimate is that pages will be generated in 0.5 seconds (compare that to ~0.1 second page generation we normally get). The culprit is because of the slow (10K rpm) harddrive. The data must be read from the harddrive into the memory in order to provide a fast query response.
What we want/wish/hope for is a feature in MySQL that allows us to choose which tables between which ranges (between which topic_id for eg) we want to populate the RAM with first before MySQL begins accepting connections. This is useful because it allows us to warm up the cache first and it also allows us to warm up the cache quicker because a single read between ranges are sequential. And with HDD’s sequential reads/writes are much faster than random reads/writes. FYI, random reads/writes are what we perform when the site goes online.
Currently the only option is to create a script that simulates the activity of a normal user to warm up the cache before we enable the site online. But this method is hacky and would require a lot of work to get the timing correct.