A WordPress Blog Performance Optimization (Horror) Story
This is a JavaPapers story. How I nosedived into deep pit and how I am trying to get out of it. This is not to bash anybody, in particular the WordPress or the Hosting provider, because all I got to blame is me and only me.
javapapers.com is running on a WordPress platform. I have been maintaining this blog for long. There are blogs out there that grew wild within a year by traffic and revenue. Mine is not that type, its been up for many years. It grew slowly for many years and at one stage, it had half million views per month. That glory was short lived and it went down so quick and I didn’t notice it too.
I am not a fulltime professional blogger, I sit on it when I get some spare time. When I noticed it, already the traffic went down by 50%. It all happened within couple of months and then I sat down to find out what’s wrong.
The Advent of New Design
I changed the site design / theme. It was a complete overhaul. The previous theme was with lot of colors and heavy images. For the kind of domain this blog is, I felt it was not good. I designed a new WordPress theme with very less static images, with emphaisi on readability. One important thing was the new design is completely responsive.
Context-sensitive Navigation Menu
New design was launched, people poured in comments, they were happy and so I was happy too. Along with the new design, I brought in a new component to the site. It was a context-sensitive navigation menu. It was a killer feature (that’s what I thought) and some of the readers too liked it.
Theme was launched on 1 April 2014. I should have chosen a better day. First Tsunami wave was in 10 days. The traffic fell by 25%. First mistake, I didn’t know that it has fallen. Generally I do not keep watch on the traffic regularly. After a month, the second wave, it went further down by 15%. And then after a month the third wave, further down by 10%.
By three months, the traffic went down by 50%. That is when I noticed it. Once I noticed, there were many sleepless nights. 90% of the viewers are from Google search. There is a disadvantage to it. We cannot tweak something and see the result immediately. Google The King will crawl on its own interval. I did lots and lots of reading, analyzing, fixing.
Gotcha! there was no single problem. There were too many issues. I kept unearthing them like worms. Then started fixing them one bye one. Following are the problems I had. Looks like this is the superset of issues one can have and I had it all!
- A site with half million views (per month) on shared hosting.
- Design overhaul and URLs were not preserved.
- Created own WordPress plugin with poor performance.
- Custom Theme hooks, filters, short codes and etc.
- 1000 Spam comments per day.
- Did not use a better caching plugin.
- Loads of comments in single page without pagination.
- Brute force attack.
- Some more minor issues.
- Internal link structure lost
- Missing pages
- Poor 404 page design
- Too many third party/social widget embedded.
- Improper compression and cache-expiry
Daunting list right. All these struck at the same time! There were two targets. First to get the site up (It was going down on regular intervals). Second to reduce the response time as much as possible.
It was on MediaTemple grid hosting. Yes, it was pricey. If it was not for MT I would have faced the issue even earlier. I had only one issue with MT the support was not good. If my support request was too simple like (where should I look for the log file?), I got answers. But if I go one step beyond this, that’s it. I will get a standard template copy-paste answer “The problem will be on you WordPress plugin” and we do not have anything to do with it.
There is nothing to feel annoyed about it, because that is how this hosting industry works. I have been with GoDaddy, DreamHost, BlueHost and MediaTemple. This is what we will get anywhere we go! One thing I know was, we cannot expect them to be our Admins. But sometimes even we don’t get our money’s worth support. When it comes to infrastructure and controls, no doubt MT is good.
So what went wrong here. There was some naught neighbor in my shared space and it should have triggered the down fall. Whatsoever we cannot run a website on shared server which gets loads and loads of request.
- User views are half million per month
- Thousands of bots crawling daily
- Tons of comment spams
Move the site to VPS or dedicated infrastructure
Following are the options I shortlisted:
- Continue with MediaTemple and upgrade to VPS
- Continue with MediaTemple and upgrade to WordPress Hosting
- Move to DreamHost VPS
- Move to DreamHost DreamPress
- WP Engine
- Digital Ocean Cloud Hosting
Since I had some bitter experience with MT support, I decided to move to some other hosting provider. I know I will get the same juice I go anywhere, but I wanted a change. So I was left with four options.
Review for WP Engine was very good. It is specialized for Worpdress hosting and it adds value. But it is priced way beyond my reach. I discussed with WP Engine sales team and they recommended me to go for their Business plan. WP Engine categorizes their plans based on the views we get. Their business plan is 249$/ mo and I cannot afford that for now. So it was ruled out.
Digital ocean cloud hosting features and pricing is too good to ignore. The promise of SSD disk, choose server location of our choice, simplified cloud hosting with easy interface and etc. Almost I have gone this way. Then going through some of the documentation, I thought of putting this on hold for some time. Though this option is attractive because of price and feature, we may have to do lot of configurations and tweaking to get optimum performance out of the server. This is always true when it comes to cloud hosting in comparison with traditional web hosting.
I was too tired at this stage to try this out. My wife is running a blog and relatively that’s a new blog. We have planned to move that to Digital Ocean and learn the tricks of the trade. After that JavaPapers can be moved. If you have till this and seriously looking for a hosting provider and if you are a do it yourself guy, then Digital Ocean is the best place to go.
Now I am left with DreamHost and it is between DreamPress plan or VPS. I chose VPS hosting as it provided more freedom to experiment than the DreamPress environment. For example we can set the RAM limit of our choice. I am planning to come up with a web application that will be part of JavaPapers and VPS will be the best choice considering that. So I moved to DreamHost VPS and allocated the 300 MB Ram for memory as limit. Within a week I hit the roof and DreamHost complained saying that memory is not sufficient either increase it or check your code for bottleneck.
Performance wise already I took lots of measure and the site is tuned so the only option is to boost the memory. So then I increased it to 400 MB RAM. This also may not be sufficient and I think that I may have to settle somewhere between 500 MB and 800 MB. Lets see how long it holds. Before moving to VPS, in shared hosting the granted quota was 100 MB of RAM. We do not have features like these graphs to ascertain our utilization. Support team was also not helpful either. Now, this is well evident that I was running on low resources.
Website Design Overhaul
JavaPapers was indexed good and it was ranking well with Google. When I did the redesign, it was not only look and feel, it involved changes in site map also. Though I thought about it, it was not planned well and executed. Once the redesign was done,
- there were many urls missing. No 301 redirects either. Just landed with 404.
- the internal linking structure drastically changed.
The above two points are serious from SEO point of view. I was introducing a navigation menu and as part of that added a custom taxonomy in WordPress. This introduced new URLs and it clashed with existing taxonomy ‘category’ urls. I found that only after Google complained about them via Google Webmaster Tools.
I modified the previous and next navigation in each page to mimic a tutorial feel instead of blog. Then added a breadcrumb feature too. These two together completely changed the link structure of the website. Thought this is not a major issue, I need not have done this along with a major redesign.
Following are the learning from that failure.
- Prepare a comprehensive sitemap before redesign and use it as a reference to verify after.
- Try to minimize the URL changes. Post launch they can be update gradually. If some URLs are to be converted, remember to plan for 301 redirect using .htaccess
- Wherever possible, do the redesign and launch is phased manner. Better to avoid uploading everything in one go. That is, follow iterative design.
- Always keep the old design ready in such a way that, it can be reverted back in a single click.
- Remember not to introduce any new URL patterns and cause duplicate content issue.
- Check for broken links with tools like Xenu.
- Html head tag is a key area. During redesign there might be changes to this place. Keep a backup of the old html-head area and compare it with post redesign.
- Remember to test in multiple browser, devices and to put back the analytics tracking, ads, feed and any other script/code used.
Beware of WordPress Plugins Used
Ah! this was told many times and I have read about this on numerous occasions. With all that knowledge I did this mistake. I felt the need for a good navigation menu in the site considering that there are tutorials that are lying low in the hierarchy of structure. I wanted to highlight good tutorials that were written long back and also enable the user to read through in a flow. I couldn’t find a free WordPress plugin that suited my need. But I found a commercial plugin and bought it for 25$. I installed and enable it. The mistake was done.
Review the Code of Plugins
What I should have done. I must have reviewed the code line by line before installing it. It is not about the money I spent for that, this dragged the performance of the site. WordPress P3 performance profiling plugin helped to assess the response time of each plugins used. This plugin is not consistent in reporting the timings. But it will give an overall idea if something is wrong. This raised the alarm and I reviewed the code of the plugin. There were even three levels of nested for-loops which can be optimized to a single level with simple though. So much of String manipulations, regex operations that can be avoided with better thought.
WordPress Walker Class Performance
Now, I decided not repair that and that will be too expensive instead we can build from scratch. So we built a WordPress plugin for menu navigation. Though there is an improvement in performance but still it was not up to the mark. The culprit was the WordPress Walker class. We had used it to walk through the custom taxonomy menu tree hierarchy.
Under the hood, WordPress Walker class is built using the event based model and XML parsing. If we use it at one level that should be fine. We used multiple levels of walkers instances nested to display the entire hierarchy. This eats up complete memory.
So finally we decided not to use the WordPress Walker class and write our implementation that is similar to it. Leaving out the event based XML parsing model and wrote code that is custom suited for the purpose. Finally we nailed it. Now you can see a context-sensitive menu in action in the left side of the site. That’s outcome of the plugin we wrote.
- Review code of every plugin you use.
- Try to avoid third-party WordPress plugin as much as possible.
- Enable debug mode in local WordPress environment always.
- Lookout for conflicts between the plugins, especially caching plugins.
- Feed related plugins generally do not behave well with cache plugins, try to choose them with care.
- WordPress performs well with cache plugins. I see WP Super Cache as a good plugin. I had compatibility issues with WP Total Cache and other plugins I used. WP Super Cache is doing well for me and so I recommend it. Configuring the options of the cache plugin is a tricky job. There is no one solution fits all. We need to keep tinkering it and find the best setting options for our site. I am mainly referring to the cache-expiry duration.
- So the WordPress plugins I have now is WP Super Cache, Akismet, Google Site Map Generator and Menu Navigation (custom built).
Right from day one, I have been using my own theme. Theme market is thriving and lots of good options are available. We can find good themes for around 50$ and there are even some good themes that’s available for free.
Main reason I started doing my own theme is a individualistic look for the site. The previous design I had for the site is so unique. The present design is common and with little tweaks to a free theme we can achieve this. But I continued with passion of building my own theme. It is so easy with the WordPress platform.
Where I went wrong, I had used old WordPress APIs and some deprecated too. Site was not up to the latest markup. For example, I had it reporting as strict XHTML. When I validated it, there were more than 500 errors. This is no hard and fast rules regarding these issues, but its better to keep things clean in this area.
Then changed it to HTML5 and started cleaning up the whole HTML stuff like removing unnecessary div tags to proper markup. I should have taken care of this when I did the redesign. I completely focused on the look and forgot about the code. HTML structure and code is equally important as the looks of the site. Users see the UI look and search engines see the HTML.
functions.php is the place to look for issues in a WordPress theme. Go through every line of it and see if you can avoid a filter, hook. An example of what mistake I have done would explain it. The WordPress theme comments template attaches a word “says” after every name in the comments display, like “Matthew says,”. I wanted to remove that word “says”. I searched for ways and found a code snippet in WordPress forum and used it. It did the job and I simply forgot about.
When I was looking for bottlenecks in theme, the function which I used to remove the word “says” is a performance hog. It is a filter and how it works is, after the whole response is constructed, it does regex pattern match on the content and removes the word and returns the response. Which eventually become the final response to the user. This almost doubles the overall Php processing duration.
So the final decision is, let the “says” word be present in the comments area. Like this there are places where we can compromise certain things for performance. Try to keep functions.php small and simple. Do not try to use lot of hooks, filters and shortcodes.
This is one major area we got to watch. I have Akismet configured and it does a good job to filter spam comments. How it works is, it contacts the Akismet server to validate the comment each time and updates the WordPress database.
In May, when I had lots of “500 Internal Server” errors there were almost 1,000 spam comments daily. This blog is alive for more than 5 years. But in the past six months the count of spam is half of what it got in its lifetime.
Closing the comments altogether is not a value based solution. I did an analysis on what articles the spam comments are received. Surprisingly 80% of the spam comments are received on 10% of the posts. I narrowed down those posts and temporarily closed comments for those posts. There you go, the spam count dropped drastically. It has given a temporary relief and I know it is not a permanent solution. Another way to take the load of our server is to move the comments to a third party discussion platform like Discuss.
Comment Count Per Page
All I was tacking is to stop the “500 Internal Server” errors. After all these optimizations said above the problem was not completely fixed but the interval reduced largely. I cross checked the time when the site when down based on Apache error log with the access log. It gave a clue. There were some 15 posts that repeatedly appeared. Now I found out, all those posts had more than 400 comments. Among that, some 3 posts had more than 650 comments.
All these time I have never worried about that. Then the obvious solution is to paginate the comments. I kept 100 top level comments page using WordPress settings and paginated the comments. I took care of the duplicate content issue by having the “canonical” url in place. By default WordPress creates canonical URLs which does not match the definition of what Google expects and had to tweak that using a filter.
Strangely even after comments pagination there is no improvement in the performance. Just FYI, I measure the performance using custom code. I just weave the code around start and stop time. Then at times I use the pingdom tool also.
Discuss Comment Platform
I see that most of the heavy blogs are using Discuss commenting system. May be all have taken this route consciously. I still had reservations to move there. Some reviews stated that people do not like Discuss that much. Though blog Admins preferred Discuss, users prefer WordPress comments as it is simple to use over Discuss.
I thought of a solution to load to load the comments on click only. When the article is shown, the comments will not be displayed. Just show a comments link at the bottom and on click AJAX load the comments from database. But already I have spent much time on this area and first wanted to ensure that this place is worth investing. Wanted a quick solution which will add value to the user also.
I started analyzing what kind of comments I had. Oops! here is the catch. 70% of the comments are appreciations or thanks related content. Though it is of great value to me but it made no sense for the other users.
So, the solution is to move all those appreciations, thanks, good, nice, +1 and all similar comments to a common testimonial page. Retained comments that added value to the content and those encouraged healthy discussions. I did these only for posts having comments count greater than 150. I may have to do this screening for all the post in future.
Wow! this move nailed it.So now it is evident, the count of comments is directly proportionate to the response time. For now things are good with respect to comments. I may have to move to Discuss or some external comment platform in future for a permanent solution for spam and performance.
WordPress and Caching
Previously I used HyperCache for caching needs. It was installed years back. When I went HyperCache way, I chose that because of the simple configuration options. But, it came with a performance compromise.
As part of the optimization, I chose WP Total Cache and I had compatibility issues with FeedBurner plugin (which I was using earlier) and the Google sitemap generator. The configuration option is also tedious.
Then I chose WP Super Cache. This works fairly straight forward. Just creates flat HTML files and writes in the disk. Then to serve the request no DB calls are made, the html file is served there by improving the throughput.
The options are also simple, it guides with a label as (recommended) and that helped as well. Only place where I had to stop by is the cache expiry. I experimented with it by having different intervals and now settled for one day.
WordPress is No Evil
Lot of things are being said about WordPress and performance and quality. My point of view is, it is a generic platform. We should know how to use it. If we master the WordPress API we can do wonders based on that. There are reference to blogs that use WordPress and have more than million views / day. With right theme, plugin, configuration and resources we can make the WordPress sing smoothly. Its all in our hands and thanks to WordPress.
In the Apache logs, I saw unusual number of access to WordPress login page. WordPress login php file is no secret. There is a brute-force attack on the page (even the moment I write this, its happening). Then I heard that its happening for almost all the WordPress blogs out there. Every minute there are 4 to 5 requests for that page.
Similarly there are large number of requests for XML-RPC page. Looks like people trying to post article and help me. These requests are not to the level of classifying it as a DOS attack. Its within limit but consuming valuable resources and dangerous too.
WP-Security is a popular plugin. It promises lot of action. I enabled it for a period. But again we have another problem, performance and also security plugin. Apart from numerous things it does these are the main things critical, rename the WP login page, disable the xml-rpc and black-list misbehaving IPs.
I did not want to use a heavy plugin for these three things. Rename login page and disable xml-rpc is fairly straight forward. Only then I noticed, black-listing the misbehaving IPs is done at the Apache server level itself by the confs. But IPs are forged and the attack continues, it’s a child’s play for the hackers. So, I updated the password to a long and complicated one like a hash key looks like it will save me, till I find a better solution.
This is one important which I ignored. Based on the learning now what I have achieved through it is,
- added proper 301 redirects for all the missing urls due to menu plugin.
- There is one strange issue: There is unusual number of hits to pages with valid urls with end of url having a word as “undefined” appended to it. Due to this, the request always lands in 404 page. Number of request is like some 5000 per day. In a Google forum, some where referring to as a defect in a Chrome plugin. I added a redirect for this and that strange issue got fixed.
- Definition of media types and character encodings. This is important in the context of caching and proxy servers.
- Most importantly gzip compression enabled for defined output filter types.
- ETag configuration
- Browser cache and expiry.
- Choose whether to prefix www to the home url.
- If you are using WPSuperCache, remember not to override the configuration settings it has written in the .htaccess. I once did it.
- Similarly WordPress has a set of mod rewrite rules and preserve that too.
What is the Result?
- Now the site is up without any 500 Internal Server errors.
- In WebMaster Tools the time spent downloading a page has dropped to 270 milliseconds. I have even had more than 2000 millisec in the worst period.
- ismyblogworking.com says the fetch time is 145 ms. It was a huge number before optimization.
- Pingdom says response average is 290 ms. Look at how the response time graph gradually goes down. This started happening for each optimizations one by one. All those red lines are down times.
- Google PageSpeed gives 94 / 100 score for desktop. Mobile speed score is 87/100 and user experience is 99/100. There are some issues still pending that’s doable. Plan is to attain a perfect 100 score.
- gtmetrix gives a Page Speed Grade of A with 93% score and YSlow Grade is B and a score of 84%.
- webpagetest.org gives a PageSpeed score of 91/100. Presently I do not have any idea to go for CDN. Cache static content is done but there are problems with css media type and its not caching.
Tools I Used
- Google Webmaster Tools
- Google Pagespeed / YSlow
- Fiddler Proxy
- Php Performance Profiler
- P3 Performance Profile Plugin
- W3 Super Cache Plugin
Lot has happened and I tried to summarize the events. This is not a detailed how to do article. There are wonderful resources available online specializing for each topic discussed above. This is just a journal or can be considered as a travel diary.
The work is in progress and may take one more month to come to a stable version. But I think I have taken the platform to a level where it can survive without major modifications for at least another two years.
That’s It for Now
In essence, you should know the nuts and bolts. Of course, people are there to help you. But when your story is so unique and you are very deep in *, then only you can save yourselves. Lucky I got my wife a good Php and WordPress programmer, she helped me a lot.
I got some sage advise that I should have hired a programmer with experience in doing all this stuff. Man that’s not the idea. I am not an expert designer or a programmer. But I keep learning and improving day by day. I love the fun of doing it. That’s what it is all about. I do not know what is up there, but I will keep moving. Oh yes I forgot to mention, Google Analytics is showing large signs of improvement.
This Misc tutorial was added on 22/09/2014.