This is a JavaPapers story. How I nosedived into deep pit and how I am trying to get out of it. This is not to bash anybody, in particular the WordPress or the Hosting provider, because all I got to blame is me and only me.
javapapers.com is running on a WordPress platform. I have been maintaining this blog for long. There are blogs out there that grew wild within a year by traffic and revenue. Mine is not that type, its been up for many years. It grew slowly for many years and at one stage, it had half million views per month. That glory was short lived and it went down so quick and I didn’t notice it too.
I am not a fulltime professional blogger, I sit on it when I get some spare time. When I noticed it, already the traffic went down by 50%. It all happened within couple of months and then I sat down to find out what’s wrong.
I changed the site design / theme. It was a complete overhaul. The previous theme was with lot of colors and heavy images. For the kind of domain this blog is, I felt it was not good. I designed a new WordPress theme with very less static images, with emphaisi on readability. One important thing was the new design is completely responsive.
New design was launched, people poured in comments, they were happy and so I was happy too. Along with the new design, I brought in a new component to the site. It was a context-sensitive navigation menu. It was a killer feature (that’s what I thought) and some of the readers too liked it.
Theme was launched on 1 April 2014. I should have chosen a better day. First Tsunami wave was in 10 days. The traffic fell by 25%. First mistake, I didn’t know that it has fallen. Generally I do not keep watch on the traffic regularly. After a month, the second wave, it went further down by 15%. And then after a month the third wave, further down by 10%.
By three months, the traffic went down by 50%. That is when I noticed it. Once I noticed, there were many sleepless nights. 90% of the viewers are from Google search. There is a disadvantage to it. We cannot tweak something and see the result immediately. Google The King will crawl on its own interval. I did lots and lots of reading, analyzing, fixing.
Gotcha! there was no single problem. There were too many issues. I kept unearthing them like worms. Then started fixing them one bye one. Following are the problems I had. Looks like this is the superset of issues one can have and I had it all!
Daunting list right. All these struck at the same time! There were two targets. First to get the site up (It was going down on regular intervals). Second to reduce the response time as much as possible.
It was on MediaTemple grid hosting. Yes, it was pricey. If it was not for MT I would have faced the issue even earlier. I had only one issue with MT the support was not good. If my support request was too simple like (where should I look for the log file?), I got answers. But if I go one step beyond this, that’s it. I will get a standard template copy-paste answer “The problem will be on you WordPress plugin” and we do not have anything to do with it.
There is nothing to feel annoyed about it, because that is how this hosting industry works. I have been with GoDaddy, DreamHost, BlueHost and MediaTemple. This is what we will get anywhere we go! One thing I know was, we cannot expect them to be our Admins. But sometimes even we don’t get our money’s worth support. When it comes to infrastructure and controls, no doubt MT is good.
So what went wrong here. There was some naught neighbor in my shared space and it should have triggered the down fall. Whatsoever we cannot run a website on shared server which gets loads and loads of request.
Following are the options I shortlisted:
Since I had some bitter experience with MT support, I decided to move to some other hosting provider. I know I will get the same juice I go anywhere, but I wanted a change. So I was left with four options.
Review for WP Engine was very good. It is specialized for Worpdress hosting and it adds value. But it is priced way beyond my reach. I discussed with WP Engine sales team and they recommended me to go for their Business plan. WP Engine categorizes their plans based on the views we get. Their business plan is 249$/ mo and I cannot afford that for now. So it was ruled out.
Digital ocean cloud hosting features and pricing is too good to ignore. The promise of SSD disk, choose server location of our choice, simplified cloud hosting with easy interface and etc. Almost I have gone this way. Then going through some of the documentation, I thought of putting this on hold for some time. Though this option is attractive because of price and feature, we may have to do lot of configurations and tweaking to get optimum performance out of the server. This is always true when it comes to cloud hosting in comparison with traditional web hosting.
I was too tired at this stage to try this out. My wife is running a blog and relatively that’s a new blog. We have planned to move that to Digital Ocean and learn the tricks of the trade. After that JavaPapers can be moved. If you have till this and seriously looking for a hosting provider and if you are a do it yourself guy, then Digital Ocean is the best place to go.
Now I am left with DreamHost and it is between DreamPress plan or VPS. I chose VPS hosting as it provided more freedom to experiment than the DreamPress environment. For example we can set the RAM limit of our choice. I am planning to come up with a web application that will be part of JavaPapers and VPS will be the best choice considering that. So I moved to DreamHost VPS and allocated the 300 MB Ram for memory as limit. Within a week I hit the roof and DreamHost complained saying that memory is not sufficient either increase it or check your code for bottleneck.
Performance wise already I took lots of measure and the site is tuned so the only option is to boost the memory. So then I increased it to 400 MB RAM. This also may not be sufficient and I think that I may have to settle somewhere between 500 MB and 800 MB. Lets see how long it holds. Before moving to VPS, in shared hosting the granted quota was 100 MB of RAM. We do not have features like these graphs to ascertain our utilization. Support team was also not helpful either. Now, this is well evident that I was running on low resources.
JavaPapers was indexed good and it was ranking well with Google. When I did the redesign, it was not only look and feel, it involved changes in site map also. Though I thought about it, it was not planned well and executed. Once the redesign was done,
The above two points are serious from SEO point of view. I was introducing a navigation menu and as part of that added a custom taxonomy in WordPress. This introduced new URLs and it clashed with existing taxonomy ‘category’ urls. I found that only after Google complained about them via Google Webmaster Tools.
I modified the previous and next navigation in each page to mimic a tutorial feel instead of blog. Then added a breadcrumb feature too. These two together completely changed the link structure of the website. Thought this is not a major issue, I need not have done this along with a major redesign.
Following are the learning from that failure.
Ah! this was told many times and I have read about this on numerous occasions. With all that knowledge I did this mistake. I felt the need for a good navigation menu in the site considering that there are tutorials that are lying low in the hierarchy of structure. I wanted to highlight good tutorials that were written long back and also enable the user to read through in a flow. I couldn’t find a free WordPress plugin that suited my need. But I found a commercial plugin and bought it for 25$. I installed and enable it. The mistake was done.
What I should have done. I must have reviewed the code line by line before installing it. It is not about the money I spent for that, this dragged the performance of the site. WordPress P3 performance profiling plugin helped to assess the response time of each plugins used. This plugin is not consistent in reporting the timings. But it will give an overall idea if something is wrong. This raised the alarm and I reviewed the code of the plugin. There were even three levels of nested for-loops which can be optimized to a single level with simple though. So much of String manipulations, regex operations that can be avoided with better thought.
Now, I decided not repair that and that will be too expensive instead we can build from scratch. So we built a WordPress plugin for menu navigation. Though there is an improvement in performance but still it was not up to the mark. The culprit was the WordPress Walker class. We had used it to walk through the custom taxonomy menu tree hierarchy.
Under the hood, WordPress Walker class is built using the event based model and XML parsing. If we use it at one level that should be fine. We used multiple levels of walkers instances nested to display the entire hierarchy. This eats up complete memory.
So finally we decided not to use the WordPress Walker class and write our implementation that is similar to it. Leaving out the event based XML parsing model and wrote code that is custom suited for the purpose. Finally we nailed it. Now you can see a context-sensitive menu in action in the left side of the site. That’s outcome of the plugin we wrote.
Right from day one, I have been using my own theme. Theme market is thriving and lots of good options are available. We can find good themes for around 50$ and there are even some good themes that’s available for free.
Main reason I started doing my own theme is a individualistic look for the site. The previous design I had for the site is so unique. The present design is common and with little tweaks to a free theme we can achieve this. But I continued with passion of building my own theme. It is so easy with the WordPress platform.
Where I went wrong, I had used old WordPress APIs and some deprecated too. Site was not up to the latest markup. For example, I had it reporting as strict XHTML. When I validated it, there were more than 500 errors. This is no hard and fast rules regarding these issues, but its better to keep things clean in this area.
Then changed it to HTML5 and started cleaning up the whole HTML stuff like removing unnecessary div tags to proper markup. I should have taken care of this when I did the redesign. I completely focused on the look and forgot about the code. HTML structure and code is equally important as the looks of the site. Users see the UI look and search engines see the HTML.
functions.php is the place to look for issues in a WordPress theme. Go through every line of it and see if you can avoid a filter, hook. An example of what mistake I have done would explain it. The WordPress theme comments template attaches a word “says” after every name in the comments display, like “Matthew says,”. I wanted to remove that word “says”. I searched for ways and found a code snippet in WordPress forum and used it. It did the job and I simply forgot about.
When I was looking for bottlenecks in theme, the function which I used to remove the word “says” is a performance hog. It is a filter and how it works is, after the whole response is constructed, it does regex pattern match on the content and removes the word and returns the response. Which eventually become the final response to the user. This almost doubles the overall Php processing duration.
So the final decision is, let the “says” word be present in the comments area. Like this there are places where we can compromise certain things for performance. Try to keep functions.php small and simple. Do not try to use lot of hooks, filters and shortcodes.
This is one major area we got to watch. I have Akismet configured and it does a good job to filter spam comments. How it works is, it contacts the Akismet server to validate the comment each time and updates the WordPress database.
In May, when I had lots of “500 Internal Server” errors there were almost 1,000 spam comments daily. This blog is alive for more than 5 years. But in the past six months the count of spam is half of what it got in its lifetime.
Closing the comments altogether is not a value based solution. I did an analysis on what articles the spam comments are received. Surprisingly 80% of the spam comments are received on 10% of the posts. I narrowed down those posts and temporarily closed comments for those posts. There you go, the spam count dropped drastically. It has given a temporary relief and I know it is not a permanent solution. Another way to take the load of our server is to move the comments to a third party discussion platform like Discuss.
All I was tacking is to stop the “500 Internal Server” errors. After all these optimizations said above the problem was not completely fixed but the interval reduced largely. I cross checked the time when the site when down based on Apache error log with the access log. It gave a clue. There were some 15 posts that repeatedly appeared. Now I found out, all those posts had more than 400 comments. Among that, some 3 posts had more than 650 comments.
All these time I have never worried about that. Then the obvious solution is to paginate the comments. I kept 100 top level comments page using WordPress settings and paginated the comments. I took care of the duplicate content issue by having the “canonical” url in place. By default WordPress creates canonical URLs which does not match the definition of what Google expects and had to tweak that using a filter.
Strangely even after comments pagination there is no improvement in the performance. Just FYI, I measure the performance using custom code. I just weave the code around start and stop time. Then at times I use the pingdom tool also.
I see that most of the heavy blogs are using Discuss commenting system. May be all have taken this route consciously. I still had reservations to move there. Some reviews stated that people do not like Discuss that much. Though blog Admins preferred Discuss, users prefer WordPress comments as it is simple to use over Discuss.
I thought of a solution to load to load the comments on click only. When the article is shown, the comments will not be displayed. Just show a comments link at the bottom and on click AJAX load the comments from database. But already I have spent much time on this area and first wanted to ensure that this place is worth investing. Wanted a quick solution which will add value to the user also.
I started analyzing what kind of comments I had. Oops! here is the catch. 70% of the comments are appreciations or thanks related content. Though it is of great value to me but it made no sense for the other users.
So, the solution is to move all those appreciations, thanks, good, nice, +1 and all similar comments to a common testimonial page. Retained comments that added value to the content and those encouraged healthy discussions. I did these only for posts having comments count greater than 150. I may have to do this screening for all the post in future.
Wow! this move nailed it.So now it is evident, the count of comments is directly proportionate to the response time. For now things are good with respect to comments. I may have to move to Discuss or some external comment platform in future for a permanent solution for spam and performance.
Previously I used HyperCache for caching needs. It was installed years back. When I went HyperCache way, I chose that because of the simple configuration options. But, it came with a performance compromise.
As part of the optimization, I chose WP Total Cache and I had compatibility issues with FeedBurner plugin (which I was using earlier) and the Google sitemap generator. The configuration option is also tedious.
Then I chose WP Super Cache. This works fairly straight forward. Just creates flat HTML files and writes in the disk. Then to serve the request no DB calls are made, the html file is served there by improving the throughput.
The options are also simple, it guides with a label as (recommended) and that helped as well. Only place where I had to stop by is the cache expiry. I experimented with it by having different intervals and now settled for one day.
Lot of things are being said about WordPress and performance and quality. My point of view is, it is a generic platform. We should know how to use it. If we master the WordPress API we can do wonders based on that. There are reference to blogs that use WordPress and have more than million views / day. With right theme, plugin, configuration and resources we can make the WordPress sing smoothly. Its all in our hands and thanks to WordPress.
In the Apache logs, I saw unusual number of access to WordPress login page. WordPress login php file is no secret. There is a brute-force attack on the page (even the moment I write this, its happening). Then I heard that its happening for almost all the WordPress blogs out there. Every minute there are 4 to 5 requests for that page.
Similarly there are large number of requests for XML-RPC page. Looks like people trying to post article and help me. These requests are not to the level of classifying it as a DOS attack. Its within limit but consuming valuable resources and dangerous too.
WP-Security is a popular plugin. It promises lot of action. I enabled it for a period. But again we have another problem, performance and also security plugin. Apart from numerous things it does these are the main things critical, rename the WP login page, disable the xml-rpc and black-list misbehaving IPs.
I did not want to use a heavy plugin for these three things. Rename login page and disable xml-rpc is fairly straight forward. Only then I noticed, black-listing the misbehaving IPs is done at the Apache server level itself by the confs. But IPs are forged and the attack continues, it’s a child’s play for the hackers. So, I updated the password to a long and complicated one like a hash key looks like it will save me, till I find a better solution.
This is one important which I ignored. Based on the learning now what I have achieved through it is,
Lot has happened and I tried to summarize the events. This is not a detailed how to do article. There are wonderful resources available online specializing for each topic discussed above. This is just a journal or can be considered as a travel diary.
The work is in progress and may take one more month to come to a stable version. But I think I have taken the platform to a level where it can survive without major modifications for at least another two years.
In essence, you should know the nuts and bolts. Of course, people are there to help you. But when your story is so unique and you are very deep in *, then only you can save yourselves. Lucky I got my wife a good Php and WordPress programmer, she helped me a lot.
I got some sage advise that I should have hired a programmer with experience in doing all this stuff. Man that’s not the idea. I am not an expert designer or a programmer. But I keep learning and improving day by day. I love the fun of doing it. That’s what it is all about. I do not know what is up there, but I will keep moving. Oh yes I forgot to mention, Google Analytics is showing large signs of improvement.