Blog moved to wordpress on openshift

I moved this blog a while back from Blogger to WordPress. I was looking to move away from Blogger/Blogspot, to something self-hosted. I had come up with the following list to make the move seamless (for me as well as regular visitors):

  • Ability to use custom domains: Since I used blogger’s custom domains feature to redirect the blogger/blogspot links to my domain, I wanted to retain that functionality
  • Make the move seamless to site visitors
  • Preserve links and link structure.  All earlier links, rss feeds, etc., should continue to work as they did with the earlier setup (helps in maintaining search engine rankings)
  • No dependence on 3rd party server/software for leaving comments: Some blogging platforms are simple and minimal; they however end up using other services for comments to blog posts. I didn’t want that to happen — all the content should be on one sever without the users needing any sort of registration elsewhere.
  • Easy to manage the software: Shouldn’t be too time-consuming to keep the blog up

Red Hat‘s OpenShift PaaS platform had just announced support for domain aliases for applications, so I started looking at what would be involved in moving the blog on their platform.

Read on for my experiences and details on deploying this WordPress blog on OpenShift.

I already had played with OpenShift a bit, and loved their workflow of deploying apps using git. Deploying a wordpress install on OpenShift would mean I wouldn’t have to manage my own servers, operating systems, software updates, etc. It’s all on the stable and secure RHEL platform, with PHP managed by the RHEL team. So all I would need to worry about is just the wordpress installation itself.  As long as I routinely check for security updates to wordpress, and push those updates to the site, I should be doing OK.

So I created a new php-5.3 app using ‘rhc-create-app’. mysql is needed for the database, so I also added an instance to the app with the command

 rhc-ctl-app -e add-mysql-5.1 -a <appname>

To manage the mysql instance, a phpmyadmin cartridge is desirable too:

rhc-ctl-app -e add-phpmyadmin-3.4 -a <appname>

To make sure my custom domain works, let’s add aliases as well:

rhc-ctl-app -c add-alias --alias
rhc-ctl-app -c add-alias --alias

I had used both, log. and www. for the blog, so let’s continue using both so that both domains continue working. Of course, I changed the DNS CNAME entries for www. and log. over to <appname>-<domainname> via my name provider’s site.

Next, using the admin credentials on the mysql db, I then created a new db and a new user and gave the user all permissions on that db.  All this is quite simple using the phpmyadmin interface.

That’s it, all set with the app on OpenShift.

I then went and downloaded the latest wordpress release (3.2.1 then) zip file and extracted the files in a local directory.

Now here’s where I started using the power of git and OpenShift: I created a git repo in the wordpress directory and added all files to it, and made an initial commit. This is my base from where I’ll use wordpress.  New wordpress releases can be copied in this directory, and a new commit will map to the upstream release version. Any modifications to files I make in my wordpress installation (e.g. theme changes) are tracked in another branch in the same directory, with that branch being rebased on top of the latest release (the master branch).

With this setup, I can just copy the contents of this directory into my app’s php directory and push the changes to OpenShift. The ‘php’ directory is where all the app code resides. I then added all files in the git repo and committed the result. I then created the wp-config.php file as a copy of the wp-config-sample.php file, modified it to suit my installation, committed the change, and also added the file to the other wordpress directory created in the first step above. I then just pushed the changes, and the app was  live on the cloud and I could get started with wordpress’s wizard-based installation.

Now here’s one oddity of hosting apps on OpenShift: the app directory isn’t writable, or isn’t the place where the app itself can make changes and assume they’d be preserved (I think this is a good thing). Since the app is deployed via git, any content written to the server app directory can be lost on the next git push. For wordpress, this means the ‘uploads’ directory has to be given a place where images, etc., can be uploaded without problems.

The OpenShift people have helpfully given us some environment variables and hooks in the app deployment process, which can be used to do this right.

The default wordpress uploads directory is ‘wp-content/uploads’.  We can continue using this directory, with the following snippet placed in ‘.openshift/action_hooks/build’:

cd $app_dir
cat >> .openshift/action_hooks/build
if [ ! -d $OPENSHIFT_DATA_DIR/uploads ]; then
    mkdir $OPENSHIFT_DATA_DIR/uploads

ln -sf $OPENSHIFT_DATA_DIR/uploads $OPENSHIFT_REPO_DIR/php/wp-content/

This ensures the ‘wp-content/uploads’ location is available for wordpress to put stuff into, and it also ensures the content goes into a place where OpenShift will not destroy the data on the next git push.

OK, having done all this, I was now ready to import my older blog posts. I installed the blogger-to-wordpress and livejournal-to-wordpress plugins (well, since I’m doing this, I thought I might as well import my older lj entries), git push’ed them, and did the import from the web interface.

Comments from livejournal entries and some blogger posts didn’t get fetched. I don’t know why that happened. I tried the import a couple more times, but those posts didn’t show up. I just decided to not bother about that; if there was any frequently-visited post, I could always go back and import it by hand. Since I didn’t expect to do any more imports, I removed those plugins and pushed the result again.

There is a blogger-to-wordpress redirect plugin, but that plugin does a lot more than just redirecting: it imports images uploaded to blogger or picasaweb on the blogger posts, generates blogger template to redirect blogger posts to wordpress, maps blogger posts to wordpress posts, etc.  Now most of this functionality is one-time; importing pictures, generating blogger template for redirection, etc., doesn’t need to be present all the time (can’t be too careful with php apps and security). I used the plugin to import all the blogger/picasaweb pictures it could fetch, and removed it as well.

I then enabled wordpress’s custom URL structure, which allows blogger-like post URLs, with the year and month as well as post title in the URL. Enabling this needs .htaccess modifications, which wordpress can’t make directly in our setup (because it can’t write to the app directory).  So created a new .htaccess file in the php/ dir. in the OpenShift app directory and included the snippet wordpress helpfully tells you it would include if the directory were writable (my code is in the snippet below).

I also took some hints from the blogger-to-wordpress plugin and created a minimal plugin that maps blogger URLs to wordpress URLs, and installed this plugin.

Next up was to ensure the older feeds kept working, and also ensuring the contents of the wp config file, and directory listings weren’t displayed. I also searched for some wordpress hardening tips, and compiled a fun-looking .htaccess file, snippet included below:

# Disable directory listing
Options All -Indexes

<files .htaccess>
    order allow,deny
    deny from all

<files wp-config.php>
    order allow,deny
    deny from all

RewriteEngine On
RewriteBase /

# Most of following comes from

# Redirect feeds from labels
RewriteRule feeds/posts/default/-/(.*) category/$1/feed/ [L,R=301]

# Redirect older blogger RSS feeds
RewriteRule rss.xml feed/ [L,R=301]
RewriteCond %{QUERY_STRING} ^alt=rss$
RewriteRule feeds/posts/default feed/? [L,R=301]

# Redirect older blogger ATOM feeds
RewriteRule atom.xml feed/atom/ [L,R=301]
RewriteRule feeds/posts/default feed/atom/ [L,R=301]

# Redirect older blogger comments feeds
RewriteRule feeds/comments/default comments/feed/ [L,R=301]

# Redirect archives
RewriteRule ^([0-9]{4})_([0-9]{1,2})_([0-9]{1,2})_archive\.html$ $1/$2 [L,R=301]

# Redirect labels
RewriteRule ^search/label/(.*)$ category/$1/ [L,R=301]

# This is WP default: makes pretty URLs possible.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]

I also installed the WP-Piwik and smart-404 plugins. WP-Piwik is a plugin that adds Piwik javascript code to give me a summary of the visits to the site, and the search keywords people use to land on my site. More on Piwik and its setup in a follow-up blog post. Smart-404 shows a list of pages with similar titles to the one being used in the 404 page. I had noticed a few 404 page hits via Piwik.

I’ve enabled the Akismet plugin that comes with the wordpress distribution, and it has flagged over 600 comments as spam so far, with just 2 false-positives. That’s impressive, but I intend to look further into this:

  1. Is there a way to reduce spam comments?
  2. Why do wordpress sites get spammed so much?

What I’ve seen so far is people search for specific terms on the ‘net, land on some post, and put the spam comment. So these are actual humans, not bots. Since they’re investing enough effort into finding blogs and adding comments, spam prevention techniques like CAPTCHAs aren’t going to work all the time. Akismet is working fine so far, so I’ll continue using it, but I’m going to think / search for ways to mitigate spam.

Overall, the move was really painless, done within a weekend and the most time was spent in learning about WordPress and moving the existing posts to the new blog. There were hardly any OpenShift issues, it stayed nicely out of the way, and I really like that about the platform.

I still haven’t figured out a way to map Blogger labels to WordPress Categories/Tags; these are new concepts (to me), and I’ll probably get something done here with some more htaccess trickery.

Fedora Miniconf and

A very delayed post on the Fedora Miniconf and was held on the 15th, 16th and 17th of this month in Bengaluru. I could confirm my attendance very late, so I missed out on the CfP and a chance at speaking in the main conference, but I could manage to get a speaking slot in the Fedora miniconf. Thanks to Rahul for accomodating me at a short notice.

One of the main things I was looking forward to was meeting my team-mate Juan Quintela. Though we met recently at the KVM Forum 2010, I was going to use this opportunity to catch him and discuss some of the things I’m working on that overlap with his domain, virtual machine live migration, and get things going.

The other thing was to get to know more people — Fedora users and developers from India who I’ve spoken with on the irc channel but not met, other developers and users of free software from around the world. Add to that a few people who I’ve worked with and not met and also people whose software I use daily and who I want to thank for working on what they do.  It was also nice meeting the old known faces from the IBM LTC in Bengaluru — Balbir Singh, Kamalesh Babulal, Vaidy, Aneesh K. V., et al.

It’s always a certainty that there will be users of virtualization (particularly kvm) stack and it’s nice to get a feel of how many people are using kvm, in what ways, how well it works for them, and so on. That’s always a motivation.

The Fedora miniconf was on the 16th. The schedules for talks for miniconfs aren’t published by the people, so it was left to us to do our advertising and crowd-pulling. Rahul had listed the speakers and the talks on the Fedora wiki page. I went ahead and took out a few print-outs for the talks and assigned time slots for each talk depending on the suggested length given by the speakers for their talks as well as the slot allotted to the Fedora Project for the miniconf. The print-outs of the schedules were meant to be pasted around the venue to attract attention to the remotest section that was to host the miniconf, Hall C. However, we just ended up keeping the printouts as handouts at the Fedora stall that we set up. The Fedora stall was quite a crowd-puller. And since it was set up on the second day, we didn’t have to compete with the other stalls since they had their share of attendance on the first day.

The other members of the Fedora crowd, Rahul, Saleem, Arun, Shreyank, Aditya, Suchakra, Siddhesh, Neependra, … have written about the Fedora stall and their experiences earlier (and linked to from the Fedora page).

The Fedora miniconf was a great success, going by the attendance and the participation we had. My talk was the first, and I could see we had a full house. I think my talk went quite well. It could have been a little disappointing for people who expected demos, but I wanted to aim this talk towards people who had a general sense of using and deploying Fedora virt as well as Fedora on the cloud and also at people who would go and do stuff themselves rather than being given everything on a silver platter. This does resonate also with the philosophy of recent years of being a contributor-oriented conference rather than a user-originted one, so I didn’t mind doing that. Gauging by the response I got after the talk, I believe I was right in doing that. (I even got one email mentioning it was a great talk by the CEO of a company).

The other talks from the Fedora miniconf were engaging, I learnt quite a bit from what the others are up to. Arun’s talk on packaging emacs extensions was entertaining. He connects with the audience, I liked that about him.

Aditya’s talk on Fedora Summer Coding was a good call to students to participate in the free software world via Fedora’s internship programme. He narrated his own experience as a Fedora Project intern, which touches the right chords of the intended audience. I think doing more such talks will get him over the jitters of presenting to a big crowd.

Suchakra’s doing good work on accessing an embedded Linux box via a console inside a browser tab — it’s a very interesting project.

Neependra’s talk was a good walk-through of using tracing commands to see what really happens in the kernel when a userspace program runs. He walked through the ‘mkdir’ command and showed the call trace. This was a good demo. He spoke about the various situations in which tracing tools could be used, not just for debugging, and that should have set people’s thoughts in motion as to how they could get more information on how the system behaves instead of just using a system.

Shreyank’s talk on creating a web tool for managing student projects and the Fedora Summer of Code was interesting as well. It was nice to see the way an actual student project was designed and developed and how it’s going to make future students’ and mentors’ lives easier. This talk should have served as a good introduction to the flow and process students have to go through in applying, starting, reviewing and completing their project.

Apart from the Fedora miniconf, I attended a few sessions in the main conf. James Morris’s keynote on the history of the security subsytem in the Linux kernel was very informative. Rahul’s keynote on the ‘Failures of Fedora‘ was totally packed with anecdotes and analyses of the decisions taken by the Fedora project and their impact on the users and developers. Fedora (earlier Red Hat Linux) is one of the oldest distributions around, and any insights into the functioning and data as to what works and what does not is a great source of information to look for building engaging communities of users and contributors.

Lennart‘s two talks on systemd and the state of surround sound on Linux were not very new to me. However, there were a few bits in there that provided some food for thought.

Juan‘s talk on live migration was packed full of experiences in getting qemu to a state where migration works fairly well. He also spoke about all the work that’s left to do. It was totally technical and I think the people who were misguided by it being labelled as a ‘sysadmin’ talk or by the title (expecting to migrate from an older physical machine to a newer physical machine w/o downtime) quickly left the hall. Whoever stayed back were either people who work on QEMU/KVM (esp. the folks from the IBM LTC) or people too polite to walk out.

Dimitris Glezos‘s talk on building large-scale web applications was a very informative one for me. I’ve never done web programming (except for html, css and a bit of php ages ago), and this was a good intro for me to understand what various web development frameworks there are, their pros and cons, the way to deploy them, the way to structure them, etc. It was evident he took a lot of effort to prepare the slides and the talk, it was totally worth it.

Danese Cooper‘s keynote on the Wikimedia Foundation was an equally informative talk. She spoke on a wide range of topics, including the team that makes up Wikimedia, their servers and datacentres, their load balancing strategy, their backup systems, their editing process, their localisation efforts, their search for a new mirror site in the APAC region, etc. I was interested in one aspect, machine-readable wikipedia content, to which they had a satisfactory answer: they’re migrating to semantic web content and would look at a machine-readable API once they’re done adding semantics to their content.

The other time was spent at the Fedora booth and talking to Juan and the other friends.

The team announced this would be the last, so thanks to them for hanging around so long. To fill the void, we’re going to have to step up and organise a platform for like-minded people from the free/open source software community around here. I’ve been part of organising some events earlier in different capacities, and I’m looking forward to being part of an effort that provides such a platform. There’s a FUDCon being planned for next year in Pune, I’ll be involved in it, and will take things along from there.