I saw Nick Falkner talking about the number of words in his blog, claiming 273,639 words in 343 posts. I was curious about how many words I had in my blog, but careful reading of the WordPress.com forums and support pages indicated that there is no easy way to get that information on a WordPress.com blog, since they don’t install the word-count plugins available from wordpress.org.
I decided to export my blog to my laptop and see if I could find the information in the exported .xml file. The word counts are not explicitly in the xml file as metadata, but the text of the blog posts is. I wrote a very crude script to approximately count the words in the content, getting 573,960 words for 862 published posts and 165 (mostly very short) unpublished drafts. This is most likely an overestimate, as it counts some non-text things (like tags) as words and includes draft posts that I haven’t published. In any event, it looks like my posts are averaging somewhere around 650 words each—somewhat shorter than Nick’s 800-word average, but still a bit long.
The script probably should be rewritten to read XML properly and cut out the tags, but it wasn’t worth the trouble for me to learn to use an XML parser for this quick check. If anyone wants to write better code for me, feel free. I was going to include my code here, but since it is full of “<” characters that the WordPress sourcecode tag treats as tags, it is too much hassle to include. Sigh—I really wish WordPress.com would make it easier to post unmangled source code.