9 December 2009 6 Comments

iRobots.txt SEO

http://markbeljaars.com/wp-content/plugins/sociofluid/images/digg_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/reddit_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/stumbleupon_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/delicious_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/furl_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/technorati_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/facebook_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/yahoobuzz_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/mixx_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/twitter_48.png
iRobots.txt SEO

Just a quick note to let you know that I have released my latest plugin called iRobots.txt SEO.

iRobots.txt SEO is a SEO optimized, secure and customizable robots.txt virtual file creator.

Full details of the plugin can be found at http://markbeljaars.com/plugins/irobotstxt-seo/.

This plugin started life as a selfish need to easily create out-of-the-box SEO optimized robots.txt files for my websites. Since the initial conception, I’ve added features to inhibit specific bots and create customized records. No other robots.txt plugin delivers this level of flexibility.

I’ve also spent a lot of time developing a standardized settings interface. The setting page looks like a standard WordPress Edit New Post page with expandable and retractable sections. Comments on the interface would be appreciated as I am planning on retrofitting Table of Contents Creator with this new style.

Anyway, I’ve blabbered enough. Please give the plug-in a try and let me know what you think.

26 November 2009 9 Comments

Pretty Link’s Marketing and SEO Benefits

http://markbeljaars.com/wp-content/plugins/sociofluid/images/digg_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/reddit_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/stumbleupon_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/delicious_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/furl_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/technorati_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/facebook_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/yahoobuzz_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/mixx_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/twitter_48.png

There are many WordPress plugins that are nice to have, but very few are essential. This is where Pretty Link is different. If you are serious about tracking the performance of your adds, internal and external links or even downloads then Pretty Link is definitely worth looking at. According to the author, Pretty Link can…

Shrink, track and share any URL on the Internet from your WordPress website

…and what’s more, the shortened URL is prefixed with your website’s domain name. This blows tinyurl.com and bit.ly out of the water and has obvious SEO benefits to boot. Pretty Link was originally borne from the need to neaten up those ugly affiliate links that often scare off would-be purchasers. With Pretty Link, you can make a URL that looks like this http://www.shareasale.com/m-pr.cfm?merchantID=16526&userID=363159&productID=466062304 into one that looks like this http://beginnerchess.org/chesshouse-22.

Once a link has been “prettisized” (I’m sure this is not a real word), it can also be tracked as per the screenshot below. This allows you to tell how often your links are clicked and which links work better at attracting potential buyers than others. This is an extremely powerful tool for all Internet marketers. What’s more, Pretty Links with different names can point to the same actual link. This is useful for example if you want to track whether people are clicking on your left side banner add or the link in your article. With this sort of tracking information you can determine which add copy attracts the clicks and which products people are generally interested it. You can then adjust the add copy that doesn’t work and rotate out the products that nobody wants.

PrettyLink-Hits

Pretty Link couldn’t be simpler to use. Once installed and activated, expand the Pretty Link Settings group and select Add New. Put the messy URL in the Target URL section and a nice human readable slug (a tag that contains no spaces, but can contain a dash) in the Pretty Link text box. If your messy URL contains a question mark (a ? denotes the start of URL parameters) then you must ensure that you select the Standard Parameter Forwarding radio selection in the Link Options section. To use the new Pretty Link, you simply use the Pretty Link instead of the original link. For example, if you click on either of the two affiliate links given in the introduction you will notice that both take you to the same destination.

PrettyLink-AddNew

Another great feature of Pretty Link is that it allows you to update the target URL without modify the Pretty Link slug. This is useful if you want to modify your affiliate link without updating your article or add copy. This may be the case if you have found a cheaper or better product for example.

Pretty Link’s SEO benefits include an option to nofollow your Pretty Link which is useful if the linked URL is not optimized for your keyword. Another benefit is that you can chose the name of your Pretty Link slug, allowing you to add more keywords to your page in the form of links.

17 November 2009 2 Comments

Robots.txt SEO Techniques

http://markbeljaars.com/wp-content/plugins/sociofluid/images/digg_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/reddit_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/stumbleupon_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/delicious_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/furl_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/technorati_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/facebook_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/yahoobuzz_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/mixx_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/twitter_48.png

This post is a long but important one. I recommend you grab a cup of hot chocolate before your start :)

If you have not heard of the robots.txt file, it is simply a small file located in your website root directory that instructs search engines on what they can and can’t do. Although not strictly enforced, search engine bots will generally respect the rules set forward in the robots.txt file. With a properly configured robots.txt file you can, for example, attempt to fend off spam bots, tell google not to index your images or instruct bots to skip pages that may contain duplicate content.

Bots are pieces of software used by search engine companies, spammers and content accumulators to crawl the internet to find new or modified content. A bot’s job is to follow links on a website crawling from page to page and site to site. It’s kind of like a Six Degrees of Kevin Bacon thing. Follow enough links and you should eventually find all the content on the net. This is why backlinks are so important. The more backlinks you have, the easier it is for search engines to find your content. There are literally millions of bot instances trawling the net at any one time. The official term for a bot is a user-agent of which there are thousands. Lets take Google for example. Google has many different user-agents used to index your site, extract images and videos, find news feeds, find mobile phone content, check your site for Adsense quality and so on. This site details a complete list of known user-agents.

The robots.txt file has been around for ages. It was actually introduced by AltaVista in 1994, but now remains a staple food for web spiders. For a complete description of the file and its standard notation, visit here. In short, a robots.txt file can restrict specific bots from crawling your entire site or part thereof. To do this, all bots have a special signature. For example,Google’s index bot is called Googlebot, Bing’s bot is called MSNbot, and Yahoo’s bot is called Yahoo! Slurp.

An entry in the Robots.txt file may look like this:

User-Agent: Yahoo! Slurp
Allow: /public*/
Disallow: /*_print*.html

Here we are telling the Slurp user agent that it can access all pages located in any directory starting with “public”, and have no access to pages with “_print” in the URI.

Below is a complete robots.txt file for one of my experimental WordPress sites (I’ll post an article explaining what I mean by experimental site another day). Astute readers may note that I am disallowing all user agents from specific directories, and only allowing some specific user agents access to the remaining areas of my site. A recent update to the standard also allows me to list the location of my site map to help search engines find all of my pages.

User-agent: *
Disallow: /cgi-bin
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /wp-content
Disallow: /search/*/feed
Disallow: /search/*/*

User-agent: Mediapartners-Google
Allow: /

User-agent: Adsbot-Google
Allow: /

User-agent: Googlebot-Image
Allow: /

User-agent: Googlebot-Mobile
Allow: /

User-agent: Mediapartners-Google
Allow: /

User-agent: Adsbot-Google
Allow: /

User-agent: Googlebot-Image
Allow: /

User-agent: Googlebot-Mobile
Allow: /

Sitemap: http://beginnerchess.org/sitemap.xml

Disallowing bots from accessing content not intended for consumption will ensure that your site will remain keyword optimized on all pages, thus helping promote your site within the search engine rankings. Say for example you have worked hard at optimizing all pages for the keyword “weight gain” and the various long tails. Your work may be filtered down in the eyes of the search engine if it was able to crawl your login page, privacy page and contact form.

Some SEO experts also argue that Google punishes young websites in favor of older more established sites. Google apparently uses the Internet Archive (found here) to determine the age of a site. If it cannot find the site in the archive, it apparently assumes the site is a certain age. For this reason, many people actively stop the Internet Archive user-agent from indexing their site. This can be done by including the following lines:

User-agent: ia_archiver-web.archive.org
Disallow: /

You may want to also stop image bots from accessing your pictures if they have borrowed non-stock images from other sites. This can be done like so:

User-agent: Googlebot-Image
Allow: /

Finally, robots.txt can be used to exclude bots from specific pages that may be used to display content that may be available on other sites or pages. It is often argued that Google will punish your ratings for displaying duplicate content. I personally do not see this as a big issue and believe that duplicate content can actually help your site’s rating in some instances (more about this another day). Anyway, to stop a bot from accessing a specific page, add the following lines:

User-agent: *
Disallow: */my-duplicate-page.html

Note that this is not a fool-proof method. If your disallowed page has links to it from another site, it will still be crawled by the bots.

I could keep going, but I’m sure you are all bored by now. Feel free to comment below or contact me directly if you wish to know more.

Happy roboting.

1 November 2009 2 Comments

A Cautionary Tale of Woe: The Bug That Got Away

http://markbeljaars.com/wp-content/plugins/sociofluid/images/digg_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/reddit_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/stumbleupon_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/delicious_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/furl_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/technorati_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/facebook_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/yahoobuzz_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/mixx_48.png http://markbeljaars.com/wp-content/plugins/sociofluid/images/twitter_48.png

Last week I released an updated version of my Table of Contents Creator plugin for WordPress. I pride myself in my coding, fault finding and testing abilities. That all came crashing down in one single night.

To cut a long story short, the new revision required a helper function to display post entries. The helper function was only called by one function and I therefore elected to include the helper function as a child of that function. This is common practise in many languages as it neatly packages all functions and their helpers in close proximity. The parent function by the way is called every time a page is displayed and terminates immediately if that page does not include the site map initiator tag.

Ok. Time passes. It is now 2am. The code is now ready to test. I know that the parent function exits immediately if it doesn’t find the initiator tag. This has previously worked, and I went nowhere near it, so it should continue to work. Right? I check the site map page, try out all the new and old options and prove that the code is working as intended. I quickly synchronize the SVN repository and rush off to bed to get some sleep before the sun comes up.

Next morning I decide to check a large site I know that uses my plugin. There’s no substitute for real world testing. Yeah, it all looks good. At that point I noticed a post in the site map that looked interesting. I clicked on the category link and was presented with the first post in the category and then a big nasty php error message. Mmm. I looked at another large site. Mmm. Maybe a coincidence. Lets look at one more. Oh no!

So what went wrong? Remember that child function? I’ve heard it said that kids can be evil and it turned out to be true in this case. If a blog page, category page, tag page or even a home page was displayed, multiple posts are shown on a single page. WordPress does this by pretending that each post is a mini-page and links all the mini-pages together to form one large page. This means that my parent function is called multiple times. Normally not an issue. But now as the function contains child functions, these child functions are created each time the parent function is called (even if they are not used). Therefore, when then second post is displayed, the page crashes with a duplicate function name error message. The nett result was that a small bug in a seemingly unrelated function caused several very large websites to go down.

The morale of the story: if you are a website administrator, make sure you run the full suite of tests after every plugin update, no matter how big or small that plugin may be.


SpinChimp Leaderboard 728x90