A couple of months ago I wrote about how I'd modified my WordPress 404 page to be a little bit more useful and informative to any reader unlucky enough to encounter it. Amazingly that article was deprecated almost as soon as I'd published it as I had continued to refine and supplement the code I'd described there. Furthermore, I had added some more functionality to the page for better "possible match" suggestions. So in this article I'm revisiting the custom 404-page to describe the changes I've made since the previous installment.
You see, one thing I realised was that if a link to a missing resource came in from a search-engine, I would have a referrer string that included the keyword(s) or phrase that had been submitted to that search-engine, since they almost all include that data in their URIs. Surely then, I should be exploiting that fact to the benefit of my visitor?
So my 404.php file begins with the standard message that all visitors will see:
<p>Sorry, but <em><? echo $_SERVER['REQUEST_URI']; ?></em> doesn't exist on the <a href="/" title="go to: the home-page..."><strong>Urban Mainframe</strong></a> at this time. If it once existed here then it may have been moved, renamed or deleted.</p>
Then, if there is a referrer, we inform the user that the inbound link is incorrect:
<? if ($_SERVER['HTTP_REFERER']) { echo "<p>The link at <em><a href="" . $_SERVER['HTTP_REFERER'] . "">" . $_SERVER['HTTP_REFERER'] . "</a></em> is incorrect.</p>"; } ?>
This is followed with a little more helpful information. Then we get to the real magic of the page. We want to be able to offer the reader a list of possible matches if possible, based on the URI they have requested, or the keyword(s) or phrase they submitted to a search-engine. We do this by extracting keywords from our referrer URI and performing an internal, full-text search on our content with those keywords.
We start by breaking the URI of the requested (erroneous) page into its component parts (ditching the domain name and TLD and splitting the remainder of the URI on the "/" character). Each URI slug is then appended to a string called "$keyword", separated with a white-space. We do this for two reasons:
This means that if our search-engine referrer test returns nothing then we still might have a keyword or two to perform our internal search with.
NOTE: The "$keyword" variable will thus always be populated. If there is a match on the external search-engine referrer test, then the "$keyword" variable will be replaced with the results of that test.
$keyword = substr($_SERVER['REQUEST_URI'],1);
$keyword = urldecode(stripslashes($keyword));
$keyword = str_replace('/',' ',$keyword);
Then we check to see if the referrer is one of the search-engines we know about and, if it is, we extract the keyword(s) or phrase that the search-engine was processing. We then make our "$keyword" variable equal whatever the "query" variable is for a given search-engine. For Google that "query" variable is called "q", for Lycos it's "query", for Yahoo it's "p" and so on.
$ref = $_SERVER['HTTP_REFERER'];
if ( preg_match("#(google|msn|live|altavista|alltheweb|scirus)#si", $ref) ) {
$s = explode("?",$ref);
parse_str($s[1]);
$keyword = $q;
} elseif (preg_match("#(aol|vivisimo|lycos|aliweb)#si", $ref) ) {
$s = explode("?",$ref);
parse_str($s[1]);
$keyword = $query;
} elseif (strstr( $ref,'yahoo')) {
$s = explode("?",$ref);
parse_str($s[1]);
$keyword = $p;
} elseif (strstr( $ref,'baidu')) {
$s = explode("?",$ref);
parse_str($s[1]);
$keyword = $wd;
}
For those of you who don't speak PHP there's a couple of things that need a little explanation here. "preg_match" performs a regular expression (RegEx) match against a string, "explode" splits a string on a delimiter (in our case the "?" character that precedes the "query" of a URL) and "parse_str" parses a string into key/value pairs (in our case the string returned by the "explode" function).
With our "$keyword" now defined, we then pass this string into our internal search mechanism and hopefully we'll get some "Possible Match(es)" that we can offer the reader:
$limit=10;
$len=25;
$before_title = '<li>';
$after_title = '</li>';
$before_post = '';
$after_post = '';
$show_pass_post = false;
$show_excerpt = false;
global $wpdb, $post;
// Make sure the post is not from the future
$time_difference = get_settings('gmt_offset');
$now = gmdate("Y-m-d H:i:s",(time()+($time_difference*3600)));
// Primary SQL query
$sql = "SELECT ID, post_title, post_content,"
. "MATCH (post_name, post_content) "
. "AGAINST ('".mysql_escape_string($keyword)."') AS score "
. "FROM $wpdb->posts WHERE "
. "MATCH (post_name, post_content) "
. "AGAINST ('".mysql_escape_string($keyword)."') "
. "AND post_date <= '$now' "
. "AND (post_status IN ( 'publish', 'static' ) && ID != '$post->ID') ";
if ($show_pass_post=='false') { $sql .= "AND post_password ='' "; }
$sql .= "ORDER BY score DESC LIMIT $limit";
$results = $wpdb->get_results($sql);
$output = '';
if ($results) {
foreach ($results as $result) {
$title = stripslashes($result->post_title);
$permalink = get_permalink($result->ID);
$post_content = strip_tags($result->post_content);
$post_content = stripslashes($post_content);
$output .= $before_title .'<a href="'. $permalink .'" rel="bookmark" title="Permanent Link: ' . $title . '">' . $title . '</a>' . $after_title;
if ($show_excerpt=='true') {
$words=split(" ",$post_content);
$post_strip = join(" ", array_slice($words,0,$len));
$output .= $before_post . $post_strip . $after_post;
}
}
echo "<h3>Possible Match(es)</h3>";
echo "<ol style="margin-left: 40px;">";
echo $output;
echo "</ol>";
}
For more on the above code please refer to the original source.
As in my original article, I must stress that in order for the full-text search to work you must have full-text indexing configured on your WordPress database. The following SQL command is all you need to enable this:
ALTER TABLE 'wp_posts' ADD FULLTEXT 'post_related' ('post_name' ,'post_content');
That's all there is to it! For very little code we can provide a useful and informative 404-page to our visitors. Hopefully very few will see it but, for those that do, we're giving just a few more hints and nudges in the right direction than most websites. Our website is more user-friendly and that's always worth making a little extra effort for.
For your reference, I'm making the full source code of my 404.php file available. If you use it I'd appreciate a credit and link - but that's certainly not obligatory. Now go and fix up your 404-page and let's make the Web a better place!
Last Revision: May 11th, 2009 at 20:18
Short URL: http://wp.me/phEOu-gr (Tweet This!)
[…] the URI into a search query is very simple. If you would like a more advanced please refer to “A better 404 - Redux” at Urban Mainframe, where Jonathan Hollin expounds on his (downloadable!) 404 page […]
Brilliant idea! Thank you for sharing.
(To be able to run the ALTER TABLE statement in phpMyAdmin, replace the ’ with ‘ - like in the original post.)
This:
if ($_SERVER[’HTTP_REFERER’]) { echo “The link at ” . $_SERVER[’HTTP_REFERER’] . ” is incorrect.”; }
displays the correct “wrong” link, but the link itself is something like that:
http://blog.urbanmainframe.com/2009/01/a-better-404-redux/blog.urbanmainframe.com/2009/01/a-better-404-redux/
Guess you see the problem.
@dieter: Hmmm, I don’t see the problem. Everything seems to be okay here and both the displayed address and the link URL appear to be correct. Can you be more specific?
Excellent.. Thanks for sharing…
You’re welcome Ajay.
@Darkblue: THANK YOU! Your help with the gift of this FREE code is very much appreciated!!
Thank you for all your hard work to assist the Wordpress community.
Thank you Matt. It always nice when someone acknowledges one’s work. I appreciate your feedback.
Well, I do what I can when it comes to designing websites…but I’m definitely not a PHP or Perl ‘programmer’ by any sense of the word.
However, I do know what “things/options” need to be included in a website to make them really operate properly and/or respond to the user in a way that is actually effective … by providing them with useful information, so I do what I can to make that happen. I am fairly good with CSS and XHTML layouts and I’m starting to get PHP fairly well.
Being able to use code such as yours definitely makes my life allot easier. I am the kind of person that likes to take ‘plugin’ code and insert it directly into the site templates instead of letting it inject code all over my XHTML pages and make them where they won’t validate. I can’t stand programmers that write code that breaks sites layout or semantics.
Anyway, your help is appreciated, and I linked back to you. No, sorry, you can’t see the site yet.
I don’t have it presentable enough yet to have my name attached to it, but I will come back and post a link when its finished for you and your readers to see.
Thanks again!
@Matt: That’s how we all learn isn’t it. We use existing code, try to understand it, perhaps modify it a little to better suit our purposes… before you know you’re developing entire applications! That’s all I do. I can’t wait to see your website when it’s complete, you must remember to send me a link.
[…] Usable from UX Booth Why You Should Keep an Eye on Your 404 Stats How to Find and Fix 404 Errors A Better 404 - Redux from urbanmainframe.com You can view the previous parts of the Top 50 Web Design Styles Series here: A Showcase of 50 […]
[…] 404 Best Practices from CSS-Tricks Tips for Creating an Informative 404 Error Page Pimp Your 404: Presentation and Functionality from Perishable Press How To Customise Your 404 Page from Sitepoint Turning 404 not found random visitors to blog readers from Tech Snacks 404 Page Management Your Grandma Can Use 404s and WordPress Server Load from alexking.org The Perfect 404 from A List Apart 5 Tips to Make Your 404 Page More Usable from UX Booth Why You Should Keep an Eye on Your 404 Stats How to Find and Fix 404 Errors A Better 404 – Redux from urbanmainframe.com […]
[…] 404 Best Practices from CSS-Tricks Tips for Creating an Informative 404 Error Page Pimp Your 404: Presentation and Functionality from Perishable Press How To Customise Your 404 Page from Sitepoint Turning 404 not found random visitors to blog readers from Tech Snacks 404 Page Management Your Grandma Can Use 404s and WordPress Server Load from alexking.org The Perfect 404 from A List Apart 5 Tips to Make Your 404 Page More Usable from UX Booth Why You Should Keep an Eye on Your 404 Stats How to Find and Fix 404 Errors A Better 404 – Redux from urbanmainframe.com […]