Designing good 404 pages

There is one essential thing that you have to understand about people ending up in your 404 page – they want to be on your site. Once you accept that fact, it stops to matter why they got there and you start to treat the 404 page like any other page on your site.

In any page on your site the next goal after showing quality content is to retain the surfer on your site and directing him towards pages which can produce value for you (get a sale, get a lead, sign a petition, etc…). The goal in the 404 page is exactly the same, it is just the lack of context that makes it more difficult then in other pages.

But do you really lack context? no one tries to get to a 404 page for the fun of it, and everybody which ends up there had a URL which he assumed is valid. What we need to do is parse the URL and try to extract as much information from it about where the surfer tried to get to.

How can we extract context from a URL? For example lets look at the recommended URL structure for WordPress sites which use pretty permalinks (virtual directories).

  • A Post URL has the structure of /year/month/day/title. Possible way to handle 404 which have similar structure is
    • If the title is of an existing post in different location display a link to that post
    • If there is a title but no post matches it suggest making a search (or show possible search results) using the wordpress search feature and google search limited to your site
    • If the month is invalid – suggest going to the archive page which lists post from that year
    • If the day is invalid – suggest going to the archive page which lists post from that month
  • An image or other file types URL has the structure of /wp-content/uploads/year/month/file. Possible way to handle 404 which have similar structure is
    • If the file is of an existing file in different location display a link to that file (or the image if it is an image)
    • If file can’t be found at all suggest making a search in the site’s media library (or show possible search results) using the wordpress search feature and google search limited to your site
  •  A category URL has the structure of /category/category name1/optional category name2. Possible way to handle 404 which have similar structure is
    • If category name 2 exists but not under category name 1 then display a link to it
    • If category name 2 don’t exist but category name 1 does, then display all the subcategories of category name 1
    • If niether category name1 nor category name 2 exist display a list of the categories

And so on, you probably got the drift by now. My examples are very generic, but knowing the content in your site you might come up with even better ways to seduce the random 404 visitor to stay in your site.

It is important to return HTTP 404 when there is no content at the URL, redirecting to another URL is evil

I got depressed when I look for a wordpress plugin which will let me control what is display on my site’s 404 page, as too many plugins suggest to redirect to another page automatically or based on some manual configuration.

This is so wrong, 404 (and all of the 4.xx codes) indicates a client error, and you should not pretend that there was no error otherwise the client will never learn and improve.

It is not only about following the letter of the standard, there are  practical implications to not following it:

  • A person that bookmarked (maybe the URL was copied from mail or configured in a smartphone) will get to an unexpected content without getting any indication whether it was his fault (used wrong URL) or the content had disappeared from the site.
  • Owners of sites that link to your content (an HTML link or reference to a img,video or other resource on your site) will never get an indication from link checkers that their links are bad and will never fix it to point to the right content.
    The worst scenario is that you, or someone affiliated with you, owning that site, and it might even be a bad kink on the same site.
  • If you use javascript based analytics software (like google analytics) they will fail to report 404 pages as the scripts will never be loaded for them
  • It effectively creates several valid URL for leading to the same content which might be good or not depending on the content in the source of the URLs and the latest google ranking algorithm

Yes, 404 pages should be more then some pretty and amusing page, but whatever it displays to the user, it has to keep sending the content with a 404 code.