Tip | mark kaplun's blog

Usage of PHP’s json_decode on user input should be considered dangerous

No one expects the spanish inquisition and no developer expects that parsing data by itself can lead to a security weakness, but still PHP is always happy to help you have an interesting life.

So what is the problem with json_decode? the problem is that with a properly crafted JSON structure an attacker can force a very slow parsing of the JSON and hang up the CPU. The weakness comes from the way PHP internally stores arrays in memory that in the normal use case is very fast to access and retrieve data, but the worst case is horrible and json_decode by default tries to create an array from a JSON string, Not sure if the object option of the API is any better.

Mitigation? It doesn’t seem like you can guess anything about the content/structure of a JSON string without actually parsing it therefor it is probably a good idea to check that the length of the input “make sense” before processing it.

Cool tool to debug emails sent from local WAMP/XAMP on windows

I hate myself for not finding the test mail server tool earlier. It is no more then an SMTP server running on the local windows machine that instead of sending the email somewhere else just logs it or opens it in your favourite email client or even text editor.

Could have saved me so much time debugging email related features locally instead of working on a true network connected host.

taming wp_upload_dir to create a directory with the name you want instead of a date

First I guess I need to answer “why” and there are probably two of them

Why do I care that it creates a date based directory even if not needed?
Because I hate to waste CPU cycles if it can be avoided with almost no effort, and I hate to see empty directories when I need to traverse the uploads tree with an FTP software
Why not take the base directory from the value being returned and create the directory by myself?
Because I prefer to trust the core APIs over my own code whenever possible. Core code is tested in real life by millions every day on different platforms and I will probably never test it on more then one. So while creating a directory seems like a very easy thing to do I sill prefer avoid thinking about the possible caveats that might be specific to a specific OS.

The code is actually easy, this is a snippet of something I work on right now


// returns the directory into which the files are stored
function mk_flf_dir() {
  add_filter('upload_dir','mk_flf_upload_dir',10,1);
  $dirinfo = wp_upload_dir();
  remove_filter('upload_dir','mk_flf_upload_dir',10,1);

  return $dirinfo['path'];
}

// override the default directory about to be created by wp_upload_dir
function mk_flf_upload_dir($info) {
  $info['path'] = $info['basedir'].'/fast_logins';
  return $info;
}

The fine point here is to remove the filter after it has done its thing, just because there is a slight chance some other code will want to call wp_upload-dir after your code had run.

301 redirections should be handled in the application, not .htacces

I see many tips and questions about how to redirect a URL with .htaccess rules. On the face of it, it makes a total sense, why should you waste the time to bootstrap your website code (which might include DB access and what not), just to send a redirect response, when the webserver can do it much faster for you?

There are several reasons not to do it in .htaccess

Unless you are redirecting most of the site, the rate of hits on a 301 should be low but the lines containing those roles in the .htaccess file still needs to be read and parsed for every url of the site, even those that serve javascript and CSS if you are using the naive approach in writting the rules. In contrast, your application can check if a redirect is needed only after checking all other possibilities. Each check is slower but the accumulated CPU time spent on this will be lower. This of course depends on your rules and how fast your application determines that there is no match for a URL, and how likely a url is to require a redirect
Statistics gathering. If you do it in .htaccess the only statistical tool you can employ to analyze the redirects is log file and they are rotating and bust a bitch to parse and collect into some better storage system.At the aplication you can simply write the data to a DB, or send an event to google analytics
Your site should be managed from one console, and redirect are more related to a application level configuration then to webserver configuration. It can be very annoying if you write a new post, give it a nice url just to discover that for some reason it is always redirected to some other place without understanding why as your administration software do not know about the htaccess rule, and you probably forgot that it is there (or maybe even someone else put it there).

How to import big wordpress file

For an active wordpress site the content gets bigger with time. Usually tou don’t even notice that until the site becomes slow and you install some caching plugin which makes the site to run even faster and you forget about the whole thing again.

Problem arises when you want to move your content with wordpress’s export and import tools. If you have a lot of content, the exported file which will be generated might be too big to be uploaded to the new server.

The easiest way to solve the problem is to split the exported file into smaller pieces using this splitter tool, and import each one of the generated files.

How to make taxonomy pages appear as result in wordpress search

In addition to many other drawbacks wordpress search has it just can’t search the description associated with a taxonomy (category, tag) or author, so even if the most obvious search search result is the category page, the internal search will never show it.

But there is a way to hack around it if you really have to. All that needs to be done is to have a page with the exact same URL as a taxonomy.

If you which for the category “events” to be searchable, assuming its url is /category/events, all you have to do is to create two pages, one with the slug “category” and a sub page of it with the slug “events” and put the text associated with the category in the “events” page.

The only problem is that the search result will be styled like a page, but this is a small price to pay.

In wordpress, pages can have whatever URL you want them to have

For all content types except pages wordpress uses a system of patterns to identify from the structure of the URL itself which type of content is being accessed. Once identified it can know in which part of the DB it should look for the content associated with the URL.

This is the reason why you usually should have a prefix “directory” in the URL which uniquely identifies your content. If there are two possible interpretations wordpress will match the first that is found.

Pages are different. WordPress kind of assumes that by default all content in the site is pages and the parsing rule for page URLs is “if it is not something else it might be a page”.

This lets you place pages anywhere in the URL structure. Here the question was about having an Event/post_slug URL for posts and have also an Event/Contact URL for a page. To do that you just need to have a page with a slug Events and a page with a slug Contact as its sub page.

As long as there is no post with the slug contact, when wordpress get a Events/Contact URL it tries to find a post in the events category with the slug Contact, and if there is none it will try to find a page with the slag Contact under a page with the slug Events and BINGO.

Two problems with this approach. Neither of them is probably major enough to prevent to use of this technique

For every URL of the structure Events/xxxx where there is no post with the slug xxxx, wordpress will have to make another DB query to check if there is a page with the slug xxxx under the page with the slug “Events”
You always have to remember not to create a post or subcategory of the category “Events” with the slug “Contact”. If you do that you page will not be access and you will not have any warning about that.

Removing query strings (parameters) from URLs

I’m getting annoyed when my browser’s address bar is full because of meaningless parameters that where appended to the “real” URL and which make no sense to me. The problem with the “Junk” is that people copy&paste the URL from the address bar which makes the junk propagate all around the web.

This might even do actual damage if the site owner relies on the parameters to differentiate between sources of traffic.

It is a good thing that one of the features of HTML5 – controlling browser history, can be used to update the address bar to new URL without making a redirect

<script type="text/javascript" charset="utf-8">
  url = the canonical URL for the address
  if (typeof history.replaceState === 'function') { // check html5 functionality support
    data = {dummy:true};
    history.replaceState(data,'',url);
  }
</script>

This will work for modern browsers only, but who realy cares about IE users? they deserve the clutter! 😉