The possible impact of changing wordpress (and php) max memory settings on site performance

In the  last several days there where several questions in wordpress answers on stackexchange related to out of memory errors. This was mostly related to some plugin which required more memory to function, and people asked what is the way to change/overcome the default memory limit of a PHP process.

My impression from the questions and answers was that people fail to understand why there is a limit at all and treat the limit as some bizarre PHP thing that you need to overcome instead of trying to understand it. There is even a plugin “Change memory limit” that its description says

Update the WordPress default memory limit. Never run into the dreaded “allowed memory size of 33554432 bytes exhausted” error again!

To understand why there is a limit you need to understand the most hidden secrets of linux and windows that will surprise most developers – After an application had allocated memory from the OS it can not free it back.Yes, when a program call the free() function, an object destructor or any other dealoocation method, the memory is returned to the free memory pool of the application from which it might allocate its next memory, but it will never be returned to the OS as long as the software is running*.

Since software doesn’t really deallocate, a server software, that is supposed to run all the time, once reached its pick memory usage will stay there.This has to be taken into account when you want to ensure specific performance with the way apache works.

Appache in prefork mode basically run itself several times, where each instance can handle one request. If no instances are free to handle the request, the request has to wait in a queue. The maximal number of concurrent requests the server can process is the number of instances we can run at the same time. Assuming we don’t do any heavily CPU bound process, our limitation is the memory that can be allocated to each instance.

And how can we calculate the amount of memory an apache needs? The naive approach is to try and use average memory consumption, but once a software passed its “average” allocation, the memory will not be released. potentially an apache instance running one memory hungry process can take control over all the available memory leaving no memory available for the other instances which will probably lead to them failing in handling request. you might think that you configured your server to enable 10 request to be handled but 9 of them fail.

It is important to understand that once the memory was allocated it is of no importance that the instance never need again all of that memory and handles only small request. The memory is attached to the instance forever.

And this is why the memory  limit exists, to protect the whole server from one faulty piece of code. If you set the limit to 128KB then you can be assured that atleast the rest of the memory is available to the other instances.

So basically the number of apache instances we can run safely without the fear of the server suddenly breaking down for no apparent reason, is (amount of memory available on the server) / (max memory limit). The higher the limit the less requests your server can process in the same time which potentially leads to less responsive server.

Apache actually can be configure to kill instances after serving a certain amount of requests and by that actually free memory. This will improve the server performance on average but it also has a cost, the cost of running a new instance. You should probably always plan for the worst case scenario and experiment very carefully with relaxing the memory restriction.

Prefork is not the only way to configure apache to run, and there are also the worker and event configurations, but they require that the PHP library you use will be thread safe. Some people claim it actually works for them but the PHP developers don’t recommend running that way.

And then, if you use fastcgi to execute php instead of mod_php, you basically change it from being an apache problem to fastcgi problem which might actually be better since while fastcgi might hurt the performance of pages generated with PHP, apache itself will be able to serve static files.

* Mainly because memory from the OS is being allocated in big chunks and it is very likely that when you dynamically allocate and free memory from that chunk some allocated “live”  memory will be in every chunk.

Trash emptier wordpress plugin

The trash was introduced in wordpress version 2.9, and the operations of deleting posts (all types of post types) and comments was replaced by sending them to the trash. Actual deletion from the DB is done through trash management which is separate for each post type and comments. In addition an automatic process empties everyday all trashed items which where in the trash from more then 30 day. read more about the trash feature in the codex.

The plugin has two function

  • Provides a way to conctol the maximal amount of time an item will be kept in the trash before being deleted as an alternative to defining the EMPTY_TRASH_DAYS constant in your wp-config.php file. You can have the automatic trash empty performed faster, or set such a riciculously long interval that essently makes emptying the trash a puerly manuall operation.
  • Manually empty item from the trash based the time they been there.

Installation

  1. Download it from the repository
  2. Using your favourit FTP software upload the emptytrash folder into your site /wp-content/plugins directory
  3. go to the plugin management page and activate.

Usage

  • To Manually empty the trash go to the “Tools” >> “Trash emptier” menu
  • To configure the automatic emptying interval go to “Options” >> “Trash emptier”  menu

If you find this plugin usefull, don’t forget to donate

I will be very surprised if I will receive enough donations to cover the cost of the effort in designing, coding testing and supporting this plugin.but it is a nice sign of appreciation for my work.

Tip to Michael Arrington – the only way to control your data is to host it on your site

Michael Arrington’s briliant rant against instagram’s move that cripples photos shared from it to twitter.

http://techcrunch.com/2012/12/06/they-screwed-us-right-before-they-screwed-us-again-poohead/

offtopic: the MG disclaimer is hillarious

Localizing/Translating wordpress plugins and themes names and descriptions

Cool small localization feature that WordPress has is the ability to localize/translate the meta data of a plugins and theme. If plugin and theme authors will actually use it, it will enable localizers to provide a totally localized experience to the users even on the plugin and theme management pages.

This feature is very easy to activate, you should just add two lines into your plugin or theme header block

Text Domain: mytextdomain
Domain Path: lang_folder

where “Text domain” (mytextdomain in this example) is the text domain you used for your plugin/theme localization in the __e() and __() calls, and “Domain path” (here it is lang_folder) is the directory under your plugin/theme root directory in which the *.mo file resides.

For plugins, you can localize Name, PluginURI, Description, Author, AuthorURI, Version. For themes, you can localize Name, Description, Author, Version, ThemeURI, AuthorURI, Status.

While localizing Author info is probably not very moral, and localizing version probably not very smart, it is possible to localize the PluginURI and ThemeURI so they will point to a support/info URL relevant for that local. In other words, a plugin/theme developer can set up support page in english and another one in spanish, and use localization to point the spanish users to the spanish support page instead of the english one.

The only problem left is how to put those strings into your *.pot/*.po file.

For the plugin header below

Name: Plugin name
Description: Plugin description
Author: me
Version: 1.0
PluginURI: http://example.com/plugin
Text domain: mytextdomain
Domain Path: /lang

you can add the following code which enables the localization of only the name and description of the plugin. The location of this snippet in your code is not important as long as poedit can parse the file and conclude that ‘Plugin name’ and ‘Plugin description’ are translatable strings

_plagin_header_local = array(
__('Plugin name','mytextdomain'),
__('Plugin description','mytextdomain')
);

There are also other methods to achieve the same effect.

Relevant reading: Jacob Santos and Viper007Bond wrote on this feature.

How to expose dynamic translatable text to translation tools like poedit

The gettext translation framework is the best I ever used. All you have to do in order to find out which strings in your code require translation, and to create a skeleton translation file, is to have a function which performance the actual translation and just keep calling it for each translatable string. Then automated tool like poedit can be configured to parse your code, find all the strings that are used as parameter to the function and make a skeleton file containing them.
If our translation function is __() then when poedit parses our code and finds __(‘Somthing’) it generates a file in format similar to

Original "Something"
Translation ""

Now we just need to fill the translation (and you don’t need to be a coder to do that!) and tell our code to look for translations in this file.

Dynamic text breaks this nice system as almost by definition dynamic means that we don’t know the exact value at “compilation” time.When poedit scans the code below it doesn’t find any translatable strings.

a = get_response_from_remote_server(random-value);
print(__(a));

But many times we know in advance that get_response_from_remote_server() can return only a limited set of strings, for example only “apple” and “orange”. Right now in order for them to be translated correctly we will need to add them manually to the translation file, which makes it much harder to maintain as you will need to add them again after the next time poedit will process your code.

Luckily there is a way to maintain the list of that kind of string as  part of the code in a way which enable poedit to detect them – just put them in the code.

dummy = __('Apple');
dummy = __('Orange');
a = get_response_from_remote_server(random-value);
print(__(a));

Dummy here is never used, so there are no side affects. The problem with this approach is that we waste CPU time to call a function while we don’t need the value it returns.

Obvious improvement will be

if (false) {
  dummy = __('Apple');
  dummy = __('Orange');
}
a = get_response_from_remote_server(random-value);
print(__(a));

Now we don’t execute the functions and a compiler might even simply discard that section of the code resulting with zero impact on performance while maintaning the ability to generate translation file automatically.

Which leads to the best option – add a source code file which include the strings but don’t add it to your compilation chain in your make file or don’t include it.

dummy = __('Apple');
dummy = __('Orange');

Nirvana.

Should you optimize your wordpress MYSQL tables? (probably no)

While it looks like a no brainer (you only need press one button  to optimize. so why not), the consensus between the mysql experts tend to discard the usefulness of optimizing as a way to improve your wordpress performance.

The real question here is not if optimizing is good or bad, but whether you should dedicate in advance time to perform it. And since while the table optimization is done the site should be offline, does the benefits are high enough to justify it.

What the optimization does it to defrag the files used for the table and rebuild the index. defraging might save some space on your harddisk, but will not impact your site’s performance. The index rebuild potentially can improve performance but in practice it rarely does so, especially for the small sites which is probably 99.9% of the stand alone wordpress sites in the world.

For people managing wordpress networks it might be more complicated as the defrag benefits might accumulate to something substantial, but I have a feeling that whatever the benefit will be, the time and effort needed to communicate to your users that their sites will be down will outweigh them.

Maybe this is something that you should do only when you are already performing site maintenance for other reason like version upgrade.

 

Caching with transient options and API in wordpress

from the transiants api codex page:

… offers a simple and standardized way of storing cached data in the database temporarily by giving it a custom name and a timeframe after which it will expire and be deleted

One usage pattern for the transient API is to cache values you retrieve from a remote server. The overhead of establishing a connection to the remote server, sending a query and waiting for a reply is too big so we make a concession and instead of been totally up to date with our info, but with a site that take ages to load, we better be 5 minutes late with the info, but with usable site, and we will do it by caching the last result for 5 minutes in a transient option.

So instead of having

echo get_my_latest_video_from_youtube();

We can use

$video = get_transient('latestvideo'); // we might have the value already in our cache, lets retrieve it
if ($video) { // it is there
echo $video;
} else { // nothing in the cache, or the cache had expired
$video =  get_my_latest_video_from_youtube(); // get the video code
set_transient('latestvideo',$video,5*60); // set the cache with expiry set to 5 minutes in the future
echo $video;
}

The nice thing about this code is its robustness as it will recover from any event that hurt the cache and regenerate the info.

Important to note that this solution does not eliminate entirely the delay in site load caused by accessing the remote data, it just make it less frequent. For a site which has only 1 visitor every 5 minutes or more we basically haven’t changed anything as the cache will expire before the next visitor will come. Setting longer expiration interval gives you more performance value, so you should set it as long as possible without making the displayed data to be stupidly out of date.

But what can be done if we want that all of our users will have great experience, not only 99% of the time, but 100% of the time? If our interval is long enough we can pre populate the cache with a scheduled backend operation.

The next code assumes your interval is 5 hours

wp_schedule_event(time(),'hourly','regenerate_video'); // since we want to the avoid the situation in which the cache expires we have to use a schedule which is smaller then 5 hours. This should really be done only on plugin or theme activation

add_action('regenerate_video',regenerate_video);

function regenerate_video() {
$video =  get_my_latest_video_from_youtube(); // get the video code
set_transient('latestvideo',$video,5*60*60);
}

This way we prime the cache every hour and therefor every user gets a current enough info. But then the cache practically never expires couldn’t we get the same results by using the usual options api aand store the cached value as an option? Our code will then look like

// on front end
$video = get_option('latestvideo');
if (!$video) {
regenerat_video();
$video = get_option('latestvideo');
}
echo $video;

// on the backend
wp_schedule_event(time(),'hourly','regenerate_video'); // since we want to the avoid the situation in which the cache expires we have to use a schedule which is smaller then 5 hours. This should really be done only on plugin or theme activation

add_action('regenerate_video',regenerate_video);

function regenerate_video() {
$video =  get_my_latest_video_from_youtube(); // get the video code
update_option('latestvideo',$video);
}

What we have done here is to practically change the expiry mechanism. Now we can control better when the data expires.

So which pattern is better, transients or straight options? There is another factor you need to take into account before deciding about that – the existence of object catching in the site.
Unlike options, transients do not autoload into memory when WordPress starts up. This means that if there is no active object cache on the site, get_transient actually performs an extra SB query, and there is nothing worse for performance then a DB query that can be avoided, especially when this query is happening on the front end.

On the other hand, when object caching is active, transients are not stored to the DB at all, but only at the cache. This eliminates the cost of using get_transient and make the options table smaller and therefor each operation (add,change,delete,query) on it faster.

 

The subtle differences between get_alloptions, wp_load_alloptions and get_option

WordPress code had been very bad and made me waste several hours because of lack of proper documentation :(. All I was trying to do was to write a script that will do a simple search and replace on the text part of a text widget and each time I  would run the script the widget will stop being displayed.

My code was very simple

$opts = get_alloptions();
foreach ($opts  as $k=>$ogt) {
  if it is a text widget {
    $opt = searchandreplace($opt);
    update_option($ov,$opt);
  }
}

Turns out get_alloptions is deprecated  in favour of wp_load_options (i.e. the codex entry for it is wrong) therefor it does cache the options,but the array it returns is raw values which might be seialized while get_option return unserialized data. That was the source for my problem as I was assuming that get_alloptions is just a sytax sugar for calling get_option for each option at once.

The working code looks like

$opts = wp_load_options();
foreach ($opts  as $ov=>$opt) {
  if it is a text widget {
    $opt=get_option($ov); // already cached so no extra DB access
    $opt = searchandreplace($opt);
    update_option($ov,$opt);
  }
}

And then it gets even worse as suddenly I discover that wp_load_options returns only the autoloaded options but I want my code to be generic enough to work on not autoloaded as well so there is basically no alternative but to do a direct DB access

$opts = $wpdb->get_results( "SELECT option_name FROM $wpdb->options");
foreach ( (array) $opts  as $ov) {
  if it is a text widget {
    $opt=get_option($ov); // already cached so no extra DB access
    $opt = searchandreplace($opt);
    update_option($ov,$opt);
  }
}