mark | mark kaplun's blog

Why forcing people to use 8 characters in their password is much more secure then forcing 6?

Because it take the hacker 33% more time to type 12345678 than 123456

Writing a translatable library/framework that is used by wordpress plugins and themes

I got burned in the wordpress stack exchange while answering a question about variables in translatable text, which was really more about adhering to (IMO wrong) theme repository conventions while using third party code which doesn’t use them. My mistake came from misinterpretation of the translation strategy the third party code being used (the TGM plugin activation library) was using.

There are twp possible translation strategies for such libraries, self contained and integrated.

In the self contained strategy the library is distributed with the translations of its strings, using a specific text domain for them. The theme/plugin author using the library just needs to include it and initialize it.

require_once(dirname(__FILE__).'/useful_lib/useful_lib.php');
useful_lib_init();

Assuming the library expects its translation file to be in the “lang” sub directory of its root it only has to do

function useful_lib_init() {
load_textdomain('useful_lib',dirname(__FILE__).'/lang');
}

and use “useful_lib” as its text domain whenever the translation functions ( __(), _e()…) are called.

The advantage of this approach is that whoever uses the library don’t need to worry about the translation of the strings used in the library, and once a translation is updated he can simply include the new version of the library and push an update.

The disadvantage is that you don’t have full control over the translation of your theme/plugin (you can claim it works in language X only if the library supports the language as well) and you have to communicate to translators that the files residing in the “useful_lib” directory should not be translated as it will have no impact on the result (maybe suggest to the translator to contribute the translation to the library).

In the integrated strategy you let the theme/plugin determine which text domain to use. The easiest way is by forcing them to DEFINE it.

if (!defined('USEFUL_LIB_TEXTDOMAIN')) {
  error('USEFUL_LIB_TEXTDOMAIN had to be defined');
}
.....
// then you have in the code
_e('Welcome',USEFUL_LIB_TEXTDOMAIN);

This way there is some duplicated translation effort between different users of the library, but the end result is totally under the contrul of the theme/plugin author.

And…. there can be an hybrid approach. If no explicit text domain is specified use our own, just need to remember to load the text domain

if (!defined('USEFUL_LIB_TEXTDOMAIN')) {
  define('USEFUL_LIB_TEXTDOMAIN,'useful_lib');
  load_textdomain('useful_lib',dirname(__FILE__).'/lang');
}
.....
// then you have in the code
_e('Welcome',USEFUL_LIB_TEXTDOMAIN);

In my opinion, libraries should follow the “integrated” strategy unless they have non trivial amount of translatable strings, and the maintainers are willing to maintain the translations as part of the library.

As for the hybrid approach, I don’t see any real life use case for it. If library can be used as “self contained” it is probably better to use it that way, and get automatic translation updates whenever there are, instead of asking your theme/plugin translators to retranslate just because there was a change in the library.

BTW, my failing was due to the library in that question using the hybrid strategy while I thought it is a “self contained” because I failed to notice the text domain initialization variants, and this calls for an article about how wordpress theme and plugin writers misuse the OOP paradigm….

CodeCanyon rejected my plugin because it was too simple…

Preface: I have no problem with CodeCanyon, or being rejected. The guys running CodeCanyon know much better then me how to run their business, and if they rejected my plugin they probably thought it will not sell, at least not in the amount worth their trouble.

I submitted my simple google authorship and avatar plugin to CodeCanyon to test if it is possible to make money from developing and selling general purpose wordpress plugins. Hoping that with proper coding and documentation I might be able to get some more money every month without having to talk/convince/argue with clients which is the most stressful part of being freelancer.

The plugin was designed to be simple (as the name implies 😉 ) in two ways

For the user – simple to use, as I want to reduce the amount of time I might need to spend in answering questions about how to use it
For me – simple to code and maintain as it was a test in which I didn’t want to commit too much time because I had no idea what will be the return on the time investment.

Of course simple is too often confused with trivial, and this plugin wasn’t totally trivial as I had to create a proxy server for it to be able to easily access data by using google API.

I estimate that it took me 5 days to write the plugin including research, coding, QA and documentation. I charge at least 50$ per hour for freelance work and assuming I worked 8 hours a day my time investment into creating the plugin was worth about 2k$. Even before starting coding, when I just decided on the scope of the plugin I knew there was little chance that I will sell enough of it to return the development effort.

The rejection letter suggested that I will add more meat to the plugin. For me it was wrong in two ways

The google API I used, accessed publicly available information from the user’s google profile and therefor required only “read” permission which I assume users will be more likely to give. The plugin already utilized any aspect of the specific API and there is just nothing else that can be done with it. Adding functionality from other APIs is possible but then I will most likely end up with two functionality, each deserving a plugin by itself, forced to live in one plugin just to make it sellable
I already invested 2k$ worth of my time into this, and I’m totally not convinced that if I invest another 2k$ I will have a better chance of earning 4k$ in reasonable time. I lost ( a little imaginary) 2k$, no point in being in the position of losing 4k$.

The thing is that I might do much better by releasing the plugin under the GPL license into the wordpress plugin repository. Since the documentation can be bare bones and I am not required to support the plugin if I don’t want to, the development cost is lower and I might get better money from donations or requests for modification (realistically neither will happen but no one guaranties minimum amount of sells in codecanyon as well). In the minimum it will increase my reputation as a wordpress developer.

Simple google authorship and avatar plugin

Provides an easy to use interface for site authors to claim google authorship, and use their google profile picture as an avatar.

Main features

Users authenticate their google accounts via Oauth protocol and get the required info using the appropriate google API calls.

Inserts a <link href=”User’s G+ profile URL” rel=”author”> into the head part of the HTML for any content type created by the user. The existence of the link lets google know that the user is the author of the content and display his picture next to search results.

Use the users google profile picture as avatar image in comments and admin

Back story

User authentication via the Oauth protocol eliminates the need for the user to find out what is his google profile URL, and copy&paste it without mistake. An already registered application is used as a proxy for authentication and using the API, and no extra work is involved. For the user, authentication, and gathering the required info, is just one click away.

Once authenticated, anyone who can edit the user’s profile can set it to show or hide authorship and use google picture as avatar.

Network considerations

The plugin will work for any sub site in the network for which it was activated (or if the plugin is network activated).

Any change the user will make to its profile will impact all the sites in the netwrok in which the user had posted or commented.

Limitations

User still needs to aadd the site to his G+ profile “contributor to” or “profile” links in order for the google authorship to work.
Only the user can authenticate himself, therefor an admin can not set authorship or avatar instead of the user.
In network usage, sub sites admins can’t edit other users profile therefor they will not have any control and only the super admin will be able to change the settings.

Download it!

Installation/Usage instructions

Follow the usual procedure of installing and activating a plugin
Go to your profile click the “get info” link under the “google profile” section
You will be asked to authenticate yourself to google for a “Identify google user for WordPress” app
Once authenticated you should be redirected back to your profile page and see your google profile image displayed at the \”google profile\” section
Don’t forget to edit the “contributor to” URLs in your google profile to include the site
Ask all authors which are interested in getting authorship to do steps 2-5.

Gravatar make it is too easy to impersonate a commenter on wordpress blog

Gravatar is a service which is used to provide a globally recognizable avatar to people that sign for the service. It is used by default in the comments section of a wordpress site when the site is configured to show the comment author’s avatar next to his comment, which is the default configuration in wordpress.

Gravatar associates an email address with an image. There is a simple algorithm that converts the email address to a url at gravatar.com and if you use the url as the “src” attribute of HTML IMG tag the image is displayed.

This simple functionality is great for wordpress since an email address is almost always required in order to post a comment, and many other services which require an email address on registration.

The problem is that there is no verification that the email address actually belongs to the commentor. If I know the email of someone that I hate (lets call him X) I can go and use it on some controversial site (porn, extreme political views, etc), leaving a sympathetic comments and then direct people that we both know to surf to that site and learn about the true nature of X. This way X’s reputation might be destroyed without him even knowing about at and all that just because his picture automatically appears next to a comment identified by his email address.

But, doesn’t email addresses are semi public information, and always been like that? You could always use someone else email address to impersonate him so what is new?
The difference is that usually email addresses were not displayed because of spam avoidance measures, but the use of gravatar while not directly exposing the address itself does expose its owner.

In my opinion gravatars should not be displayed if there was no authentication that its owner actually knows/aware it will appear on your site. For example gravatar is being used in StackExchange, but the email address is not freely submitted but rather retrieved from service which provide strong identification like google, facebook and twitter. You can probably still impersonate someone if he doesn’t have a registered user at one of those services but it is harder to do unnoticed.

Update: I opened a ticket to by default show pictures from gravatar only for registered user in wordpress.

Update 2: Since the ticket did not get any traction, I create a plugin that at least will prevent the impersonation of registered users in the comment of a specific site,

rel=”me” and rel=”author” are confusing because they fail to explain where they should point to

The microformats wiki explains rel=”author” as (emphasize mine)

rel="author" is for relating an article or post to a page or site representing its author, typically to give them credit for their work (or portions of it, like books, articles, blog posts etc).

…

The rel="author" attribute indicates that the destination of the link represents the author of the current page (or post).

And the rel=”me” is

XFN 1.1 introduced the “me” rel value which is used to indicate profile equivalence and for identity-consolidation.

rel="me" is used on hyperlinks from one page about a person to other pages about that same person.

…

Thus establishing a bi-directional rel-me link and confirming that the two URLs represent the same person.

At first read the definitions are simple and understandable, the problems arise while trying to implement them due to the subjective and fluid nature of the terms “profile” and “represent“.

What is a profile, and more importantly what is my profile? Is it just some web page that its title contains the word “profile” and my name, and should it be officially sanctioned as a profile by a big company like google or facebook or can I make my own? What information makes a page a profile? Can someone else write my profile, is the wikipedia page about me a valid profile to use? Can my profile be generated automatically, can a search page after my name in google serve as my profile?
And why do I need to link my profiles, isn’t it more logical to simply have only one profile if they can be linked? People have more then one profile to show separate sides of their personality to different audiences, for example professional and personal profiles, and linking them will run contrary to the thought process resulting in the creation of two distinct profiles.

Representation is even harder to understand. My blog represents me, but there is no one page on it that does it by itself. Yes I wrote an “about me” page but this is usually the first page being written and one that is almost never updated to reflect any changes. What represent better a practicing book writer, his blog or his official page at his publishers site? Should it be a representation that I simply endorse or does it have to be written by me.

If you have only one site or you participate in only one social network it is probably not too hard to figure out these relationships, but once you have more then one site and participate in more then one network, deciding what is your main profile and organizing the relationships is something you need to put some work into it, and what do you get in return for your work? nothing. Google and the rest of the social search companies gets some more data to build their social graph, from which they can make money, and you at best get a small icon of yours next to an excerpt of what you wrote in a page where they place ads from which they make money.

Right now the way I see it the main problem with rel=”author” and rel=”me” is convincing people to care about setting them in a way which is meaningful and consistent. For now google sells its authorship requirements under the implicit promise of SEO improvements, but what if the improvements will not be delivered and what about people who care nothing about SEO?

Right now techcrunch uses rel=”me” to point to its G+ profile (line 1 below), and if techcruch can’t (or don’t want to) handle this correctly how many sites owners will?

<link rel="me" type="text/html" href="http://www.google.com/profiles/techcrunch"/>
<link rel="alternate" type="application/rss+xml" title="TechCrunch RSS Feed" href="http://feedproxy.google.com/TechCrunch" />
<link rel="pingback" href="http://techcrunch.com/xmlrpc.php" />
<link rel="icon" type="image/x-icon" href="http://s2.wp.com/wp-content/themes/vip/tctechcrunch2/images/favicon.ico?m=1357660109g" />
<link rel="shortcut icon" type="image/x-icon" href="http://s2.wp.com/wp-content/themes/vip/tctechcrunch2/images/favicon.ico?m=1357660109g" />
<link rel="stylesheet" id="style-css" href="http://s2.wp.com/wp-content/themes/vip/tctechcrunch2/style.css?m=1357603790g" type="text/css" media="all" />
<link href="https://plus.google.com/103037366582313115962/" rel="publisher" />

Which brings us to think of fake profiles and false attribution, but this article is Tl;Dr as it is now and no point in making it longer.

How many authors are in a blog page? many!

In my opinion , in the discussion around google authorship there is too much emphasize about main content writer authorship, but almost no mention that the content indexed by google is made up also from comments and the have authors as well.

This is even more obvious in forums and Q&A sites. Who is the author of a page on stackexchange, the one who asked the question or the ones who supplied the answer. It is even more complex in wiki sites.

It feels like while people were rushing to see faces on the search results hoping for some SEO juice, they haven’t tried to reed the text of the spec

For a and area elements, the author keyword indicates that the referenced document provides further information about the author of the nearest article element ancestor of the element defining the hyperlink, if there is one, or of the page as a whole, otherwise.

And article elements are not necessarily the whole content of the page (emphasize mine)

The article element represents a self-contained composition in a document, page, application, or site and that is, in principle, independently distributable or reusable, e.g. in syndication. This could be a forum post, a magazine or newspaper article, a blog entry, a user-submitted comment, an interactive widget or gadget, or any other independent item of content.

But google right now will not let me claim authorship over my comments . The way you need to configure your profile to claim authorship is just not user friendly enough to do it for every site I comment on, just too much work.

Why would I as a site owner wish to let commenter claim authorship on comments? because if I have quality commentators people might come to my site because they follow them.

google fails to understand that authorship is a markup territory and not display territory

A quote from the webmaster tool help page

Hidden markup Make sure that your rel="author" link is not invisible to humans using techniques like display:none or CSS. Broadly speaking, Google won’t display any information that cannot be viewed by humans.

As if there is a way for a human to see the relationship info without viewing the source HTML. It is as if google needs more incoming links into the g+ profile pages to promote them in search results….

But even google understands how stupid this rule can be in practice and allows authorship info to be specified in link tags in the header. (which is actually exactly one of the options the HTML5 spec specifically specifies)

Removing query strings (parameters) from URLs

I’m getting annoyed when my browser’s address bar is full because of meaningless parameters that where appended to the “real” URL and which make no sense to me. The problem with the “Junk” is that people copy&paste the URL from the address bar which makes the junk propagate all around the web.

This might even do actual damage if the site owner relies on the parameters to differentiate between sources of traffic.

It is a good thing that one of the features of HTML5 – controlling browser history, can be used to update the address bar to new URL without making a redirect

<script type="text/javascript" charset="utf-8">
  url = the canonical URL for the address
  if (typeof history.replaceState === 'function') { // check html5 functionality support
    data = {dummy:true};
    history.replaceState(data,'',url);
  }
</script>

This will work for modern browsers only, but who realy cares about IE users? they deserve the clutter! 😉

WordPress comments suck at authentication

I am sure I will not shock anyone by saying that an email address by itself is not good enough for authentication. It is to easy to fabricate an email address, to create a one time one, and to use someones else address, so why exactly do we still use it as an authentication token in wordpress comments?

It is not that getting the email of a commenter is a bad idea, it is just that is not enough for authentication. What is needed is a way to proved that said email actually belong to that person. One idea is to send a mail to the email address and ask to confirm the submission of the comment. After verifying the email it will make more sense to get profile data from gravatar with this email address.

And there is a different approach that avoids using emails for authentication – use the commentator’s profile on the web. Most of the commentators have a facebook/google/twitter/tumbler/wordpress.com/flickr account with a profile, just let them authenticate their profiles. You can even get an avatar image and maybe name that you can use to identify them to the readers when displaying the comment.

This does not necessarily work against anonymity but you probably be more inclined to approve an authenticated comment then one which is practically anonymous.