CodeCanyon rejected my plugin because it was too simple…

Preface: I have no problem with CodeCanyon, or being rejected. The guys running CodeCanyon know much better then me how to run their business, and if they rejected my plugin they probably thought it will not sell, at least not in the amount worth their trouble.

I submitted my simple google authorship and avatar plugin to CodeCanyon to test if it is possible to make money from developing and selling general purpose wordpress plugins. Hoping that with proper coding and documentation I might be able to get some more money every month without having to talk/convince/argue with clients which is the most stressful part of being freelancer.

The plugin was designed to be simple (as the name implies 😉 ) in two ways

  • For the user – simple to use, as I want to reduce the amount of time I might need to spend in answering questions about how to use it
  • For me – simple to code and maintain as it was a test in which I didn’t want to commit too much time because I had no idea what will be the return on the time investment.

Of course simple is too often confused with trivial, and this plugin wasn’t totally trivial as I had to create a proxy server for it to be able to easily access data by using google API.

I estimate that it took me 5 days to write the plugin including research, coding, QA and documentation. I charge at least 50$ per hour for freelance work and assuming I worked 8 hours a day my time investment into creating the plugin was worth about 2k$. Even before starting coding, when I just decided on the scope of the plugin I knew there was little chance that I will sell enough of it to return the development effort.

The rejection letter suggested that I will add more meat to the plugin. For me it was wrong in two ways

  • The google API I used, accessed publicly available information from the user’s google profile and therefor required only “read” permission which I assume users will be more likely to give. The plugin already utilized any aspect of the specific API and there is just nothing else that can be done with it. Adding functionality from other APIs is possible but then I will most likely end up with two functionality, each deserving a plugin by itself, forced to live in one plugin just to make it sellable
  • I already invested 2k$ worth of my time into this, and I’m totally not convinced that if I invest another 2k$ I will have a better chance of earning 4k$ in reasonable time. I lost ( a little imaginary) 2k$, no point in being in the position of losing 4k$.

The thing is that I might do much better by releasing the plugin under the GPL license into the wordpress plugin repository. Since the documentation can be bare bones and I am not required to support the plugin if I don’t want to, the development cost is lower and I might get better money from donations or requests for modification (realistically neither will happen but no one guaranties minimum amount of sells in codecanyon as well). In the minimum it will increase my reputation as a wordpress developer.

Gravatar make it is too easy to impersonate a commenter on wordpress blog

Gravatar is a service which is used to provide a globally recognizable avatar to people that sign for the service. It is used by default in the comments section of a wordpress site when the site is configured to show the comment author’s avatar next to his comment, which is the default configuration in wordpress.

Gravatar associates an email address with an image. There is a simple algorithm that converts the email address to a  url at gravatar.com and if you use the url as the “src” attribute of HTML IMG tag the image is displayed.

This simple functionality is great for wordpress since an email address is almost always required in order to post a comment, and many other services which require an email address on registration.

The problem is that there is no verification that the email address actually belongs to the commentor. If I know the email of someone that I hate (lets call  him X) I can go and use it on some controversial site (porn, extreme political views, etc), leaving a sympathetic comments and then direct people that we both know to surf to that site and learn about the true nature of X. This way X’s reputation might be destroyed without him even knowing about at and all that just because his picture automatically appears next to a comment identified by his email address.

But, doesn’t email addresses are semi public information, and always been like that? You could always use someone else email address to impersonate him so what is new?
The difference is that usually email addresses were not displayed because of spam avoidance measures, but the use of gravatar  while not directly exposing the address itself does expose its owner.

In my  opinion gravatars should not be displayed if there was no authentication that its owner actually knows/aware it will appear on your site. For example gravatar is being used in StackExchange, but the email address is not freely submitted but rather retrieved from service which provide strong identification like google, facebook and twitter. You can probably still impersonate someone if he doesn’t have a registered user at one of those services but it is harder to do unnoticed.

 

Update: I opened a ticket to by default show pictures from gravatar only for registered user in wordpress.

 

Update 2: Since the ticket did not get any traction, I create a plugin that at least will prevent the impersonation of registered users in the comment of a specific site,

rel=”me” and rel=”author” are confusing because they fail to explain where they should point to

The microformats wiki explains rel=”author” as (emphasize mine)

rel="author" is for relating an article or post to a page or site representing its author, typically to give them credit for their work (or portions of it, like books, articles, blog posts etc).

The rel="author" attribute indicates that the destination of the link represents the author of the current page (or post).

And the rel=”me” is

XFN 1.1 introduced the “me” rel value which is used to indicate profile equivalence and for identity-consolidation.

rel="me" is used on hyperlinks from one page about a person to other pages about that same person.

Thus establishing a bi-directional rel-me link and confirming that the two URLs represent the same person.

At first read the definitions are simple and understandable, the problems arise while trying to implement them due to the subjective and fluid nature of the terms “profile” and “represent“.

What is a profile, and more importantly what is my profile? Is it just some web page that its title contains the word “profile” and my name, and should it be officially sanctioned as a profile by a big company like google or facebook or can I make my own? What information makes a page a profile? Can someone else write my profile, is the wikipedia page about me a valid profile to use? Can my profile be generated automatically, can a search page after my name in google serve as my profile?
And why do I need to link my profiles, isn’t it more logical to simply have only one profile if they can be linked? People have more then one profile to show separate sides of their personality to different audiences, for example professional and personal profiles, and linking them will run contrary to the thought process resulting in the creation of two distinct profiles.

Representation is even harder to understand. My blog represents me, but there is no one page on it that does it by itself. Yes I wrote an “about me” page but this is usually the first page being written and one that is almost never updated to reflect any changes. What represent better a practicing book writer, his blog or his official page at his publishers site? Should it be a representation that I simply endorse or does it have to be written by me.

If you have only one site or you participate in only one social network it is probably not too hard to figure out these relationships, but once you have more then one site and participate in more then one network, deciding what is your main profile and organizing the relationships is something you need to put some work into it, and what do you get in return for your work? nothing. Google and the rest of the social search companies gets some more data to build their social graph, from which they can make money, and you at best get a small icon of yours next to an excerpt of what you wrote in a page where they place ads from which they make money.

Right now the way I see it the main problem with rel=”author” and rel=”me” is convincing people to care about setting them in a way which is meaningful and consistent. For now google sells its authorship requirements under the implicit promise of SEO improvements, but what if the improvements will not be delivered and what about people who care nothing about SEO?

Right now techcrunch uses rel=”me” to point to its G+ profile (line 1 below), and if techcruch can’t (or don’t want to) handle this correctly how many sites owners will?

<link rel="me" type="text/html" href="http://www.google.com/profiles/techcrunch"/>
<link rel="alternate" type="application/rss+xml" title="TechCrunch RSS Feed" href="http://feedproxy.google.com/TechCrunch" />
<link rel="pingback" href="http://techcrunch.com/xmlrpc.php" />
<link rel="icon" type="image/x-icon" href="http://s2.wp.com/wp-content/themes/vip/tctechcrunch2/images/favicon.ico?m=1357660109g" />
<link rel="shortcut icon" type="image/x-icon" href="http://s2.wp.com/wp-content/themes/vip/tctechcrunch2/images/favicon.ico?m=1357660109g" />
<link rel="stylesheet" id="style-css" href="http://s2.wp.com/wp-content/themes/vip/tctechcrunch2/style.css?m=1357603790g" type="text/css" media="all" />
<link href="https://plus.google.com/103037366582313115962/" rel="publisher" />

Which brings us to think of fake profiles and false attribution, but this article is Tl;Dr as it is now and no point in making it longer.

How many authors are in a blog page? many!

In my opinion , in the discussion around google authorship there is too much emphasize about main content writer authorship, but  almost no mention that the content indexed by google is made up also from comments and the have authors as well.

This is even more obvious in forums and Q&A sites. Who is the author of a page on stackexchange, the one who asked the question or the ones who supplied the answer. It is even more complex in wiki sites.

It feels like while people were rushing to see faces on the search results hoping for some SEO juice, they haven’t tried to reed the text of the spec

For a and area elements, the author keyword indicates that the referenced document provides further information about the author of the nearest article element ancestor of the element defining the hyperlink, if there is one, or of the page as a whole, otherwise.

And article elements are not necessarily the whole content of the page (emphasize mine)

The article element represents a self-contained composition in a document, page, application, or site and that is, in principle, independently distributable or reusable, e.g. in syndication. This could be a forum post, a magazine or newspaper article, a blog entry, a user-submitted comment, an interactive widget or gadget, or any other independent item of content.

But google right now will not let me claim authorship over my comments :( . The way you need to configure your profile to claim authorship is just not user friendly enough to do it for every site I comment on, just too much work.

Why would I as a site owner wish to let commenter claim authorship on comments? because if I have quality commentators people might come to my site because they follow them.

google fails to understand that authorship is a markup territory and not display territory

A quote from the webmaster tool help page

Hidden markup Make sure that your rel="author" link is not invisible to humans using techniques like display:none or CSS. Broadly speaking, Google won’t display any information that cannot be viewed by humans.

As if there is a way for a human to see the relationship info without viewing the source HTML. It is as if google needs more incoming links into the g+ profile pages to promote them in search results….

But even google understands how stupid this rule can be in practice and allows authorship info to be specified in link tags in the header. (which is actually exactly one of the options the HTML5 spec specifically specifies)

WordPress comments suck at authentication

I am sure I will not shock anyone by saying that an email address by itself is not good enough for authentication. It is to easy to fabricate an email address, to create a one time one, and to use someones else address, so why exactly do we still use it as an authentication token in wordpress comments?

It is not that getting the email of a commenter is a bad idea, it is just that is not enough for authentication. What is needed is a way to proved that said email actually belong to that person. One idea is to send a mail to the email address and ask to confirm the submission of the comment. After verifying the email it will make more sense to get profile data from gravatar with this email address.

And there is a different approach that avoids using emails for authentication – use the commentator’s profile on the web. Most of the commentators have a facebook/google/twitter/tumbler/wordpress.com/flickr account with a profile, just let them authenticate their profiles. You can even get an avatar image and maybe name that you can use to identify them to the readers when displaying the comment.

This does not necessarily work against anonymity but you probably be more inclined to approve an authenticated comment then one which is practically anonymous.

Almost* all wordpress themes suck at comment form design

A naive person might assume that the most important part of a comment is the content of the comment itself. It is pity that wordpress theme designers are not naive and understand that site owners wants to know who is commenting as much as they are interested in the content.

For most themes the flow of submitting a comments is as follows

  1. Enter your name
  2. Enter your e-mail address
  3. Enter your web site
  4. If you still remember what you wanted to write (and still have time to do that), at last you can do it

The ridiculous aspect of this scheme is that steps 1-3 do not ensure that the owner will know who made the comment as it is just easy to provide a valid email address which doesn’t exist or do not belong to the commenter, and anything can go as name and website.

So steps 1-3 are just obstacles that not necessarily provide a value so isn’t it better to let people write comments and then, only if they feel like it, identify themselves? Even without going for radical design change by allowing people to submit comment and only after it was submitted identify themselves with it (blogger kind of works that way), just emphasizing the comment content by putting it on top can improve commenting experience.

Comment form should be in this order:

  1. Enter Comment
  2. Enter e-mail (let site owner contact you)
  3. Enter name (you might want to be identified by name if you are a returning commenter, or so people can easily refer to you in the discussion)
  4. Enter your website (only SEO wannabe care about that)

*Almost – just because there might be one or two that I don’t know about and get it right.

Edit: Looks like I’m not alone in going down this road, and there is even code there

Webmatrix is a nice idea by microsoft but I spent too much time trying to make mysql run on my PC and just gave up as the webmatrix installation failed. Will have to make do with the unpolished interface of wamp

Good site search is hard to write

The main problem with writing a good site search is that it users are dumb humans and that google conditioned them to expect that software can actually understand and respond correctly to their incoherent chatter.

Humans that use search will not bother to understand what kind of information is in your site what what is the correct spelling used in the site for the words there are looking for.

They will not bother to check if you spell color as colour and will expect you to understand that when they wrote apropriate they actually meant appropriate. And not all users are native english speakers so sometimes they are totally guessing what is the correct spelling.

There are great algorithms like soundex and metaphone but they are mainly for US english users….

No wonder the entry barrier in the search engine market is high. You actually need to employ linguistic experts to be able to develop algorithms that parse user input in all the languages in order to compete with google