Paul Buchheit: January 2009

Thursday, January 22, 2009

Communicating with code

Some people can sell their ideas with a brilliant speech or a slick powerpoint presentation.

I can't.

Maybe that's why I'm skeptical of ideas that are sold via brilliant speeches and slick powerpoints. Or maybe it's because it's too easy to overlook the messy details, or to get caught up in details that seem very important, but aren't. I also get very bored by endless debate.

We did a lot of things wrong during the 2.5 years of pre-launch Gmail development, but one thing we did very right was to always have live code. The first version of Gmail was literally written in a day. It wasn't very impressive -- all I did was take the Google Groups (Usenet search) code (my previous project) and stuff my email into it -- but it was live and people could use it (to search my mail...). From that day until launch, every new feature went live immediately, and most new ideas were implemented as soon as possible. This resulted in a lot of churn -- we re-wrote the frontend about six times and the backend three times by launch -- but it meant that we had direct experience with all of the features. A lot of features seemed like great ideas, until we tried them. Other things seemed like they would be big problems or very confusing, but once they were in we forgot all about the theoretical problems.

The great thing about this process was that I didn't need to sell anyone on my ideas. I would just write the code, release the feature, and watch the response. Usually, everyone (including me) would end up hating whatever it was (especially my ideas), but we always learned something from the experience, and we were able to quickly move on to other ideas.

The most dramatic example of this process was the creation of content targeted ads (now known as "AdSense", or maybe "AdSense for Content"). The idea of targeting our keyword based ads to arbitrary content on the web had been floating around the company for a long time -- it was "obvious". However, it was also "obviously bad". Most people believed that it would require some kind of fancy artificial intelligence to understand the content well enough to target ads, and even if we had that, nobody would click on the ads. I thought they were probably right.

However, we needed a way for Gmail to make money, and Sanjeev Singh kept talking about using relevant ads, even though it was obviously a "bad idea". I remained skeptical, but thought that it might be a fun experiment, so I connected to that ads database (I assure you, random engineers can no longer do this!), copied out all of the ads+keywords, and did a little bit of sorting and filtering with some unix shell commands. I then hacked up the "adult content" classifier that Matt Cutts and I had written for safe-search, linked that into the Gmail prototype, and then loaded the ads data into the classifier. My change to the classifier (which completely broke its original functionality, but this was a separate code branch) changed it from classifying pages as "adult", to classifying them according to which ad was most relevant. The resulting ad was then displayed in a little box on our Gmail prototype ui. The code was rather ugly and hackish, but more importantly, it only took a few hours to write!

I then released the feature on our unsuspecting userbase of about 100 Googlers, and then went home and went to sleep. The response when I returned the next day was not what I would classify as "positive". Someone may have used the word "blasphemous". I liked the ads though -- they were amusing and often relevant. An email from someone looking for their lost sunglasses got an ad for new sunglasses. The lunch menu had an ad for balsamic vinegar.

More importantly, I wasn't the only one who found the ads surprisingly relevant. Suddenly, content targeted ads switched from being a lowest-priority project (unstaffed, will not do) to being a top priority project, an extremely talented team was formed to build the project, and within maybe six months a live beta was launched. Google's content targeted ads are now a big business with billions of dollars in revenue (I think).

Of course none of the code from my prototype ever made it near the real product (thankfully), but that code did something that fancy arguments couldn't do (at least not my fancy arguments), it showed that the idea and product had real potential.

The point of this story, I think, is that you should consider spending less time talking, and more time prototyping, especially if you're not very good at talking or powerpoint. Your code can be a very persuasive argument.

The other point is that it's important to make prototyping new ideas, especially bad ideas, as fast and easy as possible. This can be especially difficult as a product grows. It was easy for me to stuff random broken features into Gmail when there were only about 100 users and they all worked for Google, but it's not so simple when there are 100 million users.

Fortunately for Gmail, they've recently found a rather clever solution that enables the thousands of Google engineers to add new ui features: Gmail Labs. This is also where Google's "20% time" comes in -- if you want innovation, it's critical that people are able to work on ideas that are unapproved and generally thought to be stupid. The real value of "20%" is not the time, but rather the "license" it gives to work on things that "aren't important". (perhaps I should do a post on "20% time" at some point...)

One of the best ways to enable prototyping and innovation on an established product is though an API. Twitter is possibly the best example of how well this can work. There are thousands of different Twitter clients, with new ones being written every day, and I believe a majority of Twitter messages are entered though one of these third-party clients.

Public APIs enable everyone to experiment with new ideas and create new ways of using your product. This is incredibly powerful because no matter how brilliant you and your coworkers are, there are always going to be smarter people outside of your company.

At FriendFeed, we discovered that our API does more than enable great apps, it also reveals great app developers. Gary and Ben were both writing FriendFeed apps using our API before we hired them. When hiring, you don't have to guess which people are "smart and gets things done", you can simply observe it in the wild :)

In my previous post, I asked people to describe their "ideal FriendFeed". Since then, I've been thinking about ideas for my "ideal FriendFeed". Unfortunately, it's very difficult for me to know how much I like an idea based only on words or mockups -- I really need to try it out. So in the spirit of prototyping, I've used my spare time to write a simple FriendFeed interface that prototypes some of the things I've been thinking about. This interface isn't the "future of FriendFeed", it's just a collection of ideas, some that I like, and some that I don't. One thing that's kind of cool about it (from a prototyping perspective) is that it's written entirely in Javascript running in the web browser -- it's just a single web page that uses FriendFeed's JSON APIs to fetch data. This also means that it's relatively easy for other people to copy and change -- you don't even need a server!

If you'd like to try it out, you can see everyone that I'm subscribed to (assuming their feed is public), or if you are a FriendFeed user, you can see all of your public subscriptions by going to http://paulbuchheit.github.com/xfeed.html#YOUR_NICKNAME_GOES_HERE. The complete source code (which is just several hundred lines of HTML and JS) is here. In this prototype, I'm experimenting with treating entries, comments, and likes all as simple "messages", only showing comments from the user's friends (which can be a little confusing), and putting it all in reverse-chronological order. As I mentioned, this interface isn't the "future of FriendFeed", it's just a collection of ideas that I'm playing with.

If you're interested in prototyping something, feel free to take this code and have your way with it. As always, I'd love to see your prototypes action!

Tuesday, January 06, 2009

If you're the kind of person who likes to vote...

Now is your opportunity!

FriendFeed was nominated for three "Crunchies". Please vote for us in all three categories:

I can't promise that your vote will end the war, fix the economy, or save the environment (that one is here), but I can promise that your vote might be counted.

Sunday, January 04, 2009

Overnight success takes a long time

For some reason, this weekend has seen a lot of talk about what FriendFeed is/isn't/should be doing (see Louis Gray and others). One person even predicted that we will fail.

I considered writing my own list of complaints about FriendFeed. I think and care about it a lot more than most people, so my list of FriendFeed issues would be a lot longer. I may still do that, but there's something else also worth discussing...

One of the benefits of experience is that it gives some degree of perspective. Of course there's a huge risk of overgeneralizing (someone took a picture!), but with that in mind...

We starting working on Gmail in August (or September?) 2001. For a long time, almost everyone disliked it. Some people used it anyway because of the search, but they had endless complaints. Quite a few people thought that we should kill the project, or perhaps "reboot" it as an enterprise product with native client software, not this crazy Javascript stuff. Even when we got to the point of launching it on April 1, 2004 (two and a half years after starting work on it), many people inside of Google were predicting doom. The product was too weird, and nobody wants to change email services. I was told that we would never get a million users.

Once we launched, the response was surprisingly positive, except from the people who hated it for a variety of reasons. Nevertheless, it was frequently described as "niche", and "not used by real people outside of silicon valley".

Now, almost 7 and a half years after we started working on Gmail, I see things like this:

Yahoo and Microsoft have more than 250m users each worldwide for their webmail, according to the comScore research firm, compared to close to 100m for Gmail. But Google's younger service, launched in 2004, has been gaining ground in the US over the past year, with users growing by more than 40 per cent, compared to 2 per cent for Yahoo and a 7 per cent fall in users of Microsoft's webmail.

And that probably isn't counting all of the "Apps for your domain" users. I still have a huge list of complaints about Gmail, by the way.

It would be a huge mistake for me to assume that just because Gmail did eventually take off, then the same thing will happen to FriendFeed. They are very different products, and maybe we just got lucky with Gmail.

However, it does give some perspective. Creating an important new product generally takes time. FriendFeed needs to continue changing and improving, just as Gmail did six years ago (there are some screenshots around if you don't believe me). FriendFeed shows a lot of promise, but it's still a "work in progress".

My expectation is that big success takes years, and there aren't many counter-examples (other than YouTube, and they didn't actually get to the point of making piles of money just yet). Facebook grew very fast, but it's almost 5 years old at this point. Larry and Sergey started working on Google in 1996 -- when I started there in 1999, few people had heard of it yet.

This notion of overnight success is very misleading, and rather harmful. If you're starting something new, expect a long journey. That's no excuse to move slow though. To the contrary, you must move very fast, otherwise you will never arrive, because it's a long journey! This is also why it's important to be frugal -- you don't want to starve to death half the way up the mountain.

Getting back to FriendFeed, I'm always concerned when I hear complaints about the service. However, I'm also encouraged by the complaints, because it means that people care about the product. In fact, they care so much that they write long blog posts about what we should do differently. It's clear that our product isn't quite right and needs to evolve, but the fact that people are giving it so much thought tells me that we are at least headed in roughly the right direction. I would be much more concerned if there were silence and nobody cared about what we are doing -- it would mean that we are "off in the weeds", as they say. Getting this kind of valuable feedback is one of the major benefits of launching early.

If you'd like to contribute (and I hope you do), I'd love to read more of your visions of "the perfect FriendFeed". Describe what would make FriendFeed perfect for YOU, and post it on your blog (or email post@posterous.com if you don't have a blog -- they create them automatically). Feel free to drop or change features in any way you like. Yes, technically you're doing my work for me, but it's mutually beneficial because we'll do our best to create a product that you like, and even if we don't, maybe someone else will (since the concepts are out there for everyone).

Saturday, January 03, 2009

The question is wrong

On "Coding Horror", Jeff Atwood asked this question:

Let's say, hypothetically speaking, you met someone who told you they had two children, and one of them is a girl. What are the odds that person has a boy and a girl?

He then argues that our intuition leads us to the "wrong" answer (50%) instead of the "correct" answer (2/3 or 67%).

However, the question does not include enough information to determine which of these answers is actually correct, so the only truly correct answer is, "I don't know" or "it depends". I skimmed though the comments on the post (there are about a million), and didn't see anyone addressing this issue (though someone probably did). They mostly argued about BG vs GB for some reason.

The reason that this question is wrong is because it doesn't specify the "algorithm" for posing the question.

If we assume that boys and girls are born with equal probability (50/50, like flipping a coin), then families with two children will have two girls 25% of the time, two boys 25% of the time, and a boy and a girl 50% of the time.

If the algorithm for posing the question is:

Choose a random parent that has exactly two children
If the parent has two boys, eliminate him and choose another random parent
Ask about the odds that the parent has both a boy and a girl

Then we can see that step two eliminated the "two boy" possibility, which leaves the 25% probability of two girls and the 50% probability of both a boy and a girl. Of course probabilities should add up to 100%, so the final probabilities are 25/75 (1/3) for two girls and 50/75 (2/3) for both a boy and girl. This is the "correct" answer described by Jeff, and it occurs because of the elimination performed at step two.

However, if the algorithm for posing the question was instead:

Choose a random parent that has exactly two children
Arbitrarily announce the gender of one of the children
Ask about the odds that the parent has both a boy and a girl

Now we're back to having a 50% probability of there being both a boy and a girl. The difference is that there was no elimination at step two, and simply announcing the gender of one of the children does not affect the gender of the other child or change the probability distribution.

The problem with the question as originally posed was that it didn't specify which of these algorithms was being used. Were we arbitrarily told about the girl, or was a selective process applied?

By the way, if we're applying a selective process, then 100% is also a possibly correct answer, because at step two we could have eliminated all parents that don't have both a boy and a girl. Likewise, all other probabilities are also potentially correct depending on the algorithm applied.

Update: Surprisingly, some people are still thinking that my second algorithm yields 2/3 instead of 1/2 (see the confused discussion on news.yc). I think part of the reason is that I was somewhat imprecise with the concept of "elimination". The second algorithm does not eliminate any of the families, but if I announce that there is a boy, that does eliminate the possibility of two girls. This is where some people are getting lost and thinking that the boy+girl probability has become 2/3. The catch is that announcing the boy also reduced the boy+girl probability by an equal amount, so the result is still the same (it eliminated either BG or GB, I don't know which, but it doesn't matter).