Saturday, November 28, 2009

So I finally tried Wave...

Last week, TechCrunch published a story about me not yet trying Google Wave ("Gmail Creator Thinks Email Will Last Forever. And Hasn't Tried Google Wave"). The is apparently unacceptable, or as one commenter put it, "Paul may have been trying to be cool and ironic, but really he should be ashamed for not having tried Wave yet." I'm not sure if this is because I have an obligation to try all new products, or because my views on the longevity of email will seem hopelessly naive once I try Wave, but either way, I mustn't disappoint the good people of TechCrunch :)

The Google Wave About page and video does a good job of summarizing what Wave is and how it works. If you want to learn more about Wave, I would start there and skip this post. That said, here are my thoughts on Wave:

First off, Wave is clever and full of interesting ideas.

Second, comparisons to Facebook and Twitter are nonsensical. If Twitter were CNN Headline News, Google Wave would be Microsoft Office. Wave is less of a social network and more of a productivity tool. It's Google Docs meets Gmail, or as Google puts it, "A wave is equal parts conversation and document. People can communicate and work together with richly formatted text, photos, videos, maps, and more."

Third, although Wave is very promising, it's clear that it still needs some refinement. This is why Google calls it a "preview release". The trouble with innovative new ideas is that not all of them are worth keeping. While developing Gmail, we implemented a lot of features that were either not released, or not released until much later. Some of the most interesting ideas (such as automatic email prioritization) never made it out because we couldn't find simple enough interfaces. Other ideas sounded good, but in practice weren't useful enough to justify the added complexity (such as multiple stars). Other features, such as integrated IM, simply needed more time to get right and were added later. Our approach was somewhat minimal: only include features that had proven to be highly useful, such as the conversation view and search. It's my impression that Wave was released at an earlier stage of development -- they included all of the features, and will likely winnow and refine them as Wave approaches a full launch. The Wave approach can be a little confusing, but it allows for greater public feedback and testing.

From what I've seen, the realtime aspects of Wave are both the most intriguing, and the most problematic. I think the root of the issue is that conversations need to be mostly linear, or else they become incomprehensible. IM and chat work because there is a nice, linear back-and-forth among the participants. Wave puts the conversation into little Gmail-like boxes, but then makes them update in realtime. The result is that people end up responding (in realtime) to things on other parts of the page, and the chronological linkage and flow of the conversation is lost. I suspect it would work better if each box behaved more like a little chat room. A single Wave could contain multiple chats (different sub-topics), but each box would be mostly self-contained and could be read in a linear fashion.

So now that I've tried Wave, do I expect it to kill email? No. The reason that nothing is going to kill email anytime soon is quite simple: email is universal (or as close to it as anything on the Internet). Email has all kinds of problems and I often hate it, but the fact is that it mostly works, and there's a huge amount of experience and infrastructure supporting it. The best we can do is to use email less, and tools like Wave and Docs are a big help here.

I don't know what Google has planned for Wave or Gmail, but if I were them I would continue improving Wave, and then once it's ready for the whole world to use, integrate it into Gmail. Moving Wave into Gmail would give it a huge userbase, and partially address the "email is universal" problem. They could use MIME multi-part to send both a non-Wave, HTML version of the message, and the Wave version. Wave-enabled mail readers would display the live Wave, while older mailers would show the static version along with a link to the live Wave.  

Friday, November 20, 2009

Open as in water, the fluid necessary for life

"Open" is a great thing. Everyone likes it. Unfortunately, nobody agrees what open is. There are many meanings, but in general, I think "open" must be the opposite of "closed". In the world of abstract things like software, protocols and society, closed is secret, hidden, or locked.

"Closed" limits our mobility, prevents discovery, and discourages new connections. Imagine being in a building where all of the doors are locked or guarded, and it's difficult to move from room to room or leave. A closed world is one where people are forced to stay in their place, sometimes because of physical constraints, but more commonly because they simply don't know where else to go. A closed world is giant prison.

In an open world, people are able to see more clearly, and more easily explore new ideas and possibilities. An open world is more fluid -- people and ideas easily flow over boundaries and other borders. This openness is what makes the Internet so powerful. The Internet is melting the world, but in a good way.

Open standards and open source software are important for making technology open and available to everyone, but it's important to remember that open goes beyond tech. Wikipedia makes knowledge open to everyone. Blogs and YouTube make broadcasting and mass communication open to everyone -- news and events that would have been suppressed in the past are now reaching the whole world.

These things have been discussed to death, but there's another "open" that still seems a little frivolous: our lives. We like to joke (or complain) about people who share every boring detail of their lives and thoughts on Facebook or Twitter, but they may be doing something important.

Most of our happiness and productivity comes from the everyday details of our lives: the people we live and work with, the books we read, the hikes we take, the parties we attend, etc. But how do we choose these things? How do we know what to do, and how do know if we'll like it? The obvious answer is that we do and like whatever the TV tells us to do and like. I'm not certain that's the best answer though.

By sharing more of our own thoughts and lives with the world, we contribute to the global pool of "how to live", and over time we also get contributions back from the world. Think of it as "open source living". This has certainly been my experience with my blog and FriendFeed. Not only do people occasionally say that it has helped them, but I've also met interesting new people and gotten a lot of good leads on new ideas. These are typically small things, but our lives are woven from the small details of everyday living. For example, I saw a good TED talk on "The science of motivation", shared it on FriendFeed, and in the comments Laura Norvig suggested a book called Unconditional Parenting, which turns out to be very good.

The next step is for people to open more of their current activities and plans. This is often referred to as "real-time", but since real-time is also a technical term, we often focus too much on the technical aspect of it. The "real-time" that matters is the human part -- what I'm doing and thinking right now, and my ability to communicate that to the world, right now. We see some of this on Facebook, FriendFeed, and Twitter, and also location-aware apps such as Foursquare, but it's still fairly primitive and fringe. When this activity reaches critical mass, it should be very interesting for society. It dramatically alters the time and growth coefficients in group formation. It enables a much higher degree of serendipity and ad hoc socializing.

The basic pattern of openness is that better access to information and better systems lead to better decisions and better living. This general principal is broadly accepted, but we're just now discovering that it also applies to the minutiae of our lives.

Sharing your boring thoughts and activities may seem narcissistic and self-absorbed at first (I'm still kind of embarrassed about having a blog), but there is virtue and benefit in it. Naturally there will be challenges and fear along the way, but in the long term we're contributing to a more open, fluid society, where people are more able to find happy, productive lives. It also encourages us to be more accepting of others. Everyone is flawed, and the more we see that we aren't alone, the less we need to fear that truth.

People can not truly live and thrive in a prison -- we require freedom and mobility. This may explain my incomprehensible analogy, "Open as in water, the fluid necessary for life".

Go forth and share.

Tuesday, October 13, 2009

Applied Philosophy, a.k.a. "Hacking"

Every system has two sets of rules: The rules as they are intended or commonly perceived, and the actual rules ("reality"). In most complex systems, the gap between these two sets of rules is huge.

Sometimes we catch a glimpse of the truth, and discover the actual rules of a system. Once the actual rules are known, it may be possible to perform "miracles" -- things which violate the perceived rules.

Hacking is most commonly associated with computers, and people who break into or otherwise subvert computer systems are often called hackers. Although this terminology is occasionally disputed, I think it is essentially correct -- these hackers are discovering the actual rules of the computer systems (e.g. buffer overflows), and using them to circumvent the intended rules of the system (typically access controls). The same is true of the hackers who break DRM or other systems of control.

Writing clever (or sometimes ugly) code is also described as hacking. In this case the hacker is violating the rules of how we expect software to be written. If there's a project that should take months to write, and someone manages to hack it out in a single evening, that's a small miracle, and a major hack. If the result is simple and beautiful because the hacker discovered a better solution, we may describe the hack as "elegant" or "brilliant". If the result is complex and hard to understand (perhaps it violates many layers of abstraction), then we will call it an "ugly hack". Ugly hacks aren't all bad though -- one of my favorite personal hacks was some messy code that demonstrated what would become AdSense (story here), and although the code was quickly discarded, it did it's job.

Hacking isn't limited to computers though. Wherever there are systems, there is the potential for hacking, and there are systems everywhere. Our entire reality is systems of systems, all the way down. This includes human relations (see The Game for an very amusing story of people hacking human attraction), health (Seth Roberts has some interesting ideas), sports (Tim Ferriss claims to have hacked the National Chinese Kickboxing championship), and finance ("too big to fail").

We're often told that there are no shortcuts to success -- that it's all a matter of hard work and doing what we're told. The hacking mindset takes there opposite approach: There are always shortcuts and loopholes. For this reason, hacking is sometimes perceived as cheating, or unfair, and it can be. Using social hacks to steal billions of dollars is wrong (see Madoff). On the other hand, automation seems like a great hack -- getting machines to do our work enabled a much higher standard of living, though as always, not everyone sees it that way (the Luddites weren't big fans).

Important new businesses are usually some kind of hack. The established businesses think they understand the system and have setup rules to guard their profits and prevent real competition. New businesses must find a gap in the rules -- something that the established powers either don't see, or don't perceive as important. That was certainly the case with Google: the existing search engines (which thought of themselves as portals) believed that search quality wasn't very important (regular people can't tell the difference), and that search wasn't very valuable anyway, since it sends people away from your site. Google's success came in large part from recognizing that others were wrong on both points.

In fact, the entire process of building a business and having other people and computers do the work for you is a big hack. Nobody ever created a billion dollars through direct physical labor -- it requires some major shortcuts to create that much wealth, and by definition those shortcuts were mostly invisible to others (though many will dispute it after the fact). Startup investing takes this hack to the next level by having other people do the work of building the business, though finding the right people and businesses is not easy.

Not everyone has the hacker mindset (society requires a variety of personalities), but wherever and whenever there were people, there was someone staring into the system, searching for the truth. Some of those people were content to simply find a truth, but others used their discoveries to hack the system, to transform the world. These are the people that created the governments, businesses, religions, and other machines that operate our society, and they necessarily did it by hacking the prior systems. (consider the challenge of establishing a successful new government or religion -- the incumbents won't give up easily)

To discover great hacks, we must always be searching for the true nature of our reality, while acknowledging that we do not currently possess the truth, and never will. Hacking is much bigger and more important than clever bits of code in a computer -- it's how we create the future.

Or at least that's how I see it. Maybe I'll change my mind later.

See also: "The Knack" (and the need to disassemble things)

Sunday, September 13, 2009

Left brain, Right brain, and the other half of the story

In my head, this post and yesterday's post on risk and opportunity are deeply connected, but logically they needed to be split apart.

The theory of the left-brain / right-brain split is that the left hemisphere of our brain handles linear, logical processing (cold logic) while the right hemisphere is more emotional, intuitive, and holistic (evaluating the whole picture instead of considering things one component at a time). Naturally, some people are more left-brain dominant while others are more right-brain dominant. This divide is discussed quite a bit elsewhere -- I recommend starting with the TED talk by Jill Bolte Taylor, a neuroanatomist whose left hemisphere was damaged by a stroke, causing her to become right-brain dominant.

I'm actually somewhat skeptical that the left-brain / right-brain split is as real as people assume, however it seems to be metaphorically correct, so for my non-surgical purposes, it's "good enough".

To me, one of the most interesting aspects of this right/left divide is that many people seem to identify strongly with one side or the other, and actually despise the other half of their brain (see here for a few examples, and even Jill Taylor seems to be doing it to some extent). This seems kind of dumb. My theory is that both halves of our brain are useful, and that for maximum benefit and happiness, we should learn how to use each half to its maximum potential.

This is where I link in to yesterday's post on Risk and Opportunity. My suggestion was to simultaneously seek big, exciting opportunities ("dream big"), while carefully avoiding unacceptable risks ("don't be stupid"). In my mind, that is the right/left divide.

The left-brain ability to carefully double-check logic and evaluate the risks is very important because it helps to protect us from bad decisions. When we imagine the kind of person who believes things that are obviously false, falls for scams, ends up joining a cult, etc, we probably picture a stereotypically right-brain person.

However, what the left brain has in cold, efficient logic, it lacks in passion and grandiosity.

When I wrote about evaluating risks and opportunities, it was as though we use a logical process when make decisions, but of course that's not actually true, nor should it be. Our actual decision making is much more emotional (and emotions are just another mental process).

The right-brain utility is in integrating millions of facts (more than the left brain can logically combine) and producing a unified output. However, that output is in the form of an intuition, "gut feeling", or just plain excitement, which can sometimes be difficult to communicate or justify ("it seems like a good idea" isn't always convincing). Nevertheless, these intuitions are crucial for making big conceptual leaps, and ultimately providing direction and meaning in our lives.

So to reformulate yesterday's advice, I think we do best when using our right-brain skills to discover opportunity and excitement, while also engaging our left-brain abilities to avoid disasters, find tactical advantages, and rationalize our actions to the world. Left and Right are both stuck in the same skull, but not by accident -- they actually need each other. (the same could probably be said for politics, but that would be another post)

Coincidentally, I just saw another good TED talk that mentions these right-brain/left-brain issues in the context of managing and incentivizing creative people. It's worth watching.

Saturday, September 12, 2009

Evaluating risk and opportunity (as a human)

Our lives are full of decisions that force us to balance risk and opportunity: should you take that new job, buy that house, invest in that company, swallow that pill, jump off that cliff, etc. How do we decide which risks are smart, and which are dumb? Once we've made our choices, are we willing to accept the consequences?

I think the most common technique is to ask ourselves, "What is the most likely outcome?", and if that outcome is good, then we do it (to the extent that people actually reason through decisions at all). That works well enough for many decisions -- for example, you might believe that the most likely outcome of going to school is that you can get a better job later on, and therefore choose that path. That's the reasoning most people use when going to school, getting a job, buying a house, or making most other "normal" decisions. Since it focuses on the "expected" outcome, people using it often ignore the possible bad outcomes, and when something bad does happen, they may feel bitter or cheated ("I have a degree, now where's my job!?"). For example, most people buying houses a couple of years ago weren't considering the possibility that their new house would lose 20% of its value, and that they would end up owing more than the house was worth.

When advising on startups, I often tell people that they should start with the assumption that the startup will fail and all of their equity will become worthless. Many people have a hard time accepting that fact, and say that they would be unable to stay motivated if they believed such a thing. It seems unfortunate that these people feel the need to lie to themselves in order to stay motivated, but recently I realized that I'm just using a different method of evaluating risks and opportunities.

Instead of asking, "What's the most likely outcome?", I like to ask "What's the worst that could happen?" and "Could it be awesome?". Essentially, instead of evaluating the median outcome, I like to look at the 0.01 percentile and 95th percentile outcomes. In the case of a startup, the worst case outcome is generally that you will lose your entire investment (but learn a lot), and the best case is that you make a large pile of money, create something cool, and learn a lot. (see "Why I'd rather be wrong" for more on this)

Thinking about the best-case outcomes is easy and people do it a lot, which is part of the reason it's often disrespected ("dreamer" isn't usually a compliment). However, too many people ignore the worst case scenario because thinking about bad things is uncomfortable. This is a mistake. This is why we see people killing themselves over investment losses (part of the reason, anyway). They were not planning for the worst case. Thinking about the worst case not only protects us from making dumb mistakes, it also provides an emotional buffer. If I'm comfortable with the worst-case outcome, then I can move without fear and focus my attention on the opportunity.

Considering only the best and worst case outcomes is not perfect of course -- lottery tickets have an acceptable worst case (you lose a $1) and a great best case (you win millions), yet they are generally a bad deal. Ideally we would also consider the "expected value" of our decisions, but in practice that's impossible for most real decisions because the world is too complicated and math is hard. If the expected value is available (as it is for lottery tickets), then use it (and don't buy lottery tickets), but otherwise we need some heuristics. Here are some of mine:

  • Will I learn a lot from the experience? (failure can be very educational)
  • Will it make my life more interesting? (a predictable life is a boring life)
  • Is it good for the world? (even if I don't benefit, maybe someone else will)
These things all raise the expected value (in my mind at least), so if they are mostly true, and I'm excited about the best-case outcome, and I'm comfortable with the worst-case outcome, then it's probably a good gamble. (note: I should also point out that when considering the worst-case scenario, it's important to also think about the impact on others. For example, even if you're ok with dying, that outcome may cause unacceptable harm to other people in your life.)

I've been told that I'm extremely cynical. I've also been told that I'm unreasonably optimistic. Upon reflection, I think I'm ok with being a cynical optimist :)

By the way, here's why I chose the 0.01 percentile outcome when evaluating the worst case: Last year there were 37,261 motor vehicle fatalities in the United States. The population of the United States is 304,059,724, so my odds of getting killed in a car accident is very roughly 1/10,000 per year (of course many of those people were teenagers and alcoholics, so my odds are probably a little better than that, but as a rough estimate it's good). Using this logic, I can largely ignore obscure 1/1,000,000 risks, which are too numerous and difficult to protect against anyway.

Also see The other half of the story

Thursday, June 25, 2009

Collaborative Charity

Donating money to worthwhile causes seems like a good idea, but doing it right requires knowledge, wisdom, intelligence, time, and of course money. I have at most two of those things. The traditional solution is to rely on "experts", but that has its own problems.

One of the great things about the Internet (other than the obvious) is that it enables people to collaborate in new ways, and each contribute little bits of their time and knowledge. Wikipedia is probably the best example of this, but I think it's possible to do much more. I'm not quite sure how to make this work, but I expect that in 10 years we will have much smarter "collective" systems that leverage small bits of time, knowledge, etc from large groups.

This is my first experiment in solving this problem. Actually, in some ways it's my second experiment -- a few months ago I posed a question about the "best use of money", and although it was only meant as a thought experiment, people also provided a lot of specific suggestions. That was rather encouraging.

Now I'm trying it for real -- I have a lot of ideas, but not much time, so I'm starting with the simplest solution that I could find. It may not work, but it should be interesting.

Here's how it works: I'm going to donate a bunch of money, but I want random people on the Internet to decide where it goes.

Here are the rules:

  • The money MUST go to an IRS recognized public charity. No exceptions.
  • Don't contact me. I already don't read the email I have -- I don't need more.
  • I've created a topic on Google Moderator where people can submit and vote on ideas. I've never used Google Moderator, but someone told me that it's good, so hopefully it works :)
  • Ultimately, this is just a recommendation and I may completely ignore the results if they are stupid, so don't bother spamming.
  • I also created a group on FriendFeed where people can submit links and discuss ideas.
  • I'd like to see broad support (from real people, not spam accounts) along with some evidence that it's a good idea, and perhaps endorsements from knowledgeable people.
In terms of which causes I'd like to support, I'd consider anything, but am probably most sympathetic to health, freedom, and education. In terms of solutions, I'm very skeptical of centralization, one-size-fits-all solutions, and people who are certain of the answer. I also prefer to support things that have tangible, objective outcomes (where you could say, "this money was used to purchase X" or "this money was used to fund study Y, which will be published this fall").


Friday, April 17, 2009

Make your site faster and cheaper to operate in one easy step

Is your web server using using gzip encoding? Surprisingly, many are not. I just wrote a little script to fetch the 30 external links off news.yc and check if they are using gzip encoding. Only 18 were, which means that the other 12 sites are needlessly slow, and also wasting money on bandwidth.

Check your site here.

Some people think gzip is "too slow". It's not. Here's an example (run on my laptop) using data from one of the links on news.ycombinator.com:
$ cat < /tmp/sd.html | wc -c
146117
$ gzip < /tmp/sd.html | wc -c
35481
$ time gzip < /tmp/sd.html >/dev/null
real    0m0.009s
user    0m0.004s
sys     0m0.004s

It took 9ms to compress 146,117 bytes of html (and that includes process creation time, etc), and the compressed data was only about 24% the size of the input. At that rate, compressing 1GB of data would require about 66 seconds of cpu time. Repeating the test with a much larger file results yields about 42 sec/GB, so 66 sec is not an unreasonable estimate.

Inevitably, someone will argue that they can't spare a few ms per page to compress the data, even though it will make their site much more responsive. However, it occured to me today that thanks to Amazon, it's very easy to compare CPU vs Bandwidth. According to their pricing page, a "small" (single core) instance cost $0.10 / hour, and data transfer out costs $0.17 / GB (though it goes down to $0.10 / GB if you use over 150 TB / month, which you probably don't).

Using these numbers, we can estimate that it would cost $1.88 to gzip 1TB of data on Amazon EC2, and $174 to transfer 1TB of data. If you instead compress your data (and get 4-to-1 compression, which is not unusual for html), the bandwidth will only cost $43.52.

Summary:
with gzip: $1.88 for cpu + $43.52 for bandwidth = $45.40 + happier users

without gzip: $174.00 for bandwidth = $128.60 wasted + less happy users

The other excuse for not gzipping content is that your webserver doesn't support it for some reason. Fortunately, there's a simple solution: put nginx in front of your servers. That's what we do at FriendFeed, and it works very well (we use a custom, epoll-based python server). Nginx acts as a proxy -- outside requests connect to nginx, and nginx connects to whatever webserver you are already using (and along the way it will compress your response, and do other good stuff).

Thursday, January 22, 2009

Communicating with code

Some people can sell their ideas with a brilliant speech or a slick powerpoint presentation.

I can't.

Maybe that's why I'm skeptical of ideas that are sold via brilliant speeches and slick powerpoints. Or maybe it's because it's too easy to overlook the messy details, or to get caught up in details that seem very important, but aren't. I also get very bored by endless debate.

We did a lot of things wrong during the 2.5 years of pre-launch Gmail development, but one thing we did very right was to always have live code. The first version of Gmail was literally written in a day. It wasn't very impressive -- all I did was take the Google Groups (Usenet search) code (my previous project) and stuff my email into it -- but it was live and people could use it (to search my mail...). From that day until launch, every new feature went live immediately, and most new ideas were implemented as soon as possible. This resulted in a lot of churn -- we re-wrote the frontend about six times and the backend three times by launch -- but it meant that we had direct experience with all of the features. A lot of features seemed like great ideas, until we tried them. Other things seemed like they would be big problems or very confusing, but once they were in we forgot all about the theoretical problems.

The great thing about this process was that I didn't need to sell anyone on my ideas. I would just write the code, release the feature, and watch the response. Usually, everyone (including me) would end up hating whatever it was (especially my ideas), but we always learned something from the experience, and we were able to quickly move on to other ideas.

The most dramatic example of this process was the creation of content targeted ads (now known as "AdSense", or maybe "AdSense for Content"). The idea of targeting our keyword based ads to arbitrary content on the web had been floating around the company for a long time -- it was "obvious". However, it was also "obviously bad". Most people believed that it would require some kind of fancy artificial intelligence to understand the content well enough to target ads, and even if we had that, nobody would click on the ads. I thought they were probably right.

However, we needed a way for Gmail to make money, and Sanjeev Singh kept talking about using relevant ads, even though it was obviously a "bad idea". I remained skeptical, but thought that it might be a fun experiment, so I connected to that ads database (I assure you, random engineers can no longer do this!), copied out all of the ads+keywords, and did a little bit of sorting and filtering with some unix shell commands. I then hacked up the "adult content" classifier that Matt Cutts and I had written for safe-search, linked that into the Gmail prototype, and then loaded the ads data into the classifier. My change to the classifier (which completely broke its original functionality, but this was a separate code branch) changed it from classifying pages as "adult", to classifying them according to which ad was most relevant. The resulting ad was then displayed in a little box on our Gmail prototype ui. The code was rather ugly and hackish, but more importantly, it only took a few hours to write!

I then released the feature on our unsuspecting userbase of about 100 Googlers, and then went home and went to sleep. The response when I returned the next day was not what I would classify as "positive". Someone may have used the word "blasphemous". I liked the ads though -- they were amusing and often relevant. An email from someone looking for their lost sunglasses got an ad for new sunglasses. The lunch menu had an ad for balsamic vinegar.

More importantly, I wasn't the only one who found the ads surprisingly relevant. Suddenly, content targeted ads switched from being a lowest-priority project (unstaffed, will not do) to being a top priority project, an extremely talented team was formed to build the project, and within maybe six months a live beta was launched. Google's content targeted ads are now a big business with billions of dollars in revenue (I think).

Of course none of the code from my prototype ever made it near the real product (thankfully), but that code did something that fancy arguments couldn't do (at least not my fancy arguments), it showed that the idea and product had real potential.

The point of this story, I think, is that you should consider spending less time talking, and more time prototyping, especially if you're not very good at talking or powerpoint. Your code can be a very persuasive argument.

The other point is that it's important to make prototyping new ideas, especially bad ideas, as fast and easy as possible. This can be especially difficult as a product grows. It was easy for me to stuff random broken features into Gmail when there were only about 100 users and they all worked for Google, but it's not so simple when there are 100 million users.

Fortunately for Gmail, they've recently found a rather clever solution that enables the thousands of Google engineers to add new ui features: Gmail Labs. This is also where Google's "20% time" comes in -- if you want innovation, it's critical that people are able to work on ideas that are unapproved and generally thought to be stupid. The real value of "20%" is not the time, but rather the "license" it gives to work on things that "aren't important". (perhaps I should do a post on "20% time" at some point...)

One of the best ways to enable prototyping and innovation on an established product is though an API. Twitter is possibly the best example of how well this can work. There are thousands of different Twitter clients, with new ones being written every day, and I believe a majority of Twitter messages are entered though one of these third-party clients.

Public APIs enable everyone to experiment with new ideas and create new ways of using your product. This is incredibly powerful because no matter how brilliant you and your coworkers are, there are always going to be smarter people outside of your company.

At FriendFeed, we discovered that our API does more than enable great apps, it also reveals great app developers. Gary and Ben were both writing FriendFeed apps using our API before we hired them. When hiring, you don't have to guess which people are "smart and gets things done", you can simply observe it in the wild :)


In my previous post, I asked people to describe their "ideal FriendFeed". Since then, I've been thinking about ideas for my "ideal FriendFeed". Unfortunately, it's very difficult for me to know how much I like an idea based only on words or mockups -- I really need to try it out. So in the spirit of prototyping, I've used my spare time to write a simple FriendFeed interface that prototypes some of the things I've been thinking about. This interface isn't the "future of FriendFeed", it's just a collection of ideas, some that I like, and some that I don't. One thing that's kind of cool about it (from a prototyping perspective) is that it's written entirely in Javascript running in the web browser -- it's just a single web page that uses FriendFeed's JSON APIs to fetch data. This also means that it's relatively easy for other people to copy and change -- you don't even need a server!



If you'd like to try it out, you can see everyone that I'm subscribed to (assuming their feed is public), or if you are a FriendFeed user, you can see all of your public subscriptions by going to http://paulbuchheit.github.com/xfeed.html#YOUR_NICKNAME_GOES_HERE. The complete source code (which is just several hundred lines of HTML and JS) is here. In this prototype, I'm experimenting with treating entries, comments, and likes all as simple "messages", only showing comments from the user's friends (which can be a little confusing), and putting it all in reverse-chronological order. As I mentioned, this interface isn't the "future of FriendFeed", it's just a collection of ideas that I'm playing with.

If you're interested in prototyping something, feel free to take this code and have your way with it. As always, I'd love to see your prototypes action!



Tuesday, January 06, 2009

If you're the kind of person who likes to vote...

Now is your opportunity!

FriendFeed was nominated for three "Crunchies". Please vote for us in all three categories:




I can't promise that your vote will end the war, fix the economy, or save the environment (that one is here), but I can promise that your vote might be counted.

Sunday, January 04, 2009

Overnight success takes a long time

For some reason, this weekend has seen a lot of talk about what FriendFeed is/isn't/should be doing (see Louis Gray and others). One person even predicted that we will fail.

I considered writing my own list of complaints about FriendFeed. I think and care about it a lot more than most people, so my list of FriendFeed issues would be a lot longer. I may still do that, but there's something else also worth discussing...

One of the benefits of experience is that it gives some degree of perspective. Of course there's a huge risk of overgeneralizing (someone took a picture!), but with that in mind...

We starting working on Gmail in August (or September?) 2001. For a long time, almost everyone disliked it. Some people used it anyway because of the search, but they had endless complaints. Quite a few people thought that we should kill the project, or perhaps "reboot" it as an enterprise product with native client software, not this crazy Javascript stuff. Even when we got to the point of launching it on April 1, 2004 (two and a half years after starting work on it), many people inside of Google were predicting doom. The product was too weird, and nobody wants to change email services. I was told that we would never get a million users.

Once we launched, the response was surprisingly positive, except from the people who hated it for a variety of reasons. Nevertheless, it was frequently described as "niche", and "not used by real people outside of silicon valley".

Now, almost 7 and a half years after we started working on Gmail, I see things like this:

Yahoo and Microsoft have more than 250m users each worldwide for their webmail, according to the comScore research firm, compared to close to 100m for Gmail. But Google's younger service, launched in 2004, has been gaining ground in the US over the past year, with users growing by more than 40 per cent, compared to 2 per cent for Yahoo and a 7 per cent fall in users of Microsoft's webmail.

And that probably isn't counting all of the "Apps for your domain" users. I still have a huge list of complaints about Gmail, by the way.

It would be a huge mistake for me to assume that just because Gmail did eventually take off, then the same thing will happen to FriendFeed. They are very different products, and maybe we just got lucky with Gmail.

However, it does give some perspective. Creating an important new product generally takes time. FriendFeed needs to continue changing and improving, just as Gmail did six years ago (there are some screenshots around if you don't believe me). FriendFeed shows a lot of promise, but it's still a "work in progress".

My expectation is that big success takes years, and there aren't many counter-examples (other than YouTube, and they didn't actually get to the point of making piles of money just yet). Facebook grew very fast, but it's almost 5 years old at this point. Larry and Sergey started working on Google in 1996 -- when I started there in 1999, few people had heard of it yet.

This notion of overnight success is very misleading, and rather harmful. If you're starting something new, expect a long journey. That's no excuse to move slow though. To the contrary, you must move very fast, otherwise you will never arrive, because it's a long journey! This is also why it's important to be frugal -- you don't want to starve to death half the way up the mountain.

Getting back to FriendFeed, I'm always concerned when I hear complaints about the service. However, I'm also encouraged by the complaints, because it means that people care about the product. In fact, they care so much that they write long blog posts about what we should do differently. It's clear that our product isn't quite right and needs to evolve, but the fact that people are giving it so much thought tells me that we are at least headed in roughly the right direction. I would be much more concerned if there were silence and nobody cared about what we are doing -- it would mean that we are "off in the weeds", as they say. Getting this kind of valuable feedback is one of the major benefits of launching early.

If you'd like to contribute (and I hope you do), I'd love to read more of your visions of "the perfect FriendFeed". Describe what would make FriendFeed perfect for YOU, and post it on your blog (or email post@posterous.com if you don't have a blog -- they create them automatically). Feel free to drop or change features in any way you like. Yes, technically you're doing my work for me, but it's mutually beneficial because we'll do our best to create a product that you like, and even if we don't, maybe someone else will (since the concepts are out there for everyone).

Saturday, January 03, 2009

The question is wrong

On "Coding Horror", Jeff Atwood asked this question:

Let's say, hypothetically speaking, you met someone who told you they had two children, and one of them is a girl. What are the odds that person has a boy and a girl?

He then argues that our intuition leads us to the "wrong" answer (50%) instead of the "correct" answer (2/3 or 67%).

However, the question does not include enough information to determine which of these answers is actually correct, so the only truly correct answer is, "I don't know" or "it depends". I skimmed though the comments on the post (there are about a million), and didn't see anyone addressing this issue (though someone probably did). They mostly argued about BG vs GB for some reason.

The reason that this question is wrong is because it doesn't specify the "algorithm" for posing the question.

If we assume that boys and girls are born with equal probability (50/50, like flipping a coin), then families with two children will have two girls 25% of the time, two boys 25% of the time, and a boy and a girl 50% of the time.

If the algorithm for posing the question is:
  1. Choose a random parent that has exactly two children
  2. If the parent has two boys, eliminate him and choose another random parent
  3. Ask about the odds that the parent has both a boy and a girl
Then we can see that step two eliminated the "two boy" possibility, which leaves the 25% probability of two girls and the 50% probability of both a boy and a girl. Of course probabilities should add up to 100%, so the final probabilities are  25/75 (1/3) for two girls and 50/75 (2/3) for both a boy and girl. This is the "correct" answer described by Jeff, and it occurs because of the elimination performed at step two.

However, if the algorithm for posing the question was instead:
  1. Choose a random parent that has exactly two children
  2. Arbitrarily announce the gender of one of the children
  3. Ask about the odds that the parent has both a boy and a girl
Now we're back to having a 50% probability of there being both a boy and a girl. The difference is that there was no elimination at step two, and simply announcing the gender of one of the children does not affect the gender of the other child or change the probability distribution.

The problem with the question as originally posed was that it didn't specify which of these algorithms was being used. Were we arbitrarily told about the girl, or was a selective process applied?

By the way, if we're applying a selective process, then 100% is also a possibly correct answer, because at step two we could have eliminated all parents that don't have both a boy and a girl. Likewise, all other probabilities are also potentially correct depending on the algorithm applied.

Update: Surprisingly, some people are still thinking that my second algorithm yields 2/3 instead of 1/2 (see the confused discussion on news.yc). I think part of the reason is that I was somewhat imprecise with the concept of "elimination". The second algorithm does not eliminate any of the families, but if I announce that there is a boy, that does eliminate the possibility of two girls. This is where some people are getting lost and thinking that the boy+girl probability has become 2/3. The catch is that announcing the boy also reduced the boy+girl probability by an equal amount, so the result is still the same (it eliminated either BG or GB, I don't know which, but it doesn't matter).

Tuesday, December 30, 2008

communication and collaboration - the big upgrade

The responses to my blog are always a little surprising to me. Yesterday's post didn't have a whole lot of substance, but it did include one good product idea, which is to somehow let other people edit my posts.

Someone on news.yc was not impressed by my idea though:

I was going to disagree with those negative comments below but then read the blog and damn; this guy has a freaking ego to think people would want to edit his ramblings for him in any other way than comical..

Next, I looked at the comments that were on my blog directly. One of the first ones was from Paul Graham, and it said simply:
deniable -> deniability :-)

Well, apparently Paul Graham wants to edit my ramblings, and in a way that would make me look smarter too... I'm pretty sure that he would have just made the correction himself had it been as easy and obvious as leaving a comment, but unfortunately no blog software seems to do that, most especially not Blogger.com.

Last year, someone translated one of my posts into Chinese (and I had Google translate it back).

This all reminds me of one of the blog posts that has been trapped in my head for a long time...

It starts off with something about ants, because my house must have been built on top of a giant anthill or something, because they are continually staging giant invasions and I'm always having to set them on fire or vacuum them up or something. So I'm always thinking about ants, and ants are kind of interesting because, more so than a lot of animals, the individuals are not really viable, and the hive (or colony or whatever) is kind of like a creature of its own (yeah, I know, I'm not the first person to notice this). It even has a short term memory in the form of pheromone trails left on my floor, and I erase those memories with a paper towel and some soapy water. So the ant colony is fairly sophisticated, but each ant's behavior is relatively simple -- they are just following some simple rules and don't really comprehend why or how the colony works. They don't see the "big picture".

And that reminds me of our brains, which are built out of relatively simple neurons. Each neuron simply sums up it's inputs, and then generates an output that gets passed along to some other neurons (or something like that, I'm sure it's a huge simplification, but you get the point). Certainly no individual neuron can possibly comprehend what it's doing -- it just cranks along summing up inputs and generating outputs. The magic is in the wires, the connections among the neurons.

Individual humans aren't terribly viable animals either. They almost went extinct not that long ago (100,000 years?). However, since then we've managed to pretty much take over the entire planet and build all kinds of amazing things like airplanes, computers, and burritos. Humans started out kind of similar to other animals (but weaker and less numerous) and then became something fundamentally different. That transition occurred because we are able to communicate and collaborate like nothing else. We can communicate though both time and space. We learn from people who died thousands of years ago on the other side of the planet. Even a survivalist hunter who goes off into wilderness alone is still relying on all the training and knowledge that was passed on before the journey began.

So in many ways, the human society (or human superorganism) is kind of like the human brain -- the magic is in the connections. Significant advances have occurred when we upgraded the wiring that connects everyone. The inventions of spoken language, written language, and the printing press were all revolutionary because they enabled more sophisticated communication and collaboration.

And now I can ramble on about ants and neurons and stuff, and people all of the world can read it, and digest it, and tell me I'm an idiot, and make their own ideas, and pass them on to other people, and it all happens in a matter of minutes. As much hype and excitement as there has been around the Internet, I think that people may still be misunderestimating its importance. We are literally upgrading the wiring that drives human society.

This is also why I'm excited about things like FriendFeed. The flow of information and influence is rather fundamental to way our world works. In the past much of that information flow was slow and hierarchical. It had to pass through one of a relatively small number of tightly controlled networks and publishers. But suddenly, the information can come from anywhere, and go anywhere, and it doesn't need anyone's approval. If it's completely random, it won't work any better than a bunch of randomly wired neurons (which I assume isn't very good), but with the right wiring, everyone starts to get the right information for them, and maybe we can stop being so stupid. I'm not yet sure what this new human architecture looks like, but that's what makes it an interesting (and extremely important) problem.

I sometimes think of FriendFeed as a kind of "distributed broadcast channel", but that's just part of picture. Better collaboration, like having other people edit my blog posts, is another part. It enables each of us to do what we do best, which improves the overall system efficiency and intelligence (and more importantly, I can avoid things that I don't like doing).

Keeping with the brain anology, it's very likely that we can't even comprehend what's going on. I certainly don't. I'm just a little neuron, summing up my inputs, and then passing the result along to you.

Monday, December 29, 2008

blog, v2

I haven't posted anything here in about eight months, mostly because I've just been very busy, but also because:

  • Blogging is too hard
  • I post a lot of things over on FriendFeed, which is easier, and I'm lazy (and you really should subscribe to my FriendFeed if you find anything I post here at all interesting)
  • I got tired of my blog posts. When I read them, there's something I don't like.
Nevertheless, I sometimes want to share something more than a FriendFeed message or interesting link, so I'm going to give it a second try.

However, I've decided that in the future my posts will be more rambling, and more pointless. I think part of what I don't like about the older posts is that they are sometimes arguing a point or something, but my real point (or my intention, at least) is just to share some kind of idea or thought, not convince anyone of anything. Also, I think this will be a lot easier to write because I can just type a bunch of words and they don't have to fit together in any particular way, and it's also a good excuse to not bother with any editing, so I should be able to crank these things out really fast.

I also have this idea to outsource the writing of my blog posts to someone, ideally everyone. The idea is that I'd write a bunch of stuff and then someone else (maybe wiki-style) would turn it into something coherent and readable. That would save me a lot of time and also provide plausible deniable when I write something that turns out to be especially stupid or offensive. But that's in the future. For now, it will just be a bunch of words that keep going until I get bored or distracted, and then I'll hit "send" :) (I'm also writing these things in Gmail since the blogger interface upsets me)

Wednesday, April 23, 2008

The power of links and the value of global knowledge

Long, long ago, before Google, search engines evaluated and ranked web pages by considering each page in isolation, examining the size of the fonts, the contents of the meta tags, etc. In some cases, it was even possible to "hijack" another site's listings by simply cloning their HTML. Perhaps a few search engines attempted to improve on this with simple tactics such as counting the number of links to a page, but that was generally useless since it's so easy to create "fake" links in order to boost your count.

With Pagerank, Google took a very different approach. Instead of considering each page in isolation, they examined the link structure of the entire web and computed a global evaluation of that structure. In other words, they began looking at the entire forest instead of just the individual trees. Google did other things too -- Pagerank is just one of many factors, but this general approach of evaluating information in a global context is fundamental to many of the algorithms. These algorithms made it easier for Google to spot which web sites were actually important, and which were just pretenders. Of course Google isn't perfect, and people can still manipulate rankings to some extent, but it was substantially better than the old way, and good enough to form the foundation of what is now a $174 billion dollar company.

Last week I wrote about Facebook gathering similar information about people. By collecting information about people and the links between them, they can start to get a global view of the human "forest". Unfortunately, based on many of the responses, that post wasn't very well written. A lot of people focused on how annoying Facebook applications are (true), how search results limited to your friends would be useless (also true), or other things completely unrelated to my point. A few people mentioned that Facebook hasn't done anything useful with this data, which is actually a good point, but I think that has more to do with Facebook and the newness of the data than it does with the value of the data. After all, the web was around for many years before Google came along and started profitably mining the link structure.

Will Facebook ever do anything useful with the human link data? I have no idea, and it's not particularly important to me. However, I'm confident that SOMEONE will begin mining this data, and that it could ultimately be more valuable than the link data from the web. Facebook is a convenient example because they happen to have a head start on collecting the data, but others might be the first to actually profit from it. Google, in particular, is much better at data mining and also has quite a bit of human link data (from Gmail and Orkut). Microsoft+Yahoo will also have a nice data set, though I doubt that they will know what to do with it. Of course none of this data is perfectly clean and noise-free, but real data never is -- the web certainly isn't.

Thursday, April 17, 2008

Facebook knows who you are, and that's worth more than you think

It's very fashionable to declare that Facebook is an over-hyped fad and will never make any real money, certainly not enough to justify its insane $15 billion valuation. At first glance, it's easy to understand why some people might think it's a toy -- most of the activity there seems to involve biting, poking, and joining groups with funny names.

However, I think that assessment misses out on something very interesting: Facebook is capturing everyone's identity and relationships. Of course there's some noise caused by random friending, but by examining the larger graph as well as other details such as location, affiliations, interactions, and of course explicitly entered relationship details ("how do you know Paul?"), they can get a pretty good idea of which people are actual friends and acquaintances.

The lack of reliable identity information has always been an issue on the web. It's the reason why we don't have a useful directory of email addresses -- everyone in the directory would get bombarded by spam or other unwanted messages, and even if it did exist, how would you know which of the thousands of Adam Smiths is the one that you are looking for? Facebook has already solved this problem for a large fraction of people. It's easy to search for a name and then pick out the right person based on their picture, location, or friends. I get a lot of messages on Facebook, but unlike email, I have yet to receive any spam. That's pretty remarkable.

Perhaps a people directory doesn't seem terribly valuable, but if you can't imagine how to make money from knowing everyone's identity and trust networks, then you aren't being very imaginative. Spam and fraud are two of the biggest problems on the internet, and they are very difficult to stop because it's so easy to create new identities, and we have no good way of differentiating between real identities and fake ones. Even in "real" life, people are able to skip town-to-town, defrauding people again and again because to the people in the new town, they have a new and unknown identity.

One of the best examples of this problem on the internet is eBay. If you try to buy or sell something on eBay (especially computers or electronics, apparently), there is a very good chance that someone will try to rip you off -- just search Google for ebay scammers and you will find pages such as "How scammers run rings round eBay" and "eBay Forums: Today's Scams In Progress". Ebay has had a relatively solid lock on the auction market due to network effects, but with billions of dollars in profits, a $42 billion market cap, and 10 years of not innovating, I'm willing to bet that won't last. With reliable identity information, most of these fraud schemes would become impractical, which would obviously be a real advantage for an eBay competitor.

What else is highly profitable on the internet? Search. I doubt that anyone will ever beat Google at Google-style search, certainly not Microsoft or Yahoo, even if they do tie their horses together. The only way anyone will create something significantly better than today's Google is if they add a new and important ingredient to the mix. Many people have suggested that demographic information, or perhaps knowing what your friends have searched for will help, but I doubt it. What could work is actual, direct, human involvement by the users. In fact, it's already helping in a very limited form -- Wikipedia pages are written and edited by random people on the internet and they frequently occupy the top spots on Google (and I always click on them). Of course the problem with letting random users edit or reorder the search results is that you will quickly be overwhelmed by spam and fraud. But what if you knew who the users were and which ones you could trust?

Those are just the first few things that come to mind -- the uses of identity information are endless. Of course there's no guarantee that Facebook will actually realize any of this potential -- there were many search engines before Google, and they all fumbled the opportunity they had, but it's important to at least understand the potential for big things.

Update: This post was supposed to be about data more so than Facebook (Facebook just happens to have the data). See this post for a (hopefully) better explanation.