I just slipped and added a misspelled word to the Firefox dictionary. Guess what the process is for undoing that mistake?
Some apps "fix" this problem by prompting with an "are you sure" type dialog before performing the action. That's very annoying, and it doesn't always work since I might hit space or enter or something and accidentally "ok" it. (this happens to me a lot when the dialog pops up while I'm typing)
I really like the solution used in Gmail:
After an action is performed, state what the action was and give an opportunity to undo, but in a non-intrusive way. More people should copy this approach, including Gmail, which needs it on "Send" (this will require adding a short delivery delay, like 10 sec, but it's worth it).
Monday, June 25, 2007
Saturday, June 23, 2007
Quick: Job culture
Here's a good link about the "Job culture":
I reminds me of my favorite part of Paul Graham's essay on "Why to not not start a startup":
I have a plan to fix this, but it's a little difficult to describe because people have a hard time thinking outside of the surf mindset, the assumptions that underlie the "job culture". It will take some time.
The cultural assumption of a true free enterprise system would be: “Individuals are responsible for their own lives and labors. They trade as equals, but are beholdin' to nobody.”
Free enterprise isn't anything like big-corporate capitalism. We've been told the two are equivalent, but that's just another bit of cultural brainwashing.
Think about it. Job holders by definition aren't capitalists. Job holders, no matter how well paid they might be, function merely as the servants of capitalists, just as medieval serfs functioned as the servants of lords. They are beholdin'. They function in a climate of diminished responsibility, diminished risk, and diminished reward. A climate of institutional dependency.
...
The daily act of surrendering individual sovereignty – the act of becoming a mere interchangeable cog in a machine – an act we have been conditioned to accept and to call a part of “capitalism” and “free enterprise” when it is not – is the key reason why the present Job Culture is a disaster for freedom.
James Madison, the father of the Bill of Rights, wrote:
“The class of citizens who provide at once their own food and their own raiment, may be viewed as the most truly independent and happy. They are more: They are the best basis of public liberty, and the strongest bulwark of public safety. It follows, that the greater the proportion of this class to the whole society, the more free, the more independent, and the more happy must be the society itself.”
Madison was speaking specifically about independent farmers, but he was also a believer in the independent entrepreneur – and for the same reasons.
Madison (and his like-minded friend Jefferson) knew that people who are self-sufficient in life's basics, who make their own decisions, whose livelihood relies on their own choices rather than someone else's, are less likely to march in lockstep. Independent enterprisers are far more likely to think for themselves, and far more capable of independent action than those whose first aim is to appease institutional gods.
Living in the Job Culture, on the other hand, has conditioned us to take a “someone else will deal with it” mentality. “I'm just doing my job.” “The boss makes the decisions.” “I'm just following orders.” But if someone else is responsible for all the important choices in life, then we by definition, are not.
I reminds me of my favorite part of Paul Graham's essay on "Why to not not start a startup":
16. A job is the default
This leads us to the last and probably most powerful reason people get regular jobs: it's the default thing to do. Defaults are enormously powerful, precisely because they operate without any conscious choice.
To almost everyone except criminals, it seems an axiom that if you need money, you should get a job. Actually this tradition is not much more than a hundred years old. Before that, the default way to make a living was by farming. It's a bad plan to treat something only a hundred years old as an axiom. By historical standards, that's something that's changing pretty rapidly.
...
Now we look back on medieval peasants and wonder how they stood it. How grim it must have been to till the same fields your whole life with no hope of anything better, under the thumb of lords and priests you had to give all your surplus to and acknowledge as your masters. I wouldn't be surprised if one day people look back on what we consider a normal job in the same way. How grim it would be to commute every day to a cubicle in some soulless office complex, and be told what to do by someone you had to acknowledge as a boss—someone who could call you into their office and say "take a seat," and you'd sit! Imagine having to ask permission to release software to users. Imagine being sad on Sunday afternoons because the weekend was almost over, and tomorrow you'd have to get up and go to work. How did they stand it?
I have a plan to fix this, but it's a little difficult to describe because people have a hard time thinking outside of the surf mindset, the assumptions that underlie the "job culture". It will take some time.
Friday, June 22, 2007
Three types of ideas - bad ones are often the best
Product ideas can be divided into three categories:
I expect that a lot of people will argue with the fact that I've grouped "truly bad" ideas together with "seemed bad but were actually very good" ideas. Everyone thinks that they can tell which ideas are the good ones, but observation suggests that they can't. I do believe that some people are better at picking winners than others, but at best they have maybe 50% accuracy. I've seen a lot of very smart people (such as at Google) be very wrong about these things.
For example, I remember when the "upload videos and put them on the web" incarnation of Google video was first being developed. Nearly everyone inside of Google, including me, was very skeptical that anything worthwhile would ever be uploaded. The common prediction was that it would all be movies and porn. Of course there was some of that, but the skeptics were completely wrong about the lack of worthwhile content. Uploaded video is one of the most important developments of the past several years. Unfortunately Google Video was burdened with an incredibly bad upload process (it included installing a windows client to do the upload!), and YouTube, which was started AFTER Google Video launched, took over. I believe that mistake was partially due to the negative expectations.
Here's my point: The best product ideas are often found in the "bad ideas" category!
If you're a super-genius researcher looking to dedicate your life to a really important problem that you may never solve, then the category one "cold fusion" ideas may be a good option. If you've found some clever way to dramatically reduce the cost of implementing a category two idea (like Skype did to the video phone), then that might be a good option. However, the real "low hanging fruit" is likely to be found with the category 3 "bad ideas". You'll have to deal with annoying skeptics and haters who say that you are wasting your time, but sometimes you'll create something hugely important, or at least moderately successful (and they won't).
Realizing that good ideas and bad ideas are often nearly indistinguishable, there are a few more lessons to be learned:
- Obviously good ideas that are very difficult to implement. Efficient cold-fusion, flying cars, and a lot of other sci-fi ideas fall into this group.
- Obviously "good" ideas that seem possible but haven't happened yet. Video phones and HDTV were in this category for a long time. I think this happens when people get excited about technology and overestimate the benefits (and possibly underestimate the cost). I just don't care that much about having a video phone.
- "Bad" ideas. Many of these ideas are truly bad, but some of them will in hindsight turn out to have been very good ideas. I put them in the same category because they are difficult to distinguish without the benefit of hindsight. Some examples are the personal computer ("why would anyone need a computer?"), Google ("there are already too many search engines, and besides, search engines don't make money"), and Blogger ("can't you just use Geocities, and besides, are there really that many people with something worth saying?"). More recent (and still controversial) examples include Facebook and Twitter.
I expect that a lot of people will argue with the fact that I've grouped "truly bad" ideas together with "seemed bad but were actually very good" ideas. Everyone thinks that they can tell which ideas are the good ones, but observation suggests that they can't. I do believe that some people are better at picking winners than others, but at best they have maybe 50% accuracy. I've seen a lot of very smart people (such as at Google) be very wrong about these things.
For example, I remember when the "upload videos and put them on the web" incarnation of Google video was first being developed. Nearly everyone inside of Google, including me, was very skeptical that anything worthwhile would ever be uploaded. The common prediction was that it would all be movies and porn. Of course there was some of that, but the skeptics were completely wrong about the lack of worthwhile content. Uploaded video is one of the most important developments of the past several years. Unfortunately Google Video was burdened with an incredibly bad upload process (it included installing a windows client to do the upload!), and YouTube, which was started AFTER Google Video launched, took over. I believe that mistake was partially due to the negative expectations.
Here's my point: The best product ideas are often found in the "bad ideas" category!
If you're a super-genius researcher looking to dedicate your life to a really important problem that you may never solve, then the category one "cold fusion" ideas may be a good option. If you've found some clever way to dramatically reduce the cost of implementing a category two idea (like Skype did to the video phone), then that might be a good option. However, the real "low hanging fruit" is likely to be found with the category 3 "bad ideas". You'll have to deal with annoying skeptics and haters who say that you are wasting your time, but sometimes you'll create something hugely important, or at least moderately successful (and they won't).
Realizing that good ideas and bad ideas are often nearly indistinguishable, there are a few more lessons to be learned:
- Instead of endlessly debating whether an idea is good or not, we should find faster and cheaper ways of testing them. This is one of the reasons why open systems such as the Internet or market economies develop faster than closed systems, such as communism or big companies -- individuals and small groups are able to build new things without getting approval from anyone.
- Your product idea is generally worthless, because to most people it's indistinguishable other bad ideas. Very few people will want to buy (or steal, or take for free) your idea because they already have their own bad ideas which they like better. (There are other reasons as well, but this is one of them. The way to make your idea valuable is to demonstrate that it's not bad by building a real product -- this also demonstrates your ability to execute.)
Not getting health insurance is a really bad idea
Seriously. If you can possibly afford insurance, then there's no excuse for not having it. I'm especially thinking of young people doing a startup and not buying insurance in order to "save money". That's a really dumb optimization, and suggests that you aren't very good at making trade-offs. A quick search shows that in California you can get a decent plan for $77/month. If you can't find $77/month, then maybe you aren't ready to do a startup.
I'm reminded of this issue right now because I just passed someone in SF laying face down on the pavement, not moving. His bicycle was nearby and I'm guessing that he had just been hit by a car. I hope that he does ok, but it didn't look good. (people were already helping him, and the police were just a few cars back, so I didn't stop)
Even if you are young and healthy, bad things can still happen. My brother was perfectly healthy until he got cancer at age 33. When my daughter was in the hospital, it cost $12,000/day -- she was there for almost 3 months.
The simple fact is that the health care system in the US isn't setup for people who don't have insurance. If something bad happens, you don't want to be one of those people. I'm not looking to debate our system, so please don't bother. Right now, the system is the way it is.
My point is simple: Do what you can to get insurance. Now.
I'm reminded of this issue right now because I just passed someone in SF laying face down on the pavement, not moving. His bicycle was nearby and I'm guessing that he had just been hit by a car. I hope that he does ok, but it didn't look good. (people were already helping him, and the police were just a few cars back, so I didn't stop)
Even if you are young and healthy, bad things can still happen. My brother was perfectly healthy until he got cancer at age 33. When my daughter was in the hospital, it cost $12,000/day -- she was there for almost 3 months.
The simple fact is that the health care system in the US isn't setup for people who don't have insurance. If something bad happens, you don't want to be one of those people. I'm not looking to debate our system, so please don't bother. Right now, the system is the way it is.
My point is simple: Do what you can to get insurance. Now.
Saturday, June 16, 2007
Quick: Unintended consequences
For some reason, I like reading about unintended consequences. It seems that reality always has more depth and complexity than we would like to admit. This is part of the reason why people who claim to have the answers, generally don't.
This comes to mind because I was just reading a little about the effects of pesticides on plants:
Put another way, pesticides can actually alter the nutritional properties of plants by removing environmental stress, stress which would have caused the plants to produce nutrients that may be important to humans. I hadn't considered that.
This comes to mind because I was just reading a little about the effects of pesticides on plants:
Pesticide residues in plants are regulated to protect human health. They are measured when plants enter the market. However, this procedure does not consider that pesticides modulate secondary metabolism in plants when they are applied. The question arises whether it is conceivable that pesticide-induced changes in the chemical composition of plants influence human health. The examples resveratrol, flavonoids, and furanocoumarins indicate that plant phenolics may have subtle effects on physiologic processes that are relevant to human health. These effects may be beneficial, e.g., due to the inhibition of the oxidation of macromolecules and platelet aggregation or by their pharmacologic properties. Depending on concentration and specific chemical composition, however, plant phenolics may also be toxic, mutagenic, or cancerogenic. The consequences of a modulation of plant phenolics on human health are complex and cannot be predicted with certainty. It may be that the modulation of plant phenolics at the time of application and not the usually low level of pesticide residues at the time of consumption is critical for human health.
Put another way, pesticides can actually alter the nutritional properties of plants by removing environmental stress, stress which would have caused the plants to produce nutrients that may be important to humans. I hadn't considered that.
Friday, June 15, 2007
Thursday, June 14, 2007
Quick: DBE
People sometimes ask what "Don't Be Evil" really means. Eventually I'll write down my thoughts, but for now this excellent post from Seth Godin explains a good part of it:
There are a lot of worthwhile things in the world -- why waste your life working on something evil?
Let me be really clear, just in case. If you think that the world would be a better place if everyone owned a handgun, then yes, market handguns as hard as you can. If you honestly believe that kids are well served by drinking a dozen spoonfuls of sugar every morning before school, then I may believe you're wrong, but you should go ahead and market your artificially-sweetened juice product. My point is that you have no right to market things you know are harmful or that lead to bad outcomes, regardless of how much you need that job.
Along the way, “just doing my job,” has become a mantra for blind marketers who are making short-term mistakes in order to avoid a conflict with the client or the boss. As marketing becomes every more powerful, this is just untenable. It’s unacceptable.
If you get asked to market something, you’re responsible. You’re responsible for the impacts, the costs, the side affects and the damage. You killed that kid. You poisoned that river. You led to that fight. If you can’t put your name on it, I hope you’ll walk away. If only 10% of us did that, imagine the changes. Imagine how proud you’d be of your work.
There are a lot of worthwhile things in the world -- why waste your life working on something evil?
Wednesday, June 13, 2007
The dogmatic programmer - when software becomes religion
Do you believe that there is only one "right way" to do things? Do you ignore evidence that other solutions may also work? Do you dismiss the possibility that there are reasonable trade-offs to be made between the "right way" and some other way? (do you leave angry comments on my blog?)
Do you think I'm kidding? Read these comments about GET vs POST. There's no question that POST is the "right" way and generally safer, but sometimes it's annoying. Even though GET isn't "supposed" to work, it often can be made to. Don't believe me? Google does billions of dollars a year in GET based transactions in the form of CPC ad clicks (which can cost over $50/click). A lot of other things work this way too, including most non-ajax webmail (for changing read/unread state, at the very least).
If you're writing missile guidance software or something then, please, please, please do things the right way -- it's better safe than sorry, right?
However, if you're writing little web apps, then there's a good chance that you can bend a lot of the rules and produce a better product sooner. I'd rather have something imperfect but useful and popular than something "perfect" but unfinished and unused. Sometimes, it's better sorry than safe.
When is it safe to bend the rules? Just ask yourself, "What's the worst that could happen?" If the worst case isn't too bad, then you're probably ok. Minor bad things will happen no matter what, so it's often better to put your energy into general problem recovery, rather than imagining that you can simply avoid all problems. (this is especially true on the web)
Do you think I'm kidding? Read these comments about GET vs POST. There's no question that POST is the "right" way and generally safer, but sometimes it's annoying. Even though GET isn't "supposed" to work, it often can be made to. Don't believe me? Google does billions of dollars a year in GET based transactions in the form of CPC ad clicks (which can cost over $50/click). A lot of other things work this way too, including most non-ajax webmail (for changing read/unread state, at the very least).
If you're writing missile guidance software or something then, please, please, please do things the right way -- it's better safe than sorry, right?
However, if you're writing little web apps, then there's a good chance that you can bend a lot of the rules and produce a better product sooner. I'd rather have something imperfect but useful and popular than something "perfect" but unfinished and unused. Sometimes, it's better sorry than safe.
When is it safe to bend the rules? Just ask yourself, "What's the worst that could happen?" If the worst case isn't too bad, then you're probably ok. Minor bad things will happen no matter what, so it's often better to put your energy into general problem recovery, rather than imagining that you can simply avoid all problems. (this is especially true on the web)
Quick: Vibe
Seth Godin on "the vibe":
This is a very real and under-appreciated part of work. It was one of the first things I noticed when I started at Google after working at Intel -- the place just felt so much more alive. Now it's something that I watch for when visiting startups -- my guess is that upbeat and energetic companies will outperform the ones that feel oppressive and hopeless.
Have you ever been at a banquet or in a boutique or at a concert or a meeting or a company where the vibe was incredibly positive?
...
If vibe is so important, why does it sound flaky to worry about it? Who's in charge of the vibe at your place? Could it be better? A lot better?
This is a very real and under-appreciated part of work. It was one of the first things I noticed when I started at Google after working at Intel -- the place just felt so much more alive. Now it's something that I watch for when visiting startups -- my guess is that upbeat and energetic companies will outperform the ones that feel oppressive and hopeless.
Monday, June 11, 2007
Quick: Wolves
A great story from Steve Yegge. It sounds like Steve recently realized that there are wolves at Google. Can you see the wolves in your organization? How are you protecting yourself? (or are you being eaten alive?)
"Have you ever noticed that all our folklore about wolves has certain themes and patterns to it? I've never met a wolf, but I give them fair credit for being wily creatures. I'm a big man, not one to fear a big dog's bite, but I would give a wolf a wide berth, because they're reputed to be so crafty. And crafty, friends, I am not. A craftsman, yes, but I realize now that I am more sheep than wolf.
"The landowner in my tale is a good, affable fellow; I could see it within minutes of meeting him, and I stand by him today still. He's a man with a clear vision and a good heart, and he hired crew members who, for all our talents, know nothing of wolves or wizards.
"The problem with setting up an untended, untrained flock of sheep is that there are wolves aplenty, and as sheep are not adequately prepared to recognize or deal with wolves, they will all eventually be eaten. This seems obvious to us. Nearly as obvious is that this idea applies equally to government, with the minor change that the wolves are savvy politicians and the sheep are incompetent ones. Eventually the sheep are all eaten, so our assumption is generally that the government is filled entirely with wolves — an assumption that has proven to be essentially axiomatic, across time and nation.
"What's far less obvious is that the idea applies outside flocks and governments, with the same consequences. If you set up any organization of saintly do-gooders, with no politics anywhere in sight, then the presence of a single master politician can go unrecognized for long enough for every single sheep to be eaten and replaced with a wolf. At some point the shepherd looks down and perceives that he now has a flock of wolves, and he wonders how it happened. But a shepherd rarely witnesses the process while it's in motion, because the wolves are so sneaky.
"I was being eaten alive on my mansion project, and yet I was utterly oblivious to anything but the need to survive, which due to various odd circumstances required working without sleep and eating marshmallows without respite, apparently until either I was dead or forever happened, whichever came first.
Saturday, June 09, 2007
Email blackmail is unnecessary
From Slashdot:
It's tempting to think that such schemes, although distasteful, are necessary to fight spam. They are not.
A good reputation system can be used to provide reliable delivery to high-volume, non-spamming senders, and it can be done without punishing non-profits, mailing lists, startups, or others who can't afford to pay extra delivery fees.
How does that work? Brad Taylor of Google published a paper about the Gmail reputation system, which is quite effective (pdf or html).
Unfortunately Google does not currently publish this reputation information, though I'm hopeful that they will eventually. (but I know that the team there is already hard at work on a number of other important projects)
Under the guise of fighting spam, five of the largest Internet service providers in the U.S. plan to start charging businesses for guaranteed delivery of their e-mails. In other words, with regular service we may or may not deliver your email. If you want it delivered, you will have to pay deluxe.
It's tempting to think that such schemes, although distasteful, are necessary to fight spam. They are not.
A good reputation system can be used to provide reliable delivery to high-volume, non-spamming senders, and it can be done without punishing non-profits, mailing lists, startups, or others who can't afford to pay extra delivery fees.
How does that work? Brad Taylor of Google published a paper about the Gmail reputation system, which is quite effective (pdf or html).
... spammy domains get nicely clustered around a reputation of zero. We know these domains are spammy because if our users disagreed with us, they would unmark spam on them and the reputation would no longer be zero. Nonspam domains are more loosely clustered in the 90-100 range. There ends up being a smattering of domains in be- tween. Many of the messages in the in-between category are from legitimate bulk senders, and the lower than 90 percent reputation is a reflection of less-than-ideal sending practices. But not all bulk senders are the same. Some have very high SPF and DomainKey reputations. For example, eBay's DomainKey reputation is 98.2, enough to be whitelisted by Gmail. This is the "holy grail" of bulk sending, and it is all done without requiring any payment or extra effort on the part of the sender other than just having good mail hygiene.
Unfortunately Google does not currently publish this reputation information, though I'm hopeful that they will eventually. (but I know that the team there is already hard at work on a number of other important projects)
Friday, June 08, 2007
Optimizing everything: some details matter a lot, most don't
Optimization is something that all engineers and product designers (all people, really) should understand. However, their words and behavior suggest that most either don't understand it, or haven't really internalized the lessons of optimization. (and by the way, I'm talking about CS-style "make it better" optimization, not "make it optimal" optimization)
Like everything, optimization is a complex topic, so I'm going to simplify it into three easy steps:
Most programmers probably think of optimization primarily in terms of CPU profiling, but it really applies to everything. (and in fact, this post is about the general process of optimizing, especially for things other than code -- for specific cases of hardcore game engine optimization or whatever it might be totally wrong)
For example, people who continually "pinch pennies" (perhaps driving around town to save a few cents on gas) but then waste thousands of dollars on something stupid like a fancy car or credit card interest are probably not optimizing their spending very well. They are putting most of their effort into low reward activities (saving a few cents on gas) while losing much more money in other places. Big, dumb, Dilbertesque companies are also notorious for this kind of thing, cutting back on pencils while running super-bowl commercials, or something.
This is also the kind of activity that I was talking about in my post titled "Wasting time on things that really don't matter". It's not that there is zero difference between two slightly different ways of testing for the empty string, it's just that your product probably has 100,000 more serious problems, and so time spent worrying about the empty string style is time NOT spent fixing those more important problems. Put another way, fix all of the other problems first, then debate the empty string.
Of course you may enjoy debating the empty string (and in fact there are some interesting arguments), and that's fine. The zeroth step in optimization is to decide what you are optimizing for. If you're looking to maximize the amount of time spent on endless debates, or perhaps code "beauty", then maybe the empty string debate really is the highest benefit/cost activity (because the benefit is so high). If, on the other hand, you are trying to maximize the success of your product, then the empty string debate probably has negative value and should be strongly avoided.
The empty string comparison is obviously a small detail, and it's tempting to think that we should not bother with such small details, but that too is a huge mistake. Another important lesson of optimization is that some small fraction of the details will determine the majority of the outcome. This is commonly referred to as the 80/20 or 90/10 rule, stating for example that "90% of the time is spent in 10% of the code", but it could just as easily be 99/1 or something like that (and for code that has already been well optimized, this 80/20 rule may not hold -- it's just a heuristic). In code it may be just a few lines in the "inner loop", or perhaps it's the interest rate on your mortgage ("what's a few percent?"). I once profiled a server at Google and discovered that something like 90% of the time was being spent on a single line of code: a call to sscanf (repeated millions of times) that someone had snuck into the scoring code (that's the 99999/1 rule).
Determining which details matter a lot (and which don't) is also the key to designing successful products. In the classic slashdot iPod post from 2001, they summarized the newly released iPod thusly: "No wireless. Less space than a nomad. Lame."
History suggests that those weren't the important details, since it's 2007 and there still isn't an iPod with wireless (not until the 29th, at least), and most iPods have less than the original 5GB of storage. Apple did, however, get a few very important details right: it's easy to get your music onto an iPod, it holds "enough" music (unlike the early flash players), and it fits easily into your pocket (unlike the early disk-based players).
So how do we determine which details are important? Observation. (and some judgment, but I think that's mostly formed via observation) If you're optimizing an application, observe which resources are being exhausted or causing the delay (it's not always CPU -- it's often disk seeks), and then observe where they are being consumed. If you're trying to optimize money, first observe where most of the money goes to and comes from. To optimize a product design, observe which things are most "useful", and which are most painful or annoying (waiting or thinking, usually).
If you get all of the important details right, you'll be so far ahead of everyone else that the other details probably won't matter. But remember, your time is finite, so time spent on unimportant details is time not spent on important details! Most engineers are very detail oriented people, and this is good because they can really focus on the important details, and it is bad because they can really focus on the unimportant details! I suspect that learning to tell the difference will be worth your time.
By the way, to get even more details right, decrease the cost side of the benefit/cost equation so that you can do more things in less time (by finding easier solutions, avoiding endless debates, etc).
Like everything, optimization is a complex topic, so I'm going to simplify it into three easy steps:
- Find the problems ("profiling")
- Sort the problems according to benefit-of-fixing/cost-of-fixing ("ROI")
- Fix them, in the order of highest benefit/cost to lowest
Most programmers probably think of optimization primarily in terms of CPU profiling, but it really applies to everything. (and in fact, this post is about the general process of optimizing, especially for things other than code -- for specific cases of hardcore game engine optimization or whatever it might be totally wrong)
For example, people who continually "pinch pennies" (perhaps driving around town to save a few cents on gas) but then waste thousands of dollars on something stupid like a fancy car or credit card interest are probably not optimizing their spending very well. They are putting most of their effort into low reward activities (saving a few cents on gas) while losing much more money in other places. Big, dumb, Dilbertesque companies are also notorious for this kind of thing, cutting back on pencils while running super-bowl commercials, or something.
This is also the kind of activity that I was talking about in my post titled "Wasting time on things that really don't matter". It's not that there is zero difference between two slightly different ways of testing for the empty string, it's just that your product probably has 100,000 more serious problems, and so time spent worrying about the empty string style is time NOT spent fixing those more important problems. Put another way, fix all of the other problems first, then debate the empty string.
Of course you may enjoy debating the empty string (and in fact there are some interesting arguments), and that's fine. The zeroth step in optimization is to decide what you are optimizing for. If you're looking to maximize the amount of time spent on endless debates, or perhaps code "beauty", then maybe the empty string debate really is the highest benefit/cost activity (because the benefit is so high). If, on the other hand, you are trying to maximize the success of your product, then the empty string debate probably has negative value and should be strongly avoided.
The empty string comparison is obviously a small detail, and it's tempting to think that we should not bother with such small details, but that too is a huge mistake. Another important lesson of optimization is that some small fraction of the details will determine the majority of the outcome. This is commonly referred to as the 80/20 or 90/10 rule, stating for example that "90% of the time is spent in 10% of the code", but it could just as easily be 99/1 or something like that (and for code that has already been well optimized, this 80/20 rule may not hold -- it's just a heuristic). In code it may be just a few lines in the "inner loop", or perhaps it's the interest rate on your mortgage ("what's a few percent?"). I once profiled a server at Google and discovered that something like 90% of the time was being spent on a single line of code: a call to sscanf (repeated millions of times) that someone had snuck into the scoring code (that's the 99999/1 rule).
Determining which details matter a lot (and which don't) is also the key to designing successful products. In the classic slashdot iPod post from 2001, they summarized the newly released iPod thusly: "No wireless. Less space than a nomad. Lame."
History suggests that those weren't the important details, since it's 2007 and there still isn't an iPod with wireless (not until the 29th, at least), and most iPods have less than the original 5GB of storage. Apple did, however, get a few very important details right: it's easy to get your music onto an iPod, it holds "enough" music (unlike the early flash players), and it fits easily into your pocket (unlike the early disk-based players).
So how do we determine which details are important? Observation. (and some judgment, but I think that's mostly formed via observation) If you're optimizing an application, observe which resources are being exhausted or causing the delay (it's not always CPU -- it's often disk seeks), and then observe where they are being consumed. If you're trying to optimize money, first observe where most of the money goes to and comes from. To optimize a product design, observe which things are most "useful", and which are most painful or annoying (waiting or thinking, usually).
If you get all of the important details right, you'll be so far ahead of everyone else that the other details probably won't matter. But remember, your time is finite, so time spent on unimportant details is time not spent on important details! Most engineers are very detail oriented people, and this is good because they can really focus on the important details, and it is bad because they can really focus on the unimportant details! I suspect that learning to tell the difference will be worth your time.
By the way, to get even more details right, decrease the cost side of the benefit/cost equation so that you can do more things in less time (by finding easier solutions, avoiding endless debates, etc).
Sunday, June 03, 2007
Java running faster than C
Note: A lot of people seem to be taking this post to be the "Ultimate C vs Java shootout". It's not. Performance is a very complex topic. My only real point is this: Java (which used to be slow) has reached the class of "fast languages". For the majority of applications, speed is no longer a valid excuse for using C++ instead of Java.
I just saw this page comparing the performance of several languages on a simple Mandelbrot set generator. His numbers show Java being over twice as slow as C, but then I noticed that he's using an older version of java and only running the test once, which doesn't really give the JVM a chance to show off.
I quickly hacked up the code to run 100 iterations 3 times and then used my standard "go fast" flags (there may be better flags, but I'm lazy). Here are my results:
C still wins on the first iteration, but Java is actually slightly faster on subsequent iterations!
Obviously the results will be different with different code and different machines, but it's clear that the JVM is getting quite fast.
This test was run with Java 1.6.0-b105 and gcc 4.1.2 under Linux 2.6.17 under Parallels on my 2.33GHz Core 2 Duo MacBook Pro. Here is the hacked up code: Java and C.
For extra fun, I also tried running the JS test using the Rhino compiler:
Compiled JS is about 9x slower than C on this test. If CPU speed doubles every 18 months, then JS in 2007 performs like C in 2002.
Update: A few more cpp flags have been suggested. -march=pentium4 helps a little, but it's still slower than Java.
Adding -ffast-math puts C in the lead, but I'm not sure what the downside is. The gcc man page says, "This option should never be turned on by any -O option since it can result in incorrect output for programs which depend on an exact implementation of IEEE or ISO rules/specifications for math functions." That sounds like an optimization that Java might not use.
Update: Several people have claimed that the performance difference is due to fputs (including the top rated comment on reddit, aumusingly). That is not correct, at least not on my computer. I tried replacing the print calls with a trivial function (but with side-effects), and it actually helped the Java more than the C:
Many people have pointed out that '-O8' is more than enough 'O' levels. I know, and I don't care -- it's just as good as '-O3' or whatever.
I just saw this page comparing the performance of several languages on a simple Mandelbrot set generator. His numbers show Java being over twice as slow as C, but then I noticed that he's using an older version of java and only running the test once, which doesn't really give the JVM a chance to show off.
I quickly hacked up the code to run 100 iterations 3 times and then used my standard "go fast" flags (there may be better flags, but I'm lazy). Here are my results:
$ java -server -XX:CompileThreshold=1 Mandelbrot 2>/dev/null
Java Elapsed 2.994
Java Elapsed 1.926
Java Elapsed 1.955
$ gcc -O8 mandelbrot.c
$ ./a.out 2>/dev/null
C Elapsed 2.03
C Elapsed 2.04
C Elapsed 2.05
C still wins on the first iteration, but Java is actually slightly faster on subsequent iterations!
Obviously the results will be different with different code and different machines, but it's clear that the JVM is getting quite fast.
This test was run with Java 1.6.0-b105 and gcc 4.1.2 under Linux 2.6.17 under Parallels on my 2.33GHz Core 2 Duo MacBook Pro. Here is the hacked up code: Java and C.
For extra fun, I also tried running the JS test using the Rhino compiler:
$ java -cp rhino1_6R5/js.jar -server -XX:CompileThreshold=1 org.mozilla.javascript.tools.shell.Main -O 9 mandelbrot.js 2>/dev/null
JavaScript Elapsed 21.95
JavaScript Elapsed 17.039
JavaScript Elapsed 17.466
JavaScript Elapsed 17.147
Compiled JS is about 9x slower than C on this test. If CPU speed doubles every 18 months, then JS in 2007 performs like C in 2002.
Update: A few more cpp flags have been suggested. -march=pentium4 helps a little, but it's still slower than Java.
$ gcc -O9 -march=pentium4 mandelbrot2.c
$ ./a.out 2>/dev/null
C Elapsed 1.99
C Elapsed 1.99
C Elapsed 1.99
Adding -ffast-math puts C in the lead, but I'm not sure what the downside is. The gcc man page says, "This option should never be turned on by any -O option since it can result in incorrect output for programs which depend on an exact implementation of IEEE or ISO rules/specifications for math functions." That sounds like an optimization that Java might not use.
$ gcc -ffast-math -O9 -march=pentium4 mandelbrot2.c
$ ./a.out 2>/dev/null
C Elapsed 1.66
C Elapsed 1.67
C Elapsed 1.67
Update: Several people have claimed that the performance difference is due to fputs (including the top rated comment on reddit, aumusingly). That is not correct, at least not on my computer. I tried replacing the print calls with a trivial function (but with side-effects), and it actually helped the Java more than the C:
C Elapsed 1.88
Java Elapsed 1.554
Many people have pointed out that '-O8' is more than enough 'O' levels. I know, and I don't care -- it's just as good as '-O3' or whatever.
Friday, June 01, 2007
Wasting time on things that really don't matter
It's easy to waste a lot of time debating things that aren't very important. Maybe that's fun if you don't have anything better to do, but when you're actually trying to accomplish something it can be deadly.
This post on "coding horror" is a great example of this phenomenon, or more specifically, the comments are a great example. At least half of the comments are from people who are debating which of these two lines of code are better:
Here's my answer: It's not important!
It's easy to get dragged into these debates because we all have opinions (and the one form really is better!), and of course the more time is spent talking about the issue the more important it seems.
Here's my trick for killing these stupid debates: Let's list all the issues and problems that we are facing right now. Now, where does this issue rank on that list? Is it in the top 10? The top 100? The top 1000? Are we spending more time talking about this one minor issue than we are spending working on any of our top ten issues? (You don't necessarily have to make the list -- just suggest that it exists and then guess where on the list the issue appears)
Sometimes that works. Sometimes it brings a little perspective.
And yes, this misallocation of time (aka "worrying about dumb stuff") is probably a top 10 productivity killer.
Update: Here's a good link (via news.yc) which offers one explanation for why people often spend the most time on the least important issues:
Update Two: Wasting time is fine, and even inane discussions of the best way to compare empty strings can be interesting. My real point is this: if we are actually trying to get something done, then we need to kill this stuff, or it will kill productivity. If, on the other hand, we're ok being in time-wasting-mode, then silly debates are probably harmless enough. I picked on the blog comments above not because there is anything wrong with them per-se (what are blogs for, if not inane debate?), but because it reminded me a lot of the silly debates that I've seen on work mailing lists (where they do cause harm).
This post on "coding horror" is a great example of this phenomenon, or more specifically, the comments are a great example. At least half of the comments are from people who are debating which of these two lines of code are better:
if (s == String.Empty)
if (s == "")
Here's my answer: It's not important!
It's easy to get dragged into these debates because we all have opinions (and the one form really is better!), and of course the more time is spent talking about the issue the more important it seems.
Here's my trick for killing these stupid debates: Let's list all the issues and problems that we are facing right now. Now, where does this issue rank on that list? Is it in the top 10? The top 100? The top 1000? Are we spending more time talking about this one minor issue than we are spending working on any of our top ten issues? (You don't necessarily have to make the list -- just suggest that it exists and then guess where on the list the issue appears)
Sometimes that works. Sometimes it brings a little perspective.
And yes, this misallocation of time (aka "worrying about dumb stuff") is probably a top 10 productivity killer.
Update: Here's a good link (via news.yc) which offers one explanation for why people often spend the most time on the least important issues:
Parkinson shows how you can go in to the board of directors and get approval for building a multi-million or even billion dollar atomic power plant, but if you want to build a bike shed you will be tangled up in endless discussions.
Parkinson explains that this is because an atomic plant is so vast, so expensive and so complicated that people cannot grasp it, and rather than try, they fall back on the assumption that somebody else checked all the details before it got this far. Richard P. Feynmann gives a couple of interesting, and very much to the point, examples relating to Los Alamos in his books.
A bike shed on the other hand. Anyone can build one of those over a weekend, and still have time to watch the game on TV. So no matter how well prepared, no matter how reasonable you are with your proposal, somebody will seize the chance to show that he is doing his job, that he is paying attention, that he is *here*.
Update Two: Wasting time is fine, and even inane discussions of the best way to compare empty strings can be interesting. My real point is this: if we are actually trying to get something done, then we need to kill this stuff, or it will kill productivity. If, on the other hand, we're ok being in time-wasting-mode, then silly debates are probably harmless enough. I picked on the blog comments above not because there is anything wrong with them per-se (what are blogs for, if not inane debate?), but because it reminded me a lot of the silly debates that I've seen on work mailing lists (where they do cause harm).