Month: August 2005

Application To Be Stupid

The National Highway Traffic Safety Administration has just released a report on the effects seen from the repeal of Florida’s mandatory motorcycle helmet law back in 2000 (summary and CNN report). The effect was pretty much the same as seen in other states that have repealed helmet laws: deaths increased and costs to treat head injuries more than doubled (with $10.5 million charged to charitable and government sources).

Of course, the report just dredges up all the libertarian arguments about how the government shouldn’t interfere with one’s right to be stupid, so long as they aren’t hurting anyone else by their stupidity. That argument has an air of truth to it for me, and as a public service I’d like to propose a simple government form:

Application To Be Stupid

Name: ______________ Date: ______________

Intended stupidity (check one):

[ ] Riding motorcycle without helmet
[ ] Driving without wearing seat belt
[ ] Asserting my second-amendment rights while drunk
[ ] Other (please specify): ___________________

Please read carefully and sign below:

I hereby attest that I am hellbent and determined to be as stupid as possible, as is within my rights as a free-thinking adult, and assert that it is nobody's business to tell me otherwise. I also attest that all of the following are true:

  • Should I sustain injury, I will refuse any and all medical aid offered above and beyond that which would be reasonably required by a more intelligent person. I will wear my Stupid Alert medical bracelet at all times during my activity.
  • I am either not insured, or have filed a stupidity waver with my insurance company, such that rates will not increase for others due to my stupidity.
  • I have no dependents who rely on my presence or income.
  • I am either not currently employed or my employment is currently a drag on my employer and the economy. No business decisions have been made under the assumption that I would take reasonable precautions for my own life.
  • There is no one who loves me or who would be distraught, depressed would or otherwise miss me if my stupidity brings about my premature end.
  • I am of sound mind and and am fully capable of making a rational decision. I am not currently under the influence of alcohol, drugs, or inordinate libertarianism.

Signature: _____________________________

(Thanks to Judith for the link!)

Chris Schmandt’s book available for free download

Chris Schmandt, the head of MIT Media Lab’s Speech Interface Group, has just made a PDF of his now out-of-print 1994 book Voice Communications With Computers: Conversational Systems available for download in PDF form for free off his website.

Chris was one of the readers for my Generals Exams, and naturally this was one of the books on my reading list. It’s 12 years old at this point, but most of the issues he talks about are inherent in speech communications regardless of the technology. Highly recommended.

(Thanks to Thad for the link!)

CVS disposable video camera uncrippled…

With a textbook give away the razors and sell the blades strategy, on June 26th CVS started selling a “one-time-use” video camcorder for just $29.99. Buy it, take your movie, and then get a DVD of your movie for just $12.99 at the CVS photo lab.

Just 39 days later, people have figured out figured how to make it download those movies direct to your own PC directly through USB.

I don’t know how much these things cost to CVS, but they can’t be happy about this obvious development. (No word yet on whether CVS will be taking legal action based on vague “the government should stop anyone from poking holes in our poorly-thought-out business plan” laws…)

Random factoid of the day: 222 years till Universe-sized hard drive

Another back-of-the-envelope calculation, inspired by a comment by my friend Beemer:

Atoms in the Universe: 1079
Bits on this year’s largest hard drive: 500GB = 4 x 1012 bits
Doubling time for hard drives: 1 year

Years before a single hard drive will store 1 bit for every atom in the Universe at current doubling rates: 222

Warning: past performance is not necessarily indicative of future results.

What about a Google cache on my desk?

Yesterday I said that within a decade disk space should be cheap enough to put the entire visible web on your desk for under $1000. I think that’s actually a pretty conservative estimate, since it assumes a 100 KB average page size, up to an order of magnitude higher than some estimates.

Here’s another back-of-the envelope: let’s say we wanted the equivalent of Google’s webcache on your desktop (that is, all the HTML but no images). Another way to calculate it starts with the fact that the 2003 update to Berkeley’s How Much Info? study estimated that in 2002 the web was only 167 Terabytes total, with only 30 TB as HTML (69 TB when you include images). Assuming 75% compression, that’s just around 8 TB. That same year a 2002 OCLC study calculated that the total number of web pages was only increasing by about 5% per year (with the number of sites actually shrinking, but the number of pages per site growing). That rate had been decreasing ever since the explosion in the mid ’90s, but let’s assume growth became a steady 5% and will stay at that rate for the next few years. (There are a lot of assumptions going on here, but the nice thing about these kinds of curves is that even if my numbers are off by a factor of two somewhere, so long as disk keeps increasing at the same rate that crossover point only changes by one year.)

Now we’ve got two trends, and just need to find the intersection point for the price we want:

Year Price of 1 TB disk Size of public web
(compressed HTML only,
assumes 5% growth/year)
Cost to store
2002 8 TB
2003 8.5 TB
2004 8.8 TB
2005 $500 9.25 TB $4,625
2006 $250 9.7 TB $2,425
2007 $125 10.2 TB $1,275
2008 $62.50 10.7 TB $670
2009 $31.25 11.25 TB $350
2010 $15.50 11.8 TB $185

So given a few assumptions, we’ll be able to cache all the raw text on the public web for under $1000 (disk cost) within 3 years!

When do I get the web in my pocket?

Some time ago I asked how much longer before I can have the Web in my pocket. Let’s try a quick back-of-the-envelope calculation:

A paper from January 2005 calculates the publicly indexable Web (the part easily accessible to search engine web-crawlers) as being around 11.5 billion pages. Estimates on average webpage size seem to be all over the map, but let’s figure around 100 KB per page, for a total of around a petabyte (one million Gig) for today’s indexed web. (I’m assuming text and images, but ignoring other media.)

Disk these days is going for less than 50 cents per Gig, so enough disk to store your own personal Google (and then some) costs around $500,000. With compression you can probably cut that in half. The price of disk is also falling by a factor of two every 12 months, so assuming no major jumps or snags in the disk-price curve, in a little less than a decade we can expect to hold the equivalent of today’s indexed web for less than $1000.

Now of course, in that time the web will continue to grow, so we may no longer be satisfied with our measly petabyte-on-the-desk, but I figure the amount of human-generated Web content has a much slower growth rate than our disk-space curve. The number of web sites actually shrank between 2001 and 2002, and though it now seems to be growing again there’s only so much content that human beings can create in a day. The real question I have is whether in a decade anyone will see having access to the whole web as being all that interesting — I could easily see the majority of people losing interest in the surface web in favor of personal deep-web niches. The only reason I want the whole web in my pocket is because it’s too hard for me to filter out in advance the 99.99% of the web that’ll never be of interest to me — the closer we get to that kind of pruning, the less disk we need and the higher-quality the experience will be.

Update 8/2/05: doing a different back-of-the-envelope estimate leads to being able to store a compressed-HTML cache (no images) on less than $1000 worth of disk within 3 years…

Microsoft giving grants for Personal Lifetime Storage projects

Microsoft Research has announced a Request for Proposals for projects in relating to their Digital Memories (Memex) research kit, in the context of “personal lifetime storage.” Microsoft’s inspiration (and probably the inspiration for everyone else working in this area too, at least indirectly) is Vannevar Bush’s 1945 article As We May Think, in which he famously described a kind of personal library-in-a-desk he called the memex:

Consider a future device for individual use, which is a sort of mechanized private file and library. It needs a name, and to coin one at random, “memex” will do. A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility. It is an enlarged intimate supplement to his memory.

MSR expects to give 6-9 awards to college and university projects, up to a max of $50,000 per award, and recipients would also be given a SenseCam wearable camera and software from the MyLifeBits, VIBE and Phlat research projects at Microsoft Research. Strings are minimal — they expect semiannual progress reports, want it presented at at least one of their workshops and expect the project to be either dedicated to the public domain or released under an open license such as the BSD license.