Data Mining 101

Tom Owad over at Applefritter has a nice simple example of the kind of homebrew data aggregation that’s possible with just a little bit of programming knowledge and a home internet connection.

The thing that scares me about data mining is not that super-secret information about me is revealed — my Amazon wish-list doesn’t contain anything I’d be embarrassed or concerned if it was seen by any of my friends or for that matter 99% of the other people in the world. And odds are good that anyone bothering to look me up by name or go to my website will fall into that category. The trouble is that if I pop up in a trolling-expedition at all it’s much more likely the troller is among that 1% of the people that I would be upset about reading my wish-list. Ed McMahon doesn’t mine the Internet to pick winners of the Publishers Sweepstakes, but over-zealous FBI agents do look for people promoting the wrong politics, companies look for suckers to blast with seemingly perfect-for-you product announcements, con artists look for rich recently-widowed women above a certain age, and pedophiles look for young latch-key kids with their own webcams.

Data Mining 101 Read More »

Schneier on Bush’s illegal wiretaps

From Bruce Schneier’s Cryptogram, in a recent post comparing Bush’s recent (and continuing!) wiretapping to Project Shamrock in the 1960s:

Bush’s eavesdropping program was explicitly anticipated in 1978, and made illegal by FISA. There might not have been fax machines, or e-mail, or the Internet, but the NSA did the exact same thing with telegrams.

We can decide as a society that we need to revisit FISA. We can debate the relative merits of police-state surveillance tactics and counterterrorism. We can discuss the prohibitions against spying on American citizens without a warrant, crossing over that abyss that Church warned us about twenty years ago. But the president can’t simply decide that the law doesn’t apply to him.

This issue is not about terrorism. It’s not about intelligence gathering. It’s about the executive branch of the United States ignoring a law, passed by the legislative branch and signed by President Jimmy Carter: a law that directs the judicial branch to monitor eavesdropping on Americans in national security investigations.

It’s not the spying, it’s the illegality.

Personally, I think it’s the illegality and the spying, but in the name of keeping the debate clear I’m happy to keep the two arguments separate.

Schneier on Bush’s illegal wiretaps Read More »

Annotated blog corpus to be released at WWE 2006

Intelliseek will be a big corpus of spidered and annotated blog posts to attendees at the 3rd Annual Workshop on the Weblogging Ecosystem (held in conjunction with the WWW 2006 Conference in Edinburgh, Scottland):

The data release comprises a complete set of weblog posts for three weeks in July 2005 (on the order of 10M posts from 1M weblogs). This data set has been selected as it spans a period of time during which an event of global significance occurred, namely the London bombings.

The data set includes the full content of the posts plus mark-up. The marked-up fields include: date of posting, time of posting, author name, title of the post, weblog url, permalink, tags/categories, and outlinks classified by type – details may be found here.

Sounds like a great resource for researchers. I’m also amused (in a dark sort of way) by the datashare individual agreement they require people to sign — essentially they admit that there’s no way they can get copyright clearance from all million or so bloggers they’ve collected, so they just ask everyone to agree to remove any posts if anyone complains, not use the results for commercial purposes and not use it passed the workshop.

Annotated blog corpus to be released at WWE 2006 Read More »

Big Brother Down Under

From the Sydney Morning Herald:

Jane, from Coogee, was surprised to find three police on her bus asking to inspect mobile phones. Each took a phone at random and scrolled through messages for five or ten minutes. Everyone obeyed. “The people were perfectly friendly about it,” she said. “I thought it was a bit weird and a breach of privacy. But I didn’t say anything. Nobody did.”

No, it’s not about terrorism, it’s about potential racial violence, but it’s still that nasty abuse-of-rights-in-the-name-of-safety-from-unknown-boogeymen vibe. Of course, such flagrant violations of our rights without a court order could never happen in the US. In the US, we’d never even know they’d read our text messages without a court order until we read about it in the New York Times.

(Thanks to Omri for the link.)

Big Brother Down Under Read More »

Bowing to pressure, retailers agree to take Lord’s name in vain

Docbug Exclusive — Faced with a potential boycott from right-wing Christian groups, retailers Target and Lowes have agreed to reinstate their long-standing policy of using Christ’s name for cheap commercial gain. The companies were targeted by the American Family Association because they refer to the word “holiday” instead of Christmas in their advertisements and storewide decorations.

Conservative pundits were quick to call the move a victory for those who recognize Christ as an inherent part of the end-of-year buying season. Spokesmen for both companies say they intended no disrespect, and that they plan to institute policies to insure that religion will be more prominently exploited in the future.

(Update 12/15/05: fixed typo)

Bowing to pressure, retailers agree to take Lord’s name in vain Read More »