Friday, January 05, 2007

The dirt on Ting-lan

Ting-lan was thoughtful.

She had solved a problem sideways the way mathematicians go at it sometimes.

The problem had been to match answers that had no questions to questions that had no answers.

Ting-lan smiled. Papa had been no good. If Mama could see her now. She was about to wonder twice but decided this was not the time for Ting-lan Tao. By Ting-lan Tao Ting-lan meant the reverie.

Ting-lan's solution came from Dirt. Dirt is an online celebrity gossip magazine where fans can submit audio comments to podcasts and spill their beans. Not mung beans but all kinds of confession beans. Tittle beans. Tattle beans. Audio beans.

Dirt is powered by Big In Japan. Big In Japan has an ethos for developers. Ting-lan followed the ethos.

In this ethos Ting-lan became a "social samurai". A social samurai would not dream of clustering answers and questions through their feature values without user input.

In Ting-lan's solution a user/interviewer enters the bestmatch algorithm via The Door and becomes part of the program. Ting-lan built The Door using parts from Big In Japan.

There was a movie once where users entered programs -- Tron. Ting-lan's solution was Tron for interviewers.

Who said survey research wasn't cool? Cooler than Google? Yes, Ting-lan thought, Google is boring.


Wednesday, January 03, 2007

Google is boring

Google is boring. Quintura is not.

I enter "self-organizing map". Along with the usual result set, I get a cloud. In the cloud I see my keyword along with other ones sized according to their frequency.

If I want to add a keyword to the cloud that is not already there, I just double click:

Now my result set is the product of the two keywords. Pretty cool! It gets better. I hover on "cluster analysis" and preview its keywords:

Finally, I select "cluster analysis". And I get a new cloud and a new result set. They are the products of "self-organizing map" and "cluster analysis":

Not bad. Better than not bad. Not boring...

In a recent post at Read/WriteWeb Alex Iskold wrote otherwise at The Race to Beat Google. He was somewhat dismissive of the clustering solutions saying they were too complex to make it in the mainstream. I don't think we want to underestimate clustering when it is wedded with clouds. Indeed clouds informed by clustering like the ones above can rain on Google's parade. I hear from kids that they always wanted clouds to do the walking. Imagine a swim in clouds that have three dimensions. Instead of "googling", the euphemism will be "I need to check the weather." The rejoinder from Will Smith's wife in Enemy of the State would be: "And just whose weather are you checking?"


Tuesday, January 02, 2007

Velvet has a tangerine day

Velvet was having a tangerine day. On a tangerine day every interview she touched would flow and glow and she was reminded of that song by Talking Heads which right now she couldn't stop hearing even if she wanted to:

I dont know why I love you like I do
All the troubles you put me through
Sixteen candles there on my wall
And here am I the biggest fool of them all

I wanna know that youll tell me
I love to stay
Take me to the river and drop me in the water
Dip me in the river, drop me in the water
Washing me down, washing me down.

Yep. It was one of those drop me in the water days.

There had been alot of those days since the retooling. The retooling was when she sent her laptop back to the company. Then, a few days later she got back a Wii.

It wasn't really a Nintendo. It was still a laptop. But it was also a Wii.

It was a Wii because now when she dragged the external data sources avatar onto the interview and there were still too many unanswered questions, she could shake and bake.

Shake and bake was a new resource. It used something called "connecting the dots" to turn unanswered questions into answered ones. The reason they called it "shake and bake" instead of "connecting the dots" was easy. Because you just didn't activate the shake and bake plan by dropping it on the interview and its lost sections. After you dropped the plan, it had to be simmered and cooked. That was when Wii came in.

Velvet would pick up her laptop and move it from side to side and back and forth until there was a fit and the answers covered all the questions.

The company never really explained the Wii except to say that the laptop was now capable of doing "lateral thinking" but the thinking had to be shaped. Hence shake and bake and Wii. And hence visualization. And hence the interviewer as cook.

It was all very natural. Smile.

Velvet wanted to look under the cover of shake and bake just like she would look under the cover at her boyfriend. But she knew this time she wouldn't understand anything she saw. Even so, Velvet couldn't let sleeping dogs lie. The sleeping dog in this case was a tip -- you know, one of those yellow popups that surfaced under a mouseover.

When Velvet moused over the shake and bake plan she got the tip. The tip said "self-organizing map". Sometimes it simply said "SOM". And sometimes there was the video.

Yep. Take me to the river and drop me in the water. It wasn't better than sex but Wii was it fun.


Monday, January 01, 2007

Technorati Charts

Posts that contain "data mining" per day for the last 30 days.
Technorati Chart

Posts that contain "cluster analysis" per day for the last 30 days.
Technorati Chart

Posts that contain imputation per day for the last 30 days.
Technorati Chart

Posts that contain "self organizing map" per day for the last 30 days.
Technorati Chart


Sunday, December 31, 2006

Carl's Skinny on Active Participation

Generally speaking, interviewers aren't told alot by the company. They say that is so the interviewer won't be a source of bias -- something about interviewers shaping respondents subliminally.

Like if I were to wink my eye at the respondent here and there.

It might fill her head with ideas. I hope noone is reading this -- least of all the wife.

In any event Carl had cornered a statistician on the bus where he proceeded to impersonate a programmer and the two of them kicked around the nearest neighbor hotdeck method. This was how an interview that was refused got its data.

The technical term for this is imputation.

It turns out imputation is cool except when you are collecting longitudinal data. This is when your heartaches begin.

With longitudinal data it is possible to check hotdeck data against a previous interview from the real respondent.

Some people would have said to let sleeping dogs lie but curiosity got the better of the statistician, and now it is possible to wonder about imputed data.

Enter the heartache and the impersonator and the shaper and Carl. That's me.

Carl was feeling bigger every moment.

That's because active participation was impersonation guided by the shaper.

Carl, the impersonator, would get in the mindset of the respondent and answer questions that were flagged. He would answer the questions to the shaper's liking.

This could happen sooner or it could happen later.

You would think that a bot or semi-sentient like the shaper had all the time in the world. But she didn't. Indeed she went wobbly when the impersonator gave a good answer on the first try. And this made Carl blush.