Archive for the ‘Data Panning’ Category
Export Atlas
A while back I posted about my latest Proce55ing experiment, the Export Atlas.
Here’s a version you can actually play with:

In this screen grab, Mexico has been clicked to hightlight it. The green number is GDP in trillions of US dollars. The tinier percentage figures written over the node edges indicate percentage of exports from the node on the closer side of the edge to the node on the further side of the edge.
In response to Adam’s suggestion, I made the area of the nodes rather than the radii of the nodes proportional to GDP.
I still do not know what the distinction in this data set between the EU and its member nations is. This could be some residual book keeping from pre-EU transactions I suppose?
I do realize that GDP doesn’t all go into exports so what would make this better would be to get data on the gross exports and size the nodes to that value instead. Ugh. More homework.
Carpentry
I had a disagreement with a friend K regarding what the word “Carpentry” makes people think of.
My answer:

Fig. 1: Carpentry? Trebuchet.
Her answer:

Fig 2: Carpentry? Jesus.
I felt very strongly that the natural reaction to “carpentry” was the construction of trebuchets. She disagreed with equal ferocity, so I decided to do some research on the subject and settle the matter all scientific and civil-like.
Methodology: An instant messaging poll of random people who happened to be online at the time. Sample size= 14 (including the author and K). My findings:
Question: What does the word “Carpentry” make you think of?

Fig 3: Survey Results
An example transcript from data-gathering phase illustrates the carp-pun phenomenon:
Me: trying to settle a disagreement- what does the word “Carpentry” make you think of?
Co-worker: fish
Co-worker: oh wait
Co-worker: making stuff out of wood
Me: I’ll go with the first answer
Co-worker: Carp-entry
Me: oh fuck, that’s friggin awful
Me: well played
Co-worker: mmm
Co-worker: carp-en-try
Co-worker: now that’s aweful
Me: carp-en-try-catch-finally
Co-worker: carp-en-try-catch-fry
Me: that was reeley bad
This tangential matter aside, I was troubled by two things: 1) the fact that the Jesii (that’s the plural of Jesus) were clearly in the lead, and 2) the tiny sample size.
To augment my first-hand research and hopefully re-inforce support for my hypothesis, I did what any modern student of random shit like this does: Googled it.
| Google search results | |
| Carpentry | 6,060,000 |
| Carpentry and Jesus | 272,000 |
| Carpentry and Trebuchet | 526 |
Table 1: Google Being Evil, According to Me
These data points imply that 4.49% of pages that mention carpentry also mention Jesus. This is higher than the 0.01% of carpentry pages that mention trebuchet. The word “carpentry” has a stronger statistical association with “jesus” than it does with “trebuchet”.
I am in the minority according to my own first-hand and secondary research. I conceed to K.
Conclusion: The word “Carpentry” does indeed make people think of Jesus more often that it makes them think of Trebuchets.
President’s Day
Since it’s President’s Day, I decided to do something for the Presidents. Yay, my day off and I did this just for you.
Behold:

My latest datapanning installment shows a readability score for State of the Union Speeches, 1975-2005. I chose to use the Flesch-Kincaid Index. It’s designed to indicate the school grade-level at which one’s reading ability must be if they are to understand the text.
Scores above 12 are equivelant to 12, since we don’t have 13th grade (here in The States, anyways).
I only went back to 1975 because a) that’s when I was born, and b) I got sick of copying and pasting text from this web site. Actually, I was too lazy to write a scraper for it. Maybe I’ll do that next. I have some other ideas for this set of text files.
I’d like to note that this post is also evidence that I’ll put up charts even if I don’t like the ideas they might support, or the interpretations I think they might lead to. You decide. Was Clinton inarticulate compared to Bush? In fact the very lowest point in verbal complexity over the past 30 years? Or is it really a concern of the speech writers rather than the speaker? Or is simpler speech a sign of empathy for the listener, which requires more complex abstract thought?
That would put me in a quandry because I think the declining trend of Reagan’s score jives pretty well with his Iran-Contra “I don’t recall,” Alsheimer’s Alibi.
Then again I don’t know jack shit about text analysis so the Flesch-Kincaid Index might be a load of crap. I’m too lazy to do that kind of research.
A Quantitative Look at My Personal Ad
This graphical timeline illustrates my personal ad activity since April 28, 2003:

(click the image for the full-sized version)
In total, I have sent 50 messages and received 65 (all female, according to the profiles). “Winks” may be sent without charge, but you can’t edit the text of the message. It’s used as a way to let another user know that you are interested, and also that you are a cheap ass.
To produce this image, I first scraped the message information from the personal ad site. I then produced an xml file from this data. From there I wrote a java application using dom4j to query the data with XPath. This allowed me to generate an SVG file, which I viewed with Adobe’s SVG viewer and screen-grabbed the images you see here.
If the data points (and the fact that I went through this exercise) don’t make it obvious, I am beyond any help that personal ads may provide.
I’d Make a Shitty Neocon
Last night I tried to think like a neocon. I wanted to assimilate my thinking to the ruling regime- can’t beat ‘em so I tried to join ‘em. Yes, I am desparate for some kind of coping mechanism at this point.
Anyways, after putting on my thinking cap, I came up with what I thought would be a wonderful fiscially conservative, plutocratic and democratically corrosive concept: Modify the electoral system to represent dollars instead of people. (I know it sort of does this already, but this is a much more direct link)
Here’s how my system would work: Instead of dividing up electoral votes by population counts from the census, we would divide them up based on Gross State Product (GSP).
%GSP by State, 2000-2001
The more dollar value your state contributes, the more votes it gets. This would begin to close the economically inefficient gap between dollars and political power, and take us another step forward in the Mission to Privatize Everything (except risk. still need the poor folks to bail us out ifwhen we fuck up).
So like the geek that I am, I made a spreadsheet to compare the numbers and do a What-If analysis. Here’s the excel file, if you’re really bored and want to take a look.
The results are not what I expected.
It should be a slam dunk for a candidate who pledges allegiance to a bank account, right?
Wrong. If the states were represented proportionally to their GSP instead of their populations, Kerry would have taken 290 electoral votes and Bush would have limped away with 248.
That’s why I’d make a shitty neocon. Maybe I should infiltrate their ranks and poison their intellectual culture with my ineptitude.
Three or One
We had a Halloween party saturday night. I was dressed as a “Zombeatnik” – an undead bohemian beating bongos for BRAINS. I might post some pictures later.
At some point in the evening (apparently), I decided to take a poll. I think this was after I had passed out the first time, and Carol woke me up. I decided to go on a demographic collecting rampage, since there were no brains left.
The poll question was one that Alex has been asking people for a while. It’s called Three or One, and it goes like this. To men: Would you rather have three balls, or one? And would you rather your woman have three breasts , or one? To women: Would you rather your man have three balls or one? And would you rather have three breasts or one?
And here are the results:

Three or One?
The total sample size was 26, with 16 men and 10 women responding. I did not participate.
I made a few notes in the margins next to some responses:
- (HOT) [I believe this was a really hot woman. should have gotten her name/number. oh well]
- (Tour de France) [this was a guy who argued that having one ball would make him a better cyclist]
- (Internal) [this was a woman who wanted her third breast to be internal, which I guess I allowed. Vance wants the third breast on her back, which is convenient for hugging.]
All in the name of Science.
Visual Display of Quantitative Devastation

Iraq Body Count minimum and maximum estimates as of Saturday, 23rd October 2004.
Data from the Iraq Body Count web site (which I had to run through htmlTidy, transform with some XSLT and then finally mangle into a graph with Excel).
I have some more ideas for additional data points, and those really should be error bars not series unto themselves. I’m sure there’s a web site with all this stuff already charted, but it would be interesting to see how public opinion polls correlate with the carnage.
My iTunes Valuation Experiment
As I previously mentioned in my WS4AH post, I’m working on some code that calcluates the value of a music collection based on the Amazon list price for each album.
My desktop at home has a much smaller collection of music than my iPod (nearly full @20GB) but I think the results are still interesting.
Because each album may have more than one listing (imports, editions with extra tracks and so on) there are multiple prices per album. So what I came up with was a minimum value and maximum value for the entire music collection. ‘Minimum’ being the sum of all minimum listing prices for each album on Amazon and ‘Maximum’ being the sum of all maximum list prices.
The numbers for my desktop iTunes collection (901 songs, 3.97 GB):
highest valuation: $1618.76
lowest valuation: $904.38
No wonder all those lawyers are interested in this filesharing thing.
Friendster Hack v3

I figured out why all the nodes in v2 were piled up in the center of the applet and fixed that. I made changes to the base touchgraph code to use anti-aliasing and transparency effects available with java.awt.Graphics2D (it was using java.awt.Graphics before that). Then I fixed the scraper code (which is C# and regexp) to properly interpret the paginated friend lists for users with more than 40 friends. (For an example: navigate from Sean to Gordon to Sameer. Then go get a cup of coffee ;-) As a result of fixing the pagination problem, the xml file is creeping up on 2mb with all the new nodes. So I switched over to .zip compression and it’s a much friendlier ~127k.
The !@#$%@#$%@!#$ images still don’t load, “but it works from inside my IDE”. I hate hearing that, and I hate saying it even more. I’ll get that figured out soon enough. For the time being, the popup windows are gone, and will be replaced by a static upper-left corner location in which all images appear.
I realized I wasn’t counting some valid situations in my ‘Mackness’ report, such as open-marriage status, so that’s next.