I was asked back in May to do some blogging for the megablog Huffington Post. It’s been a bit of a strange summer, and I haven’t written much of relevance for a wide audience. But today I finally wrote something (see the facebook post below) and thought: oh, that could get sent over. So I sent it over.
aging idealist. ai and education, open web, open publishing.
I’m going to post here, because I don’t feel like creating yet another account…
I don’t quite understand the following:
“you might have a good platform to craft a strategy to make people do what you want them to do.
I agree that the web, blogs, FB, etc might be providing a very rich medium for data collection, but I don’t see how this links to manipulation?
well, if you can track and monitor past behaviour linked to certain cues, you can model likely responses to certain cues. so add the cue, spur the response.
in a sense this is exactly what advertising is, right? show em a picture of a hot chick and a beer, and they’ll buy your beer. but in traditional advertising, the links between cue and action are very abstract (we showed this ad on hockey night in canada, and we increased sales by 3%).
whereas on the net, and even more in facebook, you can say, “we showed this ad, and 40% of the people with the following characteristics: like radiohead, are christian, attended mcgill … clicked through.” further, we kept following facebook activity of those people and 5% of them went on to install our “beer it up” facebook ap. and looking at that 5%, we see that 20% of them liked “radiohead AND beethoven.”
so that’s the marketing case. it lets you craft your cues much better to spur action by specifically targeted groups.
now extrapolating it out from beer, to political decisions, for instance, the process tends to happen in the same way. when such-and-such type of person sees ABC kind of news, will they do Y or Z? what happens when they see DEF? in the “analog” marketing study world there is no way to track this kind of stuff with any accuracy. in the digital marketing world, there is. and so you start to know what happens when you show ABC to type 1 (which is what you want) and DEF to type 2 (which is also what you want).
so in a sense this is not “new” it’s just that the robustness of the data collected is so stronger here, and so theoretically you should be able to do much more with it.
i heard a political pollster once talking about what they do. those phone surveys on politics in fact have all sorts of “random” questions in there. (how much TV do you watch, how many siblings do you have….etc) . and the purpose of those other questions is to *test* how good the pollsters’ predictive modeling is. if you answer X to question 5, the pollster wants to *predict* how you answer question 24.
so the web, especially facebook, are brilliant datasets to model behaviour, and modeling behaviour lets you predict behaviour, which lets you decide how best to get segment A and segment B to do what you want them to do.
or maybe i am just paranoid. but that doesn’t mean the sons of bitches aren’t out to get me.
So yes, marketers use multivariate regression, profiling, segmentation and other sophisticated data-mining techniques to predict how best to sell their products.
Is the practice evil? Arguably.
Has it been going on for ever? Absolutely.
As you point out the level of detail of the dataset is about to increase, maybe by an order of magnitude or so.
I’d keep in mind the following though:
– We still exert a lot of control over the data that flows into the database. Tomorrow I can be female and 27, next week my favorite movies can include Big Momma’s House 2, and so on and so forth. Remember, on the Internet no body knows you’re a dog. We can use this to our advantage to invalidate the datamining algorithms. Noise injection is a irrevocable defense at our disposal, until online identity is tied to verifiable identify, which may never happen, or the stakes for maintaining reliable public information become much higher. When it comes down to it, I’m not sure people take their FB profiles very seriously.
– Aggregate behaviour prediction only takes you so far. This is not the same as “controlling the masses” as you might claim. Each of us will continue (forever I hope) to maintain the right to exert free will, and by doing so we can each individually (or collectively if we so chose) act in ways that are statistically impossible to predict. I come from a background of analytical modeling, and so I know the models are actually pretty weak in the face of emergent scenarios: new product launches, elections, new trends – historical data becomes essentially meaningless pretty quickly when the rules of the game are new, or changed drastically.
Ultimately, I guess I’m not freaked out because I believe I defy prediction — but maybe Google knew I would end the post that way (or this way).
–> or this way.
I don’t think good or evil has much to do with it. As i’ve said about many things related lately: you might as well be anti-glacier. that is, it’s going to happen.
and actually i think it’s an increase of more than one order of magnitude. or, it’s an order of magnitude more collected/ible “static” data (i like radiohead and miller’s crossing), and a whole new sort of “dynamic” data – i mean data a bout what i do: i click through 10 friends a day, comment on 5 profiles, mention “Bush + suck” in 3 etc etc.
and then the next level is correlating static info and dynamic events with certain cues.
so that means we should a) realize it will happen b) think about it and c) make our decisions, eg about what we do or don’t put into facebook, or on the web, in consequence.
re: controlling dataflows – those of us who think about it do control our dataflows, many people don’t think about it.
re: identity – in facebook, and on blogs, the identity is somewhat fixed by behaviour. even if the data is bad, a fat man pretending to be a skinny dog day after day will do certain things, and it’s the behaviour tracking that’s interesting.
re: FB profiles…there’s a lot of data in there, and no one i’ve seen (tho i have not looked hard) has false data in their FB account.
re: aggregate prediction/chaos … yeah, one flaw in my postulate is that knowing the system is working as it does means that we’ll react in different ways to it. but, then again, “everyone” “knows” about how the media works in the US, and look where that’s put the world. the point is not so much that we’re about to see zombie mind control, but that the level of sophistication of the tools of marketing & manipulation are about to skyrocket. what that means i don’t know, but let’s be aware of it.
re: changing scenarios: yes. point taken …chris hughes (in comment on post below this one) argues (from Penrose) that human minds cannot be modeled with traditional computers, no matter how powerful, because in fact we are not analytical … but that maybe quantum computers could do the job, because they can model many things at once.
re: defying prediction … that reminds me of the discussions of conflict of interest, when people always say, but *I* can’t be subject to conflict of interest, my heart is pure! my morals unshakable … the point is not about how one individual (you or me) reacts, but how the system reacts.
in any case, i think just that we need to think a bit more about what’s going to happen when the web really goes semantic, and all the data we put in is much more trackable. couple that with increased bandwith and processing power, and you’ll have “better” models – maybe frighteningly better. we’ll see. lets talk again in 10 years.
The over-riding point is valid, and I agree with it: We all need to become more aware of how the data we pour into the Google-sphere can (and will) be used against us; or if not against us, at least in ways we cannot currently predict. It’s a disconcerting realization.
However, my statement around defying prediction is not one I make ego-centrically, but rather fundamentally. The conflict of interest analogy does not. Ethics/morals/interpretations are wholly subjective. You and I may disagree until we’re blue in the face whether it’s ethical to download an album. However, free will is incontrovertible: you and I will never disagree on each other’s ability to get up and leave at any moment, for instance. Certainly we could probabilistically model the likelihood of one of us getting up and leaving, but I will take to my grave the comfort that I will forever be able to be that single outlier in the tail… Perhaps that is the only thing to be hopeful for?
i think i agree with you, but again, the point of the data/modeling/predictive/forcing argument is not about how any individual will fare in such a world, but what happens in the system as a whole (ie in facebook, or web citizens). your outlier status may be comfort for you, but if indeed you are an outlier, you are easily ignored by the models, so that they can better focus on shaping behaviour of the fat part of the curve.
so when thinking about the potential impacts this might have, i’m not talking necessarily about how it will affect you or me but rather how it might be used to model and shape more aggregate behaviour.
tho i take your point about the unknowableness of human nature, but then again, propagandists – machiavelli, goebels, cheney – have always known that if you scare a population sufficiently, you can get them to do what you want in the name of “security.” for a while anyway. but now we/they have much better data to work with, to refine their messages, cues etc.
so what is the brave new world we are building.
i’m not saying we shouldn’t build it (because we will, no matter what I say), but that we should think about what it’ll mean. if for no other reason than to buy stocks in the right kind of companies.