BIG DATA, Small data, and a little practice on both (Part I)

The morning of Rockstar’s Class Assembly couple days ago, traffic was blocked up. I mean, REALLY blocked up – no buses or cars were getting in, and the taxi queue was getting longer by the minute. 

Now, if you have ever tried to cut a taxi queue in HK, you would know what a huge deal this is. One of the few things Rockstar says in Cantonese is “Pai Dui” – which means queue for your turn. “Because any (local) playground we’re at, the kids are always going ‘pai dui, pai dui’…” 😀

Yet we made it to school on time that morning ONLY because this long line of work warriors let us cut in. First, to the potential wrath of everyone else along the queue, the two guys right in front who must’ve already been waiting for very long were all “No, no, you can have it”. (I have actually encountered that Wrath Of Other Queue-ers when considering letting a mum-with-stroller into a queue in Wan Chai before… lemme tell you, that wrath is REAL (and thankfully misdirected because she then realised she was in the wrong place to get to Mong Kok). So Gentlemen At Front Of Queue were performing no mean feat letting us in.) Second, the entire queue which I, in my knee-jerk gratefulness had a death wish with DARED to make eye contact with…. turned out to be incredibly supportive. The kid had Class Assembly! TAKE that taxi! 

BLESS these people, ok. I hope they get awesome bonuses/ the markets move in their favour/ their own kids come home with awesome report cards/ they win the lottery over the weekend and don’t need to be at that taxi queue on Monday/ all of the above. 



Started reading Big Data (and eventually Small Data), then went on a real galloping tangent…. 

Big Data likes messy. Like the proverbial Impressionist painting, countless “messy” strokes ultimately come together to present a picture, tell a story, of such depth and breadth and colour. As a student I had a love-hate with systems that weighted 3 hour exams too heavily, because while you can hack them, you also have to make sure that by hook or by crook you perform for 3 hours. Life and ability is so much more than 3 hours. (Yet a considerable amount of time and effort is spent preparing to perform in a 3 hour spurt instead of say, a proverbial distance run…) And so we’ve both exploited* and been run over by the proverbial 3 hour essay systems that carry 90% of your grade.

*Kings used to pull historical exam questions and prepare essays word for word – his set answers were still happily circulating years after he’d left; I used to work the early Saturday morning library shifts no one else wanted because then I could see what books the teachers reserved… and skim the restricted ones we couldn’t check out… But the one we learned too late was to look wayy forward: pull the exam time table for next term and pick courses for the semester based on how spaced apart the exams would be at the end. Lemme tell you, 5 essay papers back-to-back every other day vs 1 paper every 5-7 days makes a huge difference to your cramming ability.

Point is, there are a lot of ways one can erm, enhance seeming abilities when there is virtually no way “everything” can be tested. Improvements in technology however, are then supposed to provide the tools for you to be able to do just that – “test everything”. We couldn’t see the alien patterns in the cornfields if we didn’t invent flight and could look at them from an aerial view.

This… (pic from theimaginativeconservative.com)

Or some leaves and freaked out Mel Gibson 😀 (pic from theimaginativeconservative.com)

 

Q: What’s a huge data source that tells people all kinds of things about you?





















From Girl With A Dragon Tattoo – original Lisbeth Salander cosplay pic from deviantart.com

Title character Salander’s appointed guardian exploits his position to exert power over her, whereupon she retaliates by tasing him and tattooing his crime on his chest. To make sure he doesn’t do anything else to her, she then also tracks his browsing history and warns him that if he tries to get the tattoo removed she’s re-tattooing it… on his forehead (HUGE CAVEAT – NOT a kiddie movie)

The image however is powerful: Ever thought how much Google can tell about us, via our search history? Knowing what a person (really) wants to know tells you a tremendous amount about them. (At the other end of the search function btw, Google’s real search engine criteria is… secret. Simply because if you could hack that criteria in the same way you could a 3 hour exam and show up tops on Google searches, that’s advertising gold that others pay big bucks for.)

Googled data however isn’t just a goldmine for marketers. It can pick up the next H1N1 outbreak much faster than say, Center for Disease Control or various other governing bodies trying to contain an outbreak. Google has stuff on you that opinion polls don’t. How? Because when you have a problem you don’t want to tell even your therapist, say you did something bad and it got tattoo-ed on your chest, you’d Google tattoo removal sites first. Someday, Google’s taking over the world and there’s nuthin’ anyone can do about it.

The first thing you do if you think you’re coming down with something is to Google your symptoms. (Google, btw – known to be a savvy investor in Biotech.)

By the time people make it to the doctor, they’ve probably been knocking symptoms about for couple days, maybe a week. When that doesn’t work, they give up on OTC or home remedies and go to the clinic. Doc picks up on something, sends samples for testing, the clinic maybe asks for a second opinion or checks the test again if massive panic is likely to ensue (ironically if massive panic is likely to ensue then it’s probably IMPORTANT and time sensitive), FINALLY reports the case a.k.a the cows will be home before the cat is out of the bag about an outbreak. See how much faster a search engine might pick up on that?

Smallpox, for eg, is the one devastating viral disease that appears to have been eradicated off the face of the earth. After a few decades however, we’ve got so many other bugs to worry about, we don’t innoculate for the ones that no longer exist, right? No one should be searching those symptoms anymore, right? So one hypochondriac, two, might go to the doc. The doc might say “you’re paranoid/ a fruitcake/ you actually have chicken pox etc etc” (a friend’s baby somehow came down with chicken pox despite vaccinations available, while another pregnant friend who apparently already had chicken pox, but in the States when she was growing up, appeared to have contracted a strain in HK that was so different she didn’t have immunity to it.)

Let’s Google search for apocalypses… If a sudden surge in smallpox symptoms being Googled appears, someone might need to trip an alarm bell somewhere.

Moving on: So all this stuff about you is technically available. Should individual volition trump data, even if statistics argue otherwise? Should we bring in the Thought Police? Do we put people in jail before a crime, if we can “prove beyond a shadow of a doubt,” that they were going to commit it? To some extent people are already convicted on search histories for chemicals to make bombs, etc etc… But crime is alive and well anyway, because the dark net knows how to conceal its searches. Crime, because of the money element, like water, finds any crack and crevice. You need to change gravity conditions. 

Bearing in mind God only really knows whether people committed a crime or not (regardless what they claim, regardless what a jury and everyone else who reads the papers thinks) and we put people away or let ’em go free in some part on how a case is argued in court, isn’t it Same Difference? Like, there’s another apocalypse right. There.

Algorithms that crunch massive amounts of data will predict the likelihood of:

1) heart attacks (very important for insurance companies to know what to charge)

2) mortgage defaults (my personal favourite – no more irresponsible mis-sells)

3) violent crime (arrest and incarcerate before the murder!)

Not too long ago, computers, like aliens, were a novelty, to be written about in sci-fis of the Evil Computer Takes Over The World sort. (Still looking for How To Avoid Being Captured By Aliens, y’all – it includes pearls of wisdom like Not Standing In Cornfields). Then came the internet and smart phones and……… future crime (Your Cellphone Knows What You Are Going To Do Next Summer) and apocalypses (Zombies Have Smartphones Too!).

Yet none of this is new. What we create is limited only by our imagination. Here’s proof we haven’t really evolved: George Orwell’s 1984 (about the government, (or “gah-mun” – if you didn’t get that, it wasn’t meant for you) knowing everything you do) was written in 1944 before Singapore existed ;D , and Philip K. Dick’s Minority Report (I like to call this The Thought Police Movie – written in 1965, made into a movie only in 2002).

War of the Worlds is my personal favourite: First written in 1897 by Wells, then made into a radio program so realistic in 1938 that it apparently caused mass panic because people thought aliens were really taking over the world…

Then made into this 2005 movie… (pic from Wikipedia)

Tom Cruise and Steven Spielberg like old books for new movies (:))

pic from wikipedia – Sci Fi novel of olde, featuring original version of Tom Cruise’s Minority Report

The Thought Police Movie came almost 40 years later:

pic from moviesandamic.wordpress.com

In this movie, the technology of the PreCrime department hinges on – not Big Data crunching, but the abilities of psychic children to predict violent crimes before they happen.

Just 2 setbacks: 1) interpreting the visions is open to manipulation (and btw so is virtually ANY data – the one time I enjoyed my accounting degree was in my final year thesis – the effects of financial statement releases on stock market trading) and 2) said children with psychic ability are kept locked up in the lab so nothing like erm, having a life, interferes with their visions.

(Btw, if you ever wondered how much plain ole’ financial statement reporting (a.k.a. “The Facts”) have to do with predicting future market movements, lemme tell you – you might stand an even-to-better chance using the psychic children 😀 )

The senior vanilla stock traders (of which I’ve never been one) used to love to say that if you put a monkey there to hit “buy” and “sell” randomly you couldn’t go tha-at far wrong. That’s because they were secure in the knowledge that their abilities cannot be replicated by algorithms or trained simians – see, the best traders and playmakers were often older, with a very good memory for how people the market behaved, last time there were similar conditions. (See? Almost psychic 🙂 Last time Greenspan scratched his nose, what did my counterparts do? Last time a massive stop-loss was triggered, how had the other major players on the other side of my trade reacted?)

“The real revolution is not in the machines used to calculate the data, but in how we use the information…” – summary of Big Data, Kim Hartman

We have cooler stuff today, but that also increases the risk that we get so caught up, bogged down in learning about it – so many studies out there, so many gadgets, so many apps, so many programs – that it becomes easier to lose sight of what we employ the stuff for in the first place. (That’s the real way technology takes over the world. By befuddling it 😀 ) You cannot easily improve on Orwell’s 1984 or Wells’ War of the Worlds, but you can make increasingly better entertainment from the same ideas…. First, radio shows. Then, movies. Then – oh look how much fun Sir Anthony Hopkins had with a Transformers movie.

Deep Blue could beat a human because it was chess. The program was up against a brilliant adult human who sat down to play based on specified terms, a database of chess strategies and rules… Any little kid however, could…. cover the chess board with Cocoa Puffs. If there was any one thing that made me inexplicably better at not sure what I’m doing here it was experiencing the incredible, at times illogical, randomness of our two children being children.

Think that’s fluff I “have” to come up with because Imma “parenting blog”? How did Alexander the Great untie the Gordian knot? Whether you buy the version where he took it apart with the single stroke of a sword, or else slid it off its pole pin, he basically poured cornflakes on the chess problem. “Changed gravity” in order to solve it. No one said in the real world outside chess you had to stick to the chess rules. But we see things with definite biases. For eg, if you were given a candle, a match and 2 rings, and tasked with attaching the rings together, would you melt wax over the rings, or would you get at the candle wick and use the string to tie the rings?

Living things find all kinds of out-of-the-box ways, all the time. It’s called Evolution.

It was Pablo Picasso who said, “Every child is an artist. The problem is how to remain an artist once we grow up”

….

 

This entry was posted in aileensml. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *