Tag Archives: machine learning

HAL 9000. Credit: OpenClipart-Vectors/pixabay

The tragic hero of ‘2001: A Space Odyssey’

This is something I wrote for April 10 but forgot to schedule for publication. Publishing it now…

Since news of the Cambridge Analytica scandal broke last month, many of us have expressed apprehension – often on Facebook itself – that the social networking platform has transformed since its juvenile beginnings into an ugly monster.

Such moral panic is flawed and we ought to know that by now. After all, it’s been 50 years since 2001: A Space Odyssey was released, and a 100 since Frankenstein – both cultural assets that have withstood the proverbial test of time only because they managed to strike some deep, mostly unknown chord about the human condition, a note that continues to resonate with the passions of a world that likes to believe it has disrupted the course of history itself.

Gary Greenberg, a mental health professional and author, recently wrote that the similarities between Viktor Frankenstein’s monster and Facebook were unmistakable except on one count: the absence of a conscience was a bug in the monster, and remains a feature in Facebook. As a result, he wrote, “an invention whose genius lies in its programmed inability to sort the true from the false, opinion from fact, evil from good … is bound to be a remorseless, lumbering beast, one that does nothing other than … aggregate and distribute, and then to stand back and collect the fees.”

However, it is 2001‘s HAL 9000 that continues to be an allegory of choice in many ways, not least because it’s an artificial intelligence the likes of which we’re yet to confront in 2018 but have learnt to constantly anticipate. In the film, HAL serves as the onboard computer for an interplanetary spaceship carrying a crew of astronauts to a point near Jupiter, where a mysterious black monolith of alien origin has been spotted. Only HAL knows the real nature of the mission, which in Kafkaesque fashion is never revealed.

Within the logic-rules-all-until-it-doesn’t narrative canon that science fiction writers have abused for decades, HAL is not remarkable. But take him out into space, make sure he knows more than the humans he’s guiding and give him the ability to physically interfere in people’s lives – and you have not a villain waylaid by complicated Boolean algebra but a reflection of human hubris.

2001 was the cosmic extrapolation of Kubrick’s previous production, the madcap romp Dr Strangelove. While the two films differ significantly in the levels of moroseness on display as humankind confronts a threat to its existence, they’re both meditations on how humanity often leads itself towards disaster while believing it’s fixing itself and the world. In fact, in both films, the threat was weapons of mass destruction (WMDs). Kubrick intended for the Star Child in 2001‘s closing scenes to unleash nuclear holocaust on Earth – but he changed his mind later and chose to keep the ending open.

This is where HAL has been able to step in, in our public consciousness, as a caution against our over-optimism towards artificial intelligence and reminding us that WMDs can take different forms. Using the tools and methods of ‘Big Data’ and machine learning, machines have defeated human players at chess and go, solved problems in computer science and helped diagnose some diseases better. There is a long way to go for HAL-like artificial general intelligence, assuming that is even possible.

But in the meantime, we come across examples every week that these machines are nothing like what popular science fiction has taught us to expect. We have found that their algorithms often inherit the biases of their makers, and that their makers often don’t realise this until the issue is called out – or they do but slip it in anyway.

According to (the modified) Tesler’s theorem, “AI is whatever hasn’t been done yet”. When overlaid on optimism of the Silicon Valley variety, AI in our imagination suddenly becomes able to do what we have never been able to ourselves, even as we assume humans will still be in control. We forget that for AI to be truly AI, its intelligence should be indistinguishable from that of a human’s – a.k.a. the Turing test. In this situation, why do we expect AI to behave differently than we do?

We shouldn’t, and this is what HAL teaches us. His iconic descent into madness in 2001 reminds us that AI can go wonderfully right but it’s likelier to go wonderfully wrong if only because of the outcomes that we are not, and have never been, anticipating as a species. In fact, it has been argued that HAL never went mad but only appeared to do so because of the untenability of human expectations. That 2001 was the story of his tragedy.

This is also what makes 2001 all the more memorable: its refusal to abandon the human perspective – noted for its amusing tendency to be tripped up by human will and agency – even as Kubrick and Arthur C. Clarke looked towards the stars for humankind’s salvation.

In the film’s opening scenes, a bunch of apes briefly interacts with a monolith just like the one near Jupiter and quickly develops the ability to use commonplace objects as tools and weapons. The rest is history, so the story suddenly jumps four million years ahead and then 18 months more. As the Tool song goes, “Silly monkeys, give them thumbs, they make a club and beat their brother down.”

In much the same way, HAL recalls the origins of mainstream AI research as it happened in the late 1950s at the Massachusetts Institute of Technology (MIT), Boston. At the time, the linguist and not-yet-activist Noam Chomsky had reimagined the inner workings of the human brain as those of a computer (specifically, as a “Language Acquisition Device”). According to anthropologist Chris Knight, this ‘act’ inspired cognitive scientist Marvin Minsky to wonder if the mind, in the form of software, could be separated from the body, the hardware.

Minsky would later say, “The most important thing about each person is the data, and the programs in the data that are in the brain”. This is chillingly evocative of what Facebook has achieved in 2018: to paraphrase Greenberg, it has enabled data-driven politics by digitising and monetising “a trove of intimate detail about billions of people”.

Minsky founded the AI Lab at MIT in 1959. Less than a decade later, he joined the production team of 2001 as a consultant to design and execute the character called HAL. As much as we’re fond of celebrating the prophetic power of 2001, perhaps the film was able to herald the 21st century as well as it has because we inherited it from many of the men who shaped the 20th, and Kubrick and Clarke simply mapped their visions onto the stars.

Featured image: HAL 9000. Credit: OpenClipart-Vectors/pixabay.

Hey, is anybody watching Facebook?

The Boston Marathon bombings in April 2013 kicked off a flurry of social media activity that was equal parts well-meaning and counterproductive. Users on Facebook and Twitter shared reports, updates and photos of victims, spending little time on verifying them before sharing them with thousands of people.

Others on forums like Reddit and 4chan started to zero in on ‘suspects’ in photos of people seen with backpacks. Despite the amount of distress and disruption these activities, the social media broadly also served to channel grief and help, and became a notable part of the Boston Marathon bombings story.

In our daily lives, these platforms serve as news forums. With each person connected to hundreds of others, there is a strong magnification of information, especially once it crosses a threshold. They make it easier for everybody to be news-mongers (not journalists). Add this to the idea that using a social network can just as easily be a social performance, and you realize how the sharing of news can also be part of the performance.

Consider Facebook: Unlike Twitter, it enables users to share information in a variety of forms – status updates, questions, polls, videos, galleries, pages, groups, etc – allowing whatever news to retain its multifaceted attitude, and imposing no character limit on what you have to say about it.

Facebook v. Twitter

So you’d think people who want the best updates on breaking news would go to Facebook, and that’s where you might be wrong. ‘Might’ because, on the one hand, Twitter has a lower response time, keeps news very accessible, encourages a more non-personal social media performance, and has a high global reach. These reasons have also made Twitter a favorite among researchers who want to study how information behaves on a social network.

On the other hand, almost 30% of the American general population gets its news from Facebook, with Twitter and YouTube at par with a command of 10%, if a Pew Research Center technical report is to be believed. Other surveys have also shown that there are more people from India who are on Facebook than on Twitter. At this point, it’d just seem inconsiderate when you realize Facebook does have 1.28 billion monthly active users from around the world.

A screenshot of Facebook Graph Search.

A screenshot of Facebook Graph Search.

Since 2013, Facebook has made it easier for users to find news in its pages. In June that year, it introduced the #hashtagging facility to let users track news updates across various conversations. In September, it debuted Graph Search, making it easier for people to locate topics they wanted to know more about. Even though the platform’s allowance for privacy settings stunts the kind of free propagation of information that’s possible on Twitter (and only 28% of Facebook users made any of their content publicly available), Facebook’s volume of updates enables its fraction of public updates rise to levels comparable with those of Twitter.

Ponnurangam Kumaraguru and Prateek Dewan, from the Indraprastha Institute of Information Technology, New Delhi (IIIT-D), leveraged this to investigate how Facebook and Twitter compared when sharing information on real-world events. Kumaraguru explained his motivation: “Facebook is so famous, especially in India. It’s much bigger in terms of the number of users. Also, having seen so many studies on Twitter, we were curious to know if the same outcomes as from work done on Twitter would hold for Facebook.”

The duo used the social networks’ respective APIs to query for keywords related to 16 events that occurred during 2013. They explain, “Eight out of the 16 events we selected had more than 100,000 posts on both Facebook and Twitter; six of these eight events saw over 1 million tweets.” Their pre-print paper was submitted to arXiv on May 19.

An upper hand

In all, they found that an unprecedented event appeared on Facebook just after 11 minutes while on Twitter, according to a 2014 study from the Association for the Advancement of Artificial Intelligence (AAAI), it took over ten times as longer. Specifically, after the Boston Marathon bombings, “the first [relevant] Facebook post occurred just 1 minute 13 seconds after the first blast, which was 2 minutes 44 seconds before the first tweet”.

However, this order-of-magnitude difference could be restricted to Kumaraguru’s choice of events because the AAAI study claims breaking news was broken fastest during 29 major events on Twitter, although it considered only updates on trending topics (and the first update on Twitter, according to them, appeared after two hours).

The data-mining technique could also have played a role in offsetting the time taken for an event to be detected because it requires the keywords being searched to be manually keyed. Finally, the Facebook API is known to be more rigorous than Twitter’s, whose ability to return older tweets is restricted. On the downside, the output from the Facebook API is restricted by users’ privacy settings.

Nevertheless, Kumaraguru’s conclusions paint a picture of Facebook being just as resourceful as Twitter when tracking real-world events – especially in India – leaving news discoverability to take the blame. Three of the 16 chosen events were completely local to India, and they were all accompanied by more activity on Facebook than on Twitter.

table1

Even after the duo corrected for URLs shared on both social networks simultaneously (through clients like Buffer and HootSuite) – 0.6% of the total – Facebook had the upper hand not just in primacy but also origin. According to Kumaraguru and Dewan, “2.5% of all URLs shared on Twitter belonged to the facebook.com domain, but only 0.8% of all URLs shared on Facebook belonged to the twitter.com domain.”

Facebook also seemed qualitatively better because spam was present in only five events. On Twitter, spam was found to be present in 13. This disparity can be factored in by programs built to filter spam from social media timelines in real-time, the sort of service that journalists will find very useful.

Kumaraguru and Dewan resorted to picking out spam based on differences in sentence styles. This way, they were able to avoid missing spam that was stylistically conventional but irrelevant in terms of content, too. A machine wouldn’t have been able to do this just as well and in real-time unless it was taught – in much the same way you teach your Google Mail inbox to automatically sort email.

Digital information forensics

A screenshot of TweetCred at work. Image: Screenshot of TweetCred Chrome Extension

A screenshot of TweetCred at work. Image: Screenshot of TweetCred Chrome Extension

Patrick Meier, a self-proclaimed – but reasonably so – pioneer in the emerging field of humanitarian technologies, wrote a blog post on April 28 describing a browser extension called TweetCred which is just this sort of learning machine. Install it and open Twitter in your browser. Above each tweet, you will now see a credibility rating bar that grades each tweet out of 7 points, with 7 describing the most credibility.

If you agree with each rating, you can bolster with a thumbs-up that appears on hover. If you disagree, you can give the shown TweetCred rating a thumbs down and mark what you think is correct. Meier makes it clear that, in its first avatar, the app is geared toward rating disaster/crisis tweets. A paper describing the app was submitted to arXiv on May 21, co-authored by Kumaraguru, Meier, Aditi Gupta (IIIT-D) and Carlos Castillo (Qatar Computing Research Institute).

Between the two papers, a common theme is the origin and development of situational awareness. We stick to Twitter for our breaking news because it’s conceptually similar to Facebook, fast and importantly cuts to the chase, so to speak. Parallely, we’re also aware that Facebook is similarly equipped to reconstruct details because of its multimedia options and timeline. Even if Facebook and Twitter the organizations believe that they are designed to accomplish different things, the distinction blurs in the event of a real-world crisis.

“Both these networks spread situational awareness, and both do it fairly quickly, as we found in our analysis,” Kumaraguru said. “We’d like to like to explore the credibility of content on Facebook next.” But as far as establishing a mechanism to study the impact of Facebook and Twitter on the flow of information is concerned, the authors have exposed a facet of Facebook that Facebook, Inc., could help leverage.