A different kind of experiment at CERN

This article, as written by me, appeared in The Hindu on January 24, 2012.

At the Large Hadron Collider (LHC) at CERN, near Geneva, Switzerland, experiments are conducted by many scientists who don’t quite know what they will see, but know how to conduct the experiments that will yield answers to their questions. They accelerate beams of particles called protons to smash into each other, and study the fallout.

There are some other scientists at CERN who know approximately what they will see in experiments, but don’t know how to do the experiment itself. These scientists work with beams of antiparticles. According to the Standard Model, the dominant theoretical framework in particle physics, every particle has a corresponding particle with the same mass and opposite charge, called an anti-particle.

In fact, at the little-known AEgIS experiment, physicists will attempt to produce an entire beam composed of not just anti-particles but anti-atoms by mid-2014.

AEgIS is one of six antimatter experiments at CERN that create antiparticles and anti-atoms in the lab and then study their properties using special techniques. The hope, as Dr. Jeffrey Hangst, the spokesperson for the ALPHA experiment, stated in an email, is “to find out the truth: Do matter and antimatter obey the same laws of physics?”

Spectroscopic and gravitational techniques will be used to make these measurements. They will improve upon, “precision measurements of antiprotons and anti-electrons” that “have been carried out in the past without seeing any difference between the particles and their antiparticles at very high sensitivity,” as Dr. Michael Doser, AEgIS spokesperson, told this Correspondent via email.

The ALPHA and ATRAP experiments will achieve this by trapping anti-atoms and studying them, while the ASACUSA and AEgIS will form an atomic beam of anti-atoms. All of them, anyway, will continue testing and upgrading through 2013.

Working principle

Precisely, AEgIS will attempt to measure the interaction between gravity and antimatter by shooting an anti-hydrogen beam horizontally through a vacuum tube and then measuring how it much sags due to the gravitational pull of the Earth to a precision of 1 per cent.

The experiment is not so simple because preparing anti-hydrogen atoms is difficult. As Dr. Doser explained, “The experiments concentrate on anti-hydrogen because that should be the most sensitive system, as it is not much affected by magnetic or electric fields, contrary to charged anti-particles.”

First, antiprotons are derived from the Antiproton Decelerator (AD), a particle storage ring which “manufactures” the antiparticles at a low energy. At another location, a nanoporous plate is bombarded with anti-electrons, resulting in a highly unstable mixture of both electrons and anti-electrons called positronium (Ps).

The Ps is then excited to a specific energy state by exposure to a 205-nanometre laser and then an even higher energy state called a Rydberg level using a 1,670-nanometre laser. Last, the excited Ps traverses a special chamber called a recombination trap, when it mixes with antiprotons that are controlled by precisely tuned magnetic fields. With some probability, an antiproton will “trap” an anti-electron to form an anti-hydrogen atom.


Before a beam of such anti-hydrogen atoms is generated, however, there are problems to be solved. They involve large electric and magnetic fields to control the speed of and collimate the beams, respectively, and powerful cryogenic systems and ultra-cold vacuums. Thus, Dr. Doser and his colleagues will spend many months making careful changes to the apparatus to ensure these requirements work in tandem by 2014.

While antiparticles were first discovered in 1959, “until recently, it was impossible to measure anything about anti-hydrogen,” Dr. Hangst wrote. Thus, the ALPHA and AEgIS experiments at CERN provide a seminal setting for exploring the world of antimatter.

Anti-particles have been used effectively in many diagnostic devices such as PET scanners. Consequently, improvements in our understanding of them feed immediately into medicine. To name an application: Antiprotons hold out the potential of treating tumors more effectively.

In fact, the feasibility of this application is being investigated by the ACE experiment at CERN.

In the words of Dr. Doser: “Without the motivation of attempting this experiment, the experts in the corresponding fields would most likely never have collaborated and might well never have been pushed to solve the related interdisciplinary problems.”

Aaron Swartz is dead.

This article, as written by me and a friend, appeared in The Hindu on January 16, 2013.

In July, 2011, Aaron Swartz was indicted by the district of Massachusetts for allegedly stealing more than 4.8 million articles from the online academic literature repository JSTOR via the computer network at the Massachusetts Institute of Technology. He was charged with, among others, wire-fraud, computer-fraud, obtaining information from a protected computer, and criminal forfeiture.

After paying a $100,000-bond for release, he was expected to stand trial in early 2013 to face the charges and, if found guilty, a 35-year prison sentence and $1 million in fines. More than the likelihood of the sentence, however, what rankled him most was that he was labelled a “felon” by his government.

On January 11, Friday, Swartz’s fight, against information localisation as well as the label given to him, ended when he hung himself in his New York apartment. He was only 26. At the time of his death, JSTOR did not intend to press charges and had decided to release 4.5 million of its articles into the public domain. It seems as though this crime had no victims.

But, he was so much more than an alleged thief of intellectual property. His life was a perfect snapshot of the American Dream. But the nature of his demise shows that dreams are not always what they seem.

At the age of 14, Swartz became a co-author of the RSS (Rich Site Summary) 1.0 specification, now a widely used method for subscribing to web content. He went on to attend Stanford University, dropped out, founded a popular social news website and then sold it — leaving him a near millionaire a few days short of his 20th birthday.

A recurring theme in his life and work, however, were issues of internet freedom and public access to information, which led him to political activism. An activist organisation he founded campaigned heavily against the Stop Online Piracy Act (SOPA) bill, and eventually killed it. If passed, SOPA would have affected much of the world’s browsing.

At a time that is rife with talk of American decline, Swartz’s life reminds us that for now, the United States still remains the most innovative society on Earth, while his death tells us that it is also a place where envelope pushers discover, sometimes too late, that the line between what is acceptable and what is not is very thin.

The charges that he faced, in the last two years before his death, highlight the misunderstood nature of digital activism — an issue that has lessons for India. For instance, with Section 66A of the Indian IT Act in place, there is little chance of organising an online protest and blackout on par with the one that took place over the SOPA bill.

While civil disobedience and street protests usually carry light penalties, why should Swartz have faced long-term incarceration just because he used a computer instead? In an age of Twitter protests and online blackouts, his death sheds light on the disparities that digital activism is subjected to.

His act of trying to liberate millions of scholarly articles was undoubtedly political activism. But had he undertaken such an act in the physical world, he would have faced only light penalties for trespassing as part of a political protest. One could even argue that MIT encouraged such free exchange of information — it is no secret that its campus network has long been extraordinarily open with minimal security.

What then was the point of the public prosecutors highlighting his intent to profit from stolen property worth “millions of dollars” when Swartz’s only aim was to make them public as a statement on the problems facing the academic publishing industry? After all, any academic would tell you that there is no way to profit off a hoard of scientific literature unless you dammed the flow and then released it per payment.

In fact, JSTOR’s decision to not press charges against him came only after they had reclaimed their “stolen” articles — even though Laura Brown, the managing director of JSTOR, had announced in September 2011, that journal content from 500,000 articles would be released for free public viewing and download. In the meantime, Swartz was made to face 13 charges anyway.

Assuming the charges are reasonable at all, his demise will then mean that the gap between those who hold onto information and those who would use it is spanned only by what the government thinks is criminal. That the hammer fell so heavily on someone who tried to bridge this gap is tragic. Worse, long-drawn, expensive court cases are becoming roadblocks on the path towards change, especially when they involve prosecutors incapable of judging the difference between innovation and damage on the digital frontier. It doesn’t help that it also neatly avoids the aura of illegitimacy that imprisoning peaceful activists would have for any government.

Today, Aaron Swartz is dead. All that it took to push a brilliant mind over the edge was a case threatening to wipe out his fortunes and ruin the rest of his life. In the words of Lawrence Lessig, American academic activist, and his former mentor at the Harvard University Edmond J. Safra Centre for Ethics: “Somehow, we need to get beyond the ‘I’m right so I’m right to nuke you’ ethics of our time. That begins with one word: Shame.”

LHC to re-awaken in 2015 with doubled energy, luminosity

This article, as written by me, appeared in The Hindu on January 10, 2012.

After a successful three-year run that saw the discovery of a Higgs-boson-like particle in early 2012, the Large Hadron Collider (LHC) at CERN, near Geneva, Switzerland, will shut down for 18 months for maintenance and upgrades.

This is the first of three long shutdowns, scheduled for 2013, 2017, and 2022. Physicists and engineers will use these breaks to ramp up one of the most sophisticated experiments in history even further.

According to Mirko Pojer, Engineer In-charge, LHC-operations, most of these changes were planned in 2011. They will largely concern fixing known glitches on the ATLAS and CMS particle-detectors. The collider will receive upgrades to increase its collision energy and frequency.

Presently, the LHC smashes two beams, each composed of precisely spaced bunches of protons, at 3.5-4 tera-electron-volts (TeV) per beam.

By 2015, the beam energy will be pushed up to 6.5-7 TeV per beam. Moreover, the bunches which were smashed at intervals of 50 nanoseconds will do so at 25 nanoseconds.

After upgrades, “in terms of performance, the LHC will deliver twice the luminosity,” Dr. Pojer noted in an email to this Correspondent, with reference to the integrated luminosity. Precisely, it is the number of collisions that the LHC can deliver per unit area which the detectors can track.

The instantaneous luminosity, which is the luminosity per second, will be increased to 1×1034 per centimetre-squared per second, ten-times greater than before, and well on its way to peaking at 7.73×1034 per centimetre-squared per second by 2022.

As Steve Myers, CERN’s Director for Accelerators and Technology, announced in December 2012, “More intense beams mean more collisions and a better chance of observing rare phenomena.” One such phenomenon is the appearance of a Higgs-boson-like particle.

The CMS experiment, one of the detectors on the LHC-ring, will receive some new pixel sensors, a technology responsible for tracking the paths of colliding particles. To make use of the impending new luminosity-regime, an extra layer of these advanced sensors will be inserted around a smaller beam pipe.

If results from it are successful, CMS will receive the full unit in late-2016.

In the ATLAS experiment, unlike with CMS which was built with greater luminosities in mind, pixel sensors are foreseen to wear out within one year after upgrades. As an intermediate solution, a new layer of sensors called the B-layer will be inserted within the detector for until 2018.

Because of the risk of radiation damage due to more numerous collisions, specific neutron shields will be fit, according to Phil Allport, ATLAS Upgrade Coordinator.

Both ATLAS and CMS will also receive evaporative cooling systems and new superconducting cables to accommodate the higher performance that will be expected of them in 2015. The other experiments, LHCb and ALICE, will also undergo inspections and upgrades to cope with higher luminosity.

An improved failsafe system will be installed and the existing one upgraded to prevent accidents such as the one in 2008.

Then, an electrical failure damaged 29 magnets and leaked six tonnes of liquid helium into the tunnel, precipitating an eight-month shutdown.

Generally, as Martin Gastal, CMS Experimental Area Manager, explained via email, “All sub-systems will take the opportunity of this shutdown to replace failing parts and increase performance when possible.”

All these changes have been optimised to fulfil the LHC’s future agenda. This includes studying the properties of the newly discovered particle, and looking for signs of new theories of physics like supersymmetry and higher dimensions.

(Special thanks to Achintya Rao, CMS Experiment.)

There's something wrong with this universe.

I’ve gone on about natural philosophy, the philosophy of representation, science history, and the importance of interdisciplinary perspectives when studying modern science. There’s something that unifies all these ideas, and I wouldn’t have thought of it at all hadn’t I spoken to the renowned physicist Dr. George Sterman on January 3.

I was attending the Institute of Mathematical Sciences’ golden jubilee celebrations. A lot of my heroes were there, and believe me when I say my heroes are different from your heroes. I look up to people who are capable of thinking metaphysically, and physicists more than anyone I’ve come to meet are very insightful in that area.

One such physicist is Dr. Ashoke Sen, whose contributions to the controversial area of string theory are nothing short of seminal – if only for how differently it says we can think about our universe and what the math of that would look like. Especially, Sen’s research into tachyon condensation and the phases of string theory is something I’ve been interested in for a while now.

Knowing that George Sterman was around came as a pleasant surprise. Sterman was Sen’s doctoral guide; while Sen’s a string theorist now, his doctoral thesis was in quantum chromodynamics, a field in which the name of Sterman is quite well-known.


When I finally got a chance to speak with Sterman, it was about 5 pm and there were a lot of mosquitoes around. We sat down in the middle of the lawn on a couple of old chairs, and with a perpetual smile on his face that made one of the greatest thinkers of our time look like a kid in a candy store, Sterman jumped right into answering my first question on what he felt about the discovery of a Higgs-like boson.

Where Sheldon Stone was obstinately practical, Sterman was courageously aesthetic. After the (now usual) bit about how the discovery of the boson was a tribute to mathematics and its ability to defy 50 years of staggering theoretical advancements by remaining so consistent, he said, “But let’s think about naturalness for a second…”

The moment he said “naturalness”, I knew what he was getting it, but more than anything else, I was glad. Here was a physicist who was still looking at things aesthetically, especially in an era where lack of money and the loss of practicality by extension could really put the brakes on scientific discovery. I mean it’s easy to jump up and down and be excited about having spotted the Higgs, but there are very few who feel free to still not be happy.

In Sterman’s words, uttered while waving his arms about to swat away the swarming mosquitoes while discussing supersymmetry:

There’s a reason why so many people felt so confident about supersymmetry. It wasn’t just that it’s a beautiful theory – which it is – or that it engages and challenges the most mathematically oriented among physicists, but in another sense in which it appeared to be necessary. There’s this subtle concept that goes by the name of naturalness. Naturalness as it appears in the Standard Model says that if we gave our any reasonable estimate of what the mass of the Higgs particle should be, it should by all rights be huge! It should be as heavy as what we call the Planck mass [~10^19 GeV].”

Or, as Martinus Veltman put it in an interview to Matthew Chalmers for Nature,

Since the energy of the Higgs is distributed all over the universe, it should contribute to the curvature of space; if you do the calculation, the universe would have to curve to the size of a football.

Naturalness is the idea in particle physics specifically, and in nature generally, that things don’t desire to stand out in any way unless something’s really messed up. For instance, consider the mass hierarchy problem in physics: Why is the gravitational force so much more weaker than the electroweak force? If either of them is a fundamental force of nature, then where is the massive imbalance coming from?

Formulaically speaking, naturalness is represented by this equation:

Here, lambda (the mountain) is the cut-off scale, an energy scale at which the theory breaks down. Its influence over the naturalness of an entity h is determined by how many dimensions lambda acts on – with a maximum of 4. Last, c is the helpful scaling constant that keeps lambda from being too weak or too strong in some setting.

In other words, a natural constant h must be comparable to other nature constants like it if they’re all acting in the same setting.

(TeX: hquad =quad c{ Lambda }^{ 4quad -quad d })

However, given how the electroweak and gravitational forces – which do act in the same setting (also known as our universe) – differ so tremendously in strength, the values of these constants are, to put it bluntly, coincidental.

Problems such as this “violate” naturalness in a way that defies the phenomenological aesthetic of physics. Yes, I’m aware this sounds like hot air but bear with me. In a universe that contains one stupendously weak force and one stupendously strong force, one theory that’s capable of describing both forces would possess two disturbing characteristics:

1. It would be capable of angering one William of Ockham

2. It would require a dirty trick called fine-tuning

I’ll let you tackle the theories of that William of Ockham and go right on to fine-tuning. In an episode of ‘The Big Bang Theory’, Dr. Sheldon Cooper drinks coffee for what seems like the first time in his life and goes berserk. One of the things he touches upon in a caffeine-induced rant is a notion related to the anthropic principle.

The anthropic principle states that it’s not odd that the value of the fundamental constants seem to engender the evolution of life and physical consciousness because if those values aren’t what they are, then a consciousness wouldn’t be able to observe them. Starting with the development of the Standard Model of particle physics in the 1960s, it’s become known that these constants are really fine in their value.

So, with the anthropic principle providing a philosophical cushioning, like some intellectual fodder to fall back on when thoughts run low, physicists set about trying to find out why the values are what they are. As the Standard Model predicted more particles – with annoying precision – physicists also realised that given the physical environment, the universe would’ve been drastically different even if the values were slightly off.

Now, as discoveries poured in and it became clear that the universe housed two drastically different forces in terms of their strength, researchers felt the need to fine-tune the values of the constants to fit experimental observations. This sometimes necessitated tweaking the constants in such a way that they’d support the coexistence of the gravitational and electroweak forces!

Scientifically speaking, this just sounds pragmatic. But just think aesthetically and you start to see why this practice smells bad: The universe is explicable only if you make extremely small changes to certain numbers, changes you wouldn’t have made if the universe wasn’t concealing something about why there was one malnourished kid and one obese kid.

Doesn’t the asymmetry bother you?

Put another way, as physicist Paul Davies did,

There is now broad agreement among physicists and cosmologists that the Universe is in several respects ‘fine-tuned’ for life. The conclusion is not so much that the Universe is fine-tuned for life; rather it is fine-tuned for the building blocks and environments that life requires.

(On a lighter note: If the universe includes both a plausible anthropic principle and a Paul Davies who is a physicist and is right, then multiple universes are a possibility. I’ll let you work this one out.)

Compare all of this to the desirable idea of naturalness and what Sterman was getting at and you’d see that the world around us isn’t natural in any sense. It’s made up of particles whose properties we’re sure of, of objects whose behaviour we’re sure of, but also of forces whose origins indicate an amount of unnaturalness… as if something outside this universe poked a finger in, stirred up the particulate pea-soup, and left before anyone evolved enough to get a look.

(This blog post first appeared at The Copernican on January 6, 2013.)

The case of the red-haired kids

This blog post first appeared, as written by me, on The Copernican science blog on December 30, 2012.

Seriously, shame on me for not noticing the release of a product named Correlate until December 2012. Correlate by Google was released in May last year and is a tool to see how two different search trends have panned out over a period of time. But instead of letting you pick out searches and compare them, Correlate saves a bit of time by letting you choose one trend and then automatically picks out trends similar to the one you’ve your eye on.

For instance, I used the “Draw” option and drew a straight, gently climbing line from September 19, 2004, to July 24, 2011 (both randomly selected). Next, I chose “India” as the source of search queries for this line to be compared with, and hit “Correlate”. Voila! Google threw up 10 search trends that varied over time just as my line had.


Since I’ve picked only India, the space from which the queries originate remains fixed, making this a temporal trend – a time-based one. If I’d fixed the time – like a particular day, something short enough to not produce strong variations – then it’d have been a spatial trend, something plottable on a map.

Now, there were a lot of numbers on the results page. The 10 trends displayed in fact were ranked according to a particular number “r” displayed against them. The highest ranked result, “free english songs”, had r = 0.7962. The lowest ranked result, “to 3gp converter”, had r = 0.7653.


And as I moused over the chart itself, I saw two numbers, one each against the two trends being tracked. For example, on March 1, 2009, the “Drawn Series” line had a number +0.701, and the “free english songs” line had a number -0.008, against it.


What do these numbers mean?

This is what I want to really discuss because they have strong implications on how lay people interpret data that appears in the context of some scientific text, like a published paper. Each of these numbers is associated with a particular behaviour of some trend at a specific point. So, instead of looking at it as numbers and shapes on a piece of paper, look at it for what it represents and you’ll see so many possibilities coming to life.

The numbers against the trends, +0.701 for “Drawn Series” (my line) and -0.008 for “free english songs” in March ‘09, are the deviations. The deviation is a lovely metric because it sort of presents the local picture in comparison to the global picture, and this perspective is made possible by the simple technique used to evaluate it.

Consider my line. Each of the points on the line has a certain value. Use this information to find their average value. Now, the deviation is how much a point’s value is away from the average value.

It’s like if 11 red-haired kids were made to stand in a line ordered according to the redness of their hair. If the “average” colour around was a perfect orange, then the kid with the “reddest” hair and the kid with the palest-red hair will be the most deviating. Kids with some semblance of orange in their hair-colour will be progressively less deviating until they’re past the perfect “orangeness”, and the kid with perfectly-orange hair will completely non-deviating.

So, on August 23, 2009, “Drawn Series” was higher than its average value by 0.701 and “free english songs” was lower than its average value by 0.008. Now, if you’re wondering what the units are to measure these numbers: Deviations are dimensionless fractions – which means they’re just numbers whose highness or lowness are indications of intensity.

And what’re they fractions of? The value being measured along the trend being tracked.

Now, enter standard deviation. Remember how you found the average value of a point on my line? Well, the standard deviation is the average value among all deviations. It’s like saying the children fitting a particular demographic are, for instance, 25 per cent smarter on average than other normal kids: the standard deviation is 25 per cent and the individual deviations are similar percentages of the “smartness” being measured.

So, right now, if you took the bigger picture, you’d see the chart, the standard deviation (the individual deviations if you chose to mouse-over), the average, and that number “r”. The average will indicate the characteristic behaviour of the trend – let’s call it “orange” – the standard deviation will indicate how far off on average a point’s behaviour will be deviating in comparison to “orange” – say, “barely orange”, “bloody”, etc. – and the individual deviations will show how “orange” each point really is.

At this point I must mention that I conveniently oversimplified the example of the red-haired kids to avoid a specific problem. This problem has been quite single-handedly responsible for the news-media wrongly interpreting results from the LHC/CERN on the Higgs search.

In the case of the kids, we assumed that, going down the line, each kid’s hair would get progressively darker. What I left out was how much darker the hair would get with each step.

Let’s look at two different scenarios.

Scenario 1: The hair gets darker by a fixed amount each step.

Let’s say the first kid’s got hair that’s 1 units of orange, the fifth kid’s got 5 units, and the 11th kid’s got 11 units. This way, the average “amount of orange” in the lineup is going to be 6 units. The deviation on either side of kid #6 is going to increase/decrease in steps of 1. In fact, from the first to the last, it’s going to be 5, 4, 3, 2, 1, 0, 1, 2, 3, 4, and 5. Straight down and then straight up.


Scenario 2: The hair gets darker slowly and then rapidly, also from 1 to 11 units.

In this case, the average is not going to be 6 units. Let’s say the “orangeness” this time is 1, 1.5, 2, 2.5, 3, 3.5, 4, 5.5, 7.5, 9.75, and 11 per kid, which brings the average to ~4.6591 units. In turn, the deviations are 3.6591, 3.1591, 2.6591, 2, 1591, 1.6591, 1.1591, 0.6591, 0.8409, 2.8409, 5.0909, and 6.3409. In other words, slowly down and then quickly more up.


In the second scenario, we saw how the average got shifted to the left. This is because there were more less-orange kids than more-orange ones. What’s more important is that it didn’t matter if the kids on the right had more more-orange hair than before. That they were fewer in number shifted the weight of the argument away from them!

In much the same way, looking for the Higgs boson from a chart that shows different peaks (number of signature decay events) at different points (energy levels), with taller but fewer peaks to one side and shorter but many more peaks to the other, can be confusing. While more decays could’ve occurred at discrete energy levels, the Higgs boson is more likely (note: not definitely) to be found within the energy-level where decays occur more frequently (in the chart below, decays are seen to occur more frequently at 118-126 GeV/c2 than at 128-138 GeV/c2 or 110-117 GeV/c2).

Idea from Prof. Matt Strassler’s blog

If there’s a tall peak where a Higgs isn’t likely to occur, then that’s an outlier, a weirdo who doesn’t fit into the data. It’s probably called an outlier because its deviation from the average could be well outside the permissible deviation from the average.

This also means it’s necessary to pick the average from the right area to identify the right outliers. In the case of the Higgs, if its associated energy-level (mass) is calculated as being an average of all the energy levels at which a decay occurs, then freak occurrences and statistical noise are going to interfere with the calculation. But knowing that some masses of the particle have been eliminated, we can constrain the data to between two energy levels, and then go after the average.

So, when an uninformed journalist looks at the data, the taller peaks can catch the eye, even run away with the ball. But look out for the more closely occurring bunches – that’s where all the action is!

If you notice, you’ll also see that there are no events at some energy levels. This is where you should remember that uncertainty cuts both ways. When you’re looking at a peak and thinking “This can’t be it; there’s some frequency of decays to the bottom, too”, you’re acknowledging some uncertainty in your perspective. Why not acknowledge some uncertainty when you’re noticing absent data, too?

While there’s a peak at 126 GeV/c2, the Higgs weighs between 124-125 GeV/c2. We know this now, so when we look at the chart, we know we were right in having been uncertain about the mass of the Higgs being 126 GeV/c2. Similarly, why not say “There’s no decays at 113 GeV/c2, but let me be uncertain and say there could’ve been a decay there that’s escaped this measurement”?

Maybe this idea’s better illustrated with this chart.


There’s a noticeable gap between 123 and 125 GeV/c2. Just looking at this chart and you’re going to think that with peaks on either side of this valley, the Higgs isn’t going to be here… but that’s just where it is! So, make sure you address uncertainty when you’re determining presences as well as absences.

So, now, we’re finally ready to address “r”, the Pearson covariance coefficient. It’s got a formula, and I think you should see it. It’s pretty neat.


(TeX: rquad =quad frac { { Sigma }_{ i=1 }^{ n }({ X }_{ i }quad -quad overset { _ }{ X } )({ Y }_{ i }quad -quad overset { _ }{ Y } ) }{ sqrt { { Sigma }_{ i=1 }^{ n }{ ({ X }_{ i }quad -quad overset { _ }{ X } ) }^{ 2 } } sqrt { { Sigma }_{ i=1 }^{ n }{ (Y_{ i }quad -quad overset { _ }{ Y } ) }^{ 2 } } })

The equation says “Let’s see what your Pearson covariance, “r”, is by seeing how much all of your variations are deviant keeping in mind both your standard deviations.”

The numerator is what’s called the covariance, and the denominator is basically the product of the standard deviations. X-bar, which is X with a bar atop, is the average value of X – my line – and the same goes for Y-bar, corresponding to Y – “mobile games”. Individual points on the lines are denoted with the subscript “i”, so the points would be X1, X2, X3, …, and Y1, Y2, Y3, …”n” in the formula is the size of the sample – the number of days over which we’re comparing the two trends.

The Pearson covariance coefficient is not called the Pearson deviation coefficient, etc., because it normalises the graph’s covariance. Simply put, covariance is a measure of how much the two trends vary together. It can have a minimum value of 0, which would mean one trend’s variation has nothing to do with the other’s, and a maximum value of 1, which would mean one trend’s variation is inescapably tied with the variation of the other’s. Similarly, if the covariance is positive, it means that if one trend climbs, the other would climb, too. If the covariance is negative, then one trend’s climbing would mean the other’s descending (In the chart below, between Oct ’09 and Jan ’10, there’s a dip: even during the dive-down, the blue line is on an increasing note – here, the local covariance will be negative).


Apart from being a conveniently defined number, covariance also records a trend’s linearity. In statistics, linearity is a notion that stands by its name: like a straight line, the rise or fall of a trend is uniform. If you divided up the line into thousands of tiny bits and called each one on the right the “cause” and the one on the left the “effect”, then you’d see that linearity means each effect for each cause is either an increase or a decrease by the same amount.

Just like that, if the covariance is a lower positive number, it means one trend’s growth is also the other trend’s growth, and in equal measure. If the covariance is a larger positive number, you’d have something like the butterfly effect: one trend moves up by an inch, the other shoots up by a mile. This you’ll notice is a break from linearity. So if you plotted the covariance at each point in a chart as a chart by itself, one look will tell you how the relationship between the two trends varies over time (or space).