29 May 2016

On few episodes of Numb3rs

I'm watching Numb3rs. As usual in this kind of shows, you can find things that look unsound and unrealistic.

You must never forget it is a TV show, after all, even if it pretends to portray the reality of math, how real smart guys are, or whatever.

I've taken notes on few episodes since season three. Now I dump them here (just to prove this blog isn't dead as Heaven on a Saturday night).

About errors in general, it's also funny/interesting to read others resources, e.g.

Despite these things, the mathematics and topics they hint at are real, and there are so many things — too many to be all known by a single man as Charlie Eppes — worth looking at. E.g. see the following link:

Season 3, episode 3: Provenance

A Pissarro is stolen from a small museum. Don Eppes reaches the brother who's preparing a lecture with Larry and Amita, and puts a big book in front of him, saying:

we've got a database of art thieves, names, M.O.

Is that “book” the database? The good old way… but they need to do a lot of data entry to run their computer-aided analyses (specifically Amita suggests a quadratic discriminant analysis).

You can see Charlie browsing the book/database…

Later, in the FBI offices, Megan is at the computer digging the “Art Theft Archives”. Better… Then why Don gave Charlie the book? Hoping in his magic glance?

Then Larry, Amita and Charlie are altogether in a room. Larry is watching at a Pissarro's painting in a book and they talk a little bit about that, while Charlie is at the blackboard (“Security … 8.3 abc”? “Location” followed by gibberish-vectors…) …

…and Amita at the computer, running the analysis on the Art Theft Database — not the book, of course!

Amita says they've got something. Since the hard work of crunching numbers is done by the computer, what is Charlie doing on the blackboard?

Another frame where Charlie is doing his own calculations, balancing his checkbook — unrelated to the case, but anyway he's crunching numbers instead of a computer or a calculator (a computer spreadsheet would minimize the likelihood of a calculation error!)

This is a recurring oddness, anyway: bunches of digits thrown on the blackboard, just because they look cool and surely meaningless to the viewer, who thinks they must be meaningful to Charlie, confirming he is a very smart guy.

Season 3, episode 4: The Mole

In this episode there's steganography involved. A Chinese woman (Kim), former Chinese diplomat, is killed (a hit and run, apparently). She's indeed a spy — we'll know it later.

She had a computer with images of girls downloaded from “porn sites” (none of those shown are porn images, of course); Kim is one of those girls.

Charlie understands there's something hidden in the image, as another case they worked on, but a lot more sophisticated! It seems like everytime you find photograph in a laptop, you can reasonably believe there's a secret hidden inside them!

Can you zoom in on this portion of the photograph?

The computer operator obeys and zooms until we see the squared nature of a pixel, and Charlie explains that if you remove some of the squares (pixels?), you can still see the image, and so you can use them to hide something.

Then he talks about the colorblind tests, suggesting a way a message can be hidden in an image.

Everything's fine with this infodump — summary: hey guys, what if there are informations hidden in these pics? Let me check it! (Clues about why Charlie thought those are not just images?)

Everything's fine, as said, but it reinforces the common “tv-driven” perception of what steganography is: hiding informations as images inside an image. It is only a possibility, and the one you can more easily put on a TV show, but it's not the only one…

Photographs are often stored in JPG format. JPG uses lossy compression algorithm. It means pixels could be scrambled, so to say, and lost… You wouldn't play with those pixels too much naively. E.g. let's talk about DCT coefficients, something like this

Anyway, let's go on with the show.

The sophisticated algorithm Charlie is running finds that in fact there's something hidden.

I think those are numbers.

Really?! Eppes magic eye!

My algorithm was able to find an image embedded in this region.

Then he sent the JPEG to FBI techs, who removed the “extraneous squares” (pixels?) revealing a hidden code. Which is, don't forget it, still an image — an image you can interpret as letters and numbers, but still an image, anyway.

In another image (portraing Kim herself) there's a Chinese ideogram who hides indeed two names. In the sequence the image is zoomed until you see the “squares”. A “square” (pixel) can't contain smaller pixels. But it's exactly what we see: the image is zoomed so that each pixel appears on screen as a square. That's the unit. There can't be anything smaller than the unit.

But the ideogram appears as if it'd be made of pixels smaller than the pixels of the image… Which is not possible.

Things get worse: that ideogram is made of even smaller pixels, used to write two names. It's not special mathematics, nor sophisticated image algorithms: it's magic.

Season 3, episode 7: Blackout

Assuming the intent of the perpetrator is to make the electric grid collapse, Charlie can predict which node will be hit. He did it with fine math, and all sort of tricks. If the result really predicts where the perpetrator will hit, it would mean the perpetrator knows about that fine math: he studied the electric grid and he's pointing to the critical node.

I would rather assume he is just trying to hit nodes without knowing which one would be critical. Soon he discovers the common area affected by the blackout, so the next hypothesis is that the aim is to blackout that specific area. This means access to data that say which node would turn off which area. This is (through corruption) more likely than assuming a perpetrator has done the fine math Charlie did to discover the node that would bring down the whole grid.

On the other side, if it's something trivial to know, then it wouldn't be a Charlie Eppes math magic and anybody could have discovered the critical point. Strangely enough it is a single point of failure: bring it down, all the grid goes down. It's strange such a node exists, but assuming it exists, we should also assume the electric company knows it. Then FBI could have asked the electric company: which node is the most critical one? That is: which node should we protect to prevent a total blackout?

Nobody thinks about it, otherwise Charlie wouldn't have a role in the episode.

It is a little bit stretched, isn't it?

Season 3, episode 8: Hardball

Also in other episodes Charlie takes a look at bunch of numbers and formulae and then claims it's a complex algorithm, a smart model, a clever math solving a problem, or whatever.

In this episode they find some mathematics in an email. A previous email said

You're a great player, but people are going to find out about the steroids, it's only a matter of time. There's proof, but it doesn't have to get out. Nobody needs to know.

So the math in the email attachment should be a proof the baseball player was using steroids. We can see it on screen:

GJ = 1.4 HP + TMP + .72 AGR + .43 (TMP) (AGR)

HP = HR *  + LDGBR + LBI - HPavg

TMP = 3.6 EJG + $4.7\sqrt{PT}$ + KR - TMPavg

AGR = $1.5 \frac{CS}{SB+1}$ + 2.9 POB + 1.3 HBP + 2.2 TOSEB -.75 PAT - AGRavg

T = (.264)(T+1)(... 1.77 ...?)??

Charlie: This is advanced statistical baseball analysis.

Don: Sabermetrics

Charlie: And this is way advanced. It's not the kind of stuff you gonna find in the box scores of the sporting news.

Likely baseball enthusiasts have seen sabermetrics before. It doesn't look as high math able to prove anything; according to Wikipedia, sabermetrics is “empirical analysis of baseball, especially baseball statistics”. The original definition: "the search for objective knowledge about baseball.

How objectively solid could it be as a proof?

Equations we saw are fairly simple and use lingo acronyms and magic numbers which likely are (or should be) coefficients resulting from empirical observations or determined by a trial-and-error method.

Can this advanced sabermetrics spot drug use?

I don't know. I mean, I'm gonna have to figure out what these abbreviations and notations mean1.

Charlie consults with an expert fantasy baseball player in Larry's office; he says, after a quick glance at the paper, that it is a “pretty wild” stuff, with a “cool changepoint detection”.

All this doesn't sound plausible. Hidden there could be fine hard math and statistical analysis and whatever, but formulae as they appear are pretty simple. Moreover we must ask ourselves how is that possible that the fantasy baseball player spotted so quickly the cool changepoint detection Charlie will work on later.

Another incredible fact is that they assume you need a supercomputer to process the data… Given that the mathematics involved is simple, no matter how subtle nor what's the hard theory behind it, even a common modern desktop computer suffices — and a spreadsheet software (e.g. Microsoft Excel) can be enough to do cool statistical analysis. A supercomputer is needed for really hard computations. Nothing like that here. Here it isn't necessary, even though it seems they all agree it is.

All that work was created by a single young man2 who surely has no access to a supercomputer. And later he'll make calculations to predict the career of Don Eppes, if he had chances to play with major leagues using steroids, and without using them.

Computer techno-blabla

They have found the dead body of Chris Bronmiller. Then they try to gain access to his email address. We can see a window on a computer screen where we can read:

Search type: Credit Card
Name: Bronmiller, Chris

_email account created at IP 49.010.642.000

>list send history... preparing list

What's this?! They can't pay a consultant to tell that such an IP is impossible?! It takes less than five minutes to check it on the Internet: cheap and easy.

Moreover, what does it mean to create an email account at an IP?!

Maybe IP doesn't stand for Internet Protocol but for something else, and the number is an identifier for the account. Strangely made of four dot-separated numbers. Like IPv4 addresses. You don't buy it, do you?

Season 4, episode 4: Thirteen

There's a killer who chooses his victims according to their telephone number: he interprets it according to Gematria.

The overcomplicated explanation of what they're going to try to anticipate the killer by discovering the numbers he could be interested in can be summarized easily with the word filtering: they have a set of numbers and of criteria. They have to filter out those numbers that doesn't match the criteria.

It's a basic thing. You wouldn't waste the time of a famous and precious mathematician to do it. And you don't need the help of Amita's combinatorics mind. Everybody impressed by her coin-sorting machine. It's filtering, it's a trivial selection process.

  1. Granger is surprised: isn't it standard math stuff? Charlie explains: when mathematicians create new types of analyses, they invent their own notational shorthand. This makes sense, but according to what we've seen, here there's no special notations, just plain common math symbols, plus several variables. I don't think that choosing a name for a variable must be considered a notational shorthand. Moreover I would also claim that mathematicians would tend to use single letter symbols, because 1.5ABC usually means 1.5×A×B×C and not 1.5 mutiplied a quantity “called” ABC. Sabermetricians have a different idea for sure, since they want to work with acronyms like BsR for (numbers of “expected”) Base Runs, PERA for Peripheral Earned run average, HP for Hitting Power, KB for strikeout reactions, and so on (last two comes from a frame of the episode).

  2. A self-taught one. Charlie is surprised when the youngster reveals that. By the way, we are all self-taught. Nobody can enter your head and change your reasoning or do it instead of you. There's no real difference between attending several lectures in an university and reading all alone books about a subject we want to learn. Well, indeed there's at least a difference: a teacher can explain you what you've not understood immediately, and can check if you've understood things correctly. But it's not strictly necessary (even if surely helpful) to learn anything you want to learn, as long as you can find reliable resources on the subject. There are barriers when to advance in the topic you need devices, labs, tools you can't afford.

No comments:

Post a Comment

Be polite and possibly on topic.