Security And Threat Modeling (All Too Briefly)

Today's notes are more of an outline.

Taxis

Today's discussion: Of Taxis and Rainbows. There's also a corresponding Hacker News thread.

(1) Let's talk about the core anonymization issue that the article reported on.

  • What's in the data, and what was anonymized?
  • What did they do to anonymize the data before release?
  • What went wrong? (how was the anon. circumvented?)
  • Is this attack a generally useful technique? for what?
  • What could have been done differently?

(1) Takeaways:

  • True anonymization is hard, and it's difficult if not impossible to undo mistakes.
  • Often the interests of historians, social scientists, journalists and others may be competing with privacy interests. There is not always a single right answer, and there certainly isn't an absolute answer that is right for all such cases.
  • When presented with such a situation in tech, think carefully. Seek advice if it's available on both the technical and non-technical angles. Here, the FOIA request was answered in 2 days.

(2) What could an adversary do with this data?

  • What can they do by de-anonymizing the driver and taxi?
  • Don't get distracted by the anonymization issue! What about the passengers?
  • Is there any risk of correlation with other data?
  • Are there possible defenses?

(3) Let's analyze this Hacker News exchange. There were 2 arguments I thought were interesting near the top:

  • "NYC is too dense for reasonable correlation"
  • "nobody who lives in a non-dense part of NYC can afford to take a taxi anyway"

New York City is a lot more than skyscrapers. It includes, say, Staten Island:

Here's a random Google street view:

Take Malte's 2390, Julia's 1952B, and other such courses if you think this sort of thing is interesting. Most importantly, if you think of "social implications" as a separate thing from engineering, stop. It's not always inseparable, but it frequently is. As with much else, there's nuance.

Threat Modeling

There are a small set of slides to accompany today's notes.

What kinds of security threats are there? Are they always technical? How can engineers who aren't security experts (and taking one security class doesn't make you an expert) avoid gaps in their mental model of security?

I'd like to set the stage with two examples: one global and one local.

Example 1: The Parler Data Leak

Some time ago, hackers were able to obtain nearly all data from the Parler website. It turns out that this was technically pretty easy to do, because of a variety of engineering decisions. These included:

  • insecure direct object references: post IDs were consecutive, and so a for loop could reliably and efficiently extract every post on the site.
  • no rate limiting: the above-mentioned for loop could run quickly.
  • failure to scrub geo-location data from uploaded images: the physical location of users was compromised.

Combining multiple security mistakes tends to add up to more than the sum of the individual impacts.

Example 2: Phishing Email

Here's a screenshot of a real email someone received a couple of years ago.

There was a lot more to follow, carefully crafted to trigger fear and loss aversion in the recipient. The sender claimed that they would toss the (salacious and embarrassing) information they had in exchange for a bit of money... in Bitcoin, of course.

This is an example of a social engineering attack: no technical aspect of a system is compromised, but still its human or institutional facets can be hacked. Social engineering is pretty broad, but other examples include a hacker calling Google to reset your password, or a trojan-horse USB drive being left in a bank parking lot.

Don't Forget the People Involved

Here's a classical experiment in psychology.

Example 1

Suppose you're shown four cards. The deck these cards have been taken from have colors on one side (either yellow or red), and numbers (between 1 and 100) on the other. At first, you only see one side of each. Perhaps you see:

1
2
yellow
red

You're asked to determine if these four cards obey a simple rule: if the card's number is even, then the card's color is red. The question is, which cards do you need to turn over to check the rule?

Example 2:

Suppose you're shown four cards. The deck these cards have been taken from have drink orders on one side (either juice or beer), and ages (between 1 and 100) on the other. At first, you only see one side of each. Perhaps you see:

beer
water
17
23

You're asked to determine if these four cards obey a simple rule: if the drink order is alcoholic, then the age is no less than 21. The question is, which cards do you need to turn over to check the rule?

Psychological Context

These are called [Wason Selection Task]s(https://en.wikipedia.org/wiki/Wason_selection_task) and have been studied extensively in psychology.

Surprisingly, humans tend to do poorly on the first task, but far better on the second. Why? Because context matters. If you can catch someone outside of their security mindset with an apparently rote task, you have a higher chance of exploiting them.

Good security involves far more than just the software. But whether or not we're talking about software or the human brain, bias can be attacked, and the attacker can abuse leaky abstractions.

Aside: The GDPR

You may have heard of the GDPR. On your term projects, it's a good idea to consider how you would comply with the law; we may ask (e.g.) how you would support deletion of a user's data (although we don't expect you to necessarily have time to implement your ideas).

Why should you care about laws like the GDPR? Setting aside ethical and empathetic concerns, if you ever intend to build software that's used in Europe, you ought to be aware of compliance requirements.

Threat Modeling

We tend to think of security as being about confidentiality. But that's too limiting. Are there other kinds of security goals you might want?

Here are 6 very broad classes of goal, which together comprise the S.T.R.I.D.E. framework:

Authentication (vs. Spoofing)

Integrity (vs. Tampering)

Non-repuditation (vs. Repudiation)

Confidentiality (vs. Information Disclosure)

Availability (vs. Denial of Service)

Authorization (vs. Elevation of Privilege)

Elevation of Privilege

We're going to get practice with STRIDE using the cards by Adam Shostack (a noted security expert) for a game called Elevation of Privilege. Adam is the author of my favorite book on threat modeling, which is available in the Brown library. I strongly recommend using this book as a guide to mitigating the kinds of threats we'll identify today.

Happily, this semester I have actual hard-copies of this game to use in class today. We won't have time to play the game, but I want to use them as props to think about potential threats your term projects will face. Everyone should leave with at least one threat in mind.