Skip to main content

Conflicting threat models

As I mentioned in my previous post, we have a long way to go when it comes to information security. I'll be presenting a talk on building secure systems at PyCon 2015 next month, and I hope to blog more about interesting bits of comprehensible security.

I believe strongly in the importance of a threat models. A threat model is your idea of what you're protecting against. It may seem obvious that you can't effectively protect anything without knowing what you're protecting it from. Sadly, simply contemplating your threat model puts you ahead of the curve in today's software industry.

Threat models often simply deal with how much effort you're willing to spend to prevent something from happening. In a world with finite resources, we have to make choices. Some models are unrealistic or prohibitively expensive to defend against. These questions aren't all strictly technical: perhaps some risk is adequately covered by insurance. Perhaps you have a legal or a compliance requirement to do something, even if the result is technically inferior. These questions are also not just about how much you're going to do: different threat models can lead to mutually exclusive resolutions, each a clear security win.

Consider your smartphone. Our phones have a lot of important, private information; it makes sense to protect them. The iPhone 6 provides two options for the lock screen: a passcode and a fingerprint sensor. Passcodes have been around for about as long as smartphones have, while fingerprint sensors are new and exciting. It's clear that either of them is more secure than not protecting your phone at all. But which one is more secure?

Most people instinctively feel the fingerprint sensor is the way to go. Biometric devices feel advanced; up until recently, they only existed in Hollywood. Fingerprints have their share of issues. It's impossible to pick a new key or have separate keys for separate capabilities; you're stuck with the keys you have. A fingerprint is like a password that you involuntarily leave on everything you touch. That said, turning a fingerprint into something that will unlock your iPhone is out of reach for most attackers.

Passcodes aren't perfect either. People generally pick poor codes: important dates and years are common, but typically not kept secret in other contexts. If you know someone's birthday, there's a decent chance you can unlock their phone. At least with a passcode, you have the option of picking a good one. Even if you do, a passcode provides little protection against shoulder surfing. Most people unlock their phone dozens of times per day, and spend most of that day in the presence of other people. A lot of those people could see your passcode inconspicuously.

Two options. Neither is perfect. How do you pick one? To make an informed choice, you need to formalize your threat models.

In the United States, under the Fifth Amendment, you don't have to divulge information that might incriminate you. I am not a lawyer, and courts have provided conflicting rulings, but currently it appears that this includes computer passwords. However, a court has ruled that a fingerprint doesn't count as secret information. If you can unlock your phone with your fingerprint, they can force you to unlock it.

If your threat models include people snooping, the fingerprint sensor is superior. If your threat model includes law enforcement, the passcode is superior. So, which do you pick? It depends on your threat model.

Disclaimer: this is an illustration of how threat models can conflict. It is not operational security advice; in which case I would point out other options. It is not legal advice, which I am not at all qualified to dispense.

We're just getting started

Most conference talks are transactional. The speaker has a point to make. After the presentation, it's "over"; only spoken about in perfect tenses. You've communicated your thoughts, perhaps had a conversation or two, but, mostly, moved on.

I've given talks like these. However, about two years ago, I gave a talk that had a deep impact on my life. That talk was Crypto 101.

Right before the presentation, cryptanalytic research was released that popped RC4. I couldn't have asked for a better setup. Turns out it wasn't just luck; eventually our systemic failure as an industry in taking security seriously was bound to catch up with us. Since then, the proverbial piper has been well-paid. We've seen a plethora of serious security bugs. Huge corporations have been the victims of attacks in the billions of dollars a pop. As I'm writing this blog post, there's an article on a new TLS attack in my reading list.

It quickly became clear that this wasn't just a one-off thing. I started writing Crypto 101, the book, not too long after giving the talk. We were, unwittingly, at the crest of a wave that's still growing. Projects like PyCA and LibreSSL started fighting tirelessly to make the software we use better. Security talks became a mandatory part of the programming conference food pyramid. My friends Hynek and Ying gave fantastic talks. They, too, got "lucky" with a security bombshell: Heartbleed happened mere days before the conference.

Last week, I presented Crypto 101 again at rax.io, Rackspace's internal conference. It was well-received, and I think I provided value for people's time. One thing, more than anything, it crystallized where we are. We're not done yet. There's still a huge audience left to reach. Interest in information security has done nothing but grow. With a total of just over 100,000 downloads for the book and about half as many for the recording of the presentation, people are definitely listening. We've made real impact, and we have people's attention, but we need to keep going.

One of the two talks I'll be giving at PyCon is a more high-level overview of how we can build secure systems. More friends of mine will talk in about TLS there too. Within Rackspace, I'm focusing on information security. There are awesome things brewing here, and I hope that we can continue the great work we've been doing so far.

We've accomplished a lot, but we're just getting started.

Updated GPG key

This message is also available as a Gist. I have also updated Keybase since writing the GPG signed message below.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

I am cycling my GPG key.

My old key has fingerprint:

D9DC 4315 772F 8E91 DD22 B153 DFD1 3DF7 A8DD 569B

My new key has fingerprint:

45DC 13EB 6A01 21E8 5219 8C09 8763 869B E2B2 663E

(If you're looking for the key ID, that's the last 8 hex characters of
the fingerprint.)

While my new key may superficially seem less secure, since I have gone
from a 4096 bit RSA key to a 3072 bit one, the new one has the
wonderful advantage of living on a smart card.

I have no reason to presume my old key to be compromised.  I have
changed the expiration date of my old key to March 7th of this year. I
have signed the new key with the old one.

I am in the process of updating https://keybase.io/lvh.
-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - https://gpgtools.org

iQIcBAEBCgAGBQJU54Q5AAoJEISZopctIA+8JqYP/2agk8RHNklaPqQ6JHdK7Rtu
ehtok2X7wcirWAridRK/l0Tfjl5x2lFJitb+rP5X3k30qw2FvoLF9YOvbQBzezR+
ma9S050GPlvs2knmRQb9f53KmjlmC9DLHT40f3BUJtUteH5X8KgEy2YfbThN2B4C
Z6P30w03gqMkOu5vpaUTe6wkTMpMeGfQz240Kwa3N84UkzzAP3dTBOkm1AiHDUeJ
yj4a9zz+qzayVGI0A1W5W8zd4+GK7Pant7I/lRd02jRQHoHtnbgiBm+5PbGvihFp
zdtrd9YDIhWJzo84qSawQCVAuhy+8CGMFqOHBtTo/BV6HklVLUuOdrfy+IwpV9Jh
cj2Cc5AauFYcWYzJkYL9MHj0b6UI4Uxx1OiAq7onBsajaIE97nLbt1j9A4I4Pb4d
7ub6YmTnwA5aLwqPbfl/egX5xKEIXq/TGcVnbpxY65fw4GsG/hJyq5JHrW43ATqX
sSTdnmIbjyw/PQFr+U0ddUfOnbITJKUElZnCami/JnZV6jDUOPY/Kn48nsxF6bk2
UatqaXpR7yQAvzHz9Yl2sZHcMw/TguumqwuYUQWLFUVZJmmc3iunCfFDVD9tiEz1
00M4PZxhIZt8zKDKIb0PSVa46yHt+kSlgtdgwIvvbuZn9TokXdp/n/DXvBkqohQg
De57mY9RnWwt5fy6AWd1
=lzks
-----END PGP SIGNATURE-----

Securing APIs with shims

Imagine that you had a capability URL, except instead of giving you the ability to perform a specific action, it gave you the ability to perform a (limited) set of operations on a third party API, e.g. OpenStack. The capability URL wouldn't just be something you exercise or revoke; it'd be an API endpoint, mostly indistinguishable from the real API. Incoming requests would be inspected, and based on a set of rules, either be rejected or forwarded to the API being shimmed.

Proof of concept

At my day job, we had a programming task that I thought logic programming would be well-suited for. Unfortunately, logic programming is kind of weird and esoteric. Even programmers with otherwise broad experiences professed to not being quite sure how it worked, or what to do with it.

Therefore, I used up my hack day (a day where we get to hack on random projects) to cook up some cool stuff using logic programming. I demoed the usual suspects (the monkey with the banana, and a sudoku solver), illustrating the difference between the relational nature of the logic programs and the imperative nature of the algorithms you might otherwise write to solve the same problems. Finally, I demoed the aforementioned proxying API shim. The proof of concept, codenamed shimmer, is up on Github.

Let's take a look at the handler function, which takes incoming requests and modifies them slightly so they can be passed on:

(defn build-handler
  [target-host target-port]
  (fn [incoming-request]
    (if (match (spy incoming-request))
      (let [modified-request (-> incoming-request
                                 (dissoc :scheme) ;; hack
                                 (assoc :host target-host
                                        :port target-port
                                        :throw-exceptions false))]
        (spy (request (spy modified-request))))
      {:status 403 ;; Forbidden
       :headers {"content-type" "text/plain"}
       :body "Doesn't match!"})))

(Those spy calls are from the excellent timbre library. They make it easy to log values without cluttering up your code; a godsend while developing with some libraries you're not terribly familiar with.)

The matching function looks like this:

(defn match
  "Checks if the request is allowed."
  [req]
  (not= (l/run 1 [q]
          (l/conde
           [(l/featurec req {:request-method :get})]
           [(l/featurec req {:request-method :post
                             :headers {"x-some-header"
                                       "the right header value"}})]
           [(l/featurec req {:request-method :post})
            (l/featurec req {:headers {"x-some-header"
                                       "another right header value"}})]))
        '()))

Future work

Make this thing actually vaguely correct. That means e.g. also inspecting the body for URL references, and changing those to go through the proxy as well.

Start collecting a library of short hand notations for specific API functionality, e.g. if you're proxying an OpenStack API, you should be able to just say you want to allow server creation requests, without having to figure out exactly what those requests look like.

The spec is hard-coded, it should be specified at runtime. That was trickier than I had originally anticipated: the vast majority of core.logic behavior uses macros. While some functionality is fairly easy to port, that's probably a red herring: I don't want to port a gazillion macros. As an example, here's conds, which is justconde as a function (except without support for logical conjunction per disjunctive set of goals):

(defn ^:private conds
  "Like conde, but a function."
  [goals]
  (if (empty? goals)
    l/fail
    (l/conde [(first goals)]
             [(conds (rest goals))])))

That's not the worst function, but let's just say I see a lot of macroexpand in my future if I'm going to take this seriously.

URLs and bodies should be parsed, so that you can write assertions against structured data, or against URL patterns, instead of specific URLs.

If I ever end up letting any of this be a serious part of my day job, I'm going to invest a ton of time improving the documentation for both core.logic and core.typed. They're fantastic projects, but they're harder to get started with than they could be, and that's a shame.

Reverse ungineering

(Title with apologies to Glyph.)

Recently, some friends of mine suggested that "software engineer" is not a good job title. While they are of course free to call their profession whatever they like, I respectfully disagree: I think "engineer" is a perfectly cromulent description of what we do.

This is an opinion piece. Despite arriving at opposite conclusions, the disagreement is feathery at best.

What if buildings failed as often as software projects?

To illustrate the differences between software development and other engineering disciplines, Glyph compares software to civil engineering.

For example, when it comes to getting things done, we're just not very good:

Most software projects fail; as of 2009, 44% are late, over budget, or out of specification, and an additional 24% are canceled entirely. Only a third of projects succeed according to those criteria of being under budget, within specification, and complete.

Such shenanigans would never be accepted in a Serious Engineering Discipline, like civil engineering:

Would you want to live in a city where almost a quarter of all the buildings were simply abandoned half-constructed, or fell down during construction? Where almost half of the buildings were missing floors, had rents in the millions of dollars, or both?

I certainly wouldn't.

Computers are terrible, but not quite that bad, as Glyph points out. "Failure" simply means something different for software projects than it does for construction projects. Many of those "failed" software projects were quite successful by other measures; the problem isn't with software projects, it's with applying civil engineering standards to a project that isn't.

Software projects aren't civil engineering projects. Attempts to treat them as such have done much more harm than good. That said, that doesn't mean that software development isn't engineering.

Firstly, civil engineering is the outlier here. Other engineering disciplines don't do well according to the civil engineering success yardstick either. The few engineering endeavors that do are usually civil engineering in disguise, such as the construction of nuclear and chemical plants. Rank-and-file projects in most fields of engineering operate a lot more like a software project than the construction of a skyscraper. Projects are late and over budget, often highly experimental in nature, and in many cases also subject to changing requirements. It's true that we just can't plan ahead in software, but we're not the only ones.

Secondly, we may be confounding cause and effect, even if we overlook that not all engineering is civil engineering. Are software projects unable to stick to these standards because it's not engineering, or is civil engineering the only thing that sticks to them because they have no other choice? Conversely, do we fail early and often because we're not engineering, or because, unlike civil engineering projects, we can? [1]

Finally, software has existed for decades, but buildings have for millennia. Bridges used to collapse all the time. Tacoma Narrows wasn't so long ago. If the tour guide on my trip to Paris is to be believed, one of those bridges has collapsed four times already.

But this isn't science!

Supposedly, software engineering isn't "real" engineering because, unlike "real" engineering, it is not backed by "real" science or math. This statement is usually paired with a dictionary definition of the word "engineering".

I feel this characterization is incongruent with the daily reality of engineering.

Consider the civil engineer, presumably the engineeringest engineer there is. [2] If you ask me to dimension an I-beam for you, I would:

  • spitball the load,
  • draw a free-body diagram,
  • probably draw a shear and moment diagram,
  • and pick the smallest standard beam that'll do what you want.

If you want to know how far that beam is going to go, I'll draw you some conjugate beams. I would also definitely not use the moment-area theorem, even though it wouldn't be too difficult for the reasonable uses of an I-beam.

Once upon a time, someone inflicted a variety of theories on me. Euler-Bernouilli beam theory, for example. Very heavy textbooks with very heavy math. Neither my physical therapist nor my regular one expect me to ever truly recover. Nonetheless, area moments and section moduli are the only way to understand where the I in I-beam comes from.

Nasty math didn't prevent me from dimensioning that I-beam. And I do really mean math, not physics: Euler-Bernouilli is a math hack. You get it by taking Hooke's law and throwing some calculus at it. Hooke's law itself is more math than physics, too: it's a first-order approximation based only on the observation that stuff stretches when you pull it. It's wrong all the time, even for fairly germane objects like rubber bands. Both theories were put together long before we had materials science. We use them because they (mostly) work, not because they are a consequence of a physical model.

That was just one example from a single discipline, but it holds more generally, too. I analyze circuits by recognizing subsections. If you show me a piece that looks like a low-pass filter, I am not distracted by Maxwell's equations to figure out what that little capacitor is doing. I could certainly derive its behavior that way; in fact, someone made me do that once, and it was quite instructive. But I'm not bothered with the electrodynamics of a capacitor right now; I'm just trying to understand this circuit!

This isn't just how engineers happen to do their jobs in practice, either. Engineering breakthroughs live on both sides of science's cutting edge. When Shockley et al. first managed to get a transistor to work, we didn't really understand what was going on. [3] Carnot was building engines long before anyone realized he had stumbled upon one of the most fundamental properties of the universe. Nobody was doing metaphysics. Sadi wanted a better steam engine.

To me, saying that I-beam was dimensioned with the help of beam theory is about as far from the truth as saying that a software project was built with the help of category theory. I'm sure that there's some way that that thing I just wrote is a covariant functor and you can co-Yoneda your way to proving natural isomorphism, but I don't have to care in order to successfully produce some software. It's easy to reduce an applied field to just the application of that field, but that doesn't make it so; especially if we haven't even really figured out the field yet.

So, even if the math and science behind computer engineering is somehow less real than that other math and science, I think that difference is immaterial, and certainly not enough to make us an entirely different profession.

But that isn't art!

Many people smarter than I have made the argument that programming is art, not dissimilar from painting, music or even cooking. I'm inclined to agree: many talented programmers are also very talented artists in other fields. However, I do disagree that those things are art-like unlike engineering, which is supposedly just cold, hard science.

There's a not-so-old adage that science is everything we understand well enough to explain to a computer, and art is everything else. If that's true, there's definitely plenty of art to be found in engineering. (That was a little tongue-in-cheek. Nobody wants to get dragged into a semantic argument about what art is.)

Even with a much narrower view of art, engineers do plenty of it, as I've tried to argue before. Not all engineering calls are direct consequences of relativity, thermodynamics or quantum mechanics. Sometimes, it is really just down to what the engineer finds most palatable. Even civil engineers, the gray predictable stalwarts of our story, care about making beautiful things. The Burj Khalifa wasn't a consequence of a human following an algorithm.

Conclusion

I think the similarities run deep. I hope we don't throw that away essentially just because our field is a little younger. We're all hackers here; and we're all engineers, too.

Footnotes

[1] I suppose this is really analogous to the anthropic principle, except applied to engineering disciplines instead of humans.
[2] I'm using civil engineer here in the strict American sense of person who builds targets, as opposed to the military engineer, who builds weapons. Jokes aside, perhaps this is related to the disagreement. Where I come from, "civil engineer" means "advanced engineering degree", and encompasses many disciplines, including architectural (for lack of better word; I mean the American "civil engineer" here), chemical, electrical, and yes, computer.
[3] While it is very easy to make up a sensible-sounding narrative time line after the fact for the breakthroughs in physics and engineering that eventually made the transistor possible, this ignores the strong disagreements between theoretical predictions and practical measurements of the time. Regardless of their cause, it would be foolish to assume that Shockley just sat down and applied some theory. The theory just wasn't there yet.

On multiplayer turn-based game mechanics

Most classic turn-based games, from chess all the way to Civilization V, are sequential in nature. A player makes a move, then the next player makes a move, and so on. The details can vary, for example:

  • There could be two players, or multiple. This number is tightly bound for scaling reasons, which we'll discuss later.
  • The game could have perfect information, like chess, where all players see a move as soon as it is played. The game could also have imperfect information, like Civilization V, where players see part of a move, but the effets may be obscured by fog of war.
  • The players may play in a consistent order (chess, Civilization V), or in a somewhat random one (D&D's initiative system).

All of those things are more or less orthogonal to the turn system. Players play turns sequentially, so I'm going to call these sequential turn-based games.

Sequential turns make scaling the number of players up difficult. Even with only 8 players, any given player will spend most of their time waiting. While 8 players are a lot for most turn-based games, it's nothing compared to an MMORPG.

An alternative to sequential turn-based play is simultaneous turn-based play. In simultaneous turn-based play all players issue their moves at the same time, and all moves are played out at the same time. The simplest example is rock-paper-scissors, but Diplomacy works the same way. More recently, this system has been explored by the top-down tactical game Frozen Synapse.

While simultaneous turn-based play gets us closer to making massively multiplayer turn-based games feasible by turning a linear scaling problem into a constant time one, we're not quite out of the woods yet.

Consider what happens when a player does not make a move. There are a few reasons that might happen:

  • The player is not playing the game right now.
  • The player has stopped playing the game altogether.
  • The player may be in a hopeless position, where stalling is better than losing. (Stalling may tie up lots of enemy resources.)

If you've ever gotten frustrated at a multiplayer game that has a "ready" system before you begin a game, but had to wait because one of the players disappeared; this is essentially the problem turn-based games face every turn.

There are a number of ways to mitigate this problem. Games can duplicate playing fields. That works for both sequential games like Hero Academy and simultaneous ones like Frozen Synapse. If a player doesn't make a move, that particular instance of the game world doesn't go anywhere; but you can play any number of games simultaneously.

For this strategy to work, the playing fields have to be independent. You don't lose heroes or soldiers because they're stuck on some stale game. The worst possible outcome is that your game statistics don't reflect reality.

That works, but rules out a permanent game world with shared resources. If there's a larger story being told, you would want these worlds to be linked somehow: be it through shared resources, or because they're literally the same game world.

There's a number of creative ways to get out from under that problem, usually by involving wall-clock time. For example, if a player doesn't respond within a fixed amount of time, they may forfeit their turn. Fuel consumption might be based on wall-clock time, not turns. [1] There's a lot of degrees of freedom here. Do you use a global clock, or one local to a particular area?

A global clock is probably simpler, but poses some game play challenges. How long is the tick? Too fast, and a player may see their empire annihilated while they're sleeping. Too slow, and the most trivial action takes forever. There isn't necessarily one right answer, either. In an all-out cataclysmic struggle between two superpowers, a complete tactical battle plan may take a long time. Any timescale that isn't frustratingly short for that situation will be frustratingly long for anyone trying to guide their spaceship (or kodo, depending which universe you're in) across the Barrens.

Local clocks have their own share of difficulties. You still need to answer what happens for anything that isn't in a particular battle; you still need to answer what happens when battles merge or diverge.

I'm currently exploring the shared global clock. In order to mitigate the issues I described, I'm contemplating two ideas:

  • Allow programmable units; a la Screeps, CodeWars...
  • Allow players to plan several turns ahead of time.

These are, of course, not mutually exclusive.

Footnotes

[1] I don't particularly like this, because it "breaks the fourth wall" in a sense. If my engines are still consuming fuel real time, why can't the enemy fire missiles? Either time is stopped, or it isn't. Sure, games can be abstract, but that feels like an undue inconsistency.

hypercathexis dev notes part 1

hy·per·ca·thex·is, n, pl hy·per·ca·thex·es \-kə-ˈthek-səs, -ka-\: excessive concentration of desire upon a particular object

I'm considering renaming the project to its plural, hypercathexes, because then it can be about a hyper cat in a bunch of hexes.

The only real constraints I started with was that I wanted a simultaneous turn-based space game on a hex grid.

Amit Patel from Red Blob Games has basically the awesomest page about hex grids, and a ton of awesome pages about many other areas of game development. I think it's a fantastic resource for programmers like myself who don't do game dev as a day job, but just want to make a little game on the side.

Simultaneous turn-based means that all players plan their moves simultaneously, and they are then also executed simultaneously. This has an interesting scaling effect. On the one hand, it clearly scales better to many players, because players "play" simultaneously. On the other hand, you start getting interesting problems. For example, clock synchronization. Does the entire world advance with the same tick-tock pattern? What happens when a player does not make a move within the allotted time? If you allow different clocks in the world, does time advance faster if both players submit a move, or do you always wait until the maximum timeout?

I wanted an excuse to play with Om and found Chestnut.

The first version had a working hex grid, but displayed it using offset coordinates. I wanted to work using axial coordinates as much as possible, because it makes a lot of math so much easier. Axial coordinates work together with the "grain" of the hex map:

axial base vectors

Second thing I did was move from individual <img> tags produced by Om to a single <svg> with hexes (<polygon>) inside it. This fixed a number of annoying placement issues with CSS. CSS really wants to position things based on bounding box, not midpoints. That's great for web pages, not so much for my hex grid. I ran into a number of annoying issues where certain hex borders would be wider than others. Browsers aren't made to draw hex grids based on left/right offsets, I guess...

I started by expressing distances in the SVG in terms of the hex width, which would become my unit. The height would then be \(\sqrt{3}/2\). Then, I realized that I could make my life easier by expressing all x coordinates in terms of a single hex width, and all y coordinates in terms of a single hex height; then I could just scale differently across x and y at the end.

Simple SVG hex grid

Developing in Firefox was mostly painless, but I discovered many discrepancies once trying it in Chrome. Things that should be the same aren't, particularly when it comes to transforms. For example, Chrome would occasionally literally do the inverse of the scaling it was supposed to:

Differences in scaling behavior across browsers

I guess trying to implement things in browsers was a mistake.

What the heck is a clojure.lang.IFn$LO?

It's no secret that I love Clojure. Like any tool though, it isn't perfect. Today, I was trying to write unit tests that use clojure.core.async/timeout, so I wrote a test double analogous to Twisted's Clock. As I tried to with-redefs it in, I got the most inscrutable error message out: java.lang.ClassCastException: icecap.handlers.delay_test$fake_timeout$timeout__22934 cannot be cast to clojure.lang.IFn$LO.

Wha? I know clojure.lang.IFn, Clojure's function type, but what the heck is a clojure.lang.IFn$LO?

Searching for the term didn't give any particularly useful results. It was clear this happened when I was redeffing the original timeout, so I looked at its documentation:

clojure.core.async/timeout
([msecs])
Returns a channel that will close after msecs

Doesn't look too special to me. What's the type of that thing, anyway? Let's find out:

> (parents (type timeout))
#{clojure.lang.IFn$LO clojure.lang.AFunction}

Aha! So that is actually part of timeout, not something else wonky going on. What does the source say? It's a pretty lame shim:

(defn timeout
"Returns a channel that will close after msecs"
[^long msecs]
(timers/timeout msecs))

I mean, nothing interesting there, just a type hint.

Oh. Wait. That's not just a type hint. long is a primitive. Testing:

> (parents (type (fn [^long x] x)))
#{clojure.lang.IFn$LO clojure.lang.AFunction}

Aha! Due to a JVM quirk, functions with a primitive type hint are special. That works for doubles, too:

> (parents (type (fn [^double x] x)))
#{clojure.lang.IFn$DO clojure.lang.AFunction}

And multiple arguments:

> (parents (type (fn [^double x ^double y] x)))
#{clojure.lang.IFn$DDO clojure.lang.AFunction}
> (parents (type (fn [^double x ^long y] x)))
#{clojure.lang.IFn$DLO clojure.lang.AFunction}

Adding a simple type hint to the function fixed it. Success!

On discussing software security improvements

A common criticism of information security folks is that they tend to advise people to not do any crypto. Through projects like Crypto 101, I've attempted to make a small contribution towards fixing that.

In the open source world, various people often try to improve the security of a project. Because designing secure systems is pretty hard, they often produce flawed proposals. The aforemetioned tendency for infosec-conscious people to tell them to stop doing crypto is experienced as unwelcoming, even dismissive. Typically, the only thing that's accomplished is that a lot of feelings get hurt; it seems to only rarely result in improved software.

I think that's quite unfortunate. I think open source is great, and we should be not just welcoming and inclusive, but aiming to produce secure software. Furthermore, even if a flawed proposal is unsalvageable, a clear description of why it is flawed will presumably result in fewer negative interactions. Best case scenario, the issues with a proposal can be discussed and potentially rectified.

In an effort to improve this situation, I'm documenting what I believe to be a useful way to discuss security changes and their tradeoffs. As Zooko has taught me:

Security isn't about perfect versus imperfect or about better versus worse, it's about this attack surface versus that attack surface.

This document aims to be the equivalent of an SSCCE for generic bug reports: a blueprint for making suggestions likely to lead to productive discourse, as long as we can agree that we're trying to produce more secure software, as well as provide a welcoming development environment.

Important points

A good proposal should contain:

  1. A brief description of what you're suggesting.
  2. A description of the attack model you're considering, why the current system does not address this issue, and why the suggested system does address this issue.
  3. A motivation of the attack model. Why is it important that this issue is actually addressed?
  4. How does this change affect the attack surface (i.e. all of the ways an attacker can attempt to attack a system)?
  5. What does the user experience for all users of the system look like? Many cryptosystems fall over because they're simply unusable for most users of the system.

An example

Wul (the widely underestimated language, pronounced /wool/) is a general purpose programming language. It has a package repository, WuPI (the Wul package index, pronounced /woopie/), the de facto standard for distributing and installing Wul software.

WuPI uses TLS as a secure transport. The WuF (Wul foundation, pronounced /woof/), maintains a root certificate, distributed with Wul. Thanks to a well-managed system of intermediary CAs run by a tireless army of volunteers, this means that both package authors and consumers know they're talking to the real WuPI.

Alice is the WuPI BDFL. Bob is a Wul programmer, and would like to improve the security of WuPI.

While consumers and authors know that they're talking to the real WuPI, there is no protection against a malicious WuPI endpoint. (This problem was recently made worse because WuPI introduced a CDN, greatly increasing the number of people who could own a node.). You know that you're talking to something with a WuF-signed certificate (presumably WuPI, provided the WuF has done a good job managing that certificate), but you have no idea if that thing is being honest about the packages it serves you.

Bob believes WuPI could solve this by using GPG signatures.

He starts with a brief description of the suggestion:

I would like to suggest that WuPI grows support for GPG signatures of packages. These signatures would be created when a package author uploads a package. They would optionally be verified when the user downloads a package.

He continues with the attack model being considered:

I believe this would secure WuPI consumers against a malicious WuPI endpoints. A malicious WuPI endpoint (assuming it acquires an appropriate certificate) is currently free to deliver whatever packages it wants.

He explains why the current model doesn't address this:

The current system assures authenticity and secrecy of the stream (through TLS), and it ensures that the server authenticates itself with a WuPI/WuF certificate. It does not ensure that the package is what the author uploaded.

He explains why he believes his model does address this:

Because the signatures are produced by the author's GPG key, a malicious WuPI endpoint would not be able to forge them. Therefore, a consumer is sure that a package with a valid signature is indeed from the author.

He explains why this attack model is important:

With the new CDN support, the number of people with access to such a certificate has greatly increased. While I certainly trust all of the volunteers involved, it would be nice if we didn't have to. Furthermore, the software on the servers can always be vulnerable to attack; as a high-value target, it certainly isn't inconceivable that an attacker would use an unknown vulnerability to take over a WuPI endpoint.

He (believes to) address the attack surface:

Because the signatures are optional, the attack surface remains the same.

Finally, he addresses the user experience:

The weak point of this scheme is most likely the user experience, because users historically seem to dislike using GPG.

I am hopeful that this increased value of participating in the GPG web of trust will mean that more people participate.

Alice reviews this, and notes a flaw in the proposal:

This proposal aims to address a security flaw when the WuPI endpoint is malicious by adding signatures. However, a malicious WuPI endpoint can lie by omission, and claim a package was never signed by the author.

Bob now realizes this issue, and suggests an improvement:

This could be rectified if the user insists on a signature for packages they expect to be signed.

As a side note, Alice notes that the attack surface does increase:

This places trust in author's ability to manage private keys, which has historically been shown to be problematic. That introduces a new attack vector: an attacker can attempt to go after the author's private key.

Regardless of the outcome of this conversation, there actually was a conversation. I believe this to be an improvement over the overall status quo.

Switched to Nikola

I've migrated from Octopress to Nikola.

Nothing personal. Octopress is fine software, but:

  • External packages, like themes, made my installation quickly impossible to understand and manage. What's the difference between javascripts and js, css and style? I don't know.
  • Performance. Nikola builds sites nearly instantly. Even with a moderate amount of pages, I found Octopress too slow. Maybe I just need a better laptop.
  • Python. While I'm sure the Ruby tools are of similar quality to Python's; I know Python's tools. Porting to Nikola was easier than learning about A-grade Ruby developer installations.

Anyway, I'm on Nikola now. Pretty happy with it. I used nikola-octopress-import. Did most of what I wanted; I contributed some minor PRs so that it would also handle extended date formats (with seconds and timezones) and legacy HTML posts.

Share