Archive for the ‘Software Engineering’ Category

Bad Hair Day for Kerberos

Friday, December 3rd, 2010 by hartmans

Tuesday, MIT Kerberos had a bad hair day—one of those days where you’re looking through your hair and realize that it’s turned to Medusa’s snakes while you weren’t looking. Apparently, since the introduction of RC4, MIT Kerberos has had significant problems handling checksums. Recall that when Kerberos talks about checksums it’s conflating two things: unkeyed checksums like SHA-1 and message authentication codes like HMAC-SHA1 used with an AES key derivation. The protocol doesn’t have a well defined concept of an unkeyed checksum, although it does have the concept of checksums like CRC32 that ignore their keys and can be modified by an attacker. One way of looking at it is that checksums were over-abstracted and generalized. Around the time that 3DES was introduced, there was a belief that we’d have a generalized mechanism for introducing new crypto systems. By the time RFC 3961 actually got written, we’d realized that we could not abstract things quite as far as we’d done for 3DES. The code however was written as part of adding 3DES support.

There are two major classes of problem. The first is that the 3DES (and I believe AES) checksums don’t actually depend on the crypto system: they’re just HMACs. They do end up needing to perform encryption operations as part of key derivation. However the code permitted these checksums to be used with any key, not just the kind of key that was intended. In a nice abstract way, the operations of the crypto system associated with the key were used rather than those of the crypto system loosely associated with the checksum. I guess that’s good: feeding a 128-bit key into 3DES might kind of confuse 3DES which expects a 168-bit key. On the other hand, RC4 has a block size of 1 because it is a stream cipher. For various reasons, that means that regardless of what RC4 key you start with, if you use the 3DES checksum with that key, there are only 256 possible outpus for the HMAC. Sadly, that’s not a lot of work for the attacker. To make matters worse, one of the common interfaces for choosing the right checksum to use was to enumerate through the set of available checksums and pick the first one that would accept the kind of key in question. Unfortunately, 3DES came before RC4 and there are some cases where the wrong checksum would be used.

Another serious set of problems stems from the handling of unkeyed checksums. It’s important to check and make sure that a received checksum is keyed if you are in a context where an attacker could have modified it. Using an md5 outside of encrypted text to integrity protect a message doesn’t make sense. Some of the code was not good about checking this.

What worries me most about this set of issues is how many new vulnerabilities were introduced recently. The set of things you can do with 1.6 based on these errors was significant, but not nearly as impressive as 1.7. A whole new set of attacks were added for the 1.8 release. In my mind, the most serious attack was added for the 1.7 release. A remote attacker can send an integrity-protected GSS-API token using an unkeyed checksum. Since there’s no key the attacker doesn’t need to worry about not knowing it. However the checksum verifies, and the code is happy to go forward.

I think we need to take a close look at how we got here and what went wrong. The fact that multiple future releases made the problem worse made it clear that we produced a set of APIs where doing the worng thing is easier than doing the right thing. It seems like there is something important to fix here about our existing APIs and documentation. It might be possible to add tests or things to look for when adding new crypto systems. However I also think there is an important lesson to take away at a design level. Right now I don’t know what the answers our, but I encourage the community to think closely about this issue.

I’m speaking about MIT Kerberos because I’m familiar with the details there. However it’s my understanding that the entire Kerberos community has been thinking about checksums lately, and MIT is not the only implementation with improvements to make here.

Federated Authentication discussion tonight at 9 PM Pacific

Thursday, March 25th, 2010 by hartmans

The federated authentication bar BOF will be held tonight at 9 PM US Pacific time in the Manhattan room at the IETF 77 meeting.. Here is information for participation.

Reading List

Remote Participation

  • Join our audio stream during the session
  • Join our jabber chat room at
  • Join our mailing list
  • The Fate of the Three Little Pigs and Little Red Riding Hood

    Friday, October 23rd, 2009 by hartmans

    One of the more annoying aspects to deploying Kerberos and GSS-API is making sure that clients have the correct name for the server they’re talking to. CIFS, the Windows file-sharing protocol, provided the identity of the server to the client. Windows used this to make a few things easier with NTLM but does not use this information with Kerberos.

    I keep finding myself in conversations where someone has the bright idea of making this problem easier by generalizing this mechanism and having the server tell the client its identity. The client can then authenticate to that identity. There’s definitely an implementation advantage: you remove all the complexity of name mapping. The problem of course is that it matters what server the client is talking to; the client actually needs to make a decision about how much to trust the server. Authentication to the bad guy is just as bad as the bad guy being able to subvirt your authentication to the good guy.

    I was talking recently to another implementor who has similar experience with their customers. Frustrated, I was looking for an analogy simple enough that people could understand the mistake here. I was running through nursery rhymes and other childrens’ tales in may head until I came to the Three Little Pigs. It’s perfect!
    The pigs do not want the results of the “Let me in,” service from the big bad wolf. There’s no way that the situation could be made better by more authentication. Asking the wolf for his PIV Card (when have you met a big bad wolf who was not a federal contractor) will not help the pigs decide to let the wolf in. Because the pigs think to consider who they are getting a service from, they decide not to trust it. Of course, physical security is something that first two pigs should have worked on. However even their brother would not have been safe if he’d taken an approach similar to the one proposed for GSS-API of asking the bad guy what name to trust before authenticating to see if the bad guy in fact has that name.

    There may be things we can do to make name mapping easier. However we cannot provide security without making a trust decision about who we talk to, and whenever we talk about “who” and trust together, we must consider the security of the mapping.

    Of course, as Little Red Riding Hood could tell you, sometimes it is all about authentication. Sadly, her grandmother was unable to get better than level of assurance 1 for her identity as “Little Red Riding Hood’s grandmother,” and the wolf was able to claim that identify for himself.

    No Bills for Nancy

    Friday, May 29th, 2009 by hartmans

    A while ago I was working with a client in the financial services sector. They had an online banking product. After a software upgrade, they got a confusing customer service call. A customer, we’ll call her Nancy, called up and reported that she could no longer access the online billpay portion of the site. However, she was sure that she had access because her bills kept paying. After some investigation it turned out that what she meant is that when she tried to click on the bill payments section of the site, she received a page indicating that she didn’t have the online bill pay feature. However, her periodic payments were still being deducted from her account. According to the provisioning system, she did have bill pay access, and audit records confirmed her claim that her payments were being made.

    The client was concerned. Taking money out of end-user accounts without giving the user visibility into what was going on or a way to change it was problematic. Besides, she was paying for a service, and the client wanted to provide it. I got involved when the other developers were unable to reproduce the problem in the development environment. When the data set including Nancy’s information was loaded, they could see that she did in fact have access. If someone logged in as her in the development environment, then the bill payment system was available.

    When the second user called with the same problem, the issue was escalated. The problem was not universal, but a small number of users reported trying to use online payments and getting the page indicating they did not have the service even when they did.

    We still were not able to reproduce on the development environment. We were focusing on whether there was some sort of release engineering error that had caused a problem to creep into the production environment. We could easily create a fresh instance of the development code (although not the supporting database) and that didn’t cause the problem to appear. Our system was interpreted and the access control logic was isolated from the operating system or hardware. So, while we acknowledged the possibility of a problem in that area, we didn’t think that development running on Alpha while production was running on newer Sparc hardware would be the issue.

    We looked very closely at the code that decided whether to display the online payments page. We did find a few problems, but none should have caused this bug. For example, the system had a concept of an individual account and a business account. an individual account could be given access to a business account for auditing and dual control reasons. However we did not support using an individual account to access the payments section of a business account. So, there was logic to make sure that the login ID of the user matched the login ID of the resource being accessed. This check would always return true because it used a numeric comparison rather than a string comparison.

    While going over the situation someone joked that we could just ask people to change their name: the problem only seemed to happen for users named Nancy. I heard this description and looked at the users who had reported the problem. Sure enough, all named Nancy. How do you get a name dependent bug in access control? Surely this was just sampling error.

    What’s special about Nancy? Well, it starts with “nan” as in not-a-number. Surely, though, that shouldn’t affect anything. Unless . . . I logged into a development and production server. Sure enough, on production, “nan+0″ evaluated to “nan”. On development, “nan+0″ evaluated to “0″. Apparently whatever the interpreter was using to read numbers respected nan on Sparc but not alpha. Still, how could this be our issue?

    Then I remembered that numeric comparison instead of the string comparison. You see there are two cases where using a numeric comparison to compare strings is not true. The first is when the string represents the number zero. The second is when it represents nan. Sure enough, with a two character change and a lot of paperwork, the Nancies gained access to their payments.

    I really love the story of this bug, so I’m sharing it here. People have tried to find a moral in “No Bills for Nancy,” over the years. “This shows the value of static type checking!” some have said. “This shows the critical importance of having identical test environments,” others have said. “If you had the right development methodology, this wouldn’t happen!” others have said.

    In some sense, that’s all true. However I’ve found that no matter how good your testing, no matter how good your practices, Nancy is out there lurking, ready to demonstrate that there is some facet of the system that we do not understand. Really, though, Mark Twain had the right answer. It’s a good story; enjoy it for itself.