A Roundtable on BSD, Security, and Quality


Und noch ein Artikel zu OpenBSD. Diesmal von www.ddj.com. :D

Ist zwar aus dem Jahr anno Domini 2001, aber trotzdem. Wir sollten ihn einfach im Forum haben. ;)

A Roundtable on BSD, Security, and Quality

by Jack J. Woehr

Contributing Editor Jack Woehr moderated a roundtable at the recent USENIX Security Symposium 2000. The participants, Theo deRaadt, Todd Miller, Angelos Keromytis, and Warner Losh, discussed several topics, including the evolving distinction between Linux and BSD and the notion that reliability and security are achieved through simplicity.

The Participants:
Theo deRaadt (TdR), OpenBSD principal architect.
Todd Miller (TM), OpenBSD contributor, a system administrator at the University of Colorado at Boulder.
Angelos Keromytis (AK), OpenBSD network stack guru, a graduate student at the University of Pennsylvania.
Warner Losh (WL), Free BSD contributor, an embedded-systems programmer.
Jack Woehr (DDJ), Dr. Dobb's Journal contributing editor.

DDJ: Some people think the install of OpenBSD should be kept simple, the way it is. Others want greater ease for the users.

TM: The direction I'm going is to have a facility for installing large numbers of machines easily. Right now you have to do attended installs. It would be nice to have a config file, TFTP or NFS mounts, preload a bunch of defaults, and splat out 30 installs at once. Most of the major OSes have this in some form or other. RedHat Linux KickStart is more complicated than it need be.

We already have a facility for installing arbitrary things in addition to the base sets, although it's not very well documented. OpenBSD 2.8 will be easier to install in some ways. One thing I've been meaning to add for the last couple of years is partition profiles, so you can say, "I want it to look sort of like this," this percentage for disk label a, this for b, and so on.

OpenBSD install basically started out as a prototype that escaped. I haven't really had time to go back and revamp it like I wanted to.

DDJ: How does OpenBSD IPv6 compare to the other free UNIXs?

WL: Itojun-san of the Kame project in Japan seems to be six different people inhabiting one body, as far as his ability to hack [the network protocol stack]. He makes sure that FreeBSD, OpenBSD, NetBSD, and BSDi remain in sync with the main Kame repositories. For the OpenBSD Crypto2000 sort-of-mini-conference, he attended and got no sleep. When his roommate went to sleep, Itojun was hacking. When he woke up in the middle of the night, Itojun was hacking. When he woke up in the morning, Itojun was hacking.

TdR: All the BSDs are at the same level. It's all the same code base. It's the same APIs. OpenBSD has more in the security side of IPv6 than FreeBSD has because that's not done by Kame.

DDJ: What is the difference in security?

AK: The big difference is that we [of OpenBSD] have a very well integrated IPSec stack. Even more so, we're the only free project with any real support for hardware acceleration for crypto.

Security is of particular interest to us. Most developers are convinced that IPSec is going to play a significant role in whatever OpenBSD is used for, either as a workstation or firewall. We see a lot of firewalls already that could be termed "IPSec VPN boxes".

If you're going to have cheap hardware, 486s, or low-end Pentiums acting as firewalls, they need all the help they can get in terms of cryptography processing. We have support for some crypto cards already (http://www.openbsd.org/crypto.html#hardware). We're looking into adding support for public key cryptography to accelerate OpenSSL and the public key operations in IPSec. That's our next big goal.

DDJ: How does IPSec change normal user operation?

AK: Right now, for various reasons, mostly because it's the simplest thing to do, IPSec is used for VPNs to bridge different sites or for telecommuting.

DDJ: But IPSec itself is much richer than that?

AK: The goal is to make it so that every single application that uses the network should be able to request security services from the network stack, to have all its data received and transmitted encrypted, no matter to whoever it is talking to.

DDJ: "The Death of FTP"?

AK: Not the death of FTP. Maybe there's a new command line argument to FTP. Maybe secure is the default setting of the OS. Maybe the user's profile says "FTP in secure mode only." We support some of that right now but it's not very well meshed.

DDJ: So the applications that everyone is using haven't caught up with the facilities offered in IPSec.

AK: Yes.

DDJ: What about the general viability of all the BSDs? Are you going to become the "OS/2" to Linux's "Windows"?

WL: That's a little scary! The BSDs are not in any kind of decline. If anything, the past year has been something of a renaissance. A lot of net companies using various BSD boxes. People are installing more firewalls, more IDS boxes, we're seeing a lot of those are BSD-based. A lot of that is due to the strength of the BSD IP stack.

AK: I suspect the next big explosion for BSD will be embedded systems. We're seeing a lot of vendors that want to have a real operating system in their device. They need to have a file system, even in flash. They want full IP support. They want to download and run executables. Java is one approach, others are taking a more pragmatic and traditional approach. It's no longer feasible for a small company to roll out its own operating system. All the BSDs and Linux have legacy code that already runs.

WL: Furthermore, a lot of embedded systems are based on processors of the past that BSD has already been ported to. MIPS leaps to mind as an example. Support for StrongARM and PPC allows BSD and Linux a foot in the door in such designs.

DDJ: What's the main difference between BSD and Linux?

WL: The strong central source repository. You know what you're building. With Linux, "You need this, and you need this, and get this somewhere else, and today we just discovered that you need these twelve patches." There's no way to keep up with that. It's crazy-making.

With the BSDs, you synchronize to the sources, "make world" or "make build", and you know exactly what's running on your machine. From a security point of view, that's good.

DDJ: What's this BSD merger about?

WL: BSDi bought Walnut Creek CDROM (http://www.cdrom.com). Walnut Creek has been a big FreeBSD supporter for a long time. BSDi is now providing hardware infrastructure and hiring developers to work on FreeBSD.

DDJ: Has there been cross-pollination between the Linux and BSD kernels?

WL: Not a lot, though sometimes one steals ideas. Linux, for instance, stole part of the BSD networking stack. [Pauses.] All of it.

AK: There used to be a huge difference between the Linux kernel on the one hand and the BSD kernels on the other. The main reason was the lack of cohesion, lack of grand vision in Linux. The subsystems were developed independently. The communication between the people who were controlling the development of the Linux kernel was not as close as in the BSDs. That has probably changed recently.

There's also some cross-pollination in the other direction on device drivers. Companies are perhaps more willing to give documentation on their devices to Linux hackers than to bother to deal with the BSDs. So the way to support a new piece of hardware in BSD is to find the Linux driver and use it as documentation for the device.

DDJ: So there's not any real big difference now between Linux and BSD kernels?

AK: One of the recent changes that I know of in the Linux kernel is the threading of the drivers. It's a fairly good idea, I don't know if we want to move there right now, but it's the direction we are going. In the last major kernel update that's the approach I took in the crypto subsystem.

WL: When an interrupt happens in BSD, you raise the processor SPL, no other interrupts can happen, and the driver executes. What Linux is moving to, and what FreeBSD SMP was also moving to, [is that] each driver has its own thread, so that when it is executing it doesn't necessarily block all the interrupts. It becomes more independent and the multiprocessing capabilities become more scalable. You can say, "I'll just run this thread on this CPU and that thread on that CPU doing device driver things" and you don't have the SPL issues you have in a traditional BSD kernel. Solaris is like that, everything is a thread in the Solaris kernel.

DDJ: So is SMP supported in the BSDs these days?

TM: People say SMP when they really mean ASMP or just "MP".

WL: A lot of the early Linux kernels claimed to be SMP but they were only run on one processor. So which SMP do you mean?

TdR: Or, whose idea of SMP?

WL: With FreeBSD 4.x, it uses a big, giant lock. One of the things BSDi is supporting is porting BSDi fine-grained threading and fine-grained locking to FreeBSD, which will make the multiprocessing more symmetric.

TdR: In NetBSD, Bill Sommerfeld has been working on SMP stuff, so ... The cost of buying a dual-processor machine is probably greater than buying a great single-processor machine, and most people get more out of a fast single processor. And there are coming machines with multiple processors in the core of a single chip.

WL: The main reason FreeBSD put multi-processor support in is that we had several customers at ISPs who had limited rack space and who could get more out of another CPU.

DDJ: OpenBSD strikes me as being an aesthetic revolt against something that happens to operating systems as they become used. When I installed OpenBSD and found it so minimalist, I said, "I have finally found the free Unix that's the closest thing to FORTH."

TdR: Before I did OpenBSD I actually wrote a FORTH compiler, in 1987, that booted diskless on a Sun 350 out of the boot rom.

DDJ: Chuck Moore still rails that programming is vastly too complicated, that it's just job insurance for programmers, that things have to be smaller and simpler.

TdR: Right. As we keep on looking at source code, we find that most people can't write more than twenty lines of code in C. They make mistakes that matter twenty years later, that become security holes, buffer overruns, races, misuses of the API. Misuses of the API is the killer. Calling them and thinking they work one way but they don't. strncpy(), strncat() ... no one knows how they work.

TM: strncpy() and strncat() work differently. If you expect one behavior, you'll be right half the time. Does it matter? The OpenBSD experience shows that, yes, it does matter, and that the people who were guessing guessed wrong half the time.

TdR: We went through our source tree and fixed all the strncpy() errors. On the whole, we found about one percent were correct in a 300-megabyte source tree.

DDJ: Give some practical advice to professional programmers. How did you learn to be right?

TdR: The rules were set a long time ago. It's just that someone started paying attention.

WL: Read the man page and make sure you understand it, not just what you think it says. You can also use an alternative API that's harder to use incorrectly, like the strl* routines that Todd did, strlcpy(), strlcat() ...

TdR: They're still not in glibc. They're everywhere else. They're in Solaris. We invented them two years ago. They're showing up in vendor operating systems. We made a convincing argument why these things are necessary. Todd and I wrote a paper on it. He gave a talk on it at a previous Usenix, I talked about it at a conference in Australia.

TM: They're very simple APIs, a tiny amount of code, it barely makes a paper. You have a consistent API, "I'm writing to here, this is where I'm starting, and this is how much space I have." People can understand that.

Most of the problem people have with strncpy() and strncat() is that they often require pointer arithmetic. People often get this wrong. It was really fun to do an 11-page paper on approximately 11 lines of code!

DDJ: Why bother with this when we now have std::string?

TdR: We're not C++ programmers. There is one program written in C++ in OpenBSD, out of 300 megabytes of source code.

DDJ: Isn't everything going to be written in C++ someday?

TdR: You need a simpler language. The problem we deal with today is that the language is too complicated, and then they want to add C++ to it?

A friend of mine works at a company that writes Microsoft applications. They have a piece of paper they constantly update listing APIs and parts of C++ you're not allowed to use. It's four pages long.

WL: The biggest problem people have with C++ is that they see a feature and feel they have to use it. That's wrong.

TdR: Another problem: one C++ programmer cannot debug another C++ programmer's code!

DDJ: I'm not sure I agree with that, but C++ does allow one to create vastly different kinds of structure. Java has the same problem.

TdR: Even in C, there are stylistic things that people do that cause great problems. In particular this fellow Angelos over here keeps indenting his code differently than the rest of us do. Whenever we look at his code we have to re-orient ourselves to how he does stuff!

TM: When everything is consistent, errors stand out. When you're dealing with security, finding errors is important.

TdR: When we do audits, we do trolls through the source code looking for particular types of errors. When it's consistent, we can go through very quickly.

You have to be pretty particular about automatic indenters. You should run qmail through indent sometime. It doesn't compile afterwards!

WL: It finds bugs in indent. People have a brief period of time where they are focused and they get 80 percent of it right, and that's indent, or whatever. The other 20 percent that's really hard, they don't have the fire for it anymore.

I use emacs to indent. If it has a problem, I've usually made a syntax mistake. I don't use the syntax coloring, though, I find it distracting, and only recently have I worked on machines that run it fast enough.

TdR: The whole thing is creating environments that are conducive to what we do, which is improve quality. Not just making correct code, taking existing code and fixing it, which is most of what we do.

TM: You look at code from the '80s and before, then you look at the code we write now, there seems to be stylistic things that change over time. Not just one person's thing, but schools of programming. Looking at old code, I can read it, but I can't just understand what it does until I fix this and that and change the formatting. When things look right, problems are obvious. You're not going line by line.

WL: A lot of old-time code on the net was written when the school of thought was, "That will never happen. Why waste my time and a few bytes of memory?" Consequently, that code has more errors than today when memory's cheap, and people say, "Why not check for errors?"

TdR: Nowadays you end up with very large chunks of code that check for errors, and once and a while miss errors, but since the code is now so large, when it fails, it's hard to find out why it failed! It seems there's no happy medium.

WL: I've seen code that tries to determine if getpid() fails. It can't.

AK: On the other hand, how many people check if close() fails? Not many. But since AFS, you actually have to check. There's nothing you can do, but at least you can report it.

TM: With fclose(), there are buffered stdio writes in there, and you want to know. You can at least warn.

AK: What's to guarantee, though, that next year there won't be some new operating system feature where, if it fails, you have to re-issue close() ...

TdR: That's a slippery slope. You're saying it's okay to change the way the UNIX APIs work.

AK: But close() was expected to fail sometimes, that's why there's a return.

TdR: close() was void before AFS.

TM: But fclose() was not. You see this kind of thing with Perl code. Perl is a nice language, I use it a lot, it's great for whipping up things and some large projects. Perl programmers have a tendency not to check return codes. Perl doesn't make that too easy, because there are some things not obvious that you really should check.

AK: People who program in Perl, like I do, often use Perl because they want something really fast, and then they throw it away. Some of this code escapes and becomes real projects, but the mentality remains, to the author's dismay. We all have horror stories about these programs we wrote years ago, then five years later we get e-mail, "This doesn't work, can you fix it?"

DDJ: Michael Cowlishaw, inventor of Rexx, tells about a program he wrote in 1979. He got e-mail in 1994 from someone who found a bug. The amazing thing was that the program still worked over the various releases of IBM VM/ESA.

TdR: That's because they didn't change the close() call! [general laughter].

WL: close() may be a silly example, but a better example is setuid().

TdR: setuid() was subtly changed by POSIX to add saved UIDs. The result was that in Perl setuid() in 1992 it caused a security hole because the system call was changed. The existing binary running on top of a new kernel had the hole. It didn't affect any other programs, but it affected one that mattered.

WL: I saw some programs in FreeBSD ports that had that same bug in it.

TdR: We were very careful. We put seteuid()s in front of every single setuid(). We decided to avoid the problem instead of having to read all those pieces of software. Since that time, we have removed some seteuid()s where we are sure it's safe.

DDJ: But make it safe first and then peel away the layers.

TdR: When we get time.

DDJ: Theo said earlier to me that it's not about security, it's about quality, that if you write software that performs according to its specification, it can't be insecure if you use it in the correct way.

TM: If the specification is a secure specification. cf, rsh and that family of things are inherently insecure because of the reliance on trusting that the IP address at the remote side is good, and that remote side hasn't been holed. There are flaws in the specification that inherently imply insecurity.

TdR: There are flaws in everything. SSL trivially opens you up to a denial-of-service attack, because the attacker can make you chew CPU calculating keys. So is IKE.

DDJ: What's the thing with IKE? There are guys running around here wearing buttons, "I don't like IKE."

TdR: The protocol used for IPSec key negotiation.

DDJ: What's the problem with it? I asked them but I didn't get a clear answer.

TdR: It's an insanely complicated protocol, therefore, the reasons it is broken are insanely complicated.

AK: The problem is 300 pages of specification. The implementation we have of this in OpenBSD is about 36,000 lines of code without the crypto. That's just the protocol. I can't think of one single piece of code that size and start to debug it.

TdR: The Boeing 747 flight control deck has about 30,000 lines of code, separated into 12 independent modules. That's the right way to do things.

DDJ: Systems get larger, but code still has to be broken up so the human mind can grasp it.

TdR: I don't see any magic bullet.

WL: Most of the IDEs don't reduce complexity, they just help you manage it. To have something that works well, you have to reduce complexity.