How open is your source?
August 14, 2006
My name is Juan, and I’d like to talk a bit about some of the things I learned while porting BixAgent to FreeBSD and Mac OS. Since I learned so much I’ll break it up into a few posts. This one is about FreeBSD and how the free and open philospohy of its community makes it a great OS to work with.
When most people hear the words “open source operating system”, the first thing that comes to mind is usually Linux. It’s kind of interesting when you think about it, because Linux and most of the GNU stuff that usually comes with it aren’t really what I would call open. The license that the code is released under is typically the GPL. It basically says that if you use the code in some new sofware then your new code must also be released with the same restrictions. The idea behind this is to ensure that no one can compete against an open source project using its own source code against it.
This is all well and good if you consider some class of developers and organizations to be worthy recipients of the benefits of openness and others to be the sworn enemies of freedom. If you want to develop software without making the source code available, you can’t use anything licensed under the GPL. That’s fine, though maybe the name open source is really a misnomer since to these people the source is most definitely closed. If you’re a regular software company that doesn’t want to release the source to your product you shouldn’t even be looking at GPL’d source code that’s similar to your product if you can help it. What’s worse is that if you wanted to release your code under a license that’s less restrictive (i.e. more free) you wouldn’t be allowed! There are actually degrees of openness, and the GPL is pretty low on that scale.
So what does all this have to do with anything? Most of FreeBSD is licensed under the BSD license. It pretty much allows you to do whatever you like with the code. That means there’s an amazing resource available to you if you’re working on low level tools such as BixAgent.
When I first started working on the FreeBSD port of BixAgent, I was totally lost. I was directed to look at the man page for sysctl to get started. That’s probably the most useful bit of real documentation that I found, but it’s far from being a complete reference. I’ll share the details of how to get a lot of good stuff from sysctl in a later post.
So about that source code. Let’s say tou want to find out how to get the free disk space on a partition. The command that is most commonly used to report this information is df. Assuming you installed the source code with the base system, you can find the source for df very easily. First type which df to find the location of the executable. If your installation is typical, you’ll see that when you run df it uses the binary at /bin/df. The directory tree under /usr/src mirrors that of the base system. For example, the source to the tools in /bin such as df can be found in /usr/src/bin. The source for df can be found at /usr/src/bin/df.
Most of these tools have fairly simple source code, so it is often more useful to look at the source for a tool than it is to try and find out what you want by searching the web. Just by scanning through df.c I see a few things that might be useful for further research. The first thing in the file is the license. You should really read through that just to make sure. The one for this file pretty much says I can do whatever I want as long as I reproduce the copyright notice. Sounds good to me. Shortly after that there’s the list of #include’s. Since this is the only source file for the program, every api function needed should be declared in the headers in this list. If you look at the list, besides the usual stuff there’s <sys/sysctl.h> and <sys/mount.h>. Finding out what those headers contain could be big clues. If you continue looking at the source, you’ll notice that they use a function called getmntinfo. The man page for this says that it returns a statfs structure for each mounted file system. The man page for statfs contains the definition for that structure.
The most useful fields in the structure for what we’re looking to do are:
uint64_t f_blocks; /* total data blocks in filesystem */
uint64_t f_bfree; /* free blocks in filesystem */
With these data we can find out the number of free blocks and the total number of blocks, and with simple arithmetic we can figure out the number of used blocks. Great! Almost. This is pretty useful, but you might want that to be in bytes rather than blocks. How many blocks are there per byte? The man page doesn’t say, but df knows the conversion and we know everything df knows. We are very close. Looking at the source again and searching for f_bfree a few times leads us to some code that uses the f_bsize member of the statfs struct as the block size. I’m pretty sure that I read through the entire man page before and didn’t notice anything like that. If I look at the man page again, I find this:
uint64_t f_bsize; /* filesystem fragment size */
Oh. Fragment size. The field is named f_bsize which does imply that it could be the block size, but when I read documentation I tend to expect things to be clear. Now, it’s entirely possible and maybe even likely that a fragment and a block are different things. However, if it’s good enough for df, it’s good enough for me. I’m sure that with some amount of research I could find out, but I think it’s pretty safe to assume that code is correct as a starting point.
I believe that reading code is the best way to learn about programming. Using the code that’s available is a great way to learn how things on your new OS. Plus you have the added benefit of being able to customize the tools that come with your system. If you get good enough at it you might even want to improve the code and give something back. I’d say that this is what real open source is about.
September 28, 2006 at 5:55 pm
How “open” is the GPL, not very open. It’s free software:
* http://www.gnu.org/philosophy/free-sw.html