UNIX SIG by Chris Fearnley A Miscellany I don't have a good topic for this article, so I will sketch out my thoughts about various things this month. First, for those of you looking to purchase a Linux CD, my current recommendation nods toward Red Hat. I still think that Debian has several fundamental advantages over Red Hat. But Debian is still "a work in progress" with a few "rough edges". In a few more months, Debian will have migrated to ELF and several of the basic installation problems will be resolved (I hope!). Red Hat is talking with several Debian developers about integrating the two distributions (at least to the extent that .rpm and .deb packages can be installed on each system in a maintainable fashion). Linus has just released 1.3.49. There are still many bugs in this developmental series of kernels (1.3.xx), but several new features are available. SMP (Symmetric Multiprocessor Support) by Alan Cox is minimally functional; most of the kernel drivers can be selected as loadable modules (in make config); many new drivers are available including appletalk, Token Ring, Arcnet, SMB support for WFW volumes, AX.25 for amateur radio, EQL (serial line load balancing), and Stallion multiport serial card support among others; there is direct support (in a single source tree) for the alpha, i386, mips, and sparc processors; and finally, the kernel is now over 400,000 lines of code (for better and for worse). Still no one knows where the train is headed, but we are all enjoying the ride! In my efforts to maintain mawk and gawk (two GPL'd versions of the AWK programming language), I have uncovered some curious differences. Which raises the issue of whether to make /usr/bin/awk a symbolic link to mawk or to gawk (in either case some code will work for one implementation and not the other). Now, the 1.3.xx kernel releases use awk to calculate dependencies in C code for the Makefile. But when mawk is used instead of gawk, gobs and gobs of error messages are reported (no such file or directory). It turns out that this is due to a peculiar mis-feature in mawk's implementation of the getline() function. This small script will spit out an error message when the file test doesn't exist: { ERRNO = getline dummy < "test" }. It also sets ERRNO to -1, so I don't see any reason for it to spit out anything onto stderr. I've contacted the mawk author and I'm hopeful that there is some way to turn off this error message. The source code for mawk and gawk contain tests to verify that the program performs adequately after a new build. In the gawk test suite there is this odd script: { gsub(/^[ ^I]*/, "", $0) ; print } where ^I indicates a [TAB] character. Running gawk with this script on the input: " ^IThis is a test, this is only a test." produces "This is a test, this is only a test." whereas mawk produces "Thisisatest,thisisonlyatest.". Of course it is a bit ridiculous to use gsub on a regular expression with a ^ (beginning of line anchor) in it or with an * (zero or more "leftmost longest" matches). As such I'm not sure which program behaves more responsibly here. But the difference is a bit unsettling. I have also been compiling apache 1.0.0 (if I ever get it into a "good" state, I plan on releasing this work to the Debian Project as well). Apache has an option to use the GNU dld (dynamic link editing library) to load it's modules at run-time. Since I'm building it for Debian's pre-release ELF system, I might also build a shared library of all it's modules (building shared libraries under ELF is trivial: build all of the object files with -fPIC then when building the shared object use gcc -shared -Wl,-soname,$(LIBSO_MAJOR) -o $(LIBSO) $(OBJS) and it should work). The question arises which approach is best: dld, ELF shared libraries, or a monolithic apache binary (since Linux supports shared copy-on-write executables this isn't at all a bad option either). Now, the givens are that apache is the only program that can take advantage of any run-time modularity provided by dld and shared libraries and that there may be dozens of apache-httpd processes in core at any given time. I'm not a compiler expert, so I'm asking: What approach do you think is best? There have been two meetings of the Philadelphia Linux User's Group (PLUG). PLUG was formulated at a PACS meeting, but has no formal ties to PACS. They have been quite fun - hanging and talkin' shop with Linux users. The third meeting was cancelled due to the big storm. The next meeting is January 9. Look for announcements in phl.announce, or send me e-mail requesting to be on the PLUG mailing list. Many people have suggested we move to a bi-weekly format (we've been trying to meet weekly). What do you think? Next month at PACS we will have another Q & A session on Unix and Linux.