Those with anti-Microsoft postures might think this will be a missive about Microsoft. It is not. Microsoft will not be mentioned here apart from these two sentances.
The monoculture to which I refer is that of building dependencies upon particular packaging mechanisms in open source tools, or upon specific distributions.
We are trying to build OFED-1.x for OpenSuSE 10.2 in order to provide Infiniband driver support to a customer’s cluster. OFED supports RPM distributions, specifically highly specific versions from RedHat and SuSE. Debian/Ubuntu need not apply.
Most Linux software uses Gnu autoconf which is annoying and complex IMO. Not as bad as some other tools, but just hard enough to make developers think twice about it. You have to work hard with autoconf to get the right arguments together for your scripts. Once this is done, it can do a fairly respectable job of building makefiles for you. If there are ways to automate this and make it easier/more error free, by all means, help point me towards such a tool.
RPM likes autoconf, or more correctly, gnu configure. It is set up to run it.
Ok, I am setting up the dominoes to be knocked down right now.
When building OFED 1.x on OpenSuSE 10.2 I get
ERROR: Failed executing “rpmbuild –rebuild –define ‘_topdir /var/tmp/OFEDRPM’ –define ‘_prefix /usr’ –define ‘build_root /var/tmp/OFED’ –define ‘configure_options –with-ipath_inf-mod –with-ipoib-mod –with-mthca-mod –with-core-mod –with-user_mad-mod –with-user_access-mod –with-addr_trans-mod’ –define ‘KVERSION 18.104.22.168-0.3-default’ –define ‘KSRC /lib/modules/22.214.171.124-0.3-default/build’ –define ‘build_kernel_ib 1’ –define ‘build_kernel_ib_devel 1’ –define ‘NETWORK_CONF_DIR /etc/sysconfig/network’ –define ‘modprobe_update 1’ –define ‘include_ipoib_conf 1’ /root/OFED-1.2-rc2/SRPMS/ofa_kernel-1.2-rc2.src.rpm”
Digging into the error log, I find this
gcc -Wp,-MD,/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/.ucma.o.d -nostdinc -isystem /usr/lib64/gcc/x86_64-suse-
linux/4.1.2/include -D__KERNEL__ \
-Iinclude2 -I/usr/src/linux-126.96.36.199-0.3/include \
-include include/linux/autoconf.h \
-include /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/include/linux/autoconf.h \
-I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -Werror-implicit-
function-declaration -fno-strict-aliasing -fno-common -Os -mtune=generic -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks
-Wno-sign-compare -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -fasynchronous-unwind-tables -fno-sta
ck-protector -Wdeclaration-after-statement -Wno-pointer-sign -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/include -I/var/tmp/OFEDRPM/BUI
LD/ofa_kernel-1.2/drivers/infiniband/include -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/ulp/ipoib -I/var/tmp/OFEDRP
M/BUILD/ofa_kernel-1.2/drivers/infiniband/debug -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/hw/cxgb3/core -I/var/tmp
/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/net/cxgb3 -I/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/net/rds -DMODULE -D”KBUILD_STR(s)=#s
” -D”KBUILD_BASENAME=KBUILD_STR(ucma)” -D”KBUILD_MODNAME=KBUILD_STR(rdma_ucm)” -c -o /var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/i
/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/ucma.c: In function ???ucma_init???:
/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/ucma.c:1062: error: ???struct miscdevice??? has no member named ???class???
/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/ucma.c: In function ???ucma_cleanup???:
/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/ucma.c:1076: error: ???struct miscdevice??? has no member named ???class???
make: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core/ucma.o] Error 1
make: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband/core] Error 2
make: *** [/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2/drivers/infiniband] Error 2
make: *** [_module_/var/tmp/OFEDRPM/BUILD/ofa_kernel-1.2] Error 2
make: *** [modules] Error 2
make: *** [modules] Error 2
make: Leaving directory `/usr/src/linux-188.8.131.52-0.3-obj/x86_64/default’
Grrrr….. It is a mis-match between what the code thinks is in the structure and the kernel header thinks is in the structure.
I have been told that it is supported in SLES 10, and RHEL4/5. OpenSuSE need not apply.
Here is why I think this is dangerous. Designing packages to specific distributions reduces their ability to be used outside of the targeted distro. This reduces the utility of the package, as distro specific peculiarities can impede adoption. Sure Redhat and Novell may like that, but it effectively creates a mono (or bi) culture of packages and applications. Worse, it ties it to specific versions which may not be in use for very good reasons. Say for example applying important patches for security or functionality.
This is problematic at best. Ok, now I am going to mention Microsoft.
These are some of the aspects of the Microsoft “dictatorship” that are to be admired. A single ABI for MPI. The packages “should” install and work.
No, I am not advocating the Microsoft model over the open source model. I am just hoping that the open source model actually gets a bit more open.
Update: A kernel update, and a build_env.sh change appear to have solved some set of problems. I will put these RPMs up on our download site.
This is BTW, one of the reasons why I think the OSS model is better. I have the power to investigate and change the build environment (while complaining about it online on this blog). Not just to call a support number and hope that it actually will be deigned to be looked at.