The bug had bitten me again. When I decided to upgrade my ageing computer box (about 9+ yrs. old now) that runs Topspin 2.1 PL6 with an Avance console, I wanted to go for the RedHat Enterprise Linux 6.3. Just for nostalgia, in the days when SGI workstations roamed the earth, XWINNMR used to be the software ruling the Brukerland and it had the special requirement that the graphics card support 8-bit colour depth, while the computer hardware and OS moved on to support 24 bit depth typically. The graphics card that did double duty, has become so obsolete of late that the 'mga' driver, seemed to cause problems slowing down my systems. That is why I decided to move on with the newer hardware. That and I also want to move away from ATAPI hard drives.
Instead of making life easy by going with RHEL 5, I decided to go with 6.3.
Problem: CCU won't boot.
Ending : happy ending :-)
I will summarize quickly what's the matter.
When both rpcbind and portmapper are running, you get an error message saying that : No RPC program registered.
I tried turning off rpcbind using chkconfig and left the older portmap running. The problem remained as it is. But, when I turned off portmap and left the newer rpcbind running, I could see that the RPC servers are registering with the portmapper, as shown by the above listing.
With this setup, CCU11 i.e. spect boots correctly. Now, let us look at the same packet where ASP_ST2 is sending a reply to spect for the latter's GET_PORT request. As before, in the upper half of the packet image, the last line shows :
Portmap GETPORT Reply Port : 724 Port: 724. Now, if you run rpcinfo -p you can confirm that port 724 is where bootparamd is listening.
With TS2.1 onwards, the installation of diskless from the Topspin DVD automatically installs a dhcpd.conf also under /etc/ directory. Couple of points on that :
Instead of making life easy by going with RHEL 5, I decided to go with 6.3.
Problem: CCU won't boot.
Ending : happy ending :-)
I will summarize quickly what's the matter.
Background :
Digging a bit with wireshark packet sniffer, we could clearly see that the initial conversation between 'spect', which is the CCU11 board and 'ASP_ST2', which is the Linux box does take place. It proceeds to the point where, spect, which is the diskless client, asks for a port number in which bootparamd is listening. It tries to get the info from the ASP_ST2 server. bootparamd, similar to nfs or rquotad belongs to the RPC family of servers. Normally, they register themselves to a program called portmap, which in turn informs a connecting client such as spect, to which port number the client is supposed to send its communications to talk to that particular server, in this case bootparamd.
I show here a packet, where this request for port number from spect (IP 149.236.99.99) to ASP_ST2 (IP 149.236.99.1) is rejected. In the upper half of the window that summarizes the traffic, note the last line "Portmap GETPORT Reply Port:0 PROGRAM_NOT_AVAILABLE". Also note the 3rd line, Internet Protocol that shows you who is sending this message to whom i.e. 149.236.99.1 --> sends this packet to --> 149.236.99.99.
Problem and resolution :
With RHEL 6.3 (which is derived from Fedora 14 and higher), the newer program rpcbind is used in place of the conventional portmap. The following wikipedia page underlines the fact that these two programs are different avatars of the same entity. The portmap seems to be the older version and rpcbind is the newer version. With RHEL 6.3, for an inexplicable reason, both the portmap and rpcbind are installed and turned on by default.
In a system where portmapper function is fine, you can enter the command rpcinfo -p and get a typical output similar to this :
program vers proto port service
100000 4 tcp 111 portmapper
100000 3 tcp 111 portmapper
100000 2 tcp 111 portmapper
100000 4 udp 111 portmapper
100000 3 udp 111 portmapper
100000 2 udp 111 portmapper
100024 1 udp 34528 status
100024 1 tcp 52726 status
100011 1 udp 875 rquotad
100011 2 udp 875 rquotad
100011 1 tcp 875 rquotad
100011 2 tcp 875 rquotad
100005 1 udp 52514 mountd
100005 1 tcp 55703 mountd
100005 2 udp 50364 mountd
100005 2 tcp 48481 mountd
100005 3 udp 58813 mountd
100005 3 tcp 53255 mountd
100003 2 tcp 2049 nfs
100003 3 tcp 2049 nfs
100003 4 tcp 2049 nfs
100227 2 tcp 2049 nfs_acl
100227 3 tcp 2049 nfs_acl
100003 2 udp 2049 nfs
100003 3 udp 2049 nfs
100003 4 udp 2049 nfs
100227 2 udp 2049 nfs_acl
100227 3 udp 2049 nfs_acl
100021 1 udp 46281 nlockmgr
100021 3 udp 46281 nlockmgr
100021 4 udp 46281 nlockmgr
100021 1 tcp 46143 nlockmgr
100021 3 tcp 46143 nlockmgr
100021 4 tcp 46143 nlockmgr
100026 1 udp 721 bootparam
100000 4 tcp 111 portmapper
100000 3 tcp 111 portmapper
100000 2 tcp 111 portmapper
100000 4 udp 111 portmapper
100000 3 udp 111 portmapper
100000 2 udp 111 portmapper
100024 1 udp 34528 status
100024 1 tcp 52726 status
100011 1 udp 875 rquotad
100011 2 udp 875 rquotad
100011 1 tcp 875 rquotad
100011 2 tcp 875 rquotad
100005 1 udp 52514 mountd
100005 1 tcp 55703 mountd
100005 2 udp 50364 mountd
100005 2 tcp 48481 mountd
100005 3 udp 58813 mountd
100005 3 tcp 53255 mountd
100003 2 tcp 2049 nfs
100003 3 tcp 2049 nfs
100003 4 tcp 2049 nfs
100227 2 tcp 2049 nfs_acl
100227 3 tcp 2049 nfs_acl
100003 2 udp 2049 nfs
100003 3 udp 2049 nfs
100003 4 udp 2049 nfs
100227 2 udp 2049 nfs_acl
100227 3 udp 2049 nfs_acl
100021 1 udp 46281 nlockmgr
100021 3 udp 46281 nlockmgr
100021 4 udp 46281 nlockmgr
100021 1 tcp 46143 nlockmgr
100021 3 tcp 46143 nlockmgr
100021 4 tcp 46143 nlockmgr
100026 1 udp 721 bootparam
Note that the portmapper (whichever implementation it is i.e. rpcbind or portmap) always listen on port 111.
I tried turning off rpcbind using chkconfig and left the older portmap running. The problem remained as it is. But, when I turned off portmap and left the newer rpcbind running, I could see that the RPC servers are registering with the portmapper, as shown by the above listing.
With this setup, CCU11 i.e. spect boots correctly. Now, let us look at the same packet where ASP_ST2 is sending a reply to spect for the latter's GET_PORT request. As before, in the upper half of the packet image, the last line shows :
Portmap GETPORT Reply Port : 724 Port: 724. Now, if you run rpcinfo -p you can confirm that port 724 is where bootparamd is listening.
A sidetrack on bootparamd and dhcpd :
bootparamd is the precursor to the dhcp protocol and this serves the /usr/diskless/client file tree to the CCU11, which then bootstraps to the code contained therein and a minimal UNIX environment takes shape.With TS2.1 onwards, the installation of diskless from the Topspin DVD automatically installs a dhcpd.conf also under /etc/ directory. Couple of points on that :
- As long as your console does not have the newer IPSO, you don't need this dhcp server to be running. The entire diskless boot is happening via bootparamd
- If you happen to upgrade to a newer console AVANCE-II or III that has an IPSO, you need this dhcpd daemon to be running.
- With RHEL 6.3, you should place the bruker suppled dhcpd.conf in the /etc/dhcp/ directory, since that is where the script expects the conf file. Your dhcpd daemon will not start, with the default /etc/dhcpd.conf location.
Recently, I built a RHEL5.9 box and I ended up with exactly the same symptoms. But the cause turned out to be much more simple. When I ran 'rpcinfo -p' I did not see bootparam anywhere. Doing a 'chkconfig' I found that bootparamd was not turned on at the time of booting. Once 'bootparamd' was started, the communication with 'spect' succeeded.
ReplyDelete