Monday, August 27, 2012

Topspin 2.1 and RHEL 6.3 - how to get out of this bind

The bug had bitten me again. When I decided to upgrade my ageing computer box (about 9+ yrs. old now) that runs Topspin 2.1 PL6 with an Avance console,  I wanted to go for the RedHat Enterprise Linux 6.3.  Just for nostalgia, in the days when SGI workstations roamed the earth, XWINNMR used to be the software ruling the Brukerland and it had the special requirement that the graphics card support 8-bit colour depth, while the computer hardware and OS moved on to support 24 bit depth typically.  The graphics card that did double duty, has become so obsolete of late that the 'mga' driver,  seemed to cause problems slowing down my systems. That is why I decided to move on with the newer hardware.  That and I also want to move away from ATAPI hard drives.

Instead of making life easy by going with RHEL 5, I decided to go with 6.3.

ProblemCCU won't boot.  

Ending : happy ending :-)   

I will summarize quickly what's the matter.

Background : 

Digging a bit with wireshark packet sniffer,  we could clearly see that the initial conversation between 'spect', which is the CCU11 board and 'ASP_ST2', which is the Linux box does take place.  It proceeds to the point where, spect, which is the diskless client, asks for a port number in which bootparamd is listening. It tries to get the info from the ASP_ST2 server.   bootparamd, similar to nfs or rquotad belongs to the RPC family of servers.  Normally, they register themselves to a program called portmap,  which in turn informs a connecting client such as spect, to which port number the client is supposed to send its communications to talk to that particular server, in this case bootparamd.  

I show here a packet, where this request for port number from  spect (IP 149.236.99.99) to ASP_ST2 (IP 149.236.99.1) is rejected.   In the upper half of the window that summarizes the traffic, note the last line "Portmap GETPORT Reply Port:0 PROGRAM_NOT_AVAILABLE".   Also note the 3rd line, Internet Protocol that shows you who is sending this message to whom i.e. 149.236.99.1 --> sends this packet to --> 149.236.99.99.  


 

Problem and resolution : 

With RHEL 6.3  (which is derived from Fedora 14 and higher), the newer program rpcbind is used in place of the conventional portmap. The following wikipedia page underlines the fact that these two programs are different avatars of the same entity.   The portmap seems to be the older version and rpcbind is the newer version.  With RHEL 6.3, for an inexplicable reason, both the portmap  and rpcbind are installed and turned on by default. 

In a system where portmapper function is fine, you can enter the command rpcinfo -p and get a typical output similar to this :

 program vers proto   port  service
    100000    4   tcp    111  portmapper
    100000    3   tcp    111  portmapper
    100000    2   tcp    111  portmapper
    100000    4   udp    111  portmapper
    100000    3   udp    111  portmapper
    100000    2   udp    111  portmapper
    100024    1   udp  34528  status
    100024    1   tcp  52726  status
    100011    1   udp    875  rquotad
    100011    2   udp    875  rquotad
    100011    1   tcp    875  rquotad
    100011    2   tcp    875  rquotad
    100005    1   udp  52514  mountd
    100005    1   tcp  55703  mountd
    100005    2   udp  50364  mountd
    100005    2   tcp  48481  mountd
    100005    3   udp  58813  mountd
    100005    3   tcp  53255  mountd
    100003    2   tcp   2049  nfs
    100003    3   tcp   2049  nfs
    100003    4   tcp   2049  nfs
    100227    2   tcp   2049  nfs_acl
    100227    3   tcp   2049  nfs_acl
    100003    2   udp   2049  nfs
    100003    3   udp   2049  nfs
    100003    4   udp   2049  nfs
    100227    2   udp   2049  nfs_acl
    100227    3   udp   2049  nfs_acl
    100021    1   udp  46281  nlockmgr
    100021    3   udp  46281  nlockmgr
    100021    4   udp  46281  nlockmgr
    100021    1   tcp  46143  nlockmgr
    100021    3   tcp  46143  nlockmgr
    100021    4   tcp  46143  nlockmgr
    100026    1   udp    721  bootparam

Note that the portmapper (whichever implementation it is i.e. rpcbind or portmap) always listen on port 111

 When both rpcbind and portmapper are running, you get an error message saying that : No RPC program registered.   

I tried turning off rpcbind using chkconfig and left the older portmap running. The problem remained as it is.  But, when I turned off portmap and left the newer rpcbind running, I could see that the RPC servers are registering with the portmapper, as shown by the above listing.

With this setup, CCU11 i.e. spect boots correctly.  Now, let us look at the same packet where ASP_ST2 is sending a reply to spect for the latter's GET_PORT request.  As before, in the upper half of the packet image, the last line shows :
 Portmap GETPORT Reply Port : 724 Port: 724.  Now, if you run rpcinfo -p you can confirm that port 724 is where bootparamd is listening.


A sidetrack on bootparamd and dhcpd :

bootparamd is the precursor to the dhcp protocol and this serves the /usr/diskless/client  file tree to the CCU11, which then bootstraps to the code contained therein and a minimal UNIX environment takes shape.

With TS2.1 onwards, the installation of diskless from the Topspin DVD automatically installs a dhcpd.conf also under /etc/ directory.  Couple of points on that :
  • As long as your console does not have the newer IPSO, you don't need this dhcp server to be running. The entire diskless boot is happening via bootparamd
  • If you happen to upgrade to a newer console AVANCE-II or III that has an IPSO, you need this dhcpd daemon to be running.  
  • With RHEL 6.3, you should place the bruker suppled dhcpd.conf in the /etc/dhcp/  directory, since that is where the script expects the conf file. Your dhcpd daemon will not start, with the default /etc/dhcpd.conf  location.

Tuesday, August 7, 2012

Topspin 2.1 Acquisition install and RHEL 6.3

Our previous topics on Topspin 2.1 install issues on a 64 bit system pretty much pertained to a data processing workstation.   When you include the full acquisition suite, which pretty much means the diskless client, you will face few more obstacles before you are done.  This is more to do with the modern EL6.3 version rather than the fact that the system is a 64 bit one.

We will mention a few, hopefully, helpful pointers through this post on those.

 bootparamd:   As of Topspin 2.1,  diskless still relies on bootparamd daemon for uploading to the spect client from the ASP_ST2 server.  With Enterprise Linux 6.3, bootparamd is no longer available as a rpm.  We have to either copy it from the TS2.1 install dvd and install it or pull a hopefully fresher version from the web.  I normally look in  www.rpmfind.net

  • The install program gives this message box : 

You can in principle install the version from TS2.1 disk. But it is of the RHEL4 flavour,  I looked in rpmfind.net and found a version that is built for CentOS 5.8, which is close enough.  The version I pulled as of this writing :   0.17-26.el5_7.1.x86_64.   Please remember that the daemon I am running is on the 64 bit host and that is why I installed the x86_64 version.

portmap  :You will see an error message about the missing rpm for portmap.  Here is the message box :


  •  Once again I pulled the 64 bit Centos 5.8 version for this rpm and installed it.  By the way, when you try installing this via the gui and get an error message, simply copy the /tmp/portmap.rpm to some other location and use rpm -ivh portmap_xxx.rpm   to complete the install.
tftp : Unless you installed it specifically,  you will get a message about missing tftp (trivial file transfer protocol). This should be available in the standard RHEL 6.3 repo and you can install it readily.

dhcp : Since the spectrometer host is also the ASP_ST2 server, it needs to run a dhcpd server to provide the static IP to spect  in the 149.236.99.0 subnet.  If you have not done so, you can install dhcp package from the standard repo.
To check if the above steps worked for you,  start the Topspin install program again and this time, only install diskless  from the list of available modules. If it goes without any errors then you are ok.

eth1 : Like me, if you are building a new system to replace an existing spectrometer host, make sure that you have a second ethernet NIC in the system. If not, the TS install program will complain about not being able to start the eth1 network card.  This is ok, as long as you remember to put that second ethernet card in the system and configure it to be eth1, either by hand or by using the NetworkManager.

If eth1 is not already configured at the time of TS2.1 install, your diskless install will complain that the dhcpd daemon cannot start. This is because the /etc/dhcpd.conf file is configured to listen to client traffic packets on the interface eth1, which is not up and running.  You can consult /var/log/messages to see these error messages, by simply grep-ing for dhcpd.