Chapter 2. Using the Altix UV CMC Software Commands

This chapter describes how to use the CMC controllers to power on, manage, and monitor an SGI Altix UV 1000 or UV 100 system in the following sections:

Connecting to the UV System Controller Network

The console type and how these console types are connected to the Altix UV 1000 systems is determined by what console option is chosen. Establish either a serial connection or network/Ethernet connection to the CMC.

Establish a serial connection

If you have an Altix UV 1000 system and wish to use a serially-connected "dumb terminal", you can connect the terminal via a serial cable to the (DB-9) RS-232-style console port connector on the CMC board of the IRU.

The terminal should be set to the following functional modes:

  • pin 2 - receive

  • pin 3 - transmit

  • pin 5 - ground

  • Baud: 115200

  • Data bits: 8

  • Parity: no

  • Stop bits: 1

  • No flow control

Note that a serial console is generally connected to the first (bottom) IRU in any single rack configuration. For more information, see the “Console Hardware Requirements” section in the SGI Altix UV 1000 System User's Guide.

Establish a Network/Ethernet connection (see SBK port, EXT port, and SMN port in Figure 1-5)

CMCs have their rack and u position set at the factory. The CMC will assign itself IP addresses, as follows:

SBK 172.17.<rack>.<slot>

EXT 10.<rack>.<slot>.1

On the system management node (SMN) port, the CMC is configured to request an IP address via dynamic host configuration protocol (DHCP).

Either connection, serial or network, will present a login prompt. For more information, see the “Levels of System Control” section in the SGI Altix UV 1000 System User's Guide.

Power on and Booting an Altix UV System from Complete Power Off

To boot an SGI Altix UV 1000 or UV 100 system from complete power off, perform the following steps:

  1. Make sure the power breakers are on.

  2. Establish a serial connection to the CONSOLE on the CMC (see Figure 1-5). See “Connecting to the UV System Controller Network” or skip to the next step.

  3. Establish a network connection to the CMC. “Connecting to the UV System Controller Network”. Use the ssh command to connect to the CMC, similar to the following example:


    Note: This is only valid if your PC is connected to the CMC (via the network connection) has its /etc/hosts file setup to include the CMCs.


    ssh root@hostname-cmc
    SGI Chassis Manager Controller, Firmware Rev. 0.0.22
    
    CMC:r1i1c> 
    

    Typically, the default password set out of the factory is root. The CMC prompt appears. CMC:r1i1c refers to rack 1, IRU 1, CMC (see Figure 1-4 and Figure 1-5)

    If the host name is not set up in the PC/workstation's hosts file, you can simply use the IP address of the CMC, as follows:

    ssh root@<IP-ADDRESS>
    

  4. Power up your Altix UV system using the power on command, as follows:

    CMC:r1i1c> power on
    


    Note: You can open a second window on the CMC, ssh root@hostname-cmc and use the uvcon command to open a console and watch the system power on.


  5. Open a second console to the CMC using the uvcon command to see the system power on, as follows:

    ssh root@hostname-cmc
    SGI Chassis Manager Controller, Firmware Rev. 0.0.22
    
    CMC:r1i1c> uvcon
    uvcon: attempting connection to localhost...
    uvcon: connection to SMN/CMC (localhost) established.
    uvcon: requesting baseio console access at r001i01b00...
    uvcon: tty mode enabled, use 'CTRL-]' 'q' to exit
    uvcon: console access established
    uvcon: CMC <--> BASEIO connection active
    ************************************************
    *******  START OF CACHED CONSOLE OUTPUT  *******
    ************************************************
    
    ******** [20100512.143541] BMC r001i01b10: Cold Reset via NL broadcast reset
    ******** [20100512.143541] BMC r001i01b07: Cold Reset via NL broadcast reset
    ******** [20100512.143540] BMC r001i01b08: Cold Reset via NL broadcast reset
    ******** [20100512.143540] BMC r001i01b12: Cold Reset via NL broadcast reset
    ******** [20100512.143541] BMC r001i01b14: Cold Reset via NL broadcast reset
    ******** [20100512.143541] BMC r001i01b04: Cold Reset via NL 
    						 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	....	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	
    


    Note: Use CTRL-] q to exit the console.


  6. Depending upon the size of your system, in can take 5 to 10 minutes for the Altix UV system to power on. When the shell> prompt appears, enter fs0, as follows:

    shell> fs0
    

  7. At the fs0 prompt, enter boot, as follows:

    fs0> boot
    

    ELILO Linux Boot loader is called and various SGI configuration scripts are run and the SUSE Linux Enterprise Server 11 SP1 installation program appears.

Power off an Altix UV System

To power down the Altix UV stem, use the power off command, as follows:

CMC:r1i1c> power off
==== r001i01c (PRI) ====

You can use the power status command, to check the power status of your system

CMC:r1i1c> power status
==== r001i01c (PRI) ====
on: 0, off: 32, unknown: 0, disabled: 0

Power NMI to Drop into KDB

To send a nonmaskable interrupt (NMI) signal from the power command to the CMC to drop into the kernel debugger (KDB), use the power nmi command, as follows:

CMC:r1i1c> power nmi

Entering kdb (current=0xffff8aa3fe11c040, pid 0) on processor 7 due to NonMaskable Interrupt @ 0xffffffff8100ad42
     r15 = 0x0000000000000000      r14 = 0x0000000000000000
     r13 = 0x0000000000000000      r12 = 0x0000000000000000
      bp = 0xffffffff81927380       bx = 0xffff8ac1ff11dfd8
     r11 = 0xffffffff8101a2c0      r10 = 0xffff88000beefd18
      r9 = 0x00000000ffffffff       r8 = 0x0000000000000000
      ax = 0x0000000000000000       cx = 0x0000000000000000
      dx = 0x0000000000000000       si = 0xffff8ac1ff11dfd8
      di = 0xffffffff81a2b308  orig_ax = 0xffffffffffffffff
      ip = 0xffffffff8100ad42       cs = 0x0000000000000010
   flags = 0x0000000000000246       sp = 0xffff88000bee7ff0
      ss = 0x0000000000000018 &regs = 0xffff88000bee7f58
[7]kdb>

Viewing Your System Configuration

To view your system configuration, use the config -v command, as follows:

CMC:r1i1c> config -v

CMCs:            2
        r001i01c UV1000
        r001i02c UV1000

BMCs:           32
        r001i01b00 IP93-BASEIO
        r001i01b01 IP93-DISK
        r001i01b02 IP93-EXTPCIE
        r001i01b03 IP93-EXTPCIE
        r001i01b04 IP93
        r001i01b05 IP93
        r001i01b06 IP93
        r001i01b07 IP93
        r001i01b08 IP93
        r001i01b09 IP93
        r001i01b10 IP93
        r001i01b11 IP93
        r001i01b12 IP93
        r001i01b13 IP93
        r001i01b14 IP93
        r001i01b15 IP93
        r001i02b00 IP93-BASEIO
        r001i02b01 IP93-EXTPCIE
        r001i02b02 IP93-DISK
        r001i02b03 IP93-EXTPCIE
        r001i02b04 IP93-EXTPCIE
        r001i02b05 IP93-EXTPCIE
        r001i02b06 IP93-EXTPCIE
        r001i02b07 IP93-EXTPCIE
        r001i02b08 IP93-INTPCIE
        r001i02b09 IP93-INTPCIE
        r001i02b10 IP93-INTPCIE
        r001i02b11 IP93-INTPCIE
        r001i02b12 IP93-INTPCIE
        r001i02b13 IP93-INTPCIE
        r001i02b14 IP93-INTPCIE
        r001i02b15 IP93-INTPCIE

Partitions:      1
        partition000 BMCs:   32

r001i01b00 refers to rack 0, IRU 1, and blade 0. For a view of the physical layout of an IRU, see Figure 1-1, Figure 1-2, and Figure 1-3.

Finding the CMC IP Address

CMCs have their rack and u position set at the factory. The CMC will assign itself IP addresses, as follows:

SBK 172.17.<rack>.<slot>

EXT 10.<rack>.<slot>.1

On the system management node (SMN) port, the CMC is configured to request an IP address via dynamic host configuration protocol (DHCP).

To find the IP address of the CMC, connect a network cable to the SMN jack and CMC will request and get a DHCP address. See “Connecting to the UV System Controller Network”.

The IP address and hostname of your system CMC resides in the /etc/sysconfig/ifcfg-eth0 file, as follows:

CMC:r1i1c> cat /etc/sysconfig/ifcfg-eth0
BOOTPROTO=static
IPADDR=137.38.82.88
NETMASK=255.255.255.0
GATEWAY=137.38.82.254
HOSTNAME=uv15-cmc

System Partitioning

A single SGI ProPack for Linux server can be divided into multiple distinct systems, each with its own console, root filesystem, and IP network address. Each of these software-defined group of processors are distinct systems referred to as a partition. Each partition can be rebooted, loaded with software, powered down, and upgraded independently. The partitions communicate with each other over an SGI NUMAlink connection. Collectively, all of these partitions compose a single, shared-memory cluster.

The following example shows how to use CMC software to partition a two rack system containing four IRUs into four distinct systems, use the uvcon command to open a console and boot each partition and repartiton it back to a single system.


Important: Each partition must have one base I/O blade and one disk blade for booting. 001i01b00 refers to rack 1, IRU 0, and blade00. r001i01b01 refers to rack 1, IRU 0, and blade01.

Base I/O and the boot disk are displayed by the config -v command, similar to the following:

r001i01b00 IP93-BASEIO
r001i01b01 IP93-DISK

  1. Use the hwcfg command to create four system partitions, as follows:

    CMC:r1i1c>hwcfg partition=1 "r1i1b*”
    CMC:r1i1c>hwcfg partition=2 "r1i2b*”
    CMC:r1i1c>hwcfg partition=3 "r2i1b*”
    CMC:r1i1c>hwcfg partition=4 "r2i2b*”
    

  2. Use the config -v command to show the four partitions, as follows:

    CMC:r1i1c> config -v
    
    CMCs:            4
            r001i01c UV1000 SMN
            r001i02c UV1000
            r002i01c UV1000
            r002i02c UV1000
    
    BMCs:           64
            r001i01b00 IP93-BASEIO P001
            r001i01b01 IP93-DISK P001
            r001i01b02 IP93-INTPCIE P001
            r001i01b03 IP93 P001
            r001i01b04 IP93 P001
            r001i01b05 IP93 P001
            r001i01b06 IP93 P001
            r001i01b07 IP93 P001
            r001i01b08 IP93 P001
            r001i01b09 IP93-INTPCIE P001
            r001i01b10 IP93-INTPCIE P001
            r001i01b11 IP93-INTPCIE P001
            r001i01b12 IP93-INTPCIE P001
            r001i01b13 IP93 P001
            r001i01b14 IP93 P001
            r001i01b15 IP93 P001
            r001i02b00 IP93-BASEIO P002
            r001i02b01 IP93-DISK P002
            r001i02b02 IP93-INTPCIE P002
            r001i02b03 IP93 P002
            r001i02b04 IP93 P002
            r001i02b05 IP93 P002
            r001i02b06 IP93 P002
            r001i02b07 IP93 P002
            r001i02b08 IP93 P002
            r001i02b09 IP93 P002
            r001i02b10 IP93 P002
            r001i02b11 IP93 P002
            r001i02b12 IP93 P002
            r001i02b13 IP93 P002
            r001i02b14 IP93 P002
            r001i02b15 IP93 P002
            r002i01b00 IP93-BASEIO P003
            r002i01b01 IP93-DISK P003
            r002i01b02 IP93 P003
            r002i01b03 IP93 P003
            r002i01b04 IP93 P003
            r002i01b05 IP93 P003
            r002i01b06 IP93 P003
            r002i01b07 IP93 P003
            r002i01b08 IP93 P003
            r002i01b09 IP93 P003
            r002i01b10 IP93 P003
            r002i01b11 IP93 P003
            r002i01b12 IP93 P003
            r002i01b13 IP93 P003
            r002i01b14 IP93 P003
            r002i01b15 IP93 P003
            r002i02b00 IP93-BASEIO P004
            r002i02b01 IP93-DISK P004
            r002i02b02 IP93 P004
            r002i02b03 IP93 P004
            r002i02b04 IP93 P004
            r002i02b05 IP93 P004
            r002i02b06 IP93 P004
            r002i02b07 IP93 P004
            r002i02b08 IP93 P004
            r002i02b09 IP93 P004
            r002i02b10 IP93 P004
            r002i02b11 IP93 P004
            r002i02b12 IP93 P004
            r002i02b13 IP93 P004
            r002i02b14 IP93 P004
            r002i02b15 IP93 P004
    
    Partitions:      4
            partition001 BMCs:   16
            partition002 BMCs:   16
            partition003 BMCs:   16
            partition004 BMCs:   16
    

  3. Use can also use the hwcfg command to display the four partitions, as follows:

    CMC:r1i1c> hwcfg
    NL5_RATE=5.0
    PARTITION=1 ................................................ 16/64 BMC(s)
    PARTITION=2 ................................................ 16/64 BMC(s)
    PARTITION=3 ................................................ 16/64 BMC(s)
    PARTITION=4 ................................................ 16/64 BMC(s)
    

  4. To reset the system and boot the four partitions, use the following commands:

    CMC:r1i1c> power on
    CMC:r1i1c> power reset "p*"
    


    Note: In the power reset “p*” command, above, quotes are required to prevent shell expansion.


  5. Use the uvcon command to open consoles to each partition and boot the partitions. Open a console to partition one, as follows:

    CMC:r1i1c> uvcon p1
    uvcon: attempting connection to localhost...
    uvcon: connection to SMN/CMC (localhost) established.
    uvcon: requesting baseio console access at partition 1 (r001i01b00)...
    uvcon: tty mode enabled, use 'CTRL-]' 'q' to exit
    uvcon: console access established (OWNER)
    uvcon: CMC <--> BASEIO connection active
    ************************************************
    *******  START OF CACHED CONSOLE OUTPUT  *******
    ************************************************
    
    ******** [20100513.215944] BMC r001i01b15: Cold Reset via NL broadcast reset
    ******** [20100513.215944] BMC r001i01b07: Cold Reset via NL broadcast reset
    ******** [20100513.215945] BMC r001i01b13: Cold Reset via NL broadcast reset
    ******** [20100513.215945] BMC r001i01b05: Cold Reset via NL broadcast reset
    ******** [20100513.215945] BMC r001i01b06: Cold Reset via NL broadcast reset
    ******** [20100513.215946] BMC r001i01b10: Cold Reset via NL broadcast reset
    ******** [20100513.215946] BMC r001i01b09: Cold Reset via NL broadcast reset
    ******** [20100513.215945] BMC r001i01b11: Cold Reset via NL broadcast reset
    ******** [20100513.215945] BMC r001i01b12: Cold Reset via NL broadcast reset
    ******** [20100513.215945] BMC r001i01b04: Cold Reset via NL broadcast reset
    ******** [20100513.215945] BMC r001i01b08: Cold Reset via NL broadcast reset
    ******** [20100513.215946] BMC r001i01b02: Cold Reset via NL broadcast reset
    ******** [20100513.215945] BMC r001i01b00: Cold Reset via NL broadcast reset
    ******** [20100513.215945] BMC r001i01b14: Cold Reset via NL broadcast reset
    ******** [20100513.215947] BMC r001i01b09: Cold Reset via ICH
    ******** [20100513.215946] BMC r001i01b12: Cold Reset via ICH
    ******** [20100513.215947] BMC r001i01b10: Cold Reset via ICH
    ******** [20100513.215947] BMC r001i01b11: Cold Reset via ICH
    ******** [20100513.215947] BMC r001i01b02: Cold Reset via ICH
    ******** [20100513.215947] BMC r001i01b00: Cold Reset via ICH
    ******** [20100513.215953] BMC r001i01b03: Cold Reset via NL broadcast reset
    ******** [20100513.220011] BMC r001i01b01: Cold Reset via NL broadcast reset
    ******** [20100513.220012] BMC r001i01b08: Cold Reset via NL broadcast reset
    ******** [20100513.220012] BMC r001i01b07: Cold Reset via NL broadcast reset
    ******** [20100513.220011] BMC r001i01b15: Cold Reset via NL broadcast reset
    ******** [20100513.220012] BMC r001i01b06: Cold Reset via NL broadcast reset
    ******** [20100513.220012] BMC r001i01b05: Cold Reset via NL broadcast reset
    ******** [20100513.220012] BMC r001i01b14: Cold Reset via NL broadcast reset
    ******** [20100513.220012] BMC r001i01b13: Cold Reset via NL broadcast reset
    ******** [20100513.220011] BMC r001i01b04: Cold Reset via NL broadcast reset
    ******** [20100513.220012] BMC r001i01b03: Cold Reset via NL broadcast reset
    ******** [20100513.220013] BMC r001i01b09: Cold Reset via NL broadcast reset
    ******** [20100513.220013] BMC r001i01b10: Cold Reset via NL broadcast reset
    ******** [20100513.220013] BMC r001i01b11: Cold Reset via NL broadcast reset
    ******** [20100513.220012] BMC r001i01b12: Cold Reset via NL broadcast reset
    ******** [20100513.220012] BMC r001i01b02: Cold Reset via NL broadcast reset
    ******** [20100513.220012] BMC r001i01b00: Cold Reset via NL broadcast reset
    ******** [20100513.220014] BMC r001i01b09: Cold Reset via ICH
    ******** [20100513.220014] BMC r001i01b10: Cold Reset via ICH
    ******** [20100513.220014] BMC r001i01b11: Cold Reset via ICH
    ******** [20100513.220013] BMC r001i01b12: Cold Reset via ICH
    ******** [20100513.220013] BMC r001i01b02: Cold Reset via ICH
    ******** [20100513.220016] BMC r001i01b00: Cold Reset via ICH
    ******** [20100513.220035] BMC r001i01b14: Cold Reset via NL broadcast reset
    ******** [20100513.220035] BMC r001i01b06: Cold Reset via NL broadcast reset
    ******** [20100513.220034] BMC r001i01b15: Cold Reset via NL broadcast reset
    ******** [20100513.220035] BMC r001i01b05: Cold Reset via NL broadcast reset
    ******** [20100513.220034] BMC r001i01b01: Cold Reset via NL broadcast reset
    ******** [20100513.220035] BMC r001i01b07: Cold Reset via NL broadcast reset
    	.	.	.	.	.	.	.	.	.	.	.	.	.	.	.	.	.	.	.	....																			 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 
    Hit [Space] for Boot Menu.
    ELILO boot:
    	.	.	.						 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 																																													....		.	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	 	
    


    Note: Use the uvcon command to open consoles on the other three partitions and boot them. The system will then have four single system images.


  6. Use the hwcfg -c partition command to clear the four partitions, as follows:

    CMC:r1i1c> hwcfg -c partition
    PARTITION=0 <PENDING RESET>
    


    Note: This will take several minutes on large systems.


  7. To reset the system and boot it as a single system image (one partition), use the following command:

    CMC:r1i1c> power reset "p*"
    For detailed instructions on how to use the UV controller commands to partition a system, see “System Partitioning” in the SGI Altix UV Linux Configuration and Operations Guide.
    
    

Upgrading System BIOS

To upgrade the compute blade BIOS, perform the following steps:

  1. From the CMC prompt, to show the current PROM level perform the following command:

    CMC:r1i1c> showbios
    Flashed on Sat May  1 14:14:45 UTC 2010 was bios.latest.fd (20100429_1603)
    

  2. Get the newest PROM image from SupportFolio Online at http://support.sgi.com/

  3. Copy the latest BIOS to a directory on the CMC in /work/bmc/common/ An example directory is, as follows:

    CMC:r1i1c> ls
    bios.latest.fd flashbios
    

  4. Use the flashbios command to flash the compute blade BIOS, as follows:

    CMC:r1i1c> flashbios
    Using default bios: bios.latest.fd
    Checking processor status on all nodes....
    Done. System is read for BIOS flash update
    Flashing bios bios.lastest.fd (20100429_1603) This will take several minutes.
    ...
    

Hyper-Threading on Altix UV 100 or Altix UV 1000 Systems

Threading in a software application splits instructions into multiple streams so that multiple processors can act on them.

Hyper-Threading (HT) Technology, developed by Intel Corporation, provides thread-level parallelism on each processor, resulting in more efficient use of processor resources, higher processing throughput, and improved performance. One physical CPU can appear as two logical CPUs by having additional registers to overlap two instruction streams or a single processor can have dual-cores executing instructions in parallel.

For more information about using HT, see “Using Cpusets with Hyper-Threads” in the Linux Resource Administration Guide.