Tuesday, October 7, 2008

MATLAB Distributed Computing Server

Perform MATLAB and Simulink computations on computer clusters and server farms.



MATLAB Distributed Computing Server™ lets users solve computationally and data-intensive problems by executing MATLAB® and Simulink® based applications on a computer cluster.

MATLAB Distributed Computing Server is available for all hardware platforms and operating systems supported by MATLAB and Simulink. It includes a basic scheduler and directly supports Platform LSF®, Microsoft® Windows® Compute Cluster Server, Microsoft Windows HPC Server 2008, Altair PBS Pro®, and TORQUE schedulers. Other schedulers can be integrated using the generic interface API. The product’s dynamic licensing feature frees administrators from managing the license profiles of individual users on the cluster; only a single MATLAB Distributed Computing Server license is required for the cluster.

Users program and prototype applications on their desktops using Parallel Computing Toolbox™ and then scale up to a cluster using MATLAB Distributed Computing Server. The server can also be used to scale up executables and shared libraries generated from parallel MATLAB applications with MATLAB Compiler™.

Click here to see detailed product info.

Tuesday, September 23, 2008

Making portable GridStack 4.1 (Voltaire OFED) drivers.

Remove previously installed IB rpms if there. To do this;
rpm -e kernel-ib-1.0-1 \
dapl-1.2.0-1.x86_64 \
libmthca-1.0.2-1.x86_64 \
libsdp-0.9.0-1.x86_64 \
libibverbs-1.0.3-1.x86_64 \
librdmacm-0.9.0-1.x86_64

lsmod
And remove by hand all of "ib_" modules with "rmmod modulename" command


*** If you installed previously OFED IB with same package you can run ./uninstall.sh
script which is included GridStack-4.1.5_9.tgz package instead above steps.
This script does same and plus things automaticaly so you can prefer.



1. First optain Gridstack source code from Voltaire.
And then;
mkdir /home/setup
cp GridStack-4.1.5_9.tgz /home/setup
cd /home/setup
tar -zxvf GridStack-4.1.5_9.tgz
all of files will be in "/home/setup/GridStack-4.1.5_9"

cd GridStack-4.1.5_9

2. Install the GridStack drivers

./install.sh --make-bin-package

This process takes about 30 minutes.
time to coffee or tea but not cigarette...
....
.......
..........
INFO: wrote ib0 configuration to /etc/sysconfig/network-scripts/ifcfg-ib0
DEVICE=ib0 ONBOOT=yes BOOTPROTO=static IPADDR=192.168.129.9 NETWORK=192.168.0.0 NETMASK=255.255.0.0 BROADCAST=192.168.255.255 MTU=2044

Installation finished
Please logout from the shell and login again in order to update your PATH environment variable

3. Finishing the driver settings
Firts edit ip settings for IB
Just edit "/etc/sysconfig/network-scripts/ifcfg-ib0" like below;

DEVICE=ib0
ONBOOT=yes
BOOTPROTO=static
IPADDR=10.129.50.9
NETMASK=255.255.0.0
MTU=2044

save and reboot the system.

4. GridStack installation puts a init.d service on the system startup.
After the bootup process you must see ib0 device on ifconfig command and
LEDs of HCA cards must be on or blinking state. Check this...

After the reboot check the state of connection by ifconfig
eth0      Link encap:Ethernet  HWaddr 00:19:BB:XX:XX:XX  
          inet addr:10.128.129.9  Bcast:10.128.255.255  Mask:255.255.0.0
          inet6 addr: fe80::219:bbff:fe21:b3a8/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:177 errors:0 dropped:0 overruns:0 frame:0
          TX packets:148 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:16829 (16.4 KiB)  TX bytes:21049 (20.5 KiB)
          Interrupt:169 Memory:f8000000-f8011100 

ib0       Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  
          inet addr:10.129.50.9  Bcast:10.129.255.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
          RX packets:11 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:128 
          RX bytes:892 (892.0 b)  TX bytes:384 (384.0 b)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:4 errors:0 dropped:0 overruns:0 frame:0
          TX packets:4 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:336 (336.0 b)  TX bytes:336 (336.0 b)

If you see similar of above message you won. Ping the neighbors IP addres if avaible there;
ping 10.129.50.1
PING 10.129.50.1 (10.129.50.1) 56(84) bytes of data.
64 bytes from 10.129.50.1: icmp_seq=0 ttl=64 time=0.094 ms
64 bytes from 10.129.50.1: icmp_seq=1 ttl=64 time=0.057 ms
64 bytes from 10.129.50.1: icmp_seq=2 ttl=64 time=0.064 ms
64 bytes from 10.129.50.1: icmp_seq=3 ttl=64 time=0.056 ms

If you does not see ib0 or cannot ping gridstack service may not be started.
Start by manualy: /etc/init.d/gridstack start

If everything ok you can make an image of this system for
central deploying mechanism like tftp.

6. Installing new compiled GridStack driver to identical machines.
It is so easy. After the GridStack compilation process a new bz2 file and
their md5 checksum are created automaticaly. You can find these two files under the
upper level of source folder. On our example two files wait for your attn in there;

ls -al /home/setup
-rw-r--r--   1 root root       88 Nov 23 19:11 GridStack-4.1.5_9-rhas-k2.6.9-42.ELsmp-x86_64.md5sum
-rw-r--r--   1 root root 43570798 Nov 23 19:11 GridStack-4.1.5_9-rhas-k2.6.9-42.ELsmp-x86_64.tar.bz2

Copy this two files to all of the IB hosts which you want to plan GridStack installation.
Opposite to previous steps this installation not takes too many minutes.
Just copy files to new machine by scp;

cd /home/setup
scp GridStack-4.1.5_9-rhas-k2.6.9-42.ELsmp-x86_64 root@10.128.129.10:/home

Change to target machine console and type those commands;

cd /home
first check-out the binary equality of bz2 file
md5sum -c GridStack-4.1.5_9-rhas-k2.6.9-42.ELsmp-x86_64.md5sum
GridStack-4.1.5_9-rhas-k2.6.9-42.ELsmp-x86_64.tar.bz2: OK

if you see OK sign type this;
tar -jxvf GridStack-4.1.5_9-rhas-k2.6.9-42.ELsmp-x86_64.tar.bz2

A folder which is called "GridStack-4.1.5_9-rhas-k2.6.9-42.ELsmp-x86_64" will be created.
cd GridStack-4.1.5_9-rhas-k2.6.9-42.ELsmp-x86_64/
./install.sh

GridStack binary rpms will be install automaticaly.
Make ifcfg-ib0 setting like above, reboot and check for IP connectivity.


7. As a bonus advice;
After the GridStack installation there is lots of ib diagnostics tools avaible under the
/usr/local/ofed/bin directory. So for example issuing the ./ibv_devinfo give an brief
and usefull informations about HCA connectivity, board model, FW level and ... etc

Here ise sample output for my machine;
hca_id: mthca0
        fw_ver:                         4.7.400
        node_guid:                      0017:08ff:ffd0:XXXX
        sys_image_guid:                 0017:08ff:ffd0:XXXX
        vendor_id:                      0x1708
        vendor_part_id:                 25208
        hw_ver:                         0xA0
        board_id:                       HP_0060000001
        phys_port_cnt:                  2
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                2048 (4)
                        active_mtu:             2048 (4)
                        sm_lid:                 29
                        port_lid:               75
                        port_lmc:               0x00

                port:   2
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                2048 (4)
                        active_mtu:             2048 (4)
                        sm_lid:                 29
                        port_lid:               261
                        port_lmc:               0x00





---=== HCA DDR EXP-D FW upgrade after GridStack 4.1 install =--------

ib-burn -y -i VLT-EXPD -a /usr/voltaire/fw/HCA400Ex-D-25208-4_7_6.img 

INFO: Using alternative image file /usr/voltaire/fw/HCA400Ex-D-25208-4_7_6.img
Burning : using fw image file: /usr/voltaire/fw/HCA400Ex-D-25208-4_7_6.img VSD extention : -vsd1 VLT-EXPD -vsd2 VLT0040010001
    Current FW version on flash:  N/A
    New FW version:               N/A

    Burn image with the following GUIDs:
        Node:      0019bbffff00XXXX
        Port1:     0019bbffff00XXXX
        Port2:     0019bbffff00XXXX
        Sys.Image: 0019bbffff00XXXX

    You are about to replace current PSID in the image file - "VLT0040010001" with a different PSID - "VLT0040010001".
    Note: It is highly recommended not to change the image PSID.

 Do you want to continue ? (y/n) [n] : y

Read and verify Invariant Sector               - OK
Read and verify PPS/SPS on flash               - OK
Burning second    FW image without signatures  - OK  
Restoring second    signature                  - OK  

Where /usr/local/bin/ib-burn is a realy BASH script
this is another deep way to burn HCA card FW

lspci -n | grep -i "15b3:6278" | awk '{print $1}'
if you see "13:00.0" as output type this;

mstflint -d 13:00.0 -i /usr/voltaire/fw/HCA400Ex-D-25208-4_7_6.img -vsd1 "" -psid HP_0060000001 -y burn > /root/hca-fw-ugr.log
This command does not prompt for Yes.

For checking FW on the flash type this;
mstflint -d 13:00.0 q

Wednesday, September 3, 2008

SFS 2.2-1 Client Upgrade witch GridStack 4.x

SFS is the Hewlett Packard's Parallel File Systems which is based on the open source Lustre file system.
As an acronym SFS stands for Scalable File Share. HP company also has SFS20 disk enclosures must be not conflict to this software.


Here we are upgrading the client packages (rpms) to new level.
Enter to /home/sfs-iso-loop/client_enabler and run;


./build_SFS_client.sh --no_infiniband --config --allow_root \
/home/sfs-iso-loop/client_enabler/src/x86_64/RHEL4_U3/SFS_client_x86_64_RHEL4_U3.config



cd /home/sfs-iso-loop/client_enabler/output/RPMS/x86_64
rpm -ivh kernel-smp-2.6.9-34.0.2.EL_SFS2.2_1.x86_64.rpm
rpm -ivh kernel-smp-devel-2.6.9-34.0.2.EL_SFS2.2_1.x86_64.rpm

change /boot/grub/menu.lst to boot from this new kernel.



Reboot the machine and showtime ...

Wednesday, May 7, 2008

Creating new LDAP Server with OpenLDAP







* 1. Install RHEL 5.1 x86_64 Server

* 2. install openldap server and client RPMs
    rpm -qa | grep -i openldap must be show
    openldap-2.3.27-8
    openldap-servers-2.3.27-8

* 3. Copy /etc/openldap/DB_CONFIG.example to /var/lib/ldap/
and rename to just DB_CONFIG


* 4. Create or edit /etc/openldap/slapd.conf
Those of lines must be added;

include         /etc/openldap/schema/core.schema
include         /etc/openldap/schema/cosine.schema
include         /etc/openldap/schema/inetorgperson.schema
include         /etc/openldap/schema/nis.schema

allow bind_v2

pidfile         /var/run/openldap/slapd.pid
argsfile        /var/run/openldap/slapd.args

access to attrs=userPassword
    by self write
    by anonymous auth
    by * none
access to *
    by * read

database        bdb
suffix          "dc=uybhm,dc=itu,dc=edu,dc=tr"
rootdn          "cn=Manager,dc=uybhm,dc=itu,dc=edu,dc=tr"

#rootpw          "This value must be set later"

directory       /var/lib/ldap

index objectClass                       eq,pres
index ou,cn,mail,surname,givenname      eq,pres,sub
index uidNumber,gidNumber,loginShell    eq,pres
index uid,memberUid                     eq,pres,sub
index nisMapName,nisMapEntry            eq,pres,sub


* 5. Create a new Manager password which will be use later for
top level LDAP administration tasks
    slappasswd -h {SSHA} (type a pasword twice when asked)
Grab output and paste to rootpw line and edit like this;
rootpw          {SSHA}F/a/QvcnCrWHj7/eyJtWd/HdGtCpqsHt
Change owner of slapd.conf to just ldap:ldap and remove
"group" and "other" permissions.


* 6. Start LDAP service and check initial working status;
/etc/init.d/ldap restart

If you see OK than jump to next step.



* 7 Run a query
ldapsearch x -b "dc=uybhm,dc=itu,dc=edu,dc=tr" -h 127.0.0.1
# extended LDIF
#
# LDAPv3
# base with scope subtree
# filter: (objectclass=*)
# requesting: ALL
#

# search result
search: 2
result: 32 No such object

# numResponses: 1


* 8. Prepare a base domain record and insert to LDAP server
# This is base record of uybhm.idu.edu.tr
# This record must be add before all of other LDIFs

dn: dc=uybhm,dc=itu,dc=edu,dc=tr
objectClass: dcObject
objectClass: organization
o: UYBHM Administrators
dc: uybhm

dn: cn=Manager,dc=uybhm,dc=itu,dc=edu,dc=tr
objectclass: organizationalRole
cn: Manager

# users, uybhm.itu.edu.tr
dn: ou=users,dc=uybhm,dc=itu,dc=edu,dc=tr
objectClass: top
objectClass: organizationalUnit
ou: users

# groups, uybhm.itu.edu.tr
dn: ou=groups,dc=uybhm,dc=itu,dc=edu,dc=tr
objectClass: top
objectClass: organizationalUnit
ou: groups



ldapadd -W -x -D "cn=Manager,dc=uybhm,dc=itu,dc=edu,dc=tr" -h 127.0.0.1 -f 1.uybhm-domain.record.ldif
Enter LDAP Password:
adding new entry "dc=uybhm,dc=itu,dc=edu,dc=tr"
adding new entry "cn=Manager,dc=uybhm,dc=itu,dc=edu,dc=tr"
adding new entry "ou=users,dc=uybhm,dc=itu,dc=edu,dc=tr"
adding new entry "ou=groups,dc=uybhm,dc=itu,dc=edu,dc=tr"



* 9. Check to result
ldapsearch -x -b "dc=uybhm,dc=itu,dc=edu,dc=tr" -h 127.0.0.1

# extended LDIF
#
# LDAPv3
# base with scope subtree
# filter: (objectclass=*)
# requesting: ALL
#

# uybhm.itu.edu.tr
dn: dc=uybhm,dc=itu,dc=edu,dc=tr
objectClass: dcObject
objectClass: organization
o: UYBHM Administrators
dc: uybhm

# Manager, uybhm.itu.edu.tr
dn: cn=Manager,dc=uybhm,dc=itu,dc=edu,dc=tr
objectClass: organizationalRole
cn: Manager

# users, uybhm.itu.edu.tr
dn: ou=users,dc=uybhm,dc=itu,dc=edu,dc=tr
objectClass: top
objectClass: organizationalUnit
ou: users

# groups, uybhm.itu.edu.tr
dn: ou=groups,dc=uybhm,dc=itu,dc=edu,dc=tr
objectClass: top
objectClass: organizationalUnit
ou: groups

# search result
search: 2
result: 0 Success

# numResponses: 5
# numEntries: 4


* 10. Prepare or reinject user records

ldapadd -W -x -D "cn=Manager,dc=uybhm,dc=itu,dc=edu,dc=tr" -h 127.0.0.1 -f 2.users.ldif
Enter LDAP Password:
adding new entry "uid=lsfadmin,ou=users,dc=uybhm,dc=itu,dc=edu,dc=tr"
adding new entry "uid=efadmin,ou=users,dc=uybhm,dc=itu,dc=edu,dc=tr"
adding new entry "uid=efnobody,ou=users,dc=uybhm,dc=itu,dc=edu,dc=tr"
adding new entry "uid=bench,ou=users,dc=uybhm,dc=itu,dc=edu,dc=tr"

A sample user record file is here;

# mahmut.un, users, uybhm.itu.edu.tr
dn: uid=mahmut.un,ou=users,dc=uybhm,dc=itu,dc=edu,dc=tr
uid: mahmut.un
cn: Mahmut UN
objectClass: account
objectClass: posixAccount
objectClass: top
objectClass: shadowAccount
shadowLastChange: 13735
shadowMax: 999999
shadowWarning: 7
uidNumber: 620
gidNumber: 620
homeDirectory: /rs/users/mahmut.un
gecos: Mahmut UN
userPassword: {SSHA}QjoA6jcZmiX92h5uchz7U3uY80eoJulS
loginShell: /bin/bash

* 11. Query and see all of added records
ldapsearch -x -b "dc=uybhm,dc=itu,dc=edu,dc=tr" -h 127.0.0.1
If you want to see password hash also you must initialize a Manager query like this;
ldapsearch -W -x -D "cn=Manager,dc=uybhm,dc=itu,dc=edu,dc=tr" -h 127.0.0.1

Sunday, February 10, 2008

Testing the Network Speed: The netcat way

Testing your core network speed is essential to describe possible bottlenecks.

Here is practical way to do this;

nc is a golden linux tool which stands for netcat.
On the receiver side;
# nc -l 10.129.50.45 -p6666 > /dev/null

10.129.50.45 is the listened IP address.


On the transmitter side;
# dd if=/dev/zero bs=1024k count=1024 | nc 10.129.50.45 6666

At the end of pumping of the zeros "dd" shows total used time and bw/s values.
If not, you can use the "time" command before the dd to do this.

Intel stretches HPC dev tools across chubby clusters

SC11 Supercomputing hardware and software vendors are getting impatient for the SC11 supercomputing conference in Seattle, which kick...