NCHPC: HPC

Showing posts with label HPC. Show all posts

Friday, April 16, 2010

New Cray OS Brings ISVs in for a Soft Landing

Cray has never made a big deal about the custom Linux operating system it packages with its XT supercomputing line. In general, companies don't like to tout proprietary OS environments since they tend to lock custom codes in and third-party ISV applications out. But the third generation Cray Linux Environment (CLE3) that the company announced is designed to make elite supercomputing an ISV-friendly experience.

Besides adding compatibility to off-the-shelf ISV codes, which we'll get to in a moment, the newly-minted Cray OS contains a number of other enhancements. In the performance realm, CLE3 increases overall scalability to greater than 500,000 cores (up from 200,000 in CLE2), adds Lustre 1.8 support, and includes some advanced scheduler features. Cray also added a feature called "core specialization," which allows the user to pin a single core on the node to the OS and devote the remainder to application code. According to Cray, on some types of codes, this can bump performance 10 to 20 percent. CLE3 also brings with it some additional reliability features, including NodeKARE, a diagnostic capability that makes sure jobs are running on healthy nodes.

But the biggest new feature added to CLE3 is compatibility with standard HPC codes from independent software vendors (ISVs). This new capability has the potential to open up a much broader market for Cray's flagship XT product line, and further blur the line between proprietary supercomputers and traditional HPC clusters.

Cray has had an on-again off-again relationship with HPC software vendors. Many of the established ISVs in this space grew up alongside Cray Research, and software from companies like CEI, LSTC, SIMULIA, and CD-adapco actually ran on the original Cray Research machines. Over time, these vendors migrated to standard x86 Linux and Windows systems, which became their prime platforms, and dropped products that required customized solutions for supercomputers. Cray left most of the commercial ISVs behind as it focused on high-end HPC and custom applications.

Programming Environment of CLE
The CLE programming environment includes tools designed to complement and enhance each other, resulting in a rich, easy-to-use programming environment that facilitates the development of scalable applications.

Parallel programming models: MPI, SHMEM, UPC, OpenMP, and Co-Array Fortran within the node
MPI 2.0 standard, optimized to take advantage of the scalable interconnect in the Cray XT system
Various MPI libraries supported under Cluster Compatibility Mode
Optimized C, C++, UPC, Fortran90, and Fortran 2003 compilers
High-performance optimized math libraries of BLAS, FFTs, LAPACK, ScaLAPACK, SuperLU, and Cray Scientiific Libraries
Cray Apprentice2 performance analysis tools

(Full version of this article can be obtained from HPCwire's web pages)

Monday, March 22, 2010

Moscow State University Supercomputer Has Petaflop Aspirations

The Moscow State University (MSU) supercomputer, Lomonosov, has been selected for a high-performance makeover, with the goal of nearly tripling its current processing power to achieve petaflop-level performance in 2010. T-Platforms, who developed and manufactured the supercomputer, is the odds-on favorite to lead the project.

With a current Linpack mark of 350 teraflops (peak: 420 teraflops), Lomonosov needs to generate an additional 650 teraflops of performance to achieve its goal. No small task. So far, there are only two computers that have broken the Linpack petaflop barrier, Jaguar at Oak Ridge National Lab, which holds the number one position on the TOP500 list, and Roadrunner at Los Alamos National Lab, with the number two spot. Lomonosov ranks 12th on the most recent edition of the TOP500 list and is the largest HPC system in the CIS and Eastern Europe.

Officials at Moscow State University held a meeting this week in order to establish a budget for the petaflops revamping of the Lomonosov system. According to Russian State Duma Speaker Boris Gryzlov, MSU has prepared a feasibility study on the effectiveness of creating a petaflops supercomputer, and the matter will be brought up to the President and Chairman of the government for approval.

Total university funding in 2010 from the Russian federal government will amount to 1.5 billion rubles ($51 million). The anticipated cost of increasing the computer's performance to reach petaflop-level is about 900 million rubles, or almost $31 million, according to Moscow State University President Victor Sadovnichy. MSU has already invested 350 million rubles ($12 million) in the Lomonosov system, and the total project cost so far is 1.9 billion rubles ($65 million). MSU is ready to provide up to a quarter of the cost of hardware, said Sadovnichy.

Apparently, the amounts specified to upgrade the system refer only to the procurement and installation of equipment, and do not include system maintenance and electricity costs. Current power requirements are around 5 MW, which, according to Sadovnichy, is comparable to powering a small city.
"Lomonosov" and its predecessor "Chebyshev" are responsible for many research breakthroughs, including an inhibitor of thrombin (a substance retarding the effect of the main component of blood clotting), as well as the development of urokinase, a possible cancer treatment. In addition to these undertakings, Lomonosov has been kept busy modeling climate processes, factoring large integers to solve cryptographic problems, and calculating the noise in turbulent environments.

The renovation work for transforming Lomonosov into a petaflop system is being put to a competitive bid, but is seems likely that T-Platforms will get the contract since it is the only Russian manufacturer with the know-how to implement such a project. And there's a partiality toward assigning work to national interests. State Duma Speaker Boris Gryzlov, who backs the creation of a domestic petaflop supercomputer, prefers to support domestic producers of supercomputers, and urged caution against the procurement of foreign goods.
Mikhail Kozhevnikov, commercial director for T-Platforms, has already prepared a bid and decided upon an upgrade path for the petaflop system. The details of the proposed architecture have not been publicly declared, however a good guess would be that they're going to add new nodes based on the Westmere EP Xeon processors Intel just announced.

Specifically, since the current MSU super is based on the T-Blade2, Xeon X5570 2.93 GHz, it's not unreasonable to think they're bidding T-Blade3 blades using Xeon X5670 2.93 GHz parts (note, the T-Blade3 don't actually exist yet). Since the new Xeons only deliver about 40 percent more computational performance per blade than the existing ones, they'll still need a bunch more servers. Alternatively, they could be thinking about upgrading with the upcoming NVIDIA Fermi GPU server boards, due out in May. That would get them to a petaflop with a lot less hardware. (A dual-socket X5670 server would yield about 250 DP gigaflops; a 4-GPU Fermi server would probably deliver over 2 DP teraflops.)

Russian Prime Minister Vladimir Putin has allocated 1.1 billion rubles ($37 million) to develop supercomputer technologies in Russia, according to a recent APA report, further demonstrating Russia's desire to possess a world-class computer system, one that may be capable of a place among the top 5 of the revered TOP500 list. Barring any unforeseen circumstances, it looks like the Lomonosov upgrade will go forward, and Russia will take its place on the exclusive short-list of petaflop systems. But, in HPC, the final goal is always a moving target, as other groups also race for the coveted petaflops level and beyond.

(This article sourced from HPCwire.)

Tuesday, March 16, 2010

Intel Ups Performance Ante with Westmere Server Chips

Right on schedule, Intel has launched its Xeon 5600 processors, codenamed "Westmere EP." The 5600 represents the 32nm sequel to the Xeon 5500 (Nehalem EP) for dual-socket servers. Intel is touting better performance and energy efficiency, along with new security features, as the big selling points of the new Xeons.

For the HPC crowd, the performance improvements are the big story. Thanks in large part to the 32nm transistor size, Intel was able to incorporate six cores and 12 MB of L3 cache on a single die -- a 50 percent increase compared to the Xeon 5500 parts. According to Intel, that translated into a 20 to 60 percent boost in application performance and 40 percent better performance per watt.

Using the high performance Linpack benchmark, Intel is reporting a 61 percent improvement for a 6-core 5600 compared its 4-core Xeon 5500 predecessor (146 gigaflops versus 91 gigaflops). You might be wondering how this was accomplished, given that the 5600 comes with only 50 percent more cores and cache. It turns out that Intel's comparison was based on its two top-of-the line Xeon chips from each processor family. The 146 gigaflops result was delivered by a X5680 processor, which runs a 3.33 GHz and has a TDP of 130 watts, while the 91 gigaflops mark was turned in by the X5570 processor, which runs at 2.93 GHz and has a TDP of 95 watts. Correcting for clock speed, the 5600 Linpack would be something closer to 128 gigaflops, representing a still-respectable 41 percent boost.

Intel also reported performance improvements across a range of technical HPC workloads. These include a 20 percent boost on memory bandwidth (using Stream-MP), a 21 percent average improvement with a number of CAE codes, a 44 percent average improvement for life science codes, and a 63 percent improvement using a Black Scholes financial benchmark. These results also reflect the same 3.33/2.93 GHz clock speed bias discussed in the Linpack test, so your mileage may vary.

Looking at the performance per watt metric, the new 5600 chips also have a clear edge. An apples-to-apples comparison of the X5570 (2.93 GHz, 95 watt) and x5670 (2.93 GHz, 95 watts), has the latter chip delivering 40 percent more performance per watt. That's to be expected since two extra cores are available on the X5670 to do extra work.

Intel is also offering low-power 40 and 60 watt versions of the 5600 alongside the mainstream 80, 95, and 130 watt offerings. These low-power versions would be especially useful where energy consumption, rather than performance, is the driving factor. For example, a 60 watt L5640 matches the raw performance of a 95 watt X5570, potentially saving 30 percent in power consumption. Intel is even offering a 30 watt L3406, aimed at the single-processor microserver segment. Other power-saving goodies that come with the 5600 include a more efficient Turbo Boost and memory power management facility, automated low power states for six cores, and support for lower power DDR3 memory.

The Xeon 5600 parts are socket-compatible with the 5500 processors and can use the same chipsets, making a smooth upgrade path for system OEMs. Like their 5500 predecessors, the 5600s support DDR3 memory to the tune of three memory channels per socket. Practically speaking, that means two cores share a memory channel when all six cores are running full blast.

The enterprise market will be pleased by the new on-chip security features in the 5600 architecture. First, there is the new AES instructions for accelerating database encryption, whole disk encryption and secure internet transactions. The 5600 also offers what Intel is calling Trusted Execution Technology (TXT). TXT can be used to prevent the insertion of malicious VM software at bootup in a virtualized cloud computing environment.

Although the 5600 family will bring Intel into the mainstream six-core server market, the company is offering new four-core parts as well. In fact, the fastest clock is owned by the X5677, a quad-core processor that tops out at 3.46 GHz. These top-of-the-line four-core versions might find a happy home with many HPC users, in particular where single-threaded application performance is paramount. This would be especially true for workloads that tend to be memory-bound, since in this case more cores might actually drag down performance by incurring processing overhead while waiting for a memory channel to open up.

Intel's marketing strategy for the Xeon 5600 is not that different from its 5500 sales pitch: improved processor efficiencies generate quick payback on the investment. For the 5600, the claim is that you can replace 15 racks of single-core Xeons with a single rack of the new chips, that is, as long as you don't need any more performance. Intel is touting a five-month payback for this performance-neutral upgrade.

On the other hand, if you need 15 times the performance, you can do a 1:1 replacement of your single-core servers and still realize about eight percent in energy savings. But since software support and server warranty costs dominate maintenance expenses, any energy savings might get swallowed up by these other costs.
Intel says it is keeping the prices on the 5600 processors in line with the Xeon 5500s, although the new processor series spans a wider range of offerings. At the low end, you have the L3406, a 30 watt 2.26 GHz part with four cores just 4 MB of L3. It goes for just $189. At the top end are the six-core X5680 and the four-core X5677, both of which are offered at $1,663. Prices quoted are in quantities of 1,000.

In conjunction with Intel's launch, a number of HPC OEMs are also announcing new systems based on the Xeon 5600 series. For example, Cray announced its CX1 line of deskside machines will now come with the new chips. SGI is also incorporating the new Xeons into its portfolio, including the Altix ICE clusters, the InfiniteStorage servers, and the Octane III personal super. SGI will also use the new chips in its just-announced Origin 400 workgroup blade solution. IBM, HP and Dell are likewise rolling out new x86 servers based on the 5600 processors.

AMD is looking to trump Intel's latest Xeon offerings with its new Magny-Cours 6100 series Opteron processors, which are set to launch at the end of the month. The new Opterons come in 8- and 12-core flavors and are debuting alongside AMD's new G34 chipset. Although the Opterons lack the HyperThreading technology of the Xeons, the additional physical cores and fourth memory channel should make up for this. Also unlike the 5600 architecture, the Opteron 6100 support both 2-socket and 4-socket systems, giving server makers additional design flexibility. In any case, the x86 rivalry is quickly heating up as the two chipmakers battle for market share in 2010.

(This article sourced from the HPC Wire and original text can be found their web pages)

Monday, March 15, 2010

DEISA PRACE Symposium 2010

DEISA, the Distributed European Infrastructure for Supercomputing Applications, and PRACE, the Partnership for Advanced Computing in Europe, are inviting again to their joint annual science symposium as an important European HPC event: The DEISA PRACE Symposium 2010 which will take place from May 10 to May 12 in Barcelona, Spain.

(Registration and more info can be found DEISA web pages.)

Thursday, March 4, 2010

SC10 is now accepting submissions for its technical program.

SC10, the premier international conference on high-performance computing, networking, storage and analysis, is now accepting submissions for its technical program. The 23rd annual conference in the series, SC10 will take place in New Orleans, Louisiana from November 13-19, 2010. Over 11,000 attendees from industry, academia and government are anticipated.

Drawing on expertise from the international HPC community, SC10 will build on over two decades of success offering a broad spectrum of technical presentations and discussions including rigorously peer-reviewed papers, panels, tutorials, workshops and posters showcasing the latest findings from laboratories and research institutions around the world.

This year, the technical program encourages participants to focus on one of three thrust areas to be featured prominently at the conference: climate simulation, heterogeneous computing and data-intensive computing.

Climate simulation spotlights the tremendous importance of research in global climate change, including HPC-based climate simulation techniques which help scientists understand global warming, climate change and other environmental processes.

SC10’s other thrusts highlight important emerging HPC technologies. Heterogeneous computing covers the technological and research advances in software that are required for accelerator-based computing, which is now occurring on large-scale machines and could propel supercomputing to the exascale level, where machines are capable of running a million trillion calculations per second.

As scientists depend more and more on supercomputing in their research, they are generating massive amounts of data that must be shared, stored and analyzed by teams of remotely located collaborators. This global trend underlines the importance of data-intensive computing, SC10s third main thrust, highlighting research into innovative solutions for managing data across distributed high-performance computing systems, especially hardware and software requirements for effective data transfer.

Submissions for most areas of the SC10 technical program will be accepted beginning March 1. Technical paper abstracts are due April 2 and final papers as well as submissions for Tutorials and the ACM Gordon Bell Prize are due April 5.

Other immediate submissions deadlines include: Workshops, which are due April 15, 2010; the Student Cluster Competition, which is due by April 16, 2010; as well as Panel submissions, which are due April 26, 2010.

All submissions can be made online via: https://submissions.supercomputing.org/

For the entire list of technical program deadlines, visit:
http://sc10.supercomputing.org/?pg=dates.html

For any questions about the Technical program, email: program (at) info.supercomputing (dot) org

About SC10
SC10, sponsored by IEEE Computer Society and ACM (Association for Computing Machinery) offers a complete technical education program and exhibition to showcase the many ways high-performance computing, networking, storage and analysis lead to advances in scientific discovery, research, education and commerce. This premier international conference includes a globally attended technical program, workshops, tutorials, a world class exhibit area, demonstrations and opportunities for hands-on learning. For more information on SC10, please visit http://sc10.supercomputing.org

Tuesday, March 2, 2010

Fixstars Launches Linux for CUDA

Multicore software specialist Fixstars Corporation has released Yellow Dog Enterprise Linux (YDEL) for CUDA, the first commercial Linux distribution for GPU computing. The OS is aimed at HPC customers using NVIDIA GPU hardware to accelerate their vanilla Linux clusters, and is designed to lower the overall cost of system deployment, the idea being to bring these still-exotic systems into the mainstream.

The problem is that the majority of future HPC accelerated deployments is destined to be GPU-based, rather than Cell-based. While Cell had a brief fling with HPC stardom as the processor that powered the first petaflop system -- the Roadrunner supercomputer at Los Alamos National Lab -- IBM has signaled it will not continue development of the Cell architecture for HPC applications. With NVIDIA's steady evolution of its HPC portfolio, propelled by the popularity of its CUDA development environment, general-purpose GPU computing is now positioned to be the most widely used accelerator technology for high performance computing. The upcoming "Fermi" GPU-based boards (Q3 2010) substantially increase the GPU's double precision capability, add error corrected memory, and include hardware support for C++ features.

Which brings us back to Fixstars. The company's new YDEL for CUDA offering is aimed squarely at filling what it sees as a growing market for turnkey GPU-accelerated HPC on x86 clusters. Up until now, customers either built their own Linux-CUDA environments or relied upon system OEMs to provide the OS integration as part of the system. That might be fine for experimenters and big national labs who love to tweak Linux and don't mind shuffling hardware drivers and OS kernels, but commercial acceptance will necessitate a more traditional model.

One of the challenges is that Red Hat and other commercial Linux distributions are generally tuned for mass market enterprise applications: large database and Web servers, in particular. In this type of setup, HPC workloads won't run as efficiently as they could. With YDEL, Fixstars modified the Red Hat kernel to support a more supercomputing-like workload. The result, according to Owen Stampflee, Fixstars' Linux Product Manager (and Terra Soft alum), is a 5 to10 percent performance improvement on HPC apps compared to other commercial Linux distributions.

Fixstars is selling YDEL for CUDA as a typical enterprise distribution, which in this case means the CUDA SDK, hardware drivers, and Linux kernel pieces are bundled together and preconfigured for HPC. A product license includes Fixstars support for both Linux and CUDA. The product contains multiple versions of CUDA, which can be selected at runtime via a setting in a configuration file or an environment variable. In addition, the YDEL comes with an Eclipse-based graphical IDE for CUDA programming. To complete the picture, Fixstars also offers end-user training and seminars on CUDA application development.

(This news summarized from the HPCwire and full text pages can be reached their site)

Monday, February 15, 2010

SC10 Conference

The SC Conference is the premier international conference for high performance computing (HPC), networking, storage and analysis. Conference will be held this year in New Orleans, LA,USA at November 15th - 18th, 2010.

For more info visit sc10.supercomputing.org

Saturday, January 2, 2010

Penguin Computing Announces Release of New Scyld ClusterWare “Hybrid”

Penguin Computing, experts in high performance computing solutions, announced today that Scyld ClusterWare™ "Hybrid", the newest version of its industry-leading cluster management software, will be released in January of 2010. Scyld ClusterWare Hybrid was developed as a solution for Penguin Computing's Scyld customers who want to provision, monitor and manage heterogeneous operating systems from a single point of control.

Scyld ClusterWare Hybrid is a fully integrated cluster management environment that combines Scyld ClusterWare's industry leading diskless single-system-image architecture with a traditional provisioning architecture that deploys an operating environment to local disk. Combining the "best of both worlds," this hybrid approach provides unmatched flexibility and transparency. Compute nodes can still be booted with Scyld ClusterWare and provisioned extremely rapidly, with a minimal memory footprint and guaranteed consistency, or can be provisioned with a complete operating environment to the local hard drive.

With Scyld ClusterWare Hybrid, target operating environments can be dynamically assigned to cluster nodes at start-up time, allowing for the quick re-purposing of systems according to workload and user demand. Once provisioned, systems can be managed from a single node with a single subset of commands, accelerating the learning curve for new users and reducing the time spent on system management for system administrators and researchers tasked with cluster management.

Click here to access more information.

Thursday, December 10, 2009

PRACE is Ready for the Next Phase

PRACE is eligible to apply for a grant under the European Union’s 7th Framework Programme to start the implementation phase.

In October 2009 PRACE demonstrated to a panel of external experts and the European Commission that the project made “satisfactory progress in all areas” and “that PRACE has the potential to have real impact on the future of European HPC, and the quality and outcome of European research that depends on HPC services”. Two months before the end of the project it met the eligibility to apply for a grant of 20 million Euros for the implementation phase of the permanent PRACE Research Infrastructure.

The future PRACE Research Infrastructure (RI) will consist of several world-class top-tier centers, managed as a single European entity. The infrastructure to be created by PRACE will form the top level of the European HPC ecosystem. It will offer competent support and a spectrum of system architectures to meet the requirements of different scientific domains and applications. It is expected that the PRACE RI will provide European scientists and technologists with world-class leadership supercomputers with capabilities equal to or better than those available in the USA, Japan, China, India and elsewhere in the world, in order to stay at the forefront of research.

About PRACE: The Partnership for Advanced Computing in Europe (PRACE) prepares the creation of a persistent pan-European HPC service, consisting of several tier-0 centres providing European researchers with access to capability computers and forming the top level of the European HPC ecosystem. PRACE is a project funded in part by the EU’s 7th Framework Programme (FP7/2007-2013) under grant agreement n° RI-211528.

Wednesday, November 18, 2009

The winner is Jaguar!

The 34th TOP500 List released November 17th in Portland, Oregon at the SC09 Conference.

A PDF version of the TOP500 Report distributed during SC09 can be found here.

In its third run to knock the IBM supercomputer nicknamed “Roadrunner” off the top perch on the TOP500 list of supercomputers, the Cray XT5 supercomputer known as Jaguar finally claimed the top spot on the 34th edition of the closely watched list.

Jaguar, which is located at the Department of Energy’s Oak Ridge Leadership Computing Facility and was upgraded earlier this year, posted a 1.75 petaflop/s performance speed running the Linpack benchmark. Jaguar roared ahead with new processors bringing the theoretical peak capability to 2.3 petaflop/s and nearly a quarter of a million cores. One petaflop/s refers to one quadrillion calculations per second.

Kraken, another upgraded Cray XT5 system at the National Institute for Computational Sciences/University of Tennessee, claimed the No. 3 position with a performance of 832 teraflop/s (trillions of calculations per second).

At No. 4 is the most powerful system outside the U.S. -- an IBM BlueGene/P supercomputer located at the Forschungszentrum Juelich (FZJ) in Germany. It achieved 825.5 teraflop/s on the Linpack benchmark and was No. 3 in June 2009.

Rounding out the top 5 positions is the new Tianhe-1 (meaning River in Sky) system installed at the National Super Computer Center in Tianjin, China and to be used to address research problems in petroleum exploration and the simulation of large aircraft designs. The highest ranked Chinese system ever, Tianhe-1 is a hybrid design with Intel Xeon processors and AMD GPUs used as accelerators. Each node consists of two AMD GPUs attached to two Intel Xeon processors.

Tuesday, November 10, 2009

What to do with an old nuclear silo?

Question: What to do with a 36 feet wide by 65 feet high nuclear grade silo with 2 feet thick concrete walls ?
Answer: An HPC Center!

A supercomputing center in Quebec has transformed a huge concrete silo into the CLUMEQ Colossus, a data center filled with HPC clusters.

The silo, which is 65 feet high with two-foot thick concrete walls, previously housed a Van de Graaf accelerator dating to the 1960s. It was redesigned to house three floors of server cabinets, arranged so cold air can flow from the outside of the facility through the racks and return via an interior 'hot core'. The construction and operation of the unique facility are detailed in a presentation from CLUMEQ.

Link: http://www.datacenterknowledge.com/archives/2009/12/10/wild-new-design-data-center-in-a-silo/

(This news sourced from the slashdot.com)

Tuesday, September 23, 2008

Making portable GridStack 4.1 (Voltaire OFED) drivers.

Remove previously installed IB rpms if there. To do this;

rpm -e kernel-ib-1.0-1 \
dapl-1.2.0-1.x86_64 \
libmthca-1.0.2-1.x86_64 \
libsdp-0.9.0-1.x86_64 \
libibverbs-1.0.3-1.x86_64 \
librdmacm-0.9.0-1.x86_64

lsmod

And remove by hand all of "ib_" modules with "rmmod modulename" command

*** If you installed previously OFED IB with same package you can run ./uninstall.sh
script which is included GridStack-4.1.5_9.tgz package instead above steps.
This script does same and plus things automaticaly so you can prefer.

1. First optain Gridstack source code from Voltaire.
And then;

mkdir /home/setup
cp GridStack-4.1.5_9.tgz /home/setup
cd /home/setup
tar -zxvf GridStack-4.1.5_9.tgz

all of files will be in "/home/setup/GridStack-4.1.5_9"

cd GridStack-4.1.5_9

2. Install the GridStack drivers

./install.sh --make-bin-package

This process takes about 30 minutes.
time to coffee or tea but not cigarette...

....
.......
..........
INFO: wrote ib0 configuration to /etc/sysconfig/network-scripts/ifcfg-ib0
DEVICE=ib0 ONBOOT=yes BOOTPROTO=static IPADDR=192.168.129.9 NETWORK=192.168.0.0 NETMASK=255.255.0.0 BROADCAST=192.168.255.255 MTU=2044

Installation finished
Please logout from the shell and login again in order to update your PATH environment variable

3. Finishing the driver settings
Firts edit ip settings for IB
Just edit "/etc/sysconfig/network-scripts/ifcfg-ib0" like below;

DEVICE=ib0
ONBOOT=yes
BOOTPROTO=static
IPADDR=10.129.50.9
NETMASK=255.255.0.0
MTU=2044

save and reboot the system.

4. GridStack installation puts a init.d service on the system startup.
After the bootup process you must see ib0 device on ifconfig command and
LEDs of HCA cards must be on or blinking state. Check this...

After the reboot check the state of connection by ifconfig

eth0      Link encap:Ethernet  HWaddr 00:19:BB:XX:XX:XX  
          inet addr:10.128.129.9  Bcast:10.128.255.255  Mask:255.255.0.0
          inet6 addr: fe80::219:bbff:fe21:b3a8/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:177 errors:0 dropped:0 overruns:0 frame:0
          TX packets:148 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:16829 (16.4 KiB)  TX bytes:21049 (20.5 KiB)
          Interrupt:169 Memory:f8000000-f8011100 

ib0       Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  
          inet addr:10.129.50.9  Bcast:10.129.255.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
          RX packets:11 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:128 
          RX bytes:892 (892.0 b)  TX bytes:384 (384.0 b)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:4 errors:0 dropped:0 overruns:0 frame:0
          TX packets:4 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:336 (336.0 b)  TX bytes:336 (336.0 b)

If you see similar of above message you won. Ping the neighbors IP addres if avaible there;

ping 10.129.50.1
PING 10.129.50.1 (10.129.50.1) 56(84) bytes of data.
64 bytes from 10.129.50.1: icmp_seq=0 ttl=64 time=0.094 ms
64 bytes from 10.129.50.1: icmp_seq=1 ttl=64 time=0.057 ms
64 bytes from 10.129.50.1: icmp_seq=2 ttl=64 time=0.064 ms
64 bytes from 10.129.50.1: icmp_seq=3 ttl=64 time=0.056 ms

If you does not see ib0 or cannot ping gridstack service may not be started.
Start by manualy: /etc/init.d/gridstack start

If everything ok you can make an image of this system for
central deploying mechanism like tftp.

6. Installing new compiled GridStack driver to identical machines.
It is so easy. After the GridStack compilation process a new bz2 file and
their md5 checksum are created automaticaly. You can find these two files under the
upper level of source folder. On our example two files wait for your attn in there;

ls -al /home/setup
-rw-r--r--   1 root root       88 Nov 23 19:11 GridStack-4.1.5_9-rhas-k2.6.9-42.ELsmp-x86_64.md5sum
-rw-r--r--   1 root root 43570798 Nov 23 19:11 GridStack-4.1.5_9-rhas-k2.6.9-42.ELsmp-x86_64.tar.bz2

Copy this two files to all of the IB hosts which you want to plan GridStack installation.
Opposite to previous steps this installation not takes too many minutes.
Just copy files to new machine by scp;

cd /home/setup
scp GridStack-4.1.5_9-rhas-k2.6.9-42.ELsmp-x86_64 root@10.128.129.10:/home

Change to target machine console and type those commands;

cd /home
first check-out the binary equality of bz2 file
md5sum -c GridStack-4.1.5_9-rhas-k2.6.9-42.ELsmp-x86_64.md5sum
GridStack-4.1.5_9-rhas-k2.6.9-42.ELsmp-x86_64.tar.bz2: OK

if you see OK sign type this;

tar -jxvf GridStack-4.1.5_9-rhas-k2.6.9-42.ELsmp-x86_64.tar.bz2

A folder which is called "GridStack-4.1.5_9-rhas-k2.6.9-42.ELsmp-x86_64" will be created.

cd GridStack-4.1.5_9-rhas-k2.6.9-42.ELsmp-x86_64/
./install.sh

GridStack binary rpms will be install automaticaly.
Make ifcfg-ib0 setting like above, reboot and check for IP connectivity.

7. As a bonus advice;
After the GridStack installation there is lots of ib diagnostics tools avaible under the
/usr/local/ofed/bin directory. So for example issuing the ./ibv_devinfo give an brief
and usefull informations about HCA connectivity, board model, FW level and ... etc

Here ise sample output for my machine;

hca_id: mthca0
        fw_ver:                         4.7.400
        node_guid:                      0017:08ff:ffd0:XXXX
        sys_image_guid:                 0017:08ff:ffd0:XXXX
        vendor_id:                      0x1708
        vendor_part_id:                 25208
        hw_ver:                         0xA0
        board_id:                       HP_0060000001
        phys_port_cnt:                  2
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                2048 (4)
                        active_mtu:             2048 (4)
                        sm_lid:                 29
                        port_lid:               75
                        port_lmc:               0x00

                port:   2
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                2048 (4)
                        active_mtu:             2048 (4)
                        sm_lid:                 29
                        port_lid:               261
                        port_lmc:               0x00

---=== HCA DDR EXP-D FW upgrade after GridStack 4.1 install =--------

ib-burn -y -i VLT-EXPD -a /usr/voltaire/fw/HCA400Ex-D-25208-4_7_6.img 

INFO: Using alternative image file /usr/voltaire/fw/HCA400Ex-D-25208-4_7_6.img
Burning : using fw image file: /usr/voltaire/fw/HCA400Ex-D-25208-4_7_6.img VSD extention : -vsd1 VLT-EXPD -vsd2 VLT0040010001
    Current FW version on flash:  N/A
    New FW version:               N/A

    Burn image with the following GUIDs:
        Node:      0019bbffff00XXXX
        Port1:     0019bbffff00XXXX
        Port2:     0019bbffff00XXXX
        Sys.Image: 0019bbffff00XXXX

    You are about to replace current PSID in the image file - "VLT0040010001" with a different PSID - "VLT0040010001".
    Note: It is highly recommended not to change the image PSID.

 Do you want to continue ? (y/n) [n] : y

Read and verify Invariant Sector               - OK
Read and verify PPS/SPS on flash               - OK
Burning second    FW image without signatures  - OK  
Restoring second    signature                  - OK  

Where /usr/local/bin/ib-burn is a realy BASH script
this is another deep way to burn HCA card FW

lspci -n | grep -i "15b3:6278" | awk '{print $1}'
if you see "13:00.0" as output type this;

mstflint -d 13:00.0 -i /usr/voltaire/fw/HCA400Ex-D-25208-4_7_6.img -vsd1 "" -psid HP_0060000001 -y burn > /root/hca-fw-ugr.log
This command does not prompt for Yes.

For checking FW on the flash type this;
mstflint -d 13:00.0 q

Wednesday, September 3, 2008

SFS 2.2-1 Client Upgrade witch GridStack 4.x

SFS is the Hewlett Packard's Parallel File Systems which is based on the open source Lustre file system.
As an acronym SFS stands for Scalable File Share. HP company also has SFS20 disk enclosures must be not conflict to this software.

Here we are upgrading the client packages (rpms) to new level.
Enter to /home/sfs-iso-loop/client_enabler and run;

./build_SFS_client.sh --no_infiniband --config --allow_root \
/home/sfs-iso-loop/client_enabler/src/x86_64/RHEL4_U3/SFS_client_x86_64_RHEL4_U3.config

cd /home/sfs-iso-loop/client_enabler/output/RPMS/x86_64
rpm -ivh kernel-smp-2.6.9-34.0.2.EL_SFS2.2_1.x86_64.rpm
rpm -ivh kernel-smp-devel-2.6.9-34.0.2.EL_SFS2.2_1.x86_64.rpm
change /boot/grub/menu.lst to boot from this new kernel.

Reboot the machine and showtime ...

NCHPC