A new day for High Performance Computing with SUSE Linux

new_day_SUSE_Sunrise_short

SUSE has had a long and successful history of supporting High-Performance Computing, with over 50% of the Top 100 HPC systems running on SUSE Linux Enterprise Server (SLES) technology. SUSE is a Platinum sponsor for OpenHPC and SUSE Linux is the foundation for the build and test environment for OpenHPC on ARM.

High Performance Computing has gone through many changes over the past few years. Businesses are increasingly adopting HPC technology to apply sophisticated analysis techniques to business data. At the same time, we see an increased use of artificial intelligence, machine learning, and high-performance data analytics approaches in traditional HPC environments.  We also see that Spectre and Meltdown reinforces the need for maintenance and support. All of these changes drive increased customer demand for HPC solutions that are easier to deploy and support.

SUSE recently made some significant changes to our HPC offerings to address these needs.

New life cycle offering for HPC – Extended Service Pack Overlap Support (ESPOS)

HPC environments often have thousands of systems to provide the compute resources necessary for solving complex problems. One of the most challenging aspects of using HPC is managing the software stack running on the cluster. After the software stack, including the underlying operating system, is installed on all the cluster nodes, administrators are reluctant to update or change that stack. The support life of the operating system is important, particularly for systems that are subject to security compliance requirements.

SUSE releases a new Service Pack for a given SUSE Linux release (such as SLES 12) approximately every twelve months. Each Service Pack is supported for approximately 18 months. That eighteen months includes a six month overlap period between a Service Pack and the subsequent Service Pack. After that six months of overlap support, customers will not receive new fixes unless they upgrade to the later service pack or purchase Long Term Service Pack Support (LTSS).

SUSE now provides new subscriptions for HPC that include a longer support life for each SUSE Linux Service Pack. This additional support life is called Extended Service Pack Overlap Support (ESPOS). Customers who purchase the SLES for HPC subscription with ESPOS get an additional year of support, for a total of 18 months.  This gives customers more time to upgrade and can allow a customer to skip an intervening Service Pack completely. SLES 12 SP3 for HPC is the first Service Pack that can be supported for an additional 12 months via ESPOS.

In the generic example below, a customer who purchased the HPC subscription with ESPOS could stay on a Service Pack for up to three years, while continuing to be supported by SUSE.

Generic_SP_Lifecycle_with_ESPOS_Hires

Support during the ESPOS period includes telephone support and fixes for critical system and security issues.

Note that ESPOS is only available for customers purchasing a SLES for HPC  subscription. ESPOS is not available for regular SLES.

New HPC offering: Long Term Service Pack Support for HPC (LTSS for HPC)

Long Term Service Pack Support (LTSS) for HPC provides customers with telephone support and fixes for critical system and security issues for up to three years beyond the end of the normal Service Pack overlap.  LTSS for HPC can be purchased in one-year increments. The new LTSS for HPC can only be used to extend the life of a SLES for HPC subscription and customers must maintain the underlying SLES for HPC subscription in addition to the LTSS.

In summary, you can purchase these subscriptions for HPC clusters:

SLES_for_HPC_Products

SLES for HPC can be purchased for new clusters or when renewing existing clusters. Customers can convert to SLES for HPC from a standard SLES subscription at renewal. Customers must have the same subscription, such as SLES for HPC with ESPOS, for all HPC nodes in a cluster. Customers cannot mix subscriptions with ESPOS with  subscriptions that do not include ESPOS in a cluster. Similarly, customers must purchase LTSS for HPC for all nodes in the cluster.

For most HPC customers who need a longer support life, SLES for HPC with ESPOS is the most cost-effective approach because it provides up to 30 months of support for a service pack. SLES for HPC with ESPOS is less expensive than purchasing a standard 18-month subscription and adding one year of LTSS.  Customers who need more than 30 months of support can always purchase the add-on LTSS support for the final two years after ESPOS support ends.

Lower prices for SLES for HPC

HPC clusters generally consist of two types of systems: Head Nodes and Compute Nodes. Head Nodes provide the management function for the cluster and typically run tasks such as workload schedulers, input/output management, hosting shared filesystems, login nodes, cluster authentication, etc. Compute Nodes, as the name implies, only provide the processing resources needed for the HPC workloads.

The key attribute of an HPC cluster is that all the systems are focused on performing compute or I/O intensive subtasks to solve a computation task that is larger than any single system can solve.  The SUSE Terms and Conditions define which workloads and configurations are considered “HPC”.

SUSE has a single price strategy for HPC that uses the same product for both HPC Head Nodes and HPC Compute Nodes.  We believe that a simple pricing model that applies to all HPC cluster systems is easier for everyone.

HPC environments might have a thousand systems to provide the resources necessary for solving complex problems. But these clusters are more homogeneous than a random collection of a thousand systems because all of the compute nodes are running identical copies of the operating system and running subtasks of a larger workload. As a result, the support costs for HPC are lower than for general purpose systems.

Operating Systems vendors traditionally charge less for subscriptions for HPC environments due to these lower support costs and because of the sheer number of systems involved. SUSE provides HPC specific products such as SLES for HPC or SLES for HPC with ESPOS to deliver SUSE Linux for HPC environments. Because SLES for HPC has unique prerequisites and restrictions,  these products cannot be directly purchased by a customer; they must be purchased through a SUSE business partner or through a SUSE direct salesperson.

We also made significant price reductions as part of the overall changes to the SUSE HPC offering. We believe that these price reductions will encourage more organizations running HPC workloads to consider using SUSE Linux. The recent Meltdown and Spectre security issues have reinforced the need for customers to have a strong partner like SUSE that is able to respond to these kinds of problems quickly.

Enhanced product offerings for HPC partners

 Most SUSE Linux HPC clusters are delivered by SUSE hardware and solution partners. These partners often provide  Level 1 and Level 2 support to customers and only involve SUSE for problems that require our back-end engineers. In recognition of this service, SUSE now provide SLES HPC products to partners that acknowledges that SUSE only needs to provide L3 support.

New support for ARM HPC systems

SUSE has supported 64-bit Arm systems since November 2016 with the introduction of SUSE Linux Enterprise Server for Arm (SLES for Arm). Most of the early 64-bit Arm systems were unsuitable for HPC workloads. That changed in late 2017 with the introduction of ARM-based systems intended for HPC environments by HPE and Cray.

These systems, based on Arm chips such as the Cavium Thunderx2 and Qualcomm Centriq 2400, provide unique capabilities for HPC environments. SUSE had already provided general support for these system in SLES 12 SP3 for ARM in September 2016. SUSE has now recognized this new option for HPC customers by expanding the platform support for SLES 12 for HPC to include X86-64 and ARM AArch64 hardware platforms.

HPC Module continues to be enhanced

SUSE continues to deliver on our commitment to make HPC easier to implement by adding additional packages to the HPC Module. The HPC Module is intended to simplify deployment and management of HPC environments by providing a number of fully supported HPC packages to our SUSE Linux customers.

These packages were built and tested by SUSE and are provided at no additional cost with the SUSE Linux support subscription. All of the packages are open-source and many are based on packages from OpenHPC.  The HPC Module is provided for customers using X86-64 and ARM AArch64 platforms and is available to customers with SLES for HPC and SLES subscriptions.

The module structure allows SUSE to deliver additions and enhancements to HPC packages more frequently than possible via Service Packs. SUSE delivered two releases of the HPC Module in 2017 but we hope to deliver updates to the HPC Module more frequently in 2018.

HPC_Module_Nov_2017

PackageHub for HPC

Not all packages desired by HPC customers are suitable for inclusion in the HPC Module as a supported component of SUSE Linux for HPC. Examples are packages that are not broadly used or that are in an early development stage. SUSE provides easy access to those packages via PackageHub. We currently provide several packages of interest to the HPC community via PackageHub including singularity, robinhood, and clustershell.

Summary

We are confident that the changes we have made to our HPC offering enhance our ability to meet the needs of the evolving HPC community. We look forward to continuing to improve the value provided by SUSE to HPC customers and welcome your comments and feedback.

Advertisements
Posted in ARM Processors, High Performance Computing, SUSE Linux | Leave a comment

POWER9 is here. SUSE is ready!

POWER9_H922H924

IBM POWER9 H922 and H924

They are here! IBM announced the POWER9 servers for small enterprises, which will be available later this quarter.

IBM announced six new servers but the IBM POWER Systems H922 and H924 are of particular interest to SAP HANA customers. These two systems were designed and optimized for Linux running demanding applications like SAP HANA.

The H922 is a 2U system with up to nine PCIe slots. The H924 is a 4U system with 11 PCIe slots. Both systems will support up to 4 x 400 GB M.2 form factor NVMe devices. These systems are excellent choices for customer scale out SAP HANA workloads and provide the kind of performance and reliability customers expect from IBM and SUSE.

SUSE  already delivered POWER9 support including NVMe enablement in SUSE Linux Enterprise Server 12 Service Pack 3 (SLES 12 SP3), back in September 2017, so you don’t need to worry about getting started with SUSE on POWER9.

SUSE Linux Enterprise 12 SP3 is currently the only Linux that can run in native POWER9 mode on these new servers.

SLES 12 Service Pack 3 also included support for up to 512 TB virtual address space on Power. This ensures memory intensive applications like SAP HANA can fully use the capabilities of these new POWER9 systems.

Because SUSE has thousands of customers running SAP HANA on Power, you benefit from the experience of the market leader for SAP HANA.

SLES for SAP Applications includes many strong features that provide the reliability, performance, and ease of use that customers need when they deploy applications like SAP HANA.

We recently added a new product, Live Patching support for POWER, to provide an even higher level of reliability by allowing customers to avoid planned downtime.

You can manage all of your SUSE Linux on Power environments with SUSE Manager server, that now runs natively on POWER so there is no need to mix architectures just to manage your SAP HANA environment.

SLES_for_SAP_Diagram

We have all been waiting a long time for POWER9, and we are very happy that the latest generation of IBM Power systems will be available for our customers in just a few days.

Come talk to us about SUSE solutions for SAP HANA on Power at the IBM Think conference in Las Vegas in March!

Posted in AIX & Power Systems Blogroll, Information Technology, SAP HANA, SUSE Linux, Uncategorized | Leave a comment

SAP HANA on Power feeling a little cramped? 128TB support for SLES 11 SP4 can help

Squished_server

Memory fragmentation can happen in any operating system, but if you are running a memory intensive workload like SAP HANA, fragmentation can put an upper limit on how long the application can be available before needing to be restarted.

Increasing the amount of virtual address space can alleviate this problem by providing more memory to the application.

SUSE increased the virtual address space available in SLES 12 Service Pack 3 for POWER to 512TB last year to help with this problem.

Unfortunately, many of our SAP HANA customers on POWER started out on SLES 11 SP4 and will be running on SLES 11 for a long time to come. What about them?

Help has arrived: with the kernel-bigmem–3.0.101-108.7.1 update, we now support up to 128TB virtual address space on SLES 11 SP4. This provides the head room to allow customers to run their memory intensive applications, like SAP HANA, for longer periods between restarts.

This capability complements the support for 32TB of physical memory that was introduced with the bigmem kernel about a year ago (Blog: More memory now available for SAP HANA on SLES 11).

For more information, please see the SUSE TID at https://www.suse.com/support/kb/doc/?id=7018408

 

Posted in AIX & Power Systems Blogroll, Information Technology, SAP HANA, SUSE Linux | 1 Comment

Why it is hard to predict the performance impact of Meltdown and Spectre

Spin_to_win_Performance.png

One of the key questions about patches to mitigate Meltdown and Spectre is “How is this going to impact my performance?”

Olaf Kirch, Distinguished Engineer and VP of Engineering at SUSE explains why it is so difficult to predict the performance impact of these mitigations and why the only real answer is to do your own benchmarks.

Read his blog here: Meltdown and Spectre Performance

Posted in AIX & Power Systems Blogroll, ARM Processors, High Performance Computing, Information Technology, SUSE Linux | Leave a comment

What will they say about you when you are gone?

dandelion-sunset-flower-shadow-dandelion

Recently I attended the funeral of a colleague, Bill Maron, from my days at IBM. Bill was the quintessential engineer who had overseen many complex projects at IBM. Bill drove the groundbreaking Transaction Processor Council (TPC) benchmarks that helped to catapult the IBM POWER4 UNIX systems to leadership performance and eventual IBM domination of the UNIX market. Bill ran the IBM UNIX performance team for many years and helped to resolve thousands of critical customer performance situations and drove the benchmarking activities associated with launching multiple generations of IBM POWER servers.

Bill fought cancer for several years but continued his work of tackling competitive bids and addressing critical performance issues.  I connected with Bill on one of my last trips at IBM. It was a customer visit for a competitive bid and I didn’t even know that Bill was ill-he just continued on as usual.

The funeral was held in the middle of the day at a funeral home that was not convenient for people coming from IBM. Despite this, his funeral was extremely well attended.

At the end of the service, people were asked to share a few thoughts about Bill. Now, you would think that the remembrances would include lots of things about Bill’s technical accomplishments and war stories about his long career at IBM.

Instead, the comments were about Bill’s unbounded kindness. Over and over we heard about how Bill had been a mentor who sustained careers, how he had guided his people to be better at their jobs, and even simple kindnesses such as swapping seats on airplanes with total strangers in order to make somebody’s day a little better. The remembrances painted a picture of a genuinely kind and thoughtful person.

I was struck by the fact that what people remembered about Bill was not his many technical and professional achievements but instead they remembered was the kind word, the gentle push in the right career direction, and simple acts of kindness.

I didn’t work with Bill closely but when I think of him, I remember most of all his smile and the mischievous twinkle in his eye.

How will people remember you?

Posted in Random Thoughts | 4 Comments

SUSE Linux Enterprise HPC Module: November 2017 Additions and updates

Toolbox_green_HPC

SUSE continues to deliver on our commitment to make HPC easier to implement by adding additional packages to the SUSE Linux Enterprise Server (SLES) HPC Module.

When we introduced the HPC Module to SUSE Linux early in 2017, we laid out a strategy to make High Performance Computing adoption easier by providing a number of fully supported HPC packages to our SUSE Linux customers.

The key value of the HPC Module is to provide commonly used HPC packages as a fully supported component of SUSE Linux. These packages have been built and tested by SUSE and are provided at no additional cost with the SUSE Linux support subscription. All of the packages included in the SLES HPC Module are open-source and many are based on packages from OpenHPC.

SUSE provides the HPC Module for customers using the X86-64 and ARM hardware platform. Other than a few hardware specific packages, all the packages are supported on both platforms. If you haven’t tried the HPC module yet, here are instructions on how to access it.

In this release, we added a number of additional packages as well as updates to existing packages.  The new packages include several libraries such as fftw, OpenBLAS, and petSc, I/O packages such as hdf5 and phdf5, and performance tools such as mpiP and tau and many more.

We also updated slurm and pdsh packages to the latest levels.

SUSE Linux HPC Module package levels conman cpuid fftw hdf5 hwloc lua-filesystem lua-lmod lua-luaterm lua-luaposix memkind mpiP mrsh munge mvapich2 netcdf netcdf-cxx netcdf-fortran numpy openblas openmpi papi pdsh petsc phdf5 powerman prun rasdaemon ScaLAPACK slurm

SUSE HPC Module package levels November 2017

We also added two packages of interest to HPC customers, robinhood and singularity to SUSE PackageHub. PackageHub is a SUSE curated repository for community supported, open-source packages that provides SUSE customers with easier installation. Help for using PackageHub can be found here.

We hope that our customers will find the HPC Module useful. Let us know how we are doing via comments or email.

Jay

 

 

 

Posted in High Performance Computing, Open Source, SUSE Linux, Uncategorized | Leave a comment

When they build it, SUSE will be ready

Field_of_DreamsSLES 12 Service Pack 3 for ARM is done. Now what?

SUSE has finished up SLES 12 Service Pack 3 and it is available to our customers. Inside SUSE, Service Pack 3 was labeled a “Consolidation” release. This type of release is intended to focus more on stability than on new features. This gives the Engineering team the opportunity to fix lower priority bugs and pay back technical debt introduced by new features in previous releases.  This is particularly true for legacy SUSE platforms like x86-64.

Service Pack 3 gave us the opportunity to make ARM a full member of the SUSE Linux family. We integrated ARM into the normal engineering and test environments that we already use for other platforms. I won’t bore you with all the details, but this infrastructure enablement is necessary for SUSE to provide the foundation for future success with ARM. Ultimately, we made the investments to complete this infrastructure work and ARM is now a regular hardware platform for SUSE Linux.

What’s happened with ARM servers since Service Pack 2

Last year, when SUSE became the first commercial Linux to support 64-bit ARM, we enabled a fairly limited set of ARM System on a Chip (SoC) processors for the server market. Of that group, only the Cavium ThunderX and Applied Micro XGene-2 systems had performance approaching what was needed for traditional server workloads. The other processors enabled in Service Pack 2 were better suited for niche and embedded workloads. But boy sure it is fun running SUSE Linux Enterprise on the Raspberry Pi!

We have seen a lot of changes in the ARM server industry, with some SoC vendors scaling back their investment in ARM server platforms and other vendors expanding their investment. We have also seen many vendors working on the second generation of ARM server processors – with enough potential performance to become a viable alternative the legacy systems in most data centers.

The first real appearance of prototype second generation ARM systems made their debut at the ISC high performance computing conference in Frankfurt in June. Having real hardware on the show floor made the rumors of impending ARM entry into the HPC market much more real. Hints about the expected performance of these systems piqued the interest of potential users. And some independent hardware vendors started revealing future systems based on the next generation of ARM server chips.

SUSE Linux enablement for the next generation of ARM server processors

There’s no question that the market is very interested in the next generation of ARM servers. SUSE has already done its part to enable these servers by including enablement for Cavium ThunderX2, Qualcomm Centriq 2400, MACOM/Applied Micro XGene-3, HiSilicon Hi1616, and other ARM processors in SLES 12 Service Pack 3.

SUSE and our ARM partners have been working a long time to enable enterprise deployments of 64-bit ARM servers, but now it’s a question of “when” rather than “if”.

The time for ARM servers is coming soon. SUSE Linux is ready today..

SLES12SP3_ARM_list

Posted in ARM Processors, High Performance Computing, Information Technology, Open Source, SUSE Linux, Uncategorized | Leave a comment