Online resource kit >

HomeAt a GlanceDeveloperIT Pro

Windows HPC Server 2008 Head Node Performance Tuning

< Go Back

This white paper focuses on Windows HPC Server 2008 components that run on the head node, which enables you to plan, monitor, and tune these components.

The HPC Job Scheduler Service, the HPC Management Service, and Microsoft SQL Server 2005 are the most important components in determining the overall performance and reliability of the cluster. The size of the workload is the largest determining factor in how powerful the head node needs to be. 

All of a computer’s resources (memory, CPU, and disk) are vital to the performance of an HPC Server head node in different ways.  Insufficient RAM, hard disk capacity, network capacity, and CPU power can all result in sub-optimal performance of the HPC Server head node. For this white paper, a 250-node cluster was tested internally at Microsoft, and the test results are described.

The primary driver of data growth is the number and size of tasks, especially when those tasks redirect output and error streams to the database instead of to an output file. One important factor in determining how large the database will grow is the clean-up interval of the Job Scheduler. The data growth of the 250-node test cluster is described for further illustration.

The configuration of the database, specifically the use of Microsoft SQL Server Express versus Microsoft SQL Server Standard, is an essential factor in creating a cluster. The merits of both options are explored, followed by a brief description of a typical maintenance plan and SQL configuration best practices.

Service-oriented architecture (SOA) applications have some special capacity planning requirements because of the particular way that they interact with the cluster and the additional need for Windows Communication Foundation (WCF) Broker nodes in the cluster. Key factors and best practices are explored.

By examining CPU, memory, and disk metrics, you should be able to determine if a Windows HPC Server head node or broker node is hitting a hardware bottleneck and what that bottleneck is. Guidelines for checking the head node and broker node components are described.

There are a number of best practices that are recommend for large clusters, and most of the best practices should also improve the behavior of smaller clusters. The only exceptions to these guidelines are test clusters on which performance and reliability of the head node services may not be critical and smaller clusters where the head node services will not be highly stressed.

View