DNN Blog

Oct 26

Posted by: Shaun Walker
10/26/2007  RssIcon

As many of you probably noticed, we have been experiencing some stability issues on dotnetnuke.com over the past couple days. Every couple hours the website would seem to "hang" and pages would no longer be served to visitors. In order to get things working again, we would need to restart the website or recycle the application pool manually. Then the site would return to normal operations - at least for a period of time before it would "hang" once again. So what was the problem?

Well the first thing we wanted to verify was that browser requests were actually reaching the server in a consistent manner. We were not getting 404 errors, but the browser would sit there indefinitely trying to load a page. We tried pinging the server and had no problems. We tried tracert and sometimes it appeared to time out just before it reached the server. We are hosted by MaximumASP, so we contacted them to determine if it was potentially a DNS problem. We were informed that their network firewall prevents tracerts from completing - so the behavior we were seeing was accurate. We connected remotely to the server and tried accessing the site locally. Pages would not be served - so this ruled out DNS.

The next thing we identified was that we had performed a number of upgrades to the site this past week. We had upgraded from 4.6.0 to 4.6.2. We had upgraded to the new version of Forums ( 4.4.3 ). We had upgraded to the new version of Blog ( 3.3.1 ). And we had upgraded the content for the Online Help. Had one of these items cause the problem?

We decided to do some snooping on the web server. Looking at IIS we could see that the site was running fine. Loading perfmon we could see that the box was not overloaded. Next we looked at the Event Logs for the machine. There was nothing which stuck out as a red flag indicator in the logs. We restarted the IIS service, recycled the app pool, and restarted the app. While the website was up and running, we logged into it as the host user and took a look at the Event Viewer in DotNetNuke. There was nothing which identified a serious problem. We looked at the App_Start event and saw that they were not firing regularly - so we knew the site was not constantly recycling on its own. Still the site would "hang" after a period of time.  So what next?

We decided to connect remotely to the SQL Server 2005 server and take a look. Running perfmon we could see that the CPU utilization was spiked at 100%. Basically SQL Server was maxxed out and therefore was not returning data to the web application. Restarting the web application would result in SQL Server getting unblocked, settling at <20% CPU utilization, until suddenly it would spike to 100% again. At first we thought there must be a rogue query or transaction which was causing SQL Server to become blocked. We ran SQL Profiler to try and identify the offender. But still no luck. So what next?

We went back to the web server. We opened taskmgr and went to the Processes tab. We highlighted the w3wp.exe process and sorted the table by the Image Name column. These represent individual application pools on the server. There were 11 of them. And looking at the Mem Usage column we could tell that the total memory used was > 2.0 GB. So what is the significance of this? Well if you refer back to my Performance blog:

http://www.dotnetnuke.com/Community/Blogs/tabid/825/EntryID/1203/Default.aspx

You will remember that a 32 bit Windows box only has 2.0 GB of memory available to all application pools. The memory is allocated equally across the application pools. So with 11 active application pools, the dotnetnuke.com website app pool did not have enough memory available to satisfy its needs - it was starving. How?

The DotNetNuke web application relies on caching to achieve optimal performance. When the application needs data, it makes a call to the database and then stores the result in the ASP.NET cache so that it does not need to call the database on subsequent requests ( retrieving data in-process from the cache is far more efficient than retrieving data out-of-process from a database ). But this model falls apart when there is not enough memory available. ASP.NET will attempt to insert the data into the cache, but will be unsuccessful because it is full already. As a result, when the application needs to access that same data in the future, it will be forced to go to the database again ( and again and again and again.... ). In our case the database was being hammered so hard that it was spiking the CPU utilization at 100%, blocking SQL Server threads. The meant data was not being passed back to the web application, which in turn, resulted in the web application not being able to issue a response to the web browser. So it would "hang". So how to fix it?

Well, as it turns out, we did not actually need to have 11 application pools on our web server. We had provisioned them that way to provide isolation for various applications, but the reality is that we could group some of the applications together. Consolidating the applications pools down to 5 pools resulted in more than a 2X memory allocation increase for the remaining pools. With more memory available, DotNetNuke could cache data efficiently and take the burden off of SQL Server, bringing back down to <20% CPU utilization.

The moral of the story is that there are some serious complexities in diagnosing ASP.NET site issues. Your journey will result in many twists and turns as you follow the trail of evidence. Assuming that the problem is related to the web application will often lead you down the wrong path. More often than not, the problem will be related to your specific server environment. So it is also critical to understand the behavior and constraints of the Windows server environment as this will allow you to connect the various dots.

Tags:
Categories:
Location: Blogs Parent Separator Shaun Walker

9 comment(s) so far...


Re: Troubleshooting Site Issues

Thanks for the write up. More often then not stories like this go untold so only those directly involved learn a lesson. I am sure it will benifit me and other down the road when we least suspect it.

By ccarns on   10/26/2007

Re: Troubleshooting Site Issues

Also is the related to the 2gb allocated to system and 2gb allocated to application by default in Windows? If so there is a /3GB flag you can add to the boot.ini file that allows you to change this. I read it on MSDN when trying to optimize MS Reporting Services: http://www.microsoft.com/whdc/system/platform/server/PAE/PAEmem.mspx

By ccarns on   10/26/2007

Re: Troubleshooting Site Issues

Whether you have the default 2GB or are using the 3GB switch, you still need to be conscious of the fact that the memory is shared among all the app pools on your server. At some point you may exceed the memory requirements on one of the pools and if you do, you may start to experience strange behavior which does not directly point to a RAM problem.

By sbwalker on   10/26/2007

Re: Troubleshooting Site Issues

The enterprise version of 2003 server will support up to 16gb of RAM.

We have a box running with 8GB using dual xeon processors with the following line in the boot.ini
[operating systems]
multi(0)disk(0)rdisk(0)partition(1)\WINDOWS="Windows Server 2003, Enterprise" /fastdetect /PAE /NoExecute=OptOut

The PAE parameter allows for up to 16 GB with a 32bit OS. See http://support.microsoft.com/kb/283037

By ActiveInternet on   10/27/2007

Re: Troubleshooting Site Issues

It is a common misconception that Windows 2003 servers can be "scaled up" by adding more RAM. This is not really the case. From a web serving perspective, it is not about the actual physical memory on the box, but the Address Space. By default, the address space in 32-bit Windows is 2GB. There is a hack to increase it to 3GB in boot.ini. The /PAE switch described above is a feature of x86 processors that allows for up to 64 Gigabytes (GB) of physical memory to be used in 32-bit systems, given appropriate operating system support. I would be curious if this has any impact on Address Space. And if so, what would be the upper limit: 4GB? This is my favorite whitepaper on Windows 2003/IIS6 configuration: http://www.asp.net/learn/whitepapers/aspnet-and-iis6/

By sbwalker on   10/27/2007

Re: Troubleshooting Site Issues

Hi Shaun... I read your prior performance blog. I'm looking at building a site which should get significant traffic. Memory usage of a DNN install obviously is dependent on what modules are used in a given portal. I have a few questions and perhaps a few insights.

I'd guess that the forums and specifically searches are among the DNN home site's largest draw? This might be an area of focus given most DNN interactive sites probably rely heavily on forums (just as is the case in Joomla, Drupal sites etc). Amazon for example uses Drupal to handle its forums which have an enormous load. If your forums are indeed the largest area of load perhaps searches past a watermark date could be handled different. For example the module being more intelligent using keywords for example. Keywords supplied by the posters and perhaps forum threads keeping track of keywords resulting in hits/click through. For searches that are past a set watermark date only keywords or perhaps title be searched rather than body content.



With that said the site I am working to build will rely heavy on forums. My questions are:

1. Does the DNN home site use a dedicated Database Server?

2. Given your setup with Maximum ASP and the present site how many users average can the DNN home site service before problems due to load appear?

3. Would it be possible as part of DNN admin to slap some real statistics based on "present load" in place. So a DNN site admin could pop in and look at memory usage, CPU usage etc. as PART of the DNN package. Thus they can look and perhaps then be able to react by site changes whatall to adjust as needed.

Additionally... something to consider (and it'd prolly' be a good deal of work) the DNN team might look at building in some distributed computing. Thus a site like we are looking at building which may well have an enormous forums load could distribute it to its own dedicated server. If DNN is to be able to ever service enormous user loading its really the only answer. One of the things REALLY high traffic Joomla/Mambo sites do is bridge lets say PHPBB. So a module will serve as a bridge performing the login and transferral of various data to PHPBB and the end user is essentially now using a completely different server but rather transparently if you will.

Lastly... While looking at some stuff at CodeProject I saw a project where DNN has been converted to C#. The author claims considerable performance improvement. Can you make any comments on this?

By tssrg on   10/28/2007

Re: Troubleshooting Site Issues

Looks like you can get more than 4gb of address space according to this article with the enterprise edition of Windows Server 2003.
http://www.microsoft.com/whdc/system/platform/server/PAE/PAEdrv.mspx the footnote refers to address space (excerpt Below)

Introduction
PAE is an Intel-provided memory address extension that enables support of greater than 4 GB of physical memory for most 32-bit (IA-32) Intel Pentium Pro and later platforms. This article provides information to help device driver developers implement Windows drivers that support PAE.

Microsoft supports Physical Address Extension (PAE) memory in Microsoft Windows 2000, Windows XP, and Windows Server 2003 products:

Operating system Maximum memory support with PAE
Windows 2000 Advanced Server
8 GB of physical RAM

Windows 2000 Datacenter Server
32 GB of physical RAM

Windows XP (all versions)
4 GB of physical RAM*

Windows Server 2003 (and SP1), Standard Edition
4 GB of physical RAM*

Windows Server 2003, Enterprise Edition
32 GB of physical RAM

Windows Server 2003, Datacenter Edition
64 GB of physical RAM

Windows Server 2003 SP1, Enterprise Edition
64 GB of physical RAM

Windows Server 2003 SP1, Datacenter Edition
128 GB of physical RAM


* Total physical address space is limited to 4 GB on these versions of Windows.

PAE is supported only on 32-bit versions of the Windows operating system. 64-bit versions of Windows do not support PAE. For information about device driver and system requirements for 64-bit versions of Windows, see 64-bit System Design.

By ActiveInternet on   10/28/2007

Re: Troubleshooting Site Issues

To answer your questions: 1) Yes, we use a dedicated database server. 2) We are currently running only a single web server but plan to migrate to a web farm very soon. The number of users which any DNN site can support is highly dependent on hardware specs, number of apps running on the server, network configuration, geographic disbursement of users, etc... In fact the orginal point of this post was to explain how the environment affects the scalability of the DNN application. So it would really not be relevant to make any load comparison unless your environment mirrored all of our criteria exactly. 3) It is not possible to include any metrics about load into the DNN application. We approached Microsoft with this problem in the spring and were told there is no way to accomplish this in ASP.NET. Beyond technical limitations there would be security implications of revealing information like this to a DNN user - especially in shared hosting environments.

By sbwalker on   10/28/2007

Re: Troubleshooting Site Issues

Very educational. This posting gave me a very good sense of how one should approach the troubleshooting of a large scale environment. Very methodical, and I appreciate your sharing this.

Thank you.

By MarkHGordon on   4/10/2008
Attend A Webinar
Free Demo Site
Download DotNetNuke Professional Edition Trial
Have Someone Contact Me

Like Us on Facebook Join our Network on LinkedIn Follow DNN Corporate on Twitter Follow DNN on Twitter

Advertisers

Sponsors

DotNetNuke Corporation

DotNetNuke Corp. is the steward of the DotNetNuke open source project, the most widely adopted Web Content Management Platform for building web sites and web applications on Microsoft .NET. Organizations use DotNetNuke to quickly develop and deploy interactive and dynamic web sites, intranets, extranets and web applications. The DotNetNuke platform is available in a free Community and subscription-based Professional and Enterprise Editions with an Elite Support option. DotNetNuke Corp. also operates the DotNetNuke Store where users purchase third party apps for the platform.