| Cameron's profileCameron Fuller’s T2R2PhotosBlogLists | Help |
Cameron Fuller’s T2R2Cameron Fuller’s Technical Theories Ramblings and Rants (T2R2) |
|||||||||||||||||||||||||||||||||
Blogs of my co-workers at Catapult!
MVP Blogs
|
July 20 The Day that T2R2 finally went dark…It’s been a heck of a blogging run here on this spaces site. I started this three years ago with a simple name and no real idea what type of stuff I would end up blogging on – Technical Theories Ramblings and Rants (T2R2). Over these three years I have had the distinct pleasure of knowing that pages on this site have been viewed more than a quarter of a million times by people in more than 60 countries spanning the globe. I have had some great feedback on the stuff I have written (and a few ruffled feathers) and I hope it has provided answers to some of the techno-trivia out there which is a daily part of our IT lives. So for those of you who have read this blog – thank you! I hope that this has provided some good information and maybe a couple of chuckles. But all good things must come to an end, or in this case just to a transition to something bigger and better. Effective today, T2R2 will continue but it will be on a new home as I will be joining with the incredible crew at SystemCenterCentral (http://www.systemcentercentral.com)! I look forward to working with all of the SCC crew and contributing via my blog site which will now be available at http://www.systemcentercentral.com/blogs/cameronfuller where I will be blogging on System Center related technologies. P.S., the new RSS feed for my site is available at http://www.systemcentercentral.com/Community/Blogs/tabid/150/RSS/1/UserId/357/CategoryId/61/Default.aspx. P.P.S., I am also blogging to my work blog which is located at http://blogs.catapultsystems.com (RSS is http://blogs.catapultsystems.com/cfuller/rss.xml) on all topics including home networking. July 17 Windows Server 2008 Product Activation & Operations ManagerRecently I logged into my RMS server and had a somewhat unpleasant surprise. While I am sure that I had done the production activation on this server I noticed in the event logs and on the properties of my system that Windows was not activated yet. I still had three days left to activate my server and while it did not represent an issue to activate it I realized that if this one had not yet been activated what other servers in my environment had not! For background, Windows Server 2008 Product activation gives you a 60 day grace period to activate the system. After the 60 days it goes into a notification critical state where you can still log on and the system performs normally however the background goes to black (which I make as my default configuration regardless so that classifies under ironic) and only critical updates are applied (also my default configuration so also ironic). For additional background, see http://www.microsoft.com/windowsserver2008/en/us/WS08-product-activation.aspx for Windows Server 2008 Activation and http://www.microsoft.com/windowsserver2008/en/us/R2-product-activation.aspx for Windows Server 2008 R2 Activation. What’s really strange in this situation is that this server had already been activated which I know for a fact because it has been functional for more than 60 days and there were still 3 days left on the countdown – weird any way you look at it. In our environment we are Internet restricted on servers so Windows activation needed to occur via phone instead of via the network. That may be involved in how we got to this situation, but who knows. So what to do about it? What if this isn’t the only server in the environment that was about to go to a state that it needed to be activated. Go Go Operations Manager 2007 R2! :) What I found was that in the application log there were multiple events from the source of “Security-Licensing-SLC”. With some digging I found three of them that were useful: 8196 – Out of the grace period 8200 – About to be out of the grace period 1003 – Windows activation is solid/life is good So using these events it was straight-forward to create a management pack with two different monitors. One goes Critical when it finds an event number of 8196 from the source that matches a wildcard of *Security-Lic* for the source. Originally I configured this to look for the source Security-Licensing-SLC but it would not work (maybe too many characters, maybe the – throws it off, no clue why). I ended up using the authoring console to create a test management pack which had three different monitors: 1) A top level aggregate monitor targeted under Windows Computer –> Configuration 2) A lower level simple event monitor under the aggregate monitor which checked for the warning event of 8200 (and creates a warning alert), and went to healthy status on event 1003 2) A lower level simple event monitor under the aggregate monitor which checked for the critical event of 8196 (and creates a critical alert), and went to healthy status on event 1003 These are shown in the authoring console in the graphic below: This is what it looks like in Health Explorer on an activated Windows Server system: This sample MP is available for download at http://www.SystemCenterCentral.com (http://www.systemcentercentral.com/PackCatalog/PackCatalogDetails/tabid/145/IndexID/20935/Default.aspx). Summary: Quick test MP available above to determine if your servers are about to fail or have failed activation. Going forward this looks like good functionality to add to the existing Vista and Server 2008 MP’s. July 02 Active Directory Management Pack, Kinda an Alert?Today we received a couple of alerts from the Active Directory 2008 management pack for a 64-bit Windows Server 2008 domain controller. They caught my eye because they did not have alert descriptions and in one case there was no Alert name. Normally I would post these as ReSearch This kb articles (see http://www.systemcentercentral.com/default.aspx?tabid=39&search=KB) but generally there are three required pieces of information: The Alert, the Issue, and the Resolution. In both cases there was no alert description and in one of these cases we do not have alert text. Both of these alerts occurred at the same time on the same domain controller: Alert: Overall Essential Services state Issue: The Overall Essential Services state monitor portion of the Active Directory Domain Controller Server 2008 Computer role identified an alert. No additional knowledge was available. Resolution: Speaking with the technician we found out the he had performed an uninstallation of the Exchange 2007 tools from the domain controller at the time that these alerts activated. These alerts had not recurred since that time. We closed the alerts to monitor to see if it will reoccur.
Alert: (none) Issue: The SysVol for Windows 2008 portion of the Management Pack for Active Directory Server 2008 (Monitoring) identified an alert as part of the DFS Service Health alert monitor for one of the domain controllers in our environment. No additional knowledge was available. Resolution: Speaking with the technician we found out the he had performed an uninstallation of the Exchange 2007 tools from the domain controller at the time that these alerts activated. These alerts had not recurred since that time. We closed the alerts to monitor to see if it will reoccur. Upgrading to Windows Server 2008 SP2I ran my first upgrade today to from Windows Server 2008 SP1 to Windows Server 2008 SP2. The installation was for a domain controller running in a virtual (Hyper-V) and it ran for approximately 1 hour and 20 minutes. At this point it appears to be working without issues. A quick video of the upgrade (less than a minute) is attached to this blog entry to give an idea what the process looks like. June 23 OpsMgr Tuning – IIS Discovery Probe Module Failed ExecutionAlert: IIS Discovery Probe Module Failed Execution Issue: Non-recurring failure across multiple servers in the environment. Discussed here at http://www.eggheadcafe.com/conversation.aspx?messageid=32202280&threadid=31903288 with no resolution. This is occurring in IIS MP version 6.0.6539.0. We have approximately 115 IIS servers with 85 alerts of this type occurring weekly. Digging into these led to a list of object discoveries specific to the IIS management packs. These included multiple discovery rules set to run every 3600 seconds (hourly). Including: IIS 2000: Windows Internet Information Services Web Sites 0-25 Discovery Rule Windows Internet Information Services Web Sites 26-50 Discovery Rule Windows Internet Information Services Web Sites 51-75 Discovery Rule Windows Internet Information Services Web Sites 76-100 Discovery Rule IIS 2003: Windows Internet Information Services Application Pools 0-25 Discovery Rule Windows Internet Information Services Application Pools 26-50 Discovery Rule Windows Internet Information Services Application Pools 51-75 Discovery Rule Windows Internet Information Services Application Pools 76-100 Discovery Rule Windows Internet Information Services Web Applications 0-25 Discovery Rule Windows Internet Information Services Web Applications 26-50 Discovery Rule Windows Internet Information Services Web Applications 51-75 Discovery Rule Windows Internet Information Services Web Applications 76-100 Discovery Rule Windows Internet Information Services Web Sites 0-25 Discovery Rule Windows Internet Information Services Web Sites 26-50 Discovery Rule Windows Internet Information Services Web Sites 51-75 Discovery Rule Windows Internet Information Services Web Sites 76-100 Discovery Rule Each of these various discoveries was occasionally failing with an error 0x80070006 error (The handle is invalid). Resolution: Set overrides on these alerts to run every 86400 seconds (daily) instead of every 3600 seconds (hourly). This greatly decreased the frequency (down from 85 a day to 10 a day) but did not completely remove these alerts. The new MP version (the after version 6.0.6539.0) will use a consolidator rule to only alert if this occurs multiple times a day so these overrides should not be required unless the objective is to decrease the overhead associated with this management pack. June 12 Creating Complex Wildcard Expressions on Performance Views in OpsMgrWhen creating a set of dashboards for different servers in our environment, we started with the creation of a group which had the servers defined within it. Once this was done, we create a series of views to show us the state of the servers and to show relevant performance information. The user requirement was a single performance view that showed both the available disk space and the processor queue information for the same systems. An unrestricted list of performance counters which were available for these systems is shown in the graphic below: It would have been easy to create a single performance view which would have displayed the available disk space for the systems, and another performance view which would show the processor queue information for the systems. My goal however was to provide a single performance view that showed both. So I started working with this idea. When creating a performance view you can specify that it has a specific object name or a specific counter name. I wanted the counter for “Free Space” and the counter for “Processor “ items. The goal was to provide these counters: (but no others)
What I found is that you can use brackets to define what acceptable letters are for the item you are configuring. As an example, if you know that you wanted to get something that had the letters a-z in them you can specify this with [abcdefghijklmnopqrstuvwxyz]. In my case I wanted to restrict enough to get these three but no more. So this is how the query was created/I determine what would create a unique condition based upon these counters: Free Space Processor Result [ ] f p [fp] r r [rr] e o [eo] e c [ec] e [ e] s s [ss] p s [ps] a o [ao] c r [cr] e [e ] Combined these form: (nope, sorry not Voltron for my fellow geeks out there. But you gotta wonder when Hollywood is going to jump on that particular boat). [ ][fp][rr][eo][ec][ e][ss][ps][ao][cr][e ] As shown below: The result was the view showed only the counters expected as shown below: Summary: You can use [] to specify criteria when defining performance views in OpsMgr. You may not need it often, but this is another tool to keep in the toolbelt if you need it. New Agents in Grey statusDuring a large deployment we were debugging several agents which would not go to a healthy state/they were reporting to OpsMgr in a grey state. We investigated on the agent side event logs and saw no indication of an error and no indication of an error on the OpsMgr RMS server or the gateway servers which the systems were reporting into. As part of our debugging we restarted the three services (serviced names available in http://cameronfuller.spaces.live.com/blog/cns!A231E4EB0417CB76!1809.entry) on the RMS and each of the which which were in a grey state started reporting correctly. Summary: Something to consider if you hit a wall with several agents which are all in a grey state and won’t report into OpsMgr correctly/try restarting the OpsMgr services on the Root Management Server. Multiple Servers reporting sporadic heartbeat failures?We ran into a situation where we had several servers in our environment which would report a heartbeat failure and then would start heartbeat successfully for a while. We tracked down the common configuration when we found in the OpsMgr console / Administration / Device Management / Agent Managed that each of these servers was communicating with the same gateway server and with further digging it turned out that these servers had not been configured to automatically fail over to another gateway server. We logged into the gateway server that was common to the servers which were reporting heartbeat errors and found that it was non-functional so we rebooted the gateway server. To avoid this going forward we re-configured these agents to fail over to another gateway server. Summary: If several servers are reporting sporadic heartbeat failures, check and see if there is an issue on the OpsMgr server that they are communicating with (Gateway Server or Management Server). Use redundancy to avoid situations where one Management Server or Gateway Server has issues. OpsMgr, Filegroup is Full, MSDEMy friend JC and I ran into a situation recently where OpsMgr 2007 R2 started alerting us that: “Could not allocate space for object in database because the filegroup is full” Normally this means that the database is not set auto-grow or that the drive where the filegroup exists is full. These situations are fixed through accessing the database and configuring it to auto-grow or to add another drive where the filegroup can be moved to or extended onto. In this case however, the database was set to auto-grow and the drive where the filegroup was stored was not full. We attempted to manually extend the database but it failed because it was running in MSDE and was restricted in size to 4096 MB. MSDE is restricted to a database size of 4GB (this article provides a good reference on limits of MSDE http://databases.aspfaq.com/database/what-are-the-limitations-of-msde.html). To resolve this we ended up moving the database from MSDE to a full version of SQL server and then we expanded the database beyond the 4GB limit. Summary: OpsMgr alerts on filegroups which are full on an MSDE database – remember the 4GB size limit. June 01 Monitoring GlobalScape FTP services with OpsMgrWe have a requirement to provide monitoring for FTP services that are provided by a product called GlobalScape FTP server. There were several different approaches that I considered here to monitor this via OpsMgr so here were the results: Since it was FTP I would not have done due diligence unless I tried to use the Microsoft IIS Management pack. I installed this and verified that it does not provide monitoring for 3rd party FTP services (logically enough) so this was not a viable option to provide monitoring for this configuration. The IIS management pack does provide monitoring for Microsoft FTP services (as shown below): My next thought was to try to use a web monitor as an ftp site can be browsed in a web browser via ftp://ftpsite.domain.com. To test this I created a web application monitor for http://ftpsite.domain.com (the wizard would not accept the ftp://ftpsite.domain.com option) into a new management pack, and then I exported the management pack, edited it to change the http to ftp and then re-imported it. My hope was that I could change it to test the ftp configuration this way but what I found was that it did not work to monitor the ftp web site it caused a series of errors on the watcher node so I deleted the new management pack. So I decided to try a couple of different approaches combined. The first involved the creation of a TCP port monitor to watch the system on port 21 (the FTP port). This worked like a champ as shown below: To supplement the port monitor I added a service monitor to watch the “GlobalSCAPE Secure FTP Server” service. This monitored the availability of the service which worked well to compliment the TCP port monitor which was created and is shown below: Summary: Want to monitor a GlobalScape FTP server with OpsMgr out of the box functionality? Try a TCP port monitor on port 21 and add a Service Monitor for the “GlobalSCAPE Secure FTP Server” service! |
||||||||||||||||||||||||||||||||
|
|