Cameron's profileCameron Fuller’s T2R2PhotosBlogLists Tools Help

Blog


    July 20

    The Day that T2R2 finally went dark…

    It’s been a heck of a blogging run here on this spaces site. I started this three years ago with a simple name and no real idea what type of stuff I would end up blogging on – Technical Theories Ramblings and Rants (T2R2). Over these three years I have had the distinct pleasure of knowing that pages on this site have been viewed more than a quarter of a million times by people in more than 60 countries spanning the globe. I have had some great feedback on the stuff I have written (and a few ruffled feathers) and I hope it has provided answers to some of the techno-trivia out there which is a daily part of our IT lives. So for those of you who have read this blog – thank you! I hope that this has provided some good information and maybe a couple of chuckles.

    But all good things must come to an end, or in this case just to a transition to something bigger and better. Effective today, T2R2 will continue but it will be on a new home as I will be joining with the incredible crew at SystemCenterCentral (http://www.systemcentercentral.com)! I look forward to working with all of the SCC crew and contributing via my blog site which will now be available at http://www.systemcentercentral.com/blogs/cameronfuller where I will be blogging on System Center related technologies.

    P.S., the new RSS feed for my site is available at http://www.systemcentercentral.com/Community/Blogs/tabid/150/RSS/1/UserId/357/CategoryId/61/Default.aspx

    P.P.S., I am also blogging to my work blog which is located at http://blogs.catapultsystems.com (RSS is http://blogs.catapultsystems.com/cfuller/rss.xml) on all topics including home networking.

    July 17

    Windows Server 2008 Product Activation & Operations Manager

    Recently I logged into my RMS server and had a somewhat unpleasant surprise. While I am sure that I had done the production activation on this server I noticed in the event logs and on the properties of my system that Windows was not activated yet. I still had three days left to activate my server and while it did not represent an issue to activate it I realized that if this one had not yet been activated what other servers in my environment had not!

    For background, Windows Server 2008 Product activation gives you a 60 day grace period to activate the system. After the 60 days it goes into a notification critical state where you can still log on and the system performs normally however the background goes to black (which I make as my default configuration regardless so that classifies under ironic) and only critical updates are applied (also my default configuration so also ironic).

    For additional background, see http://www.microsoft.com/windowsserver2008/en/us/WS08-product-activation.aspx for Windows Server 2008 Activation and http://www.microsoft.com/windowsserver2008/en/us/R2-product-activation.aspx for Windows Server 2008 R2 Activation.

    What’s really strange in this situation is that this server had already been activated which I know for a fact because it has been functional for more than 60 days and there were still 3 days left on the countdown – weird any way you look at it. In our environment we are Internet restricted on servers so Windows activation needed to occur via phone instead of via the network. That may be involved in how we got to this situation, but who knows. So what to do about it? What if this isn’t the only server in the environment that was about to go to a state that it needed to be activated. Go Go Operations Manager 2007 R2! :)

    What I found was that in the application log there were multiple events from the source of “Security-Licensing-SLC”. With some digging I found three of them that were useful:

    8196 – Out of the grace period

    8200 – About to be out of the grace period

    1003 – Windows activation is solid/life is good

    So using these events it was straight-forward to create a management pack with two different monitors. One goes Critical when it finds an event number of 8196 from the source that matches a wildcard of *Security-Lic* for the source. Originally I configured this to look for the source Security-Licensing-SLC but it would not work (maybe too many characters, maybe the – throws it off, no clue why).

    I ended up using the authoring console to create a test management pack which had three different monitors:

    1) A top level aggregate monitor targeted under Windows Computer –> Configuration

    2) A lower level simple event monitor under the aggregate monitor which checked for the warning event of 8200 (and creates a warning alert), and went to healthy status on event 1003

    2) A lower level simple event monitor under the aggregate monitor which checked for the critical event of 8196 (and creates a critical alert), and went to healthy status on event 1003

    These are shown in the authoring console in the graphic below:

     24

    This is what it looks like in Health Explorer on an activated Windows Server system:
    HealthCheck showing Licensing

    This sample MP is available for download at http://www.SystemCenterCentral.com (http://www.systemcentercentral.com/PackCatalog/PackCatalogDetails/tabid/145/IndexID/20935/Default.aspx).  

    Summary: Quick test MP available above to determine if your servers are about to fail or have failed activation. Going forward this looks like good functionality to add to the existing Vista and Server 2008 MP’s.

    July 02

    Active Directory Management Pack, Kinda an Alert?

    Today we received a couple of alerts from the Active Directory 2008 management pack for a 64-bit Windows Server 2008 domain controller. They caught my eye because they did not have alert descriptions and in one case there was no Alert name.

    Normally I would post these as ReSearch This kb articles (see http://www.systemcentercentral.com/default.aspx?tabid=39&search=KB) but generally there are three required pieces of information: The Alert, the Issue, and the Resolution. In both cases there was no alert description and in one of these cases we do not have alert text. Both of these alerts occurred at the same time on the same domain controller:

    Alert: Overall Essential Services state

    Issue: The Overall Essential Services state monitor portion of the Active Directory Domain Controller Server 2008 Computer role identified an alert. No additional knowledge was available.

    Resolution: Speaking with the technician we found out the he had performed an uninstallation of the Exchange 2007 tools from the domain controller at the time that these alerts activated. These alerts had not recurred since that time. We closed the alerts to monitor to see if it will reoccur.

     

    Alert: (none)

    Issue: The SysVol for Windows 2008 portion of the Management Pack for Active Directory Server 2008 (Monitoring) identified an alert as part of the DFS Service Health alert monitor for one of the domain controllers in our environment. No additional knowledge was available.

    Resolution: Speaking with the technician we found out the he had performed an uninstallation of the Exchange 2007 tools from the domain controller at the time that these alerts activated. These alerts had not recurred since that time. We closed the alerts to monitor to see if it will reoccur.

    Upgrading to Windows Server 2008 SP2

    I ran my first upgrade today to from Windows Server 2008 SP1 to Windows Server 2008 SP2. The installation was for a domain controller running in a virtual (Hyper-V) and it ran for approximately 1 hour and 20 minutes. At this point it appears to be working without issues. A quick video of the upgrade (less than a minute) is attached to this blog entry to give an idea what the process looks like.

      
    June 23

    OpsMgr Tuning – IIS Discovery Probe Module Failed Execution

    Alert: IIS Discovery Probe Module Failed Execution

    Issue: Non-recurring failure across multiple servers in the environment. Discussed here at http://www.eggheadcafe.com/conversation.aspx?messageid=32202280&threadid=31903288 with no resolution. This is occurring in IIS MP version 6.0.6539.0. We have approximately 115 IIS servers with 85 alerts of this type occurring weekly. Digging into these led to a list of object discoveries specific to the IIS management packs. These included multiple discovery rules set to run every 3600 seconds (hourly). Including:

    IIS 2000:

    Windows Internet Information Services Web Sites 0-25 Discovery Rule

    Windows Internet Information Services Web Sites 26-50 Discovery Rule

    Windows Internet Information Services Web Sites 51-75 Discovery Rule

    Windows Internet Information Services Web Sites 76-100 Discovery Rule

    IIS 2003:

    Windows Internet Information Services Application Pools 0-25 Discovery Rule

    Windows Internet Information Services Application Pools 26-50 Discovery Rule

    Windows Internet Information Services Application Pools 51-75 Discovery Rule

    Windows Internet Information Services Application Pools 76-100 Discovery Rule

    Windows Internet Information Services Web Applications 0-25 Discovery Rule

    Windows Internet Information Services Web Applications 26-50 Discovery Rule

    Windows Internet Information Services Web Applications 51-75 Discovery Rule

    Windows Internet Information Services Web Applications 76-100 Discovery Rule

    Windows Internet Information Services Web Sites 0-25 Discovery Rule

    Windows Internet Information Services Web Sites 26-50 Discovery Rule

    Windows Internet Information Services Web Sites 51-75 Discovery Rule

    Windows Internet Information Services Web Sites 76-100 Discovery Rule

    Each of these various discoveries was occasionally failing with an error 0x80070006 error (The handle is invalid).

    Resolution: Set overrides on these alerts to run every 86400 seconds (daily) instead of every 3600 seconds (hourly). This greatly decreased the frequency (down from 85 a day to 10 a day) but did not completely remove these alerts.

    The new MP version (the after version 6.0.6539.0) will use a consolidator rule to only alert if this occurs multiple times a day so these overrides should not be required unless the objective is to decrease the overhead associated with this management pack.

    June 12

    Creating Complex Wildcard Expressions on Performance Views in OpsMgr

    When creating a set of dashboards for different servers in our environment, we started with the creation of a group which had the servers defined within it. Once this was done, we create a series of views to show us the state of the servers and to show relevant performance information. The user requirement was a single performance view that showed both the available disk space and the processor queue information for the same systems.

    An unrestricted list of performance counters which were available for these systems is shown in the graphic below:

    Counter select 03

    It would have been easy to create a single performance view which would have displayed the available disk space for the systems, and another performance view which would show the processor queue information for the systems. My goal however was to provide a single performance view that showed both. So I started working with this idea. When creating a performance view you can specify that it has a specific object name or a specific counter name. I wanted the counter for “Free Space” and the counter for “Processor “ items. The goal was to provide these counters: (but no others)

    • Processor Queue Length
    • % Processor Time
    • % Free Disk Space

    What I found is that you can use brackets to define what acceptable letters are for the item you are configuring. As an example, if you know that you wanted to get something that had the letters a-z in them you can specify this with [abcdefghijklmnopqrstuvwxyz]. In my case I wanted to restrict enough to get these three but no more. So this is how the query was created/I determine what would create a unique condition based upon these counters:

    Free Space    Processor      Result                 

                                             [  ]

    f                   p                   [fp]

    r                   r                   [rr]

    e                  o                   [eo]

    e                  c                    [ec]

                        e                   [ e]

    s                   s                  [ss]

    p                  s                  [ps]

    a                  o                  [ao]

    c                   r                  [cr]

    e                                      [e ]

    Combined these form: (nope, sorry not Voltron for my fellow geeks out there. But you gotta wonder when Hollywood is going to jump on that particular boat).

    [  ][fp][rr][eo][ec][ e][ss][ps][ao][cr][e ]

    As shown below:

    Counter select 04

    The result was the view showed only the counters expected as shown below:

    Counter select 02 

    Summary: You can use [] to specify criteria when defining performance views in OpsMgr. You may not need it often, but this is another tool to keep in the toolbelt if you need it.

    New Agents in Grey status

    During a large deployment we were debugging several agents which would not go to a healthy state/they were reporting to OpsMgr in a grey state. We investigated on the agent side event logs and saw no indication of an error and no indication of an error on the OpsMgr RMS server or the gateway servers which the systems were reporting into. As part of our debugging we restarted the three services (serviced names available in http://cameronfuller.spaces.live.com/blog/cns!A231E4EB0417CB76!1809.entry) on the RMS and each of the which which were in a grey state started reporting correctly.

    Summary: Something to consider if you hit a wall with several agents which are all in a grey state and won’t report into OpsMgr correctly/try restarting the OpsMgr services on the Root Management Server.

    Multiple Servers reporting sporadic heartbeat failures?

    We ran into a situation where we had several servers in our environment which would report a heartbeat failure and then would start heartbeat successfully for a while. We tracked down the common configuration when we found in the OpsMgr console / Administration / Device Management / Agent Managed that each of these servers was communicating with the same gateway server and with further digging it turned out that these servers had not been configured to automatically fail over to another gateway server. We logged into the gateway server that was common to the servers which were reporting heartbeat errors and found that it was non-functional so we rebooted the gateway server. To avoid this going forward we re-configured these agents to fail over to another gateway server.

    Summary: If several servers are reporting sporadic heartbeat failures, check and see if there is an issue on the OpsMgr server that they are communicating with (Gateway Server or Management Server). Use redundancy to avoid situations where one Management Server or Gateway Server has issues.

    OpsMgr, Filegroup is Full, MSDE

    My friend JC and I ran into a situation recently where OpsMgr 2007 R2 started alerting us that:

    “Could not allocate space for object in database because the filegroup is full”

    Normally this means that the database is not set auto-grow or that the drive where the filegroup exists is full. These situations are fixed through accessing the database and configuring it to auto-grow or to add another drive where the filegroup can be moved to or extended onto. In this case however, the database was set to auto-grow and the drive where the filegroup was stored was not full. We attempted to manually extend the database but it failed because it was running in MSDE and was restricted in size to 4096 MB. MSDE is restricted to a database size of 4GB (this article provides a good reference on limits of MSDE http://databases.aspfaq.com/database/what-are-the-limitations-of-msde.html). To resolve this we ended up moving the database from MSDE to a full version of SQL server and then we expanded the database beyond the 4GB limit.

    Summary: OpsMgr alerts on filegroups which are full on an MSDE database – remember the 4GB size limit.

    June 01

    Monitoring GlobalScape FTP services with OpsMgr

    We have a requirement to provide monitoring for FTP services that are provided by a product called GlobalScape FTP server. There were several different approaches that I considered here to monitor this via OpsMgr so here were the results:

    Since it was FTP I would not have done due diligence unless I tried to use the Microsoft IIS Management pack. I installed this and verified that it does not provide monitoring for 3rd party FTP services (logically enough) so this was not a viable option to provide monitoring for this configuration. The IIS management pack does provide monitoring for Microsoft FTP services (as shown below):

    image

    My next thought was to try to use a web monitor as an ftp site can be browsed in a web browser via ftp://ftpsite.domain.com. To test this I created a web application monitor for http://ftpsite.domain.com (the wizard would not accept the ftp://ftpsite.domain.com option) into a new management pack, and then I exported the management pack, edited it to change the http to ftp and then re-imported it. My hope was that I could change it to test the ftp configuration this way but what I found was that it did not work to monitor the ftp web site it caused a series of errors on the watcher node so I deleted the new management pack.

    So I decided to try a couple of different approaches combined. The first involved the creation of a TCP port monitor to watch the system on port 21 (the FTP port). This worked like a champ as shown below:

    image

    To supplement the port monitor I added a service monitor to watch the “GlobalSCAPE Secure FTP Server” service. This monitored the availability of the service which worked well to compliment the TCP port monitor which was created and is shown below:

    image

    Summary: Want to monitor a GlobalScape FTP server with OpsMgr out of the box functionality? Try a TCP port monitor on port 21 and add a Service Monitor for the “GlobalSCAPE Secure FTP Server” service!

    May 26

    QuickTricks: Multiple Gateway Servers going Grey?

    If you read the previous blog article (and were paying attention) you may have noticed that not only was there an unmonitored Management Server, but there were also two gateway servers which were in a grey state. These two gateway servers were the only gateway servers for a remote domain in the environment.

    MS05

    Reviewing the event logs on the gateway servers pointed towards the action account for the gateway server. In this case the password for the domain’s GWAA (Gateway Action Account) had been changed and resulted in the failure of the gateway servers. To resolve this I accessed the action account and changed the password to the updated one from the Administration \ Run As Configuration \ Accounts section.

    MS04

    Summary: If all of the gateway servers in a domain go to a grey state, verify that the user account and password for the Action Account are correct.

    QuickTricks: Management Server Not Monitored?

    I logged into one of my OpsMgr environments and on the Administration space I ran across an interesting issue. A functional management server which considered itself to be Not monitored (see below).

    MS01

    Reviewing the event logs on the management server there was nothing interesting so it wasn’t obvious what the issue was… Until I looked at the management server from the monitoring pane within the computers view. Ah-ha! Maintenance Mode!

    MS02

    Once this was removed from maintenance mode all was good again from a Administration \ Device Management \ Management Servers perspective.

    Summary: A previously Management Server shows as Not monitored – check to make sure that someone didn’t put it into maintenance mode.

    May 21

    Free Disk Space on Windows 2008/Vista, Virtualization, Hibernation

    We have a large number of servers in our environment which are virtualized and as a result they are created with C drives that are relatively small compared to physical servers (there is no reason to provide a 128 GB C drive on a virtual system if it can be stored in 30 GB). This approach makes it viable to run a large number of virtual guests on a limited amount of disk space. Unfortunately this means that we need to keep the operating system drives a clean as possible. We recently started running out of room on some of our virtuals, and to dig up additional disk space one of the guys here came up with a solid option.

    In the example below, we have a Windows Server 2008 system which has 2 GB of memory assigned to it.

    Hibernate07 

    We currently had 13.5 GB of available disk space (on our systems which we originally performed these changes, they were often down to less than 1 GB of free disk space).

    Hibernate04 

    To free up the drive space, we removed the file for hibernation. This is done with the “powercfg –h off” command from the command line. As shown below, this will fail unless you open the commandline with administrator rights as shown below:

    Hibernate01

    hibernate02

    A successful removal of the hibernation file is shown below:

    Hibernate03

    This freed up an additional 2 GB of disk space on the C drive (the same size as the amount of memory shown on the server above) and did not require a reboot of the system.

    hibernate06

    For virtual systems this removes the ability to hibernate, but we do not use hibernation on any of our virtualized systems so that does not represent an issue.

    Summary: Running out of free space on a Vista or Server 2008 and it’s virtualized? Try removing the hibernation file!

    Thank you to Shane Carden who took our lack of disk space situation and tracked down this option!

    May 18

    Agents are grey but there no error messages?

    Recently I was working through activating AD Integration in a remote forest. If you are planning on doing this I highly recommend that you read this in detail (http://www.systemcenterforum.org/wp-content/uploads/ADIntegration_final.pdf) and if you have access check out Pete Zerger’s session on how to get AD Integration working in a untrusted forest (both at MMS and TechEd this year). I spent a bunch of time tracking through how the Gateway server was working, how to get AD Integration in place and how to configure the auto-assignment.

    My issue however was the two of my servers would not communicate through the Gateway server correctly. They both showed up in the OpsMgr console as grey, even though I tracked through the logs on the RMS, Gateway, and Agent itself and did not see any issues. So, I assumed that the error must be somewhere in the path of AD Integration, Gateway, or RMS (hey, these were the most complex pieces so the mistake had to be there right?) which was why I wasn’t able to track this down.

    So what was the issue? Both of my two test clients had been renamed (great choices for test clients eh?). OpsMgr had the correct name and the incorrect name listed as agents and it was reporting a grey state on the one that was no longer the correct system name.

    Summary: When you have an agent that has gone grey, log into the agent system and double-check that the name matches the one that OpsMgr expects it to be. If the name has changed, and it is not being monitored uninstall the agent and re-install the agent. If it is monitored under both names delete the incorrect agent name out of the OpsMgr console in the Administration pane.

    May 07

    Windows IT Professional Magazine: Top 5 Extensions for OpsMgr

    Windows IT Pro just published my article with them which is now online at:

    http://windowsitpro.com/article/articleid/101719/top-5-extensions-for-opsmgr.html

    The topic is: Top 5 Extensions for OpsMgr (Managing Microsoft System Center Operations Manager 2007 smarter, faster, easier).

    This is my first article that I have published with Windows IT Professional, so I would really like to thank Windows IT Pro for accepting my article and to my co-workers for their ideas on this topic!

    Update: There is a link change to this article/I will re-post the updated link when it becomes available.

    April 15

    OpsMgr R2 RC, Process Monitoring, Conference Rooms

    A while back we built out some conference room kiosk’s (discussed at http://cameronfuller.spaces.live.com/blog/cns!A231E4EB0417CB76!1509.entry) which display the status of a conference room using an Outlook Web Access calendar. We have run into issues where the connection to the OWA server would be lost and/or when other issues would occur which would cause these to stop functioning. To address these situations, we recently started monitoring these with OpsMgr 2007 R2.

    One of the cool new functions in OpsMgr 2007 R2 is the integrated Process Monitoring Template. Prior to R2, there were ways to monitor processes but they were a bit clunky (I can say this, I wrote one which is available in the OpsMgr 2007 Unleashed book!).  To provide monitoring for these systems, we needed there to be a single instance of the iexplore executable running on each conference room system. When this was not the state of the system, the conference room system needed to be rebooted.

    The first step was to deploy the OpsMgr agent to these conference room systems. Next we created a custom group which contained only the conference room systems. After that we needed to create a Process Monitor for these systems as shown in these screenshots. We started by opening the authoring pane and adding a process monitor (Authoring / Management Pack Templates / Add Monitoring Wizard).

    01 

    02

    (It’s not recommended to use the default management pack when creating these. Create your own MP here to store any process monitors you are creating).

    03

    We are checking for the iexplore process name in this example to a custom targeted group we created which contain only the conference room systems.

    04

    For this example, we needed to monitor a minimum of 1 process, a maximum of 1 process, and we started testing at 1 minute. Eventually this last value was increased to 20 minutes to provide time to reboot the systems if it was required. We were not concerned for how long the process was running as that was not applicable for monitoring a conference room monitor.

    05

    We were not concerned about CPU usage or memory usage for this monitor so we left it with the defaults.

    06

    This is what the summary page looked like when the creation process was completed.

    07

    Now that we were monitoring these systems and had the process monitoring in place we could see their status within the Monitoring pane of the OpsMgr console as shown above within the Windows Service And Process Monitoring / Process State section.

    The next step was to provide a recovery which would reboot the conference room systems if they did not have a single instance of the iexplore executable running. To do this, we opened the monitor to the Diagnostic and Recovery tab and created a server reboot task which executed automatically when

    08

    10

    09

    So now when conference rooms stop responding correctly, OpsMgr identifies the situation and reboots them to put them into a healthy state!

    Summary: Check out the new Process Monitoring templtates OpsMgr R2. Fully integrated, monitor based and works like a champ!

    UPDATE: I received some excellent feedback on this topic from Jonathan Almquist. I'm attaching his feedback and my thoughts below.

    I have some serious concern about this guidance. I cannot believe there is even a thought of having a recovery workflow that automatically reboots a server. This is a very dangerous and risky proposition, in my opinion, and has the potential to do a lot of damage. When I say damage, I’m not just talking about the server or applications it may host. I’m talking about damage to the reputation of SCOM and Microsoft, because this is going to cause problems eventually, and SCOM and Microsoft is going to be held accountable.

    No workflow should ever reboot a server. Rebooting a server should always be a manual operator recovery action.

    My 2 cents.

    My thoughts on Jonathan's feedback:

    I would not recommend the automated reboot of a server either. For this particular situation we are monitoring client machines which provide a single function which is to show the calendar for a room which the client machine is physically outside of. I would like to echo Jonathan's perspective that this approach should NOT be taken on a server as the risk associated with rebooting a server is significant and should not be done in an automated method such as this.

     

    April 14

    Information Technology needs a dose of reality...

    This blog is called Cameron's Technical Theories, Ramblings and Rants. This one is a Rant/you may stop reading right here if you want. But hey, it's been a long time and you agreed to it when you put an RSS feed onto my crazy blog so here we go!

    <Rant>

    Who else will stand with me and say no more to trying to provide cool sounding names for IT and Project Management related stuff with no respect for what they actually represent? Do we really need to have a "Black Belt" in project management? What's next a “Pink Belt” in graphics design? Or the “Polka-Dot Belt” in Firewall configuration?

    Rangers? Delta force? Come one, give me a break. A real one of those would not even have to try to destroy any of us, sheer will-power would likely be enough to splatter us into the floor. Doesn’t anyone else consider it to be disrespectful to use the same name for technology related stuff as the people who are out there doing the dangerous work to protect us on a daily basis?

    I believe that from here forward anyone with a yellow-belt in martial arts (or higher) or in the military should have the right to go up to a someone with one of these titles and school them. Anyone who has earned martial arts belts could do this in their sleep anyways. In martial arts belts are earned through blood, sweat and pain. If you were actually say a brown belt in martial arts, and a green belt in project management wouldn’t you want to beat yourself up? I know that I would!

    What's next someone using sports analogies to provide cool IT names.. Oops. Got me there... I'll shut up now.

    Cameron Fuller – MVP

    </Rant>

    April 10

    SQL Server, Multiple Forests without Two-Way Trusts, SQL Jobs Failing

    We ran into a situation where SQL jobs started failing on a SQL server in a DMZ domain seemingly out of the blue. As background, we have a situation where:

    • Domain-A trusts Domain-B, but Domain-B doesn’t trust Domain-A (This is common in a DMZ configuration)
    • SQL Server is installed on a Domain-A machine (This is not that common, but it is possible)
    • The SQL Server (and agent jobs) services are running on a Domain-A account.

    We started receiving information that specific jobs which were running under credentials from Domain-B:

    JOB RUN: ‘{jobname}’ was run on {date/time}

    DURATION: 0 hours, 0 minutes, 0 seconds

    STATUS: Failed

    MESSAGES: The job failed. Unable to determine if the owner (Domain-B\{Username}) of job {jobname} has server access (reason: Could not obtain information about Windows NT group/user ‘Domain-B\{Username}’, error code 0x5. [SQLSTATE 42000] (Error 15404)).

    Digging into this alert ran into information that this is an access denied message (http://www.windows-tech.info/15/944bdabc733a57e3.php).

    On the web the recommendation to address this is:  “In order to work, SQL Server should be running under a Domain-B service account, otherwise it is very likely that Domain-B will not accept the token from the service and fail.” (taken from http://social.msdn.microsoft.com/Forums/en-US/sqlsecurity/thread/841a5446-6689-4612-8629-5029a341a77e/). 

    For our environment however what we found was a little more interesting. There had been an account defined on both Domain-A and Domain-B with the same name and the same password. This account was using pass-through authentication to make this work. Upon digging into both of these accounts (Service account name in Domain-A and the same Service account name in Domain-B) we found that the account in Domain-A had been locked out! We unlocked the account (and started logging information on this account going forward to trap any failed logons or lockout situations).

    Summary: If you have a DMZ domain, SQL installed in the DMZ and agent jobs start failing? Check to see if you are using pass-through authentication and if both accounts are not either password expired or disabled.

    April 06

    QuickTricks: OpsMgr Service Names

    There are new service names in the R2 version of OpsMgr. The following is the mapping of the new R2 service names and what they map to for the RTM/SP1 version:

    OpsMgr RC2 Service Name OpsMgr RTM & SP1 Name
    System Center Data Access OpsMgr SDK Service
    System Center Management OpsMgr Health Service
    System Center Management Configuration OpsMgr Config Service
    April 02

    QuickTricks: Where are the agent install logs?

    While doing a test deployment on the upcoming OpsMgr R2 release I was debugging why some of my agents were not deploying and I couldn’t find my agent installation logs to see what was causing the issue. These are created on the management server which the agent will be reporting to. So as an example, I have an RMS (Root Management Server), MS1 (Management Server), MS2, GW1 (Gateway Server), and GW. In my case they were on the MS1 server which the agent was told to report to via AD Integration.

    Summary: The install logs are located on the Management Server which is actually installing the agent in the C:\program files\System Center Operations Manager 2007\AgentManagement\AgentLogs folder.