Monday, March 30, 2015

Group Policy Fails to Apply on Domain Controllers With IPv6 disabled


Let me start by saying, don't disable IPv6 on domain controllers. There's no reason for it, and it will cause you more headaches in the long run. From what I've read, it used to be best practices in the early days of IPv6, but Windows is smart enough now that it shouldn't have any problems with it.

So I inherited a domain setup where the domain controllers (Running Server 2008 (not R2)) had IPv6 disabled for some reason. It's not well documented why this was done, and it's on my list of things to fix, but for the moment I'm stuck with it.

As to why I'm still running DCs on 2008, shut up it's on my to-do list.


Disabled 6to4 adapter. 6to4 adapter will register itself in DNS and cause lookup problems. I couldn't find a good way to tell the adapter not to register itself, so I settled for disabling it. From an admin command prompt:
netsh int ipv6 6to4 set state disabled
been running this way for a week or so, and haven't had any more problems. I am able to run a gpupdate on the domain controllers and have it apply successfully.

Problem & Full Story

 See "Forward" as to why I am operating DCs with IPv6 disabled.

Started running into this problem a while back, but had never had the time to troubleshoot it fully. Domain Controllers would randomly stop working correctly requiring reboots and extensive testing. These problems would take the form of

  • Slow logins -- stuck at applying user settings
  • Slow boot -- stuck applying computer settings
  • "no trust relationship" errors
  • DCs failing to apply group policy updates via "gpupdate /force"
    • Updates would apply correctly on reboot
Eventually, going through logs, I was able to narrow it down to a DNS issue. Mostly there were "RPC Server Unavailable" Errors which indicated a lookup failure. The DCs also function as DNS servers, so the fact they had lookup problems when nothing else seemed to was double strange.

Parsing through the "Forward Lookup Zones" I found that the DCs were still registering IPv6 addresses. If I cleared those addresses out manually then everything seemed to work. The DNS entries I cleared out looked like this:

Name                    Type                    Data                    Timestamp
(same as parent folder) IPv6 Host(AAAA)         2222::etc::2220         2/2/22
(same as parent folder) IPv6 Host(AAAA)         2222::etc::2221         2/2/22
DC1                     IPv6 Host(AAAA)         2222::etc::2222         2/2/22
DC2                     IPv6 Host(AAAA)         2222::etc::2223         2/2/22

After clearing these out everything would work for awhile. But the entries would eventual re-create themselves and the problems would come back. The (same as parent folder) -- which is a reference to the resolution of -- would recreate every hour or so, whereas the DC1,DC2 entries seemed to only come back on a reboot. It was baffling because the IPv6 adapters were totally disabled, so I couldn't figure out why/how they were continuing to register themselves.

After looking through ways to more forcibly disable IPv6 (through registry hacks, etc.), and deciding that was a bad idea, I thought to look more closely at the addresses that were being registered.  Issuing a ipconfig /all, I realized that those addresses were associated with the 6to4 adapters, not the actual ipv6 interface.

A quick google later to find out how to disable the 6to4 adapter and everything was in order. As mentioned above it's been a week and I haven't seen any side effects to disabling the 6to4 adapter.

Long-term solution is to reenable IPv6 on the domain controllers, but it's been disabled for so long, and since I have no idea why it was disabled in the first place, that will require more careful testing.

Monday, March 16, 2015

Print Server CPU spikes on reboot -- Spoolsv.exe


Problem in my case turned out to be a bad driver. Luckily it was a driver from an old printer that was no longer on the network.

Cleaning out print drivers is pretty easy: Open up two instances of print manager, go to the Drivers menu on one, and the printers on the other. Arrange the columns on the printers page so that you can see the printer name and the driver name. Sort by driver name, then compare the list to the ones in the Drivers menu. This makes it easy to see which drivers are no longer in use.

If you've cleaned out old drivers, and the problem is still occurring, you may need to start removing printers until the problem stops. Remove a printer, reboot the server, and see if the problem still exists. Of course, make sure you copy the printer information so you can re-add it later. Once you've determined which printer/driver combo is causing the issue, see if you can find an alternative driver for that printer. Most printer manufacturers recommend using their "Universal Print Driver" in a print server environment, rather than the 'named' or 'model specific' driver.


In brief, our print server would hang with 100% CPU usage whenever it was rebooted. This necessitated spoolsv.exe (the 100% process) to be manually terminated, and the spooler and all dependant services to be restarted. After manual termination/restart, the CPU levels would remain normal. Printer shares would be unavailable during the CPU spike.

Machine Specs:

  • Server 2008 R2 -- fully updated
  • VM on top of XenServer 6.2 - Intel based system
  • 2 core
  • 3 GB of RAM
  • Additional Printer Related Software
    • Print Manager (print tracking software)
    • Citrix Universal Print Server
Things I tried that didn't help

  • Windows updates
  • Updating paravirtualization driver.
  • SFC / Checkdisk/ other system file scans
  • Re-installing drivers/print queues  
Problem ended up being an issue with some unused print driver. Apparently the spooler service loads up drivers even if they are not being used by any print queue. So the only way to fix the problem was to remove all drivers from printers that had been retired -- see details in Solution above.

Tuesday, February 3, 2015

Printer Error on Boot: 49.38.13 -- HP Color LaserJet CP5525


In my case, I ended up having to clear out the "active" firmware and have it pull a clean copy from it's backup. To do this you perform a "Partial Clean" from the preboot menu.

To access the preboot menu follow the instructions on the Service Manual.

Before doing this, be aware it will clear out networking/admin password/ service password/etc. The Printer will have to be re-setup as if from new for the most part.

After you're in the preboot menu, navigate to the "Administration" menu, then select "Partial Clean". Accept the confirmation dialog, then press the back button until you're at the root (top-level) menu. Select "Continue". The printer will now reinstall the firmware from it's backup.

After it finishes it's restore. Re-configure the device with TCP/IP settings, admin settings, and any other customizations you'd made to the printer.

Problem & Full Story

I discussed our issues with our CP5525 printer previously. Well after we had resolved that problem, we started getting a new, more catastrophic problem. About a week after fixing the previous error, the printer got a new error, 49.38.13. I say more catastrophic because this error comes up as soon as the printer finishes booting. The error gives says to power printer off/on, but doing so only causes the error to come up again and wastes 2 minutes of your time.

With the error, User's are unable to print, and you are unable to get to any settings through the panel on the printer or through the web interface. The only way to make changes is to get into the preboot menu during startup.

Things I tried that didn't work
  • Disabled jet-direct (preboot menu)
  • Selecting First-boot (preboot menu)
  • Removing network cable during boot
  • Removing power cable and holding power button to clear memory/capacitors (~30 seconds)
    • This almost worked, it booted up and I was able to navigate around the menus for a little bit before the error popped up again.
What did end up working was doing a "partial clean" from the preboot menu (See Solution section above). It is important to make the distinction between the "Partial Clean" option and the "Full Clean" option. The Printer has two copies of the firmware installed, an active and a backup. The active copy holds all the settings you've configured on the device, where the backup is a clean image with all default settings. The partial clean removes the active and replaces it a copy of the backup. Functionally this should fix any firmware corruption, but it also resets the device to factory defaults.

The "Full Clean" option removes both the active and backup firmware images, leaving the device in an unbootable state. This is only used if the firmware you installed was corrupt (corrupt download, or something).

So, to be clear. Use the partial clean, not full clean.

Error When Printing: 49.4A.04 -- HP Color Laserjet Enterprise CP5525xh


In my situation, installing the latest firmware fixed the problem, 2304061_439461 at time of writing. Various threads on HP's support site suggest the issue is a "rogue" print job; in less dumb terms, it's a print job the printer doesn't know how to handle. Updating the firmware have given the printer the ability to understand the jobs that were causing the problem.

There are several ways to go about doing a firmware upgrade. They can be done remotely via the web interface, or locally with a usb drive. If you're mad and have your printer internet connected, you can also tell it to download the update itself. Do make sure to read the "Read Me" for the update, however, because they sometimes do specify that the update cannot be applied with a given method because of errors. No idea why this is, you'd think the update process would be pretty much the same regardless of method of application, but you'll probably save yourself a headache or two if you double check before hand.

Either way, follow the instructions in the user/service guide.

CP5525 User Guide
CP5525 Service Manual

You can easily find these for other models by Googling "$model user guide" or "$model service manual" -- User guides will generally be on HP's site, service manuals are generally not. I like

Important Note: I ran into another problem after firmware upgrade. I'm not sure the problem was because of the firmware upgrade, but be aware of it as a possibility. You can read about the additional problem and how I fixed it here.

Problem & Full Story

We've had numerous problems with our CP5525xh printer. It's only used by three or four people, but, at least recently, has been about 80% of our printer complaints. I can't blame all the issues on the printer itself, there's a fair bit of user error, but the thing is unreasonably unstable in general.

On recurring issue has been the printer crashing when we have the gall to try to make it print something. When this happens we get the 49.4A.04 error, and the printer must be rebooted to restore service. This will fix the printer, until the problem document is printed again.

The error is document specific, certain documents print fine, others cause the error. There doesn't seem to be much rhyme or reason to which documents it accepts or doesn't; other than that problem documents are often more complex documents (files printed from photoshop, lightroom, and some more complex .pdf files). Also interesting to note that documents that once print fine can have minor edits done, then stop working. The preeminent case was the adding of a text box to a poster one user was working on in photoshop causing the document to stop printing.

Worth noting also that this is strictly a printer issue. The job is spooled and sent through the print server without error. Same error also occurs when printing directly, rather than through a print server.

Some things we tried that didn't work
  • Selecting "First Boot" from preboot menu
  • Selecting "Cold Boot" from preboot menu
  • Direct mapping printer, rather than using print server
  • Rebooting User's machine / restarting offending program
  • Reinstall / Update drivers on User's Machine
  • Reinstall / Update drivers on Print Server
 Another option is to have the user export the document (to a more widely accepted format) before printing. This seems to work in most cases, but was met with much hostility as a work around (additional steps / time in workflow).

What finally fixed it for us was to install the latest firmware for the printer -- see "Solution" section at the top.

Tuesday, October 21, 2014

PowerShell: Managing User Profiles on Remote Machines


If you're like me, you're a charismatic force of nature whom everyone loves unconditionally. However, if you're job is like mine, you may often finding yourself wanting to remotely remove a user's  profile cache from a machine.

Profile cache is something a roaming domain profile creates when a user first logs into a machine. This holds the user's "CURRENT_USER" registry hive (ntuser.dat) as well as any profile folders that are not being redirected. If you're domain is setup to download redirected folders (offline file sync) it also stores those.

Why would you want to remove that? Various reasons. My usual use case is that one of my terminal servers (RDS) is running out of disk space. We have upwards of several thousand students that could potentially use our servers, but usually only a few hundred will use them in a given week. This means provisioning out enough space to hold several thousand user profiles (this would be several TB) would be too expensive and would never be necessary with proper profile management. Other reasons to remove local caches are: Corrupt profile, Old/bad settings stored in appdata, long login/other login problems, possibly security (but only if password caching is enabled) .

So how do we clear them. Windows has two native ways to do this, manually through the system properties interface, or through group policy using the "remove profiles older than x days" GPO. The first method works, but is limited; you must be logged into the machine (locally or rpd), you can only delete one profile at a time -- it taking several seconds to remove each profile, this method takes a really long time to remove a large number of profiles -- and the "manage user profiles" window can take a really long time ( 30 minutes or more) to actually open when there are hundreds of profiles on the machine. The GPO method is a bit better, but has two major caveats. First, the machine must be rebooted for the purge to run -- this means it cannot be done on demand when the system is in use -- and it cannot target specific accounts.

So we have a situation in which no real tool exists to do what we want. DelProf2 is one option I've looked at before, and while I'm sure it's a perfectly functional tool, as a rule I don't like running software to automate tasks when I don't know exactly what it's doing (from their change log it appears to be a very manual process). So from the short-comings methods above we want the following features
  • Ability to delete multiple profiles quickly
  • No reboot required
  • Command line for batching/automation
  • Ability to target specific profiles, or delete all older profiles
  • Fully remove all registry info and files of user's account
  • Do not have to be logged in / can be done remotely
Turns out all of this can be done via PowerShell and WMI.

 Enter PowerShell

First things first, here is the full code for the three different functions so you can follow along. They should be pretty easy to read, they're commented and have full get-help integration (except the first one, because it's very basic). (Covert-UTCtoDateTime) (Get-UserProfile) (Remove-UserProfile)

The first one I've already talked about on a previous post. It's a simple function to convert UTC time strings into a datetime object that PowerShell understands.

Lets look at the next one then.


I'm going to skip over the get-help information, as doing so would be redundant. So the first bit of code is this:


These are our Cmdlet bindings. They allow us to pass parameters cleanly to the function. Notice nothing here is mandatory. If no parameters are passed to the function it defaults to return all profiles on the local system. There UserID and Computer parameters are given default values if none is specified ('%' is WMI speak for "all" -- analogous to '*' or '.*' in most regex systems)


Get-if(!(Get-Command Convert-UTCtoDateTime -ErrorAction SilentlyContinue)){
    write-host -BackgroundColor "Black" -ForegroundColor "Red" "################################################################################"
    write-host -BackgroundColor "Black" -ForegroundColor "Red" "#                                                                               "
    write-host -BackgroundColor "Black" -ForegroundColor "Red" "This Program Requires cmdlet ""Convert-UTCtoDateTime""                          "
    write-host -BackgroundColor "Black" -ForegroundColor "Red" "Find it here:                                                                   "
    write-host -BackgroundColor "Black" -ForegroundColor "Red" "  "
    write-host -BackgroundColor "Black" -ForegroundColor "Red" "#                                                                               "
    write-host -BackgroundColor "Black" -ForegroundColor "Red" "################################################################################"

Here we check to make sure our dependent function is loaded. If not, it displays this lovely warning with a link.
Get-if($Computer.ToLower() -eq "localhost"){
    $Return = Get-WmiObject -Query "Select * from win32_userprofile where LocalPath like '%\\$UserID'" 

    $Return = get-wmiobject -ComputerName $Computer -Query "Select * from win32_userprofile where LocalPath like '%\\$UserID'" 

OK, first real code. here we do a quick check to see if we're running on the localhost or against a remote machine. The WMI query we run is the same for both, but the parameters we pass to the get-wmiobject function are slightly different.

On my machine doing -ComputerName $Computer actually works with 'localhsot', but only because 'localhost' is defined as in the default hosts file. This might be a safe assumption to make on any windows system, but I try not to make assumptions when I can.

About the WMI query. For anyone familar with SQL this will probably look pretty familiar, the terminology is a bit different though.  We're selecting everything from the class "win32_userprofile" where the class "localpath" matches our UserID with a backslash in front of it. The backslash is there to prevent UIDs which are a substring of another UID from returning erroneous results. for example: if you had users 'bob' and 'jimbob' searching for 'bob' would return both bob and jimbob without the slash in front.

Here's a command to run to see all what is returned by this query.

Get-WmiObject -Query "Select * from win32_userprofile"

What you'll see, if this runs correctly, is a mess. There's only a few properties in here that are useful (which we'll filter out in a minute). A noticeable exclusion here, you'll notice, is that there is no Username type field. I just wanted to point this out here because it might seems strange to be looking at the LocalPath property otherwise.

So after this block of code we have a full list of user profiles stored in our "$Return" variable. Now it's time for some filtering.

Get-#Filter System Accounts
    $Return = $Return | Where-Object -Property Special -eq $False
#Filter out Loaded Accounts
    $Return = $Return | Where-Object -Property Loaded -eq $False
#Filter otherthan loaded accounts
    $Return = $Return | Where-Object -Property Loaded -eq $True

Here are the first three filters. They're all pretty much the same. First they check if they're switch has been set, then use where-object to filter out certain properties. I do these all as individual if statements that modify the $return variable so that they can be chained together.

The two properties we're looking at here are "Special" and "Loaded". The Special property tells us if the account if an account is a non-user (i.e. system) account. You'll see things like "system", "network service", etc. listed as special. The "Loaded" property tells us if the account is currently in use. This property will be important later as you can't remove accounts that are currently loaded.

My inclusion of a "OnlyLoaded" flag might seem strange here. This is not directly related to the removal of user accounts, but an additional functionality. Combine "-OnlyLoaded" and "-ExcludeSystemAccounts" and you can find out what user(s) is(are) logged into the machine. Neat!

Let's look at the last filter now.
Get-#Filter on lastusetime
$Return | Where-Object -property LastUseTime -eq $Null | % {Write-Host -BackgroundColor "Black" -ForegroundColor "Yellow" $_.LocalPath " Has no 'LastUseTime', omitting" }
$Return = $Return | Where-Object -property LastUseTime -ne $Null
$Return = $Return | Where-Object {$(Convert-UTCtoDateTime $_.LastUseTime -ToLocal) -lt $OlderThan }

This one has a bit more going on.

The if statement looks a bit different. Because the variable is a "system.datetime" object rather than a boolean, I'm typecasting it as a boolean. If the variable has been populated, this returns true, if the variable is $Null (that is, was not set), then it returns false. The type casting isn't strictly necessary simply doing "if($OlderThan)" would return the same thing. This is mostly just for readability.

The next lines warn the user that it's skipping over any user accounts with a $Null "lastusttime" property. This is one aspect that may need to be modified in the future, but I don't think so. I have never seen a $Null LastUseTime on an actual user account. Mostly it shows up on accounts created by programs. For example, my computer has ".Net v4.5 Classic", "DefaultAppPool",  and ".Net v4.5" as accounts with no lastusetime. Even local users who have been created, but never logged in, won't get caught by this; this is because they don't show up at all until their first logon, at which point they'll get a "lastusetime".

Finally, after filtering out the $Null entries, we convert the LastUseTime to a datetime object and compare it to the datetime passed to the -OlderThan parameter. By default the lastusetime is a very ugly string that is difficult to make sense of at a glance. More importantly, to do any sort of date math, powershell needs it in a datetime object. So this is where the Convert-UTCtoDateTime function comes in to play. This function takes the ugly UTC string and turns it into something powershell can understand.

One caveat here, the -ToLocal flag turns out to be important. When doing date math, powershell evidantly doesn't take time zones into consideration. So it is necessary to have both dates be in local time before doing math, otherwise it might not behave as expected. See the following example:

Next, and final block:

Write-Output $Return
 Write-Output $Return | Select SID,LocalPath,@{Label="Last Use Time";Expression={Convert-UTCtoDateTime $_.LastUseTime -ToLocal}}    

Here we process our output. I've added support here for the powershell verbose flag. -Verbose is a native flag in powershell, which all Cmdlets have, even if they don't implement them. Normally this is used with the Write-Verbose Cmdlet, but I've done a little more. I didn't just want to write something additional when verbose is set, but wanted it formatted differently, that is, unformatted. This is necessary for this Cmdlets integration with the Remove-UserProfile Cmdlet. So I check to see if verbose has been set, if it has simply return the $Return variable. If it is not set by verbose, I format the output to look nice and show only the relevant information.

The relevant information here is the SID (Profile unique identifier), the LocalPath (c:\users\myuser), and the LastUseTime. The LastUseTime I modify with Convert-UTCtoDateTime to make it look nicer and be more useful at a glance.

That's about all there is to Get-UserProfile. Next We'll look at Remove-UserProfile, which uses Get-UserProfile.


Remove-UserProfile is very much an extension of Get-UserProfile. At a high level, it uses Get-UserProfile to obtain a list of user profiles then deletes them. That's really about it. Obviously there's a few checks and things in here as well, so lets go through that.

$ProfileList = Get-UserProfile -Verbose -UserID $UserID -Computer $Computer -ExcludeSystemAccounts -OlderThan $OlderThan

Since we've already done all the big work in the Get-UserProfile Cmdlet, all we need to do is call it with the appropriate flags. We use verbose so we get the full object, not just the filtered information. We exclude system accounts because we don't want to delete those for what I hope are obvious reasons -- I'm not sure that it'd actually let you, but better to be safe. We also use the -OlderThan flag regardless of whether the user has actually specified this.

Looking back at the parameter bindings, you see I've included a default value for $OlderThan that is one day in the future. This is for a couple of reasons. First, it's way more readable, no nested if statements with different querys. Second, this filters out the not-system-but-also-not-user accounts. I haven't tried removing these accounts to see what would actually happen, but I'm sure .net would be none too happy about it.

Next block

        Write-Warning "NO USER PROFILES WERE FOUND"

This is a simple $Null check to make sure the query actually returned something. If no profiles matched the criteria, the script exits.

        Foreach($User in $ProfileList){
            $User | Select SID,LocalPath,@{Label="Last Use Time";Expression={Convert-UTCtoDateTime $_.LastUseTime -ToLocal}}
        $Title = "PROCEED?"
        $Yes = New-Object System.Management.Automation.Host.ChoiceDescription "&Yes","Removes User Accounts"
        $No = New-Object System.Management.Automation.Host.ChoiceDescription "&No","Exits Script, No Changes Will be Made"
        $options = [System.Management.Automation.Host.ChoiceDescription[]]($yes, $no)
        $result = $host.ui.PromptForChoice($title, $message, $options, 1) 
        switch ($result)
            0 {}
            1 {return;}


The next bit here is a confirmation dialog. This is a built in PowerShell feature you can read more about here, but a few quick notes about my implementation. First, if the -Batch flag is set, it skips this. This is important as otherwise the script would always require user confirmation which would make it far less useful from an automation standpoint.

The foreach loop here lists out (in a nice format) all the user profiles to be deleted. This is a nice sanity check for the user to make sure they know what they're deleting.

On the choices, "$yes"/0 does nothing, and $no/1 exits the script, with the default being no. I wrote it this way to make the coding easier. With this continue/exit method, the rest of the Cmdlet doesn't have to be imbedded within the "switch($result)" block; which makes the mode much more readable and the -Batch code easier to write.

    Foreach($User in $ProfileList){
            Write-Host -BackgroundColor "Black" -ForegroundColor "Red" "User Account " $User.LocalPath "is Currently in user on" $Computer ":`tSkipping"
            Write-Output "User $($User.LocalPath) on $($Computer) was in use and could not be removed"
        Write-Host -BackgroundColor "Blue" -ForegroundColor "Green" "Removing User $($UserID.LocalPath) from $($Computer)"
        Echo "Deleting $($User.LocalPath) from $($Computer)"


Now we get into the actual deleting. A simple foreach loop that deletes everything that was returned by the Get-UserProfile Cmdlet. A few things to look at in here. I use if(!$Batch) in couple places. This is for formatting reasons. The only difference between the batch and non-batch output is the method of writing. Batch uses Write-Ouput (aka echo) which is nice because it can be redirected to a log file. However Write-Ouput lacks a lot of formatting options. So in non-batch mode I use Write-Host, which cannot be redirected to a file, but gives us some formatting/coloring options to make the output more readable.

Next, lets look at the if($User.Loaded). As discussed in Get-UserProfile, the loaded property tells us whether or not the profile is currently in use. It's important to filter these out otherwise PowerShell will throw errors when you try to delete the profile. Why not use the -ExcludeLoaded flag we created in Get-UserProfile? I debated about this for awhile actually, but decided it would be frustrating if you were trying to delete a specific profile and the script kept saying "no profiles found". This way provides more information, even if it wastes a bit more time.

And lastly, we delete the profile. "$User.Delete()" is really all it takes.

These three functions are about 250 lines all together. And you could get all the same functionality in this.

  Get-WmiObject -Computer MyComputer.mydomain -Query "Select * from win32_userprofie where LocalPath like '%\\MyUser'" | % {$_.Delete()}

Not entirely sure my aim in pointing this out. Maybe that there's a tradeoff between writing something you know, and something other people could use?

Anyway, hope someone else can get some use out of this. I know it's something that's bugged me for a long time. 

Wednesday, October 15, 2014

Adventures in PowerShell: Converting UTC to DateTIme


Ran into a problem recently. I'm working on script (more on this later) that will let me pull a list of user profiles on a remote machine. The problem is that the user profile's "last use time" a bit of information I would like to have is in UTC format System.String object -- meaning it looks like this:

Pretty standard stuff. Except PowerShell doesn't know what to do with it. It blew my mind when I learned this. PowerShell doesn't have a native function to convert this type of string to a date (that is, a 'System.DateTime' object. So after an hour of Googling I couldn't find a good go-to script to handle it (at least, not one in PowerShell, several in VB or C#). So I wrote my own. It's a pretty simple script, it relies on it being in the specific format above, that is:

 I may update this script if I find additional formats -- but being a standard, this should work in most places.

One note is that while the format specifies down 6 decimal places, windows can only handle 3, so the trailing 3 are dropped.

This will convert to your local timezone with the '-ToLocal' flag

The Code:

function Convert-UTCtoDateTime{

  Author: Keith Ballou
  Date: 10/15/14


    #Parameter Binding

    #Breakout the various portions of the time with substring
    #This is very inelegant, and UTC
    $yyyy = $UTC.substring(0,4)
    $M = $UTC.substring(4,2)
    $dd = $UTC.substring(6,2)
    $hh = $UTC.substring(8,2)
    $mm = $UTC.substring(10,2)
    $ss = $UTC.substring(12,2)
    $fff = $UTC.substring(15,3)
    $zzz = $UTC.substring(22,3)

    #If local, add the UTC offset returned by get-date
    (get-date -Year $yyyy -Month $M -Day $dd -Hour $hh -Minute $mm -Second $ss -Millisecond $fff) + (get-date -format "zzz")
    #else just return the UTC time
    get-date -Year $yyyy -Month $M -Day $dd -Hour $hh -Minute $mm -Second $ss -Millisecond $fff

Tuesday, September 30, 2014

More Printing Problems - Spooler and Citrix Print Manager Crash

Solution 2

 Most of Solution 1 below still applies. Found that a large percentage of crashes/hangs can be avoided by making sure there are no old drivers on the Terminal Server or Print Server. You can check out a separate, related post here for details.

In short, the spooler service will load drivers, even if no mapped/installed printer is using them. This can cause old out of date drivers to crash/hang the spooler even if they are not being used.

Again, the problem is not 100% fixed, but is much better after clearing out drivers.


Today the solution appears to be use HP UPD (I'm using 5.8, for the record) and recreate print queues (fancy name for shared printers) and have a nice script to handle failures elegantly.

Use of the HP Universal Print Driver is pretty much mandated for use in a terminal server environment (check out the HP compatibility list here (pdf)). Most HP printers new/old do not support the use of their device specific driver in a Citrix Xen* environment. 

Recreating the print queues apparently helps for not-quite-adequately-explained reasons. Apparently switching which driver a print queue is using can cause some sort of corruption that can crash the spooler. So you have to delete and re-add the shared printer (you can use the same port, etc.) but starting with the UPD driver, rather than using the device-specific one and changing later. I've done this for a couple of printers and it has already drastically reduced the number of crashes, will be recreating other printers that seem to be causing problems and will update this will progress.

Having a script that handles failures nicely is key to reducing user impact. Printers are always problematic, especially in a Xen/RDS environment, so expecting 0 crashes is probably over optimistic. I've written a script that runs when a service is detected to have failed (I do this through the 'recovery' tab in the service properties). The script restarts both the spooler and the Citrix Print Manager Service. For some reason, if these two are not restarted together, they don't seem to talk to each other. So whenever one fails, both need to be restarted. I'll put the script at the bottom of the article.


We've had more problems with printing since fixing the issue with the Citrix Print Client. The issue now is that the print spooler on our terminal server keeps crashing and it causes people not to be able to print, printers not to map, etc.

In brief, there are two services "Citrix Print Management" (CpSVC) and "Print Spooler" (spooler). Even though we are no longer using the Citrix Print Client, we still the the CpSVC service because it handles the mapping of printers through Citrix Group Policy. The Citrix Group policy gives us some additional functionality when mapping printers that would be difficult to replicate in normal AD group policy. Anyway, when either of those services crash it breaks everything; Simply setting the service to restart on crash doesn't work either. The processes must be restarted together, otherwise they don't seem to talk to each other.

I've written a small script that reboots both services whenever one fails, which minimizes the impact of a failure, but I'm still working on solving the underlying problem. So it's time for another long rambling post trying to figure out what's happening, the last one went pretty well, so let's give it a shot.


XenDesktop Controller: Server 2008r2SP1,  XenDesktop 7.0, Physical Server with way more resources than necessary
XenDesktop Hosted Desktop: Server 2008r2SP1, runs RDS server/ XenDesktop 7.1, clients connect from Wyse Xenith2 thin clients, about 100 possible clients, but generally have only 30-50 at any given time.
Print Server: Server 2008R2SP1, Microsoft Print Server

Errors in Event Viewer

Here's a brief rundown of the various errors I've gotten and what I've been able to find out about each.

splwow64.exe crash

Type: Error
Source: Application Error
EventID: 1000
Task Category: (100)
Faulting application name: splwow64.exe, version: 6.1.7601.17777, time stamp: 0x4f35fbfe
Faulting module name: ntdll.dll, version: 6.1.7601.18247, time stamp: 0x521eaf24
Exception code: 0xc0000374
Fault offset: 0x00000000000c4102
Faulting process id: 0xeb30
Faulting application start time: 0x01cfd83a3b2e4f74
Faulting application path: C:\Windows\splwow64.exe
Faulting module path: C:\Windows\SYSTEM32\ntdll.dll

 splwow64.exe is a process that translates the x64 print drivers for use by 32-bit applications. E.G. All of our print drivers must be 64-bit, because it's running 2008R2SP1, but the server runs 32-bit Office (for various plug-in compatibility). When office wants to print it has to go through splwow64.exe because it wouldn't know what to do with a 64-bit driver.

As for why this crashes, I have no idea. You see the "faulting module" is one "ntdll.dll", and the error code is "0xc0000374". ntdll.dll is explained here in Wikipedia, I'd tried to summarize but since my understanding is vague at best, probably best to read it yourself. "0xc0000374" is an error code that indicates "Heap Corruption", which is a fancy way of saying the memory was modified in a way that wasn't expected. Neither of these bits of information are particularly insightful, but they come up over and over in these errors.

spoolsv.exe crash

Type: Error
Source: Application Error
EventID: 1000
Task Category: (100)
Faulting application name: spoolsv.exe, version: 6.1.7601.17777, time stamp: 0x4f35fc1d
Faulting module name: ntdll.dll, version: 6.1.7601.18247, time stamp: 0x521eaf24
Exception code: 0xc0000374 OR 0xc0000005
Fault offset: 0x00000000000c4102
Faulting process id: 0x9044
Faulting application start time: 0x01cfd8232b50fcd7
Faulting application path: C:\Windows\System32\spoolsv.exe
Faulting module path: C:\Windows\SYSTEM32\ntdll.dll
This error is very similar to the splwow64.exe one above, with the exception of the "0xc0000005" variant. This, as far as I can tell, is a memory access violation.

Couldn't Load Print Processor

Type: Error
Source: PrintService
EventID: 365
Task Category: Initializing a print Processor
Windows could not load print processor hpcpp160 because EnumDatatypes failed. Error code 126. Module: 18\hpcpp160.dll. Please obtain and install a new version of the driver from the manufacturer (if available), or choose an alternate driver that works with this print device.

This error is a bit more informative. There are variants for other print processors (e.g. hpzppwn7). hpcpp160 happens to be the HP universal print driver (version 5.8). Anyway it's our first indication of something wrong with the print service. Problem is, most of the time this print processor has no problems. Most of our printers use the HP UPD, and they work most of the time.

I've also tried reinstalling the UPD (on the client, on the server would require more extend downtime). This hasn't had an effect.

Citrix -- "Environment is incorrect" "no printers were found" "printer auto-creation failed"

Type: Error
EventID: 1114 / 1116

I'm not going to list out the full text of these because they are erroneous (at least as far as my problem goes) -- see this forum for more info. Basically these errors can be logged even if printer creation is succeeding. It's very obnoxious. It's possible it's logged because printers are not being deleted at logoff, but I haven't found anything to suggest that is still an issue (that forum post is pretty old).

Print Spooler Can't copy file

note: you must enable the PrintService operation log to see this error. In event viewer find it under Applications and Services > Microsoft > Windows > PrintService. Right-click the operational log, select "enable log"

Type: Error
Source: PrintService
EventID: 811
Task Category: Executing a file operation
 The print spooler failed to move the file C:\Windows\system32\spool\PRTPROCS\x64\hpcpp160.dll to C:\Windows\system32\spool\PRTPROCS\x64\202_hpcpp160.dll, error code 0xb7. See the event user data for context information.

This message again may vary with the print processor (replace hpcpp160.dll with whatever.dll). This is an odd message. The folder it references is full of duplicate print processor dlls (1_hpcpp160.dll to 499_hpcpp160.dll). I have no idea why, this is the current lead I'm working on.

Things I've Tried

  1. Create Script that restarts processes
    1. First did this restarting "spooler" and "cpsvc" every 5 minutes
      1. this technically worked, but caused some strange behavior and is over inelegant
    2. Set "spooler" and "cpnsvc" to run the script when either crashes
      1. this can be done in the services MMC snap in.
      2. Still doesn't solve the underlying problem, but is a nice band-aid fix until I can figure the bigger issue out
      3. it's also way more elegant that the "restart every 5  minutes" solution.
      4. Note: had to change CpSvc to log on as a local service with permission to interact with desktop (was just local service), otherwise the script wouldn't run correctly when it failed.
  2. Moving printers to the HP UPD
    1. Thought here is that one of the device-specific print drivers wasn't terminal compatible
    2. This hasn't exactly panned out. Moved device-specific printers to the UPD but errors continue to show up. I've just finished this migration recently, so maybe it'll pay off over time.
  3. Clearing out system32\spool\prtproc\x64
    1. This folder is full of duplicate .dll files (see "print spooler can't copy file" above) 
    2. Found out not to delete everything from that folder. The WinPrint.dll file will not recreate itself.
    3. Print spooler still crashed (0xc0000005). at like 5am when no one would have been using it. So that's fun.
      1. Actually, someone did log it at 5am, just seconds before the crash. So that's something to go on maybe.
    4. Printing hasn't gotten any worse, so at least I haven't broken anything
    5. This has at least seems to have stopped the 811 errors. Watching to see if the prtproc folder starts to build up again.
  4. Recreating Print Queue
    1. Some things I read suggested that print queues my become corrupt when switching drivers/print processors.
      1. Print queue is the technical term for a shared printer
    2. So I deleted and recreated the print queues that seemed to be causing the most issues
      1. "most issues" was determined by cross referencing the crash time with our user tracking to determine which stations (and thus which printers) were most recently logged into before the crash.
    3. This actually appears to have had some effect. I've only had one crash (and it was the splwow64.exe crash, not the spooler or CpSvc) today.
      1. Today's Friday, so load is low, but will continue to monitor.

Days Pass....
Spooler/CpSvc/splwow64 continue  to crash, but much more infrequently. Average maybe once or twice a day, much lower than the every couple of hours it used to be. I am going to continue to create print queues and see if I can eliminate the crashing all together and will update this page as I learn more.


write-host "Shutting down Citrix Print Manager"
stop-service -force cpsvc
write-host "Waiting for CpSvc to shut down Gracefully" -nonewline

while($(Get-service cpsvc).Status -ne "Stopped" )

if($Count -gt 5)

    write-host ""
    write-host "CpSvc has not shutdown gracefully, shutting down manually"
    stop-process -force -Name cpsvc
write-host "." -nonewline
Start-Sleep 1
write-host ""

write-host "Shutting Down Print Spooler"
stop-service -force spooler
write-host "Waiting for Spooler to shut down Gracefully"

while($(Get-service spooler).Status -ne "Stopped" )

if($Count -gt 5)

    write-host ""
    write-host "spooler has not shutdown gracefully, shutting down manually"
    stop-process -force -Name spoolsv
write-host "." -nonewline
Start-Sleep 1
write-host ""

write-host "Bringing Spooler Back up"

start-service spooler

write-host "Bringing Citrix Print Manager back up"

start-service cpsvc

date >> c:\temp\restartprinters.txt