Tuesday, September 30, 2014

More Printing Problems - Spooler and Citrix Print Manager Crash

Solution 2

 Most of Solution 1 below still applies. Found that a large percentage of crashes/hangs can be avoided by making sure there are no old drivers on the Terminal Server or Print Server. You can check out a separate, related post here for details.

In short, the spooler service will load drivers, even if no mapped/installed printer is using them. This can cause old out of date drivers to crash/hang the spooler even if they are not being used.

Again, the problem is not 100% fixed, but is much better after clearing out drivers.

Solution

Today the solution appears to be use HP UPD (I'm using 5.8, for the record) and recreate print queues (fancy name for shared printers) and have a nice script to handle failures elegantly.

Use of the HP Universal Print Driver is pretty much mandated for use in a terminal server environment (check out the HP compatibility list here (pdf)). Most HP printers new/old do not support the use of their device specific driver in a Citrix Xen* environment. 

Recreating the print queues apparently helps for not-quite-adequately-explained reasons. Apparently switching which driver a print queue is using can cause some sort of corruption that can crash the spooler. So you have to delete and re-add the shared printer (you can use the same port, etc.) but starting with the UPD driver, rather than using the device-specific one and changing later. I've done this for a couple of printers and it has already drastically reduced the number of crashes, will be recreating other printers that seem to be causing problems and will update this will progress.

Having a script that handles failures nicely is key to reducing user impact. Printers are always problematic, especially in a Xen/RDS environment, so expecting 0 crashes is probably over optimistic. I've written a script that runs when a service is detected to have failed (I do this through the 'recovery' tab in the service properties). The script restarts both the spooler and the Citrix Print Manager Service. For some reason, if these two are not restarted together, they don't seem to talk to each other. So whenever one fails, both need to be restarted. I'll put the script at the bottom of the article.


Introduction


We've had more problems with printing since fixing the issue with the Citrix Print Client. The issue now is that the print spooler on our terminal server keeps crashing and it causes people not to be able to print, printers not to map, etc.

In brief, there are two services "Citrix Print Management" (CpSVC) and "Print Spooler" (spooler). Even though we are no longer using the Citrix Print Client, we still the the CpSVC service because it handles the mapping of printers through Citrix Group Policy. The Citrix Group policy gives us some additional functionality when mapping printers that would be difficult to replicate in normal AD group policy. Anyway, when either of those services crash it breaks everything; Simply setting the service to restart on crash doesn't work either. The processes must be restarted together, otherwise they don't seem to talk to each other.

I've written a small script that reboots both services whenever one fails, which minimizes the impact of a failure, but I'm still working on solving the underlying problem. So it's time for another long rambling post trying to figure out what's happening, the last one went pretty well, so let's give it a shot.

Environment

XenDesktop Controller: Server 2008r2SP1,  XenDesktop 7.0, Physical Server with way more resources than necessary
XenDesktop Hosted Desktop: Server 2008r2SP1, runs RDS server/ XenDesktop 7.1, clients connect from Wyse Xenith2 thin clients, about 100 possible clients, but generally have only 30-50 at any given time.
Print Server: Server 2008R2SP1, Microsoft Print Server

Errors in Event Viewer

Here's a brief rundown of the various errors I've gotten and what I've been able to find out about each.

splwow64.exe crash

Type: Error
Source: Application Error
EventID: 1000
Task Category: (100)
Message:
Faulting application name: splwow64.exe, version: 6.1.7601.17777, time stamp: 0x4f35fbfe
Faulting module name: ntdll.dll, version: 6.1.7601.18247, time stamp: 0x521eaf24
Exception code: 0xc0000374
Fault offset: 0x00000000000c4102
Faulting process id: 0xeb30
Faulting application start time: 0x01cfd83a3b2e4f74
Faulting application path: C:\Windows\splwow64.exe
Faulting module path: C:\Windows\SYSTEM32\ntdll.dll

 splwow64.exe is a process that translates the x64 print drivers for use by 32-bit applications. E.G. All of our print drivers must be 64-bit, because it's running 2008R2SP1, but the server runs 32-bit Office (for various plug-in compatibility). When office wants to print it has to go through splwow64.exe because it wouldn't know what to do with a 64-bit driver.

As for why this crashes, I have no idea. You see the "faulting module" is one "ntdll.dll", and the error code is "0xc0000374". ntdll.dll is explained here in Wikipedia, I'd tried to summarize but since my understanding is vague at best, probably best to read it yourself. "0xc0000374" is an error code that indicates "Heap Corruption", which is a fancy way of saying the memory was modified in a way that wasn't expected. Neither of these bits of information are particularly insightful, but they come up over and over in these errors.


spoolsv.exe crash

Type: Error
Source: Application Error
EventID: 1000
Task Category: (100)
Message: 
Faulting application name: spoolsv.exe, version: 6.1.7601.17777, time stamp: 0x4f35fc1d
Faulting module name: ntdll.dll, version: 6.1.7601.18247, time stamp: 0x521eaf24
Exception code: 0xc0000374 OR 0xc0000005
Fault offset: 0x00000000000c4102
Faulting process id: 0x9044
Faulting application start time: 0x01cfd8232b50fcd7
Faulting application path: C:\Windows\System32\spoolsv.exe
Faulting module path: C:\Windows\SYSTEM32\ntdll.dll
This error is very similar to the splwow64.exe one above, with the exception of the "0xc0000005" variant. This, as far as I can tell, is a memory access violation.

Couldn't Load Print Processor

Type: Error
Source: PrintService
EventID: 365
Task Category: Initializing a print Processor
Message:
Windows could not load print processor hpcpp160 because EnumDatatypes failed. Error code 126. Module: 18\hpcpp160.dll. Please obtain and install a new version of the driver from the manufacturer (if available), or choose an alternate driver that works with this print device.

This error is a bit more informative. There are variants for other print processors (e.g. hpzppwn7). hpcpp160 happens to be the HP universal print driver (version 5.8). Anyway it's our first indication of something wrong with the print service. Problem is, most of the time this print processor has no problems. Most of our printers use the HP UPD, and they work most of the time.

I've also tried reinstalling the UPD (on the client, on the server would require more extend downtime). This hasn't had an effect.

Citrix -- "Environment is incorrect" "no printers were found" "printer auto-creation failed"

Type: Error
EventID: 1114 / 1116

I'm not going to list out the full text of these because they are erroneous (at least as far as my problem goes) -- see this forum for more info. Basically these errors can be logged even if printer creation is succeeding. It's very obnoxious. It's possible it's logged because printers are not being deleted at logoff, but I haven't found anything to suggest that is still an issue (that forum post is pretty old).

Print Spooler Can't copy file

note: you must enable the PrintService operation log to see this error. In event viewer find it under Applications and Services > Microsoft > Windows > PrintService. Right-click the operational log, select "enable log"

Type: Error
Source: PrintService
EventID: 811
Task Category: Executing a file operation
Message
 The print spooler failed to move the file C:\Windows\system32\spool\PRTPROCS\x64\hpcpp160.dll to C:\Windows\system32\spool\PRTPROCS\x64\202_hpcpp160.dll, error code 0xb7. See the event user data for context information.

This message again may vary with the print processor (replace hpcpp160.dll with whatever.dll). This is an odd message. The folder it references is full of duplicate print processor dlls (1_hpcpp160.dll to 499_hpcpp160.dll). I have no idea why, this is the current lead I'm working on.

Things I've Tried

  1. Create Script that restarts processes
    1. First did this restarting "spooler" and "cpsvc" every 5 minutes
      1. this technically worked, but caused some strange behavior and is over inelegant
    2. Set "spooler" and "cpnsvc" to run the script when either crashes
      1. this can be done in the services MMC snap in.
      2. Still doesn't solve the underlying problem, but is a nice band-aid fix until I can figure the bigger issue out
      3. it's also way more elegant that the "restart every 5  minutes" solution.
      4. Note: had to change CpSvc to log on as a local service with permission to interact with desktop (was just local service), otherwise the script wouldn't run correctly when it failed.
  2. Moving printers to the HP UPD
    1. Thought here is that one of the device-specific print drivers wasn't terminal compatible
    2. This hasn't exactly panned out. Moved device-specific printers to the UPD but errors continue to show up. I've just finished this migration recently, so maybe it'll pay off over time.
  3. Clearing out system32\spool\prtproc\x64
    1. This folder is full of duplicate .dll files (see "print spooler can't copy file" above) 
    2. Found out not to delete everything from that folder. The WinPrint.dll file will not recreate itself.
    3. Print spooler still crashed (0xc0000005). at like 5am when no one would have been using it. So that's fun.
      1. Actually, someone did log it at 5am, just seconds before the crash. So that's something to go on maybe.
    4. Printing hasn't gotten any worse, so at least I haven't broken anything
    5. This has at least seems to have stopped the 811 errors. Watching to see if the prtproc folder starts to build up again.
  4. Recreating Print Queue
    1. Some things I read suggested that print queues my become corrupt when switching drivers/print processors.
      1. Print queue is the technical term for a shared printer
    2. So I deleted and recreated the print queues that seemed to be causing the most issues
      1. "most issues" was determined by cross referencing the crash time with our user tracking to determine which stations (and thus which printers) were most recently logged into before the crash.
    3. This actually appears to have had some effect. I've only had one crash (and it was the splwow64.exe crash, not the spooler or CpSvc) today.
      1. Today's Friday, so load is low, but will continue to monitor.

Days Pass....
Spooler/CpSvc/splwow64 continue  to crash, but much more infrequently. Average maybe once or twice a day, much lower than the every couple of hours it used to be. I am going to continue to create print queues and see if I can eliminate the crashing all together and will update this page as I learn more.

RestartPrintServices.ps1

write-host "Shutting down Citrix Print Manager"
stop-service -force cpsvc
write-host "Waiting for CpSvc to shut down Gracefully" -nonewline
$count=0

while($(Get-service cpsvc).Status -ne "Stopped" )
{
$count++;

if($Count -gt 5)

{
    write-host ""
    write-host "CpSvc has not shutdown gracefully, shutting down manually"
    stop-process -force -Name cpsvc
    break;
}
write-host "." -nonewline
Start-Sleep 1
}
write-host ""

write-host "Shutting Down Print Spooler"
stop-service -force spooler
write-host "Waiting for Spooler to shut down Gracefully"

$count=0
while($(Get-service spooler).Status -ne "Stopped" )
{
$count++;

if($Count -gt 5)

{
    write-host ""
    write-host "spooler has not shutdown gracefully, shutting down manually"
    stop-process -force -Name spoolsv
    break;
}
write-host "." -nonewline
Start-Sleep 1
}
write-host ""


write-host "Bringing Spooler Back up"

start-service spooler

write-host "Bringing Citrix Print Manager back up"

start-service cpsvc

date >> c:\temp\restartprinters.txt

 

7 comments:

  1. I just came across this post because it seems like I'm dealing with the same issue. I'm on XenApp 5.0 running Server 2008 x32. So far, I've been trying to load the model-specific drivers to solve this problem but it looks like I'm going down the wrong track. I plan on getting your restart script working in my environment first to ease the users pain but I'm very interested to see what the root cause of this will be. Thank you very much for documenting this!! I'll keep you posted on my testing results

    ReplyDelete
    Replies
    1. Right on,

      One other problem I've run into is not the services crashing, which is easy to detect and fix, but the services becoming hung/frozen. The end result is about the same (users get no printers or errors when trying to print) and the fix is the same (restarting both services) but it's much harder to detect/fix automatically.

      So that's something to watch out for even after you get the auto-restart stuff going on.

      Delete
    2. Yep! I'm noticing that too now. The CPM service is getting stuck on stopping so I have to use task kill to restart the process. Like you say, by the time one user reports the issue, 5 more people are calling help desk to report it as well. I'm opening up a ticket with a consulting group we work closely with and hope to have some good results soon. I'll post what we find!! Truly annoying issue..

      Delete
    3. We've determined that this issue was being caused by users loading drivers onto the XenApp servers even though we specifically stated, through policy, not to allow users to load drivers. To fix it, we implemented the GPOs referenced in this Citrix article: http://support.citrix.com/article/CTX128786. Once this GPO was in place, drivers stopped being loaded and that stopped my print spooler from crashing. It's been three days since I've seen a crash!! Hopefully this info helps with your issue.

      Delete
    4. I've updated the article with an additional solution. Based on Spreela's experience, I checked to make sure users were not able to load their own drivers (they were not in my case), but I did realize the server had a bunch of old drivers that were no longer in use (from printers that had been retired). After clearing out those drivers from the Terminal Server and Print Server the stability of both has improved greatly.

      You know, why wouldn't the spooler load drivers it doesn't need right?

      Delete
  2. Hi - Do you still have the splwow64 problem. It still irritate us :-(

    ReplyDelete
    Replies
    1. I have not had the problem in quite some time. We moved away from using the citrix print mapping policy to using just regular AD Group policy with item-level targeting. This combined with the various mitigations above seems to have largely stopped the crashing.

      Delete