[Disclaimer: This article is a work in progress and may not be 100% accurate]
When developing SharePoint solutions, it is common to script deployment of artefacts (i.e. wsps, features, content etc) so that frequent deployment and uninstallation of builds can occur across environments. One of the most common steps in our scripts is to retract then delete wsp files. Retracting of wsps happens via an admin job, hence if our scripts tried to delete straight after a retract command, they would fail (as the retract job hasn’t completed). In 2007, we called stsadm –o execadmsvcjobs between retract and delete to overcome this. In SharePoint 2010, the behaviour has changed so the following must be considered.
- Stsadm –o execadmsvcjobs is effectively an alias of the Powershell Start-SPAdminJob command.
- Stsadm –o execadmsvcjobs / start-spadminjob only executes (synchronously) if the SharePoint Administration Service is not started on the server on which you are executing the command. If this service is running, then the admin jobs will be executed according to schedule and you will get the error “Start-SPAdminJob : The administration service is running so all administration jobs will be run in the timer service.”. This is a change in behaviour from 2007.
- Stsadm –o execadmsvcjobs / start-spadminjob only forces timer jobs to execute on an individual server in a farm (hence needs to be run on all farm nodes). In 2007 the documentation indicated that this only needed to be run on one server (provided that the admin service was running for all nodes in the farm). Investigating the code and speaking to individuals, it looks like the 2007 documentation may be misleading, and actually in 2007 execadmsvcjobs needed to be run on all nodes. http://technet.microsoft.com/en-us/library/cc288149(office.12).aspx. See also http://technet.microsoft.com/en-us/library/ee513051.aspx . It looks like for 2007 stsadm –o execadmsvcjobs only worked in the above scenario by co-incidence (i.e. running on one WFE doesn’t implicitly force it to run on all WFEs). I have not confirmed this, but there may be difference in behaviour depending on whether a solution is web app or farm scoped (i.e. if webapp scoped, timer jobs are created on each WFE, but this may not be the case if not farm scoped).
The process when execadmsvcjobs / start-spadminjob runs is as follows:
- When start-spadminjob / execadmsvcjobs runs, it iterates through all job definitions on the local server and gets a list of outstanding SPAdministrationServiceJobDefinitions.
- These matching job definitions are then iterated through and the jobs concrete implementation of “Execute” is called (based on the Virtual “Execute” method in the SPJobDefinition class)
- In the case of SPSolutionDeploymentJobDefinition (execute method), files are deployed to / removed from the local server only when running the job.
To ensure that admin jobs run synchronously, we therefore have the following options (other than disabling the admin service)
- Do a simple thread.sleep. This obviously does not guarantee that a job has finished execution
- Wait until the SPRunningJobs collection returns empty
- Check SPHistoryEntries to see if the solution job finished running
Also, to ensure that all jobs execute on all nodes in the farm, you could run some code such as…
- Gets the Timer Service – e.g. SPTimerService timerService = LocalFarm.TimerService
- Get the admin job definition via SPJobDefinition job = timerService.JobDefinitions
- Iterate through the server in the farm via SPServer server = LocalFarm.Servers
- Call job.Execute(server.Id) to execute the job on each server. NB. From what I understand, calling the Execute method directly in code is not advise. In 2010 there is a new function called RunNow() however, I don’t think this exhibits synchronous behaviour.
Thanks to Kashif Tahir for his input with this.