<< Back to previous view |
[QB-1587] QB server assigns more than 1 jobs to the same build agent
|
|
Status: | Closed |
Project: | QuickBuild |
Component/s: | None |
Affects Version/s: | 5.0.8 |
Fix Version/s: | None |
Type: | Bug | Priority: | Major |
Reporter: | Irina Kotlova | Assigned To: | Robin Shen |
Resolution: | Cannot Reproduce | Votes: | 0 |
Remaining Estimate: | Unknown | Time Spent: | Unknown |
Original Estimate: | Unknown | ||
Environment: | Build Agent: VMWare vCenter Server 5.0.0; Windows Server 2008 R2 Enterprise SP1 |
File Attachments: | CleanWorkspaceConflict.png |
Description |
This week we started to have the issue when CleanUpWorkspace step failed for a certain configuration. To remedy the situation I cleaned the workspace from the QB UI and also manually removed the configuration directory on the build agent node. The next day the same problem occurred with the different configuration.
It happened that at a certain point 2 separate build jobs of the same build got assigned to the same build agent. Both of them are trying to clean the same workspace and there is a conflict. Thanks, Irina |
Comments |
Comment by Irina Kotlova [ 21/Mar/13 06:59 PM ] |
I attached CleanWorkspaceConflict.png - win2k8build1 build agent is assigned to 2 build jobs that are supposed to run on separate build agents. |
Comment by Robin Shen [ 22/Mar/13 01:00 AM ] |
Please send your database backup to [robin AT pmease DOT com] and let me know the configuration to look at. |
Comment by Irina Kotlova [ 22/Mar/13 08:27 PM ] |
I noticed that it is the same build agent node that exposes this behavior. I reinstalled the build agent from scratch and will wait for couple of days - thank you for your attention. |
Comment by Irina Kotlova [ 26/Mar/13 02:39 PM ] |
Robin,
After re-installing the build agent and manual removal of all workspaces in question the problem did not re-appear. My assumption is that originally it was somehow introduced by the fact that QuickBuild Build Agent Service account was excluded from the Administrators group on this node. As a result the former workspace directory was inaccessible for removal. Sorry for the alarm - please close the ticket. Thank you, Irina |
Comment by Robin Shen [ 26/Mar/13 11:35 PM ] |
No problem. Just re-open it if you encounter the issue again. |
Comment by Irina Kotlova [ 16/Apr/13 04:01 PM ] |
Hi Robin,
Could you, please help me with understanding what's going on with this QuickBuild Node - win2k8buildvm11? Before this node exposed the above behavior. Now I do not see it anymore. Since then I re-installed QB build agent 2 times, excluded the account, starting QB service from Windows Administrators' group. So, now I do not see a problem related to inability to clean the workspace. But still something is wrong with this build agent. For some reason it hangs on certain steps. For example, now it hangs while performing the test: the test is comparing saved artifacts directory tree with the expected list (the baseline). There is very few files in the tree, by the way. the process starts at 2:47:49 and then hangs till 07:08:31 with doing nothing. Similar hang happens while building C++ code in another project. It should not be because of interaction with the desktop as other build agents are configured similarly and they do not expose this behavior. The problem should be somehow related to permissions although I do not see what is different between win2k8buildvm11 and others. Your help is very appreciated! Irina Here is the log. Before I tried setting up debug logging level but there was no additional information in it. 02:47:49,060 INFO - [echo] Build step: Comparing released files structure with the baseline: D:\LConnected\V14-Blackstone\QuickBuild\nightly\src\publishedDir_x64.txt. 02:47:49,060 INFO - [echo] Command: C:\BuildFarm\src\python\tools\treevalidator.cmd -v -f D:\LConnected\V14-Blackstone\QuickBuild\nightly\src\publishedDir_x64.txt -p D:\LConnected\V14-Blackstone\QuickBuild\nightly\outputqb\Win64\Release > D:\LConnected\V14-Blackstone\QuickBuild\nightly\logStandard.test.comparison.log 02:47:49,061 INFO - [echo] Directory: D:\LConnected\V14-Blackstone\QuickBuild\nightly\src 02:47:49,061 INFO - [echo] =============== 02:47:49,062 INFO - buildUtilities.runShell: 07:08:31,965 INFO - Trying to kill process tree 2840 with OS kill utility... 07:08:33,166 INFO - SUCCESS: The process with PID 4668 (child process of PID 4304) has been terminated. 07:08:33,167 INFO - SUCCESS: The process with PID 4304 (child process of PID 5972) has been terminated. 07:08:33,177 INFO - SUCCESS: The process with PID 5972 (child process of PID 2840) has been terminated. 07:08:33,177 INFO - SUCCESS: The process with PID 2840 (child process of PID 4940) has been terminated. 07:08:33,217 INFO - Executing post-execute action... 07:08:33,222 ERROR - Step 'master>DistributeBuildTargets>ForEachPlatformConfig?BUILD_CONFIG=Release&BUILD_PLATFORM=Win64&BUILD_TYPE=distributed&BUILD_SIGN=true>SelectBuildSystem>RunBuildSteps>IfSyncSuccessful-Build>PhaseTest' is failed. java.lang.RuntimeException: java.lang.InterruptedException at com.pmease.quickbuild.execution.Commandline.execute(Commandline.java:350) at com.pmease.quickbuild.execution.Commandline.execute(Commandline.java:201) at com.pmease.quickbuild.plugin.builder.ant.AntBuildStep.run(AntBuildStep.java:275) at com.pmease.quickbuild.plugin.builder.ant.AntBuildStep$$EnhancerByCGLIB$$828dc58.CGLIB$run$0(<generated>) at com.pmease.quickbuild.plugin.builder.ant.AntBuildStep$$EnhancerByCGLIB$$828dc58$$FastClassByCGLIB$$5069064b.invoke(<generated>) at net.sf.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:215) at com.pmease.quickbuild.DefaultScriptEngine$Interpolator.intercept(DefaultScriptEngine.java:269) at com.pmease.quickbuild.plugin.builder.ant.AntBuildStep$$EnhancerByCGLIB$$828dc58.run(<generated>) at com.pmease.quickbuild.stepsupport.Step.execute(Step.java:501) at com.pmease.quickbuild.stepsupport.StepExecutionJob.executeStepAwareJob(StepExecutionJob.java:29) at com.pmease.quickbuild.stepsupport.StepAwareJob.executeBuildAwareJob(StepAwareJob.java:47) at com.pmease.quickbuild.BuildAwareJob.execute(BuildAwareJob.java:61) at com.pmease.quickbuild.grid.GridJob.run(GridJob.java:78) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.lang.InterruptedException at java.lang.ProcessImpl.waitFor(Native Method) at com.pmease.quickbuild.execution.Commandline.execute(Commandline.java:319) ... 18 more |
Comment by Robin Shen [ 17/Apr/13 01:35 AM ] |
To narrow down the issue, please logon to this machine, and start build agent from command line directly by running:
"bin/agent console" You will need to stop agent service first to make this working. Then watch for some time to see if the problem still happens. |