History | Log In     View a printable version of the current page.  
Issue Details (XML | Word | Printable)

Key: QB-1587
Type: Bug Bug
Status: Closed Closed
Resolution: Cannot Reproduce
Priority: Major Major
Assignee: Robin Shen
Reporter: Irina Kotlova
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
QuickBuild

QB server assigns more than 1 jobs to the same build agent

Created: 21/Mar/13 06:53 PM   Updated: 17/Apr/13 01:35 AM
Component/s: None
Affects Version/s: 5.0.8
Fix Version/s: None

Original Estimate: Unknown Remaining Estimate: Unknown Time Spent: Unknown
File Attachments: None
Image Attachments:

1. CleanWorkspaceConflict.png
(117 kb)
Environment: Build Agent: VMWare vCenter Server 5.0.0; Windows Server 2008 R2 Enterprise SP1


 Description  « Hide
This week we started to have the issue when CleanUpWorkspace step failed for a certain configuration. To remedy the situation I cleaned the workspace from the QB UI and also manually removed the configuration directory on the build agent node. The next day the same problem occurred with the different configuration.

It happened that at a certain point 2 separate build jobs of the same build got assigned to the same build agent. Both of them are trying to clean the same workspace and there is a conflict.

Thanks,
Irina

 All   Comments   Work Log   Change History      Sort Order:
Irina Kotlova [21/Mar/13 06:59 PM]
I attached CleanWorkspaceConflict.png - win2k8build1 build agent is assigned to 2 build jobs that are supposed to run on separate build agents.

Robin Shen [22/Mar/13 01:00 AM]
Please send your database backup to [robin AT pmease DOT com] and let me know the configuration to look at.

Irina Kotlova [22/Mar/13 08:27 PM]
I noticed that it is the same build agent node that exposes this behavior. I reinstalled the build agent from scratch and will wait for couple of days - thank you for your attention.

Irina Kotlova [26/Mar/13 02:39 PM]
Robin,

After re-installing the build agent and manual removal of all workspaces in question the problem did not re-appear.

My assumption is that originally it was somehow introduced by the fact that QuickBuild Build Agent Service account was excluded from the Administrators group on this node. As a result the former workspace directory was inaccessible for removal.

Sorry for the alarm - please close the ticket. Thank you,
Irina

Robin Shen [26/Mar/13 11:35 PM]
No problem. Just re-open it if you encounter the issue again.

Irina Kotlova [16/Apr/13 04:01 PM]
Hi Robin,

Could you, please help me with understanding what's going on with this QuickBuild Node - win2k8buildvm11?

Before this node exposed the above behavior. Now I do not see it anymore.

Since then I re-installed QB build agent 2 times, excluded the account, starting QB service from Windows Administrators' group. So, now I do not see a problem related to inability to clean the workspace.

But still something is wrong with this build agent. For some reason it hangs on certain steps. For example, now it hangs while performing the test: the test is comparing saved artifacts directory tree with the expected list (the baseline). There is very few files in the tree, by the way. the process starts at 2:47:49 and then hangs till 07:08:31 with doing nothing. Similar hang happens while building C++ code in another project.

It should not be because of interaction with the desktop as other build agents are configured similarly and they do not expose this behavior. The problem should be somehow related to permissions although I do not see what is different between win2k8buildvm11 and others.

Your help is very appreciated!
Irina

Here is the log. Before I tried setting up debug logging level but there was no additional information in it.

02:47:49,060 INFO - [echo] Build step: Comparing released files structure with the baseline: D:\LConnected\V14-Blackstone\QuickBuild\nightly\src\publishedDir_x64.txt.
02:47:49,060 INFO - [echo] Command: C:\BuildFarm\src\python\tools\treevalidator.cmd -v -f D:\LConnected\V14-Blackstone\QuickBuild\nightly\src\publishedDir_x64.txt -p D:\LConnected\V14-Blackstone\QuickBuild\nightly\outputqb\Win64\Release > D:\LConnected\V14-Blackstone\QuickBuild\nightly\logStandard.test.comparison.log
02:47:49,061 INFO - [echo] Directory: D:\LConnected\V14-Blackstone\QuickBuild\nightly\src
02:47:49,061 INFO - [echo] ===============
02:47:49,062 INFO - buildUtilities.runShell:
07:08:31,965 INFO - Trying to kill process tree 2840 with OS kill utility...
07:08:33,166 INFO - SUCCESS: The process with PID 4668 (child process of PID 4304) has been terminated.
07:08:33,167 INFO - SUCCESS: The process with PID 4304 (child process of PID 5972) has been terminated.
07:08:33,177 INFO - SUCCESS: The process with PID 5972 (child process of PID 2840) has been terminated.
07:08:33,177 INFO - SUCCESS: The process with PID 2840 (child process of PID 4940) has been terminated.
07:08:33,217 INFO - Executing post-execute action...
07:08:33,222 ERROR - Step 'master>DistributeBuildTargets>ForEachPlatformConfig?BUILD_CONFIG=Release&BUILD_PLATFORM=Win64&BUILD_TYPE=distributed&BUILD_SIGN=true>SelectBuildSystem>RunBuildSteps>IfSyncSuccessful-Build>PhaseTest' is failed.
    java.lang.RuntimeException: java.lang.InterruptedException
        at com.pmease.quickbuild.execution.Commandline.execute(Commandline.java:350)
        at com.pmease.quickbuild.execution.Commandline.execute(Commandline.java:201)
        at com.pmease.quickbuild.plugin.builder.ant.AntBuildStep.run(AntBuildStep.java:275)
        at com.pmease.quickbuild.plugin.builder.ant.AntBuildStep$$EnhancerByCGLIB$$828dc58.CGLIB$run$0(<generated>)
        at com.pmease.quickbuild.plugin.builder.ant.AntBuildStep$$EnhancerByCGLIB$$828dc58$$FastClassByCGLIB$$5069064b.invoke(<generated>)
        at net.sf.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:215)
        at com.pmease.quickbuild.DefaultScriptEngine$Interpolator.intercept(DefaultScriptEngine.java:269)
        at com.pmease.quickbuild.plugin.builder.ant.AntBuildStep$$EnhancerByCGLIB$$828dc58.run(<generated>)
        at com.pmease.quickbuild.stepsupport.Step.execute(Step.java:501)
        at com.pmease.quickbuild.stepsupport.StepExecutionJob.executeStepAwareJob(StepExecutionJob.java:29)
        at com.pmease.quickbuild.stepsupport.StepAwareJob.executeBuildAwareJob(StepAwareJob.java:47)
        at com.pmease.quickbuild.BuildAwareJob.execute(BuildAwareJob.java:61)
        at com.pmease.quickbuild.grid.GridJob.run(GridJob.java:78)
        at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
        at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
        at java.util.concurrent.FutureTask.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)
    Caused by: java.lang.InterruptedException
        at java.lang.ProcessImpl.waitFor(Native Method)
        at com.pmease.quickbuild.execution.Commandline.execute(Commandline.java:319)
        ... 18 more

Robin Shen [17/Apr/13 01:35 AM]
To narrow down the issue, please logon to this machine, and start build agent from command line directly by running:
"bin/agent console"
You will need to stop agent service first to make this working.
Then watch for some time to see if the problem still happens.