[#QB-3840] InterruptedException

QuickBuild

InterruptedException

Created: 08/Mar/22 03:12 PM Updated: 09/Mar/22 11:44 AM

Component/s:

None

Affects Version/s:

12.0.0

Fix Version/s:

None

Original Estimate:	Unknown	Remaining Estimate:	Unknown	Time Spent:	Unknown
Environment:	QB running on Windows 10 delegating to Ubuntu

Description

« Hide

Since updating to version 12.0 my build breaks without obvious reason.
What could be wrong?
What is the meaning of 'InterruptedException'?

09:56:23,692 ERROR - Step 'master>Build YOCTO>Build LDK Bitbake?QB_Project=gpr3-root-image>Build Bitbake' is failed.
    java.lang.RuntimeException: java.lang.InterruptedException
        at com.pmease.quickbuild.execution.Commandline.execute(Commandline.java:395)
        at com.pmease.quickbuild.plugin.basis.CommandBuildStep.run(CommandBuildStep.java:238)
        at com.pmease.quickbuild.plugin.basis.CommandBuildStep$$EnhancerByCGLIB$$70f7be85.CGLIB$run$14(<generated>)
        at com.pmease.quickbuild.plugin.basis.CommandBuildStep$$EnhancerByCGLIB$$70f7be85$$FastClassByCGLIB$$86ab0a2d.invoke(<generated>)
        at net.sf.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:228)
        at com.pmease.quickbuild.DefaultScriptEngine$Interpolator.intercept(DefaultScriptEngine.java:267)
        at com.pmease.quickbuild.plugin.basis.CommandBuildStep$$EnhancerByCGLIB$$70f7be85.run(<generated>)
        at com.pmease.quickbuild.stepsupport.Step.doExecute(Step.java:677)
        at com.pmease.quickbuild.stepsupport.Step.execute(Step.java:577)
        at com.pmease.quickbuild.stepsupport.StepExecutionJob.executeStepAwareJob(StepExecutionJob.java:31)
        at com.pmease.quickbuild.stepsupport.StepAwareJob.executeBuildAwareJob(StepAwareJob.java:56)
        at com.pmease.quickbuild.BuildAwareJob.execute(BuildAwareJob.java:77)
        at com.pmease.quickbuild.grid.GridJob.run(GridJob.java:131)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
    Caused by: java.lang.InterruptedException
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:502)
        at java.lang.UNIXProcess.waitFor(UNIXProcess.java:395)
        at com.pmease.quickbuild.execution.Commandline.execute(Commandline.java:357)
        ... 17 more

Description

Since updating to version 12.0 my build breaks without obvious reason. What could be wrong? What is the meaning of 'InterruptedException'? 09:56:23,692 ERROR - Step 'master>Build YOCTO>Build LDK Bitbake?QB_Project=gpr3-root-image>Build Bitbake' is failed. java.lang.RuntimeException: java.lang.InterruptedException at com.pmease.quickbuild.execution.Commandline.execute(Commandline.java:395) at com.pmease.quickbuild.plugin.basis.CommandBuildStep.run(CommandBuildStep.java:238) at com.pmease.quickbuild.plugin.basis.CommandBuildStep$$EnhancerByCGLIB$$70f7be85.CGLIB$run$14(<generated>) at com.pmease.quickbuild.plugin.basis.CommandBuildStep$$EnhancerByCGLIB$$70f7be85$$FastClassByCGLIB$$86ab0a2d.invoke(<generated>) at net.sf.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:228) at com.pmease.quickbuild.DefaultScriptEngine$Interpolator.intercept(DefaultScriptEngine.java:267) at com.pmease.quickbuild.plugin.basis.CommandBuildStep$$EnhancerByCGLIB$$70f7be85.run(<generated>) at com.pmease.quickbuild.stepsupport.Step.doExecute(Step.java:677) at com.pmease.quickbuild.stepsupport.Step.execute(Step.java:577) at com.pmease.quickbuild.stepsupport.StepExecutionJob.executeStepAwareJob(StepExecutionJob.java:31) at com.pmease.quickbuild.stepsupport.StepAwareJob.executeBuildAwareJob(StepAwareJob.java:56) at com.pmease.quickbuild.BuildAwareJob.execute(BuildAwareJob.java:77) at com.pmease.quickbuild.grid.GridJob.run(GridJob.java:131) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.InterruptedException at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at java.lang.UNIXProcess.waitFor(UNIXProcess.java:395) at com.pmease.quickbuild.execution.Commandline.execute(Commandline.java:357) ... 17 more

Show »

All

Comments

Work Log

Change History

Sort Order:

[ Permlink | « Hide ]

Robin Shen [09/Mar/22 11:44 AM]

For step timeouts, it is always marked as build failure. And if timeout is specified at configuration level, it will be marked as build timed out. This is the same for all versions. I will close this issue and file a separate bug:

https://track.pmease.com/browse/QB-3841

[ Permlink | « Hide ]

Martin Falkner [09/Mar/22 11:27 AM]

Just out of curiosity: In earlier versions, the build was marked as timed out when this happened. Has this been changed?
This is actually the reason I did not notice it in the beginning.

[ Permlink | « Hide ]

Martin Falkner [09/Mar/22 11:21 AM]

I feel so stupid!
While experimenting with the "Disconnect Tolerance", I must have accidently entered the value in one of the "Sequential" steps into the wrong field "Timeout" instead of "Disconnect Tolerance". As we had network problems at that time, I must not have noticed it immediately and later forgot that I changes the "Disconnect Tolerance" in two places.
So, the "Timeout" was set to 5 minutes and this caused this 'InterruptedException'!
As while doing maintenance, I did the update to version version 12.0 at he same time and later looked at the timeouts, therefore I wrongly wrote it down to this update as the 'new problem' was still there even after fixing the network problems.
Sorry for having wasted your time.

[ Permlink | « Hide ]

Robin Shen [09/Mar/22 10:40 AM]

Can you please attach screenshot of overview of the failed build?

[ Permlink | « Hide ]

Martin Falkner [09/Mar/22 10:26 AM]

The master step is already configured to run on "DB-FUSION-YOCTO:8811", therefore there is no parent, is it?

[ Permlink | « Hide ]

Robin Shen [09/Mar/22 10:05 AM]

This setting should be specified in parent step of the step running on node "DB-FUSION-YOCTO:8811".

[ Permlink | « Hide ]

Martin Falkner [09/Mar/22 09:08 AM]

Actually, I already had disconnection tolerance on 180 on the master step (this is where the delegation to ubuntu is done).
Or is this not the correct place?

[ Permlink | « Hide ]

Martin Falkner [09/Mar/22 09:06 AM]

The sleep task went through just fine.
Would it help to run a build with increased logging on the server?

[ Permlink | « Hide ]

Robin Shen [09/Mar/22 08:52 AM]

The testGridJob checks connectivity of the node running the job periodically by checking the agent port. It kills the job if it detected disconnection. You may also edit parent step of the failed step to specify a network disconnection tolerance value (in advanced setting of the step) to see if it helps.

[ Permlink | « Hide ]

Martin Falkner [09/Mar/22 08:47 AM]

Sure, I will.

Meanwhile I've restarted all the build machines.
I also ran ping on both sides, there was absolutely no interrupt.
I also looked at the agent log (see below), but probably this will not help.
It always breaks 5 minutes into the build.
I assume I just create a build that sleeps for 6 minutes and we sill see.

jvm 1 | 2022-03-09 09:38:25,057 INFO Workspace: /home/fusionuser/dev/buildagent/workspace/Fusion/GPR/GPR3/OS/BRCM/GPR3_OS Branch
jvm 1 | 2022-03-09 09:43:25,590 WARN testGridJob is cancelling job 'ac43330f-76c5-47f3-bcae-dfa1aa23f661'...
jvm 1 | 2022-03-09 09:43:25,591 WARN Job still exists on job node and cancel command is issued (job id: ac43330f-76c5-47f3-bcae-dfa1aa23f661, build id: 102982, job node: DB-FUSION-YOCTO:8811)...
jvm 1 | 2022-03-09 09:43:25,592 WARN testGridJob is cancelling job 'fd76e1dc-6c8f-4aa0-a327-e6423a5722d1'...
jvm 1 | 2022-03-09 09:43:25,592 WARN Job still exists on job node and cancel command is issued (job id: fd76e1dc-6c8f-4aa0-a327-e6423a5722d1, build id: 102982, job node: DB-FUSION-YOCTO:8811)...
jvm 1 | 2022-03-09 09:43:25,592 WARN testGridJob is cancelling job 'a221849a-e509-4a3c-aec4-ff821af30219'...
jvm 1 | 2022-03-09 09:43:25,592 WARN Job still exists on job node and cancel command is issued (job id: a221849a-e509-4a3c-aec4-ff821af30219, build id: 102982, job node: DB-FUSION-YOCTO:8811)...
jvm 1 | 2022-03-09 09:43:25,592 WARN testGridJob is cancelling job '1b884602-cf6a-40e2-9fc6-59e66e0a2214'...
jvm 1 | 2022-03-09 09:43:25,593 WARN Job still exists on job node and cancel command is issued (job id: 1b884602-cf6a-40e2-9fc6-59e66e0a2214, build id: 102982, job node: DB-FUSION-YOCTO:8811)...
jvm 1 | 2022-03-09 09:43:25,593 WARN testGridJob is cancelling job '423baf8b-8c03-42f3-8915-a7294f9f79db'...
jvm 1 | 2022-03-09 09:43:25,593 WARN Job still exists on job node and cancel command is issued (job id: 423baf8b-8c03-42f3-8915-a7294f9f79db, build id: 102982, job node: DB-FUSION-YOCTO:8811)...
jvm 1 | 2022-03-09 09:43:25,605 INFO Killing process 15112...
jvm 1 | 2022-03-09 09:43:25,605 INFO Killing process 15077...
jvm 1 | 2022-03-09 09:43:25,606 INFO Killing process 15076...
jvm 1 | 2022-03-09 09:43:25,627 INFO Killing process 15123...
jvm 1 | 2022-03-09 09:43:25,627 INFO Killing process 15125...
jvm 1 | 2022-03-09 09:43:25,627 INFO Killing process 15126...
jvm 1 | 2022-03-09 09:43:25,627 INFO Killing process 15127...
jvm 1 | 2022-03-09 09:43:25,628 INFO Killing process 15128...
jvm 1 | 2022-03-09 09:43:25,628 INFO Killing process 15129...
jvm 1 | 2022-03-09 09:43:25,628 INFO Killing process 15130...
jvm 1 | 2022-03-09 09:43:25,628 INFO Killing process 15131...
jvm 1 | 2022-03-09 09:43:25,629 INFO Killing process 15132...
jvm 1 | 2022-03-09 09:43:25,629 INFO Killing process 15120...
jvm 1 | 2022-03-09 09:43:25,629 INFO Killing process 15123...
jvm 1 | 2022-03-09 09:43:25,629 INFO Killing process 15125...
jvm 1 | 2022-03-09 09:43:25,630 INFO Killing process 15126...
jvm 1 | 2022-03-09 09:43:25,630 INFO Killing process 15127...
jvm 1 | 2022-03-09 09:43:25,630 INFO Killing process 15128...
jvm 1 | 2022-03-09 09:43:25,631 INFO Killing process 15129...
jvm 1 | 2022-03-09 09:43:25,632 INFO Killing process 15130...
jvm 1 | 2022-03-09 09:43:25,632 INFO Killing process 15131...
jvm 1 | 2022-03-09 09:43:25,632 INFO Killing process 15132...

[ Permlink | « Hide ]

Robin Shen [09/Mar/22 07:18 AM]

Can you please help to reproduce this on a blank QB instance, and send me the database backup for diagnostics?

[ Permlink | « Hide ]

Martin Falkner [09/Mar/22 06:26 AM]

On this build, It's happening every time.

[ Permlink | « Hide ]

Robin Shen [08/Mar/22 10:45 PM]

Is this happening all the time, or occasionally?