We are lately seeing a lot of our builds getting CANCELLED. We don't have a 100% repro, but it happens quite often and usually under heavy link/compilation load.
Build log says:
21:00:23,539 ERROR - Step 'master>Compile' is failed.
java.lang.RuntimeException: java.lang.InterruptedException: Composite step 'Compile' is cancelled.
at com.pmease.quickbuild.stepsupport.CompositeStep.run(CompositeStep.java:121)
at com.pmease.quickbuild.stepsupport.Step.execute(Step.java:539)
at com.pmease.quickbuild.stepsupport.StepExecutionJob.executeStepAwareJob(StepExecutionJob.java:30)
at com.pmease.quickbuild.stepsupport.StepAwareJob.executeBuildAwareJob(StepAwareJob.java:45)
at com.pmease.quickbuild.BuildAwareJob.execute(BuildAwareJob.java:60)
at com.pmease.quickbuild.grid.GridJob.run(GridJob.java:106)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.InterruptedException: Composite step 'Group Compile' is cancelled.
... 11 more
I don't find anything with corresponding timestamps in agent log. We have investigated possible causes in our build infra, but didn't find anything. Do you have any ideas what might cause this?
Not sure if that's relevant, but build agents do log following errors every now and then:
2016-05-02 20:57:24,024 [Thread-14] ERROR com.pmease.quickbuild.Quickbuild - Error connecting server.
com.caucho.hessian.client.HessianRuntimeException: java.net.SocketException: Software caused connection abort: recv failed
at com.caucho.hessian.client.HessianProxy.sendRequest(HessianProxy.java:285)
at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:171)
at com.sun.proxy.$Proxy19.connect(Unknown Source)
at com.pmease.quickbuild.grid.AgentConnectivityTask.run(AgentConnectivityTask.java:51)
at java.lang.Thread.run(Unknown Source)
Caused by: java.net.SocketException: Software caused connection abort: recv failed
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at sun.security.ssl.InputRecord.readFully(Unknown Source)
at sun.security.ssl.InputRecord.read(Unknown Source)
at sun.security.ssl.SSLSocketImpl.readRecord(Unknown Source)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(Unknown Source)
at sun.security.ssl.SSLSocketImpl.startHandshake(Unknown Source)
at sun.security.ssl.SSLSocketImpl.startHandshake(Unknown Source)
at sun.net.
www.protocol.https.HttpsClient.afterConnect(Unknown Source)
at sun.net.
www.protocol.https.AbstractDelegateHttpsURLConnection.connect(Unknown Source)
at sun.net.
www.protocol.http.HttpURLConnection.getOutputStream0(Unknown Source)
at sun.net.
www.protocol.http.HttpURLConnection.getOutputStream(Unknown Source)
at sun.net.
www.protocol.https.HttpsURLConnectionImpl.getOutputStream(Unknown Source)
at com.caucho.hessian.client.HessianURLConnection.getOutputStream(HessianURLConnection.java:101)
at com.caucho.hessian.client.HessianProxy.sendRequest(HessianProxy.java:283)
... 4 more
We didn't see such behavior in 5.1.30.