History | Log In     View a printable version of the current page.  
Issue Details (XML | Word | Printable)

Key: QB-3148
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Robin Shen
Reporter: John Landers
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
QuickBuild

QuickBuild Agents drop running build on timeout

Created: 16/Mar/18 05:16 PM   Updated: 17/Mar/18 12:00 AM
Component/s: None
Affects Version/s: 6.1.36
Fix Version/s: None

Original Estimate: Unknown Remaining Estimate: Unknown Time Spent: Unknown
Environment: Windows Build Server, build agents running on Azure


 Description  « Hide
Build Server is running inside company, build agents are up in cloud on Azure, occasionally our internet switch from company to azure has a blimp and drops the vpn connection between buildserver/agent for minute.Our builds are about 45 minutes long and a drop in middle is time lost. This happens once a week or so. We are looking into the network issue but it would be nice to increase the timeout.

A running build will fail with exception below at end of issue.

It seems the connect/read timeouts for the Remote connection are hard coded. It would be nice if these are configurable.
It looks like v8.0.0 has same values, we are looking to upgrade to 8.0.0 soon.

com.pmease.quickbuild.RemotingProxyFactory.RemotingProxyFactory(String)
public RemotingProxyFactory(String token) {
this.token = token;
setOverloadEnabled(true);
setConnectTimeout(Bootstrap.NET_CONNECT_TIMEOUT*1000L);
setReadTimeout(Bootstrap.NET_READ_TIMEOUT*1000L);
}

com.pmease.quickbuild.bootstrap.Bootstrap
    public static final int NET_CONNECT_TIMEOUT = 120; // in seconds
    
    public static final int NET_READ_TIMEOUT = 300; // in seconds


Exception that fails build:

09:46:44,417 ERROR - Build is failed.
    java.lang.RuntimeException: Error executing step execution job.
        at com.pmease.quickbuild.stepsupport.StepExecutionTask.reduce(StepExecutionTask.java:29)
        at com.pmease.quickbuild.stepsupport.StepExecutionTask.reduce(StepExecutionTask.java:19)
        at com.pmease.quickbuild.grid.GridTaskFuture.get(GridTaskFuture.java:116)
        at com.pmease.quickbuild.DefaultBuildEngine.run(DefaultBuildEngine.java:532)
        at com.pmease.quickbuild.DefaultBuildEngine.process(DefaultBuildEngine.java:400)
        at com.pmease.quickbuild.DefaultBuildEngine.access$000(DefaultBuildEngine.java:139)
        at com.pmease.quickbuild.DefaultBuildEngine$2.run(DefaultBuildEngine.java:1142)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
    Caused by: com.pmease.quickbuild.QuickbuildException: Error testing job.
        at com.pmease.quickbuild.grid.GridTaskFuture.testJobs(GridTaskFuture.java:63)
        at com.pmease.quickbuild.grid.GridTaskFuture.get(GridTaskFuture.java:98)
        ... 7 more
    Caused by: com.caucho.hessian.client.HessianRuntimeException: com.caucho.hessian.client.HessianRuntimeException: Error connecting 'http://172.18.12.8:8811/service/node'
        at com.caucho.hessian.client.HessianProxy.sendRequest(HessianProxy.java:285)
        at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:171)
        at com.sun.proxy.$Proxy77.testGridJob(Unknown Source)
        at com.pmease.quickbuild.grid.GridTaskFuture.testJobs(GridTaskFuture.java:51)
        ... 8 more
    Caused by: com.caucho.hessian.client.HessianRuntimeException: Error connecting 'http://172.18.12.8:8811/service/node'
        at com.caucho.hessian.client.HessianURLConnection.getOutputStream(HessianURLConnection.java:113)
        at com.caucho.hessian.client.HessianProxy.sendRequest(HessianProxy.java:283)
        ... 11 more
    Caused by: java.net.ConnectException: Connection timed out: connect
        at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
        at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:85)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
        at sun.net.www.http.HttpClient.<init>(HttpClient.java:211)
        at sun.net.www.http.HttpClient.New(HttpClient.java:308)
        at sun.net.www.http.HttpClient.New(HttpClient.java:326)
        at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1169)
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1105)
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:999)
        at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:933)
        at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1283)
        at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1258)
        at com.caucho.hessian.client.HessianURLConnection.getOutputStream(HessianURLConnection.java:101)

 All   Comments   Work Log   Change History      Sort Order:
Robin Shen [17/Mar/18 12:00 AM]
Please upgrade to QB8 and set step disconnect tolerance value to workaround this issue