<< Back to previous view

[QB-4077] Retry connection between server and agents
Created: 04/Mar/24  Updated: 18/Mar/24

Status: Open
Project: QuickBuild
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major
Reporter: Nguyen Danh Hung Assigned To: Robin Shen
Resolution: Unresolved Votes: 1
Remaining Estimate: Unknown Time Spent: Unknown
Original Estimate: Unknown


 Description   
Hello Mr. Robin Shen,
Could you please add connection retry method every time HessianProxy connection has exception?
We want QB will retry until the AgentConnectivityTask is maintaining connection if there is an exception on connection.

Here are the logs:
Step 'master>Build_Steps' is failed.
    java.lang.RuntimeException: Error executing step process job.
        at com.pmease.quickbuild.stepsupport.StepProcessTask.reduce(StepProcessTask.java:126)
        at com.pmease.quickbuild.stepsupport.StepProcessTask.reduce(StepProcessTask.java:19)
        at com.pmease.quickbuild.grid.GridTaskFuture.get(GridTaskFuture.java:168)
        at com.pmease.quickbuild.grid.GridTaskFuture.get(GridTaskFuture.java:172)
        at com.pmease.quickbuild.stepsupport.SequentialStep.triggerChildren(SequentialStep.java:46)
        at com.pmease.quickbuild.stepsupport.CompositeStep.run(CompositeStep.java:133)
        at com.pmease.quickbuild.stepsupport.Step.doExecute(Step.java:661)
        at com.pmease.quickbuild.stepsupport.Step.execute(Step.java:575)
        at com.pmease.quickbuild.stepsupport.StepExecutionJob.executeStepAwareJob(StepExecutionJob.java:31)
        at com.pmease.quickbuild.stepsupport.StepAwareJob.executeBuildAwareJob(StepAwareJob.java:56)
        at com.pmease.quickbuild.BuildAwareJob.execute(BuildAwareJob.java:79)
        at com.pmease.quickbuild.grid.GridJob.run(GridJob.java:131)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
    Caused by: java.lang.RuntimeException: Error executing step execution job.
        at com.pmease.quickbuild.stepsupport.StepExecutionTask.reduce(StepExecutionTask.java:29)
        at com.pmease.quickbuild.stepsupport.StepExecutionTask.reduce(StepExecutionTask.java:19)
        at com.pmease.quickbuild.grid.GridTaskFuture.get(GridTaskFuture.java:168)
        at com.pmease.quickbuild.grid.GridTaskFuture.get(GridTaskFuture.java:172)
        at com.pmease.quickbuild.stepsupport.StepProcessJob.executeStepAwareJob(StepProcessJob.java:46)
        ... 8 more
    Caused by: com.caucho.hessian.client.HessianRuntimeException: com.caucho.hessian.client.HessianRuntimeException: Error connecting 'http://QB_ip/service/server'
        at com.caucho.hessian.client.HessianProxy.sendRequest(HessianProxy.java:285)
        at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:171)
        at com.sun.proxy.$Proxy35.stepUpdated(Unknown Source)
        at com.pmease.quickbuild.stepsupport.StepExecutionJob.executeStepAwareJob(StepExecutionJob.java:57)
        ... 8 more
    Caused by: com.caucho.hessian.client.HessianRuntimeException: Error connecting 'http://QB_ip/service/server'
        at com.caucho.hessian.client.HessianURLConnection.getOutputStream(HessianURLConnection.java:103)
        at com.caucho.hessian.client.HessianProxy.sendRequest(HessianProxy.java:283)
        ... 11 more
    Caused by: java.net.SocketTimeoutException: connect timed out
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:607)
        at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
        at sun.net.www.http.HttpClient.&lt;init&gt;(HttpClient.java:242)
        at sun.net.www.http.HttpClient.New(HttpClient.java:339)
        at sun.net.www.http.HttpClient.New(HttpClient.java:357)
        at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1228)
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162)
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
        at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:990)
        at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1342)
        at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1317)
        at com.caucho.hessian.client.HessianURLConnection.getOutputStream(HessianURLConnection.java:99)
        ... 12 more


 Comments   
Comment by Nguyen Danh Hung [ 18/Mar/24 02:31 AM ]
Hello Mr.Robin Shen,
Could you please update status of this improvement?
Comment by Robin Shen [ 18/Mar/24 05:42 AM ]
Hi Nguyen,

There is no plan for this improvement as retrying on every connection failure universally is very cubersome, and also it is not a good practice as it hides real network problems.

Instead, please consider the step retry option in advanced setting of a step.
Generated at Thu May 16 08:48:29 UTC 2024 using JIRA 189.