History | Log In     View a printable version of the current page.  
Issue Details (XML | Word | Printable)

Key: QB-3228
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Robin Shen
Reporter: AlSt
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
QuickBuild

When IP of a build node changes it does not get recognized and results into "No route to host"

Created: 25/Jul/18 08:37 AM   Updated: 28/Feb/19 03:10 PM
Component/s: None
Affects Version/s: 8.0.10
Fix Version/s: None

Original Estimate: Unknown Remaining Estimate: Unknown Time Spent: Unknown


 Description  « Hide
At least on windows nodes we've seen that quite often recently that when a node gets a new IP address QB does not recognize that and the server still attempts to connect using the old IP address which results into a "No route to host" which is of course true because the old IP address is not assigned to any node.

Reproducable by:
* Set static IP on a windows node
* Start QB agent
* Let a build run on that node
* Change the static IP on the node without stopping the QB agent
* Try to start the same build again -> During build condition checking an exception is thrown with "No route to host" exception

In the log still the old IP is printed. But it already has a completely different one.

I'm not sure if this also happens on Linux.

Log snippet on server:
2018-07-25 10:28:06,656 [pool-1-thread-1208] INFO com.pmease.quickbuild.DefaultBuildEngine - Processing build request (configuration:root/Run on ld-ws-pci02v, request id:9fe0f3b3-fe4d-452b-921f-509df2108255)
2018-07-25 10:28:06,662 [pool-1-thread-1208] INFO com.pmease.quickbuild.DefaultBuildEngine - Checking build condition on node (address: ld-ws-pci02v:8811, ip: 172.16.117.213)...
2018-07-25 10:28:09,673 [pool-1-thread-1208] ERROR com.pmease.quickbuild.DefaultBuildEngine - Error processing build request.
 java.lang.RuntimeException: Error executing check condition job.
        at com.pmease.quickbuild.CheckConditionTask.reduce(CheckConditionTask.java:39)
        at com.pmease.quickbuild.CheckConditionTask.reduce(CheckConditionTask.java:16)
        at com.pmease.quickbuild.grid.GridTaskFuture.get(GridTaskFuture.java:155)
        at com.pmease.quickbuild.DefaultBuildEngine.process(DefaultBuildEngine.java:395)
        at com.pmease.quickbuild.DefaultBuildEngine.access$000(DefaultBuildEngine.java:143)
        at com.pmease.quickbuild.DefaultBuildEngine$2.run(DefaultBuildEngine.java:1233)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
 Caused by: com.caucho.hessian.client.HessianRuntimeException: java.net.NoRouteToHostException: No route to host
        at com.caucho.hessian.client.HessianProxy.sendRequest(HessianProxy.java:285)
        at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:171)
        at com.sun.proxy.$Proxy76.executeGridJob(Unknown Source)
        at com.pmease.quickbuild.grid.GridTaskFuture.execute(GridTaskFuture.java:50)
        at com.pmease.quickbuild.grid.GridImpl.execute(GridImpl.java:58)
        at com.pmease.quickbuild.grid.GridImpl.execute(GridImpl.java:45)
        at com.pmease.quickbuild.DefaultBuildEngine.process(DefaultBuildEngine.java:392)
        ... 5 more
 Caused by: java.net.NoRouteToHostException: No route to host
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
        at sun.net.www.http.HttpClient.<init>(HttpClient.java:211)
        at sun.net.www.http.HttpClient.New(HttpClient.java:308)
        at sun.net.www.http.HttpClient.New(HttpClient.java:326)
        at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1169)
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1105)
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:999)
        at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:933)
        at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1283)
        at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1258)
        at com.caucho.hessian.client.HessianURLConnection.getOutputStream(HessianURLConnection.java:99)
        at com.caucho.hessian.client.HessianProxy.sendRequest(HessianProxy.java:283)
        ... 11 more
2018-07-25 10:31:27,140 [pool-1-thread-1273] ERROR com.pmease.quickbuild.grid.GridJob - Error connecting task node 'ld-ws-pci02v:8811', will cancel running job...
 com.caucho.hessian.client.HessianRuntimeException: java.net.NoRouteToHostException: No route to host
        at com.caucho.hessian.client.HessianProxy.sendRequest(HessianProxy.java:285)
        at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:171)
        at com.sun.proxy.$Proxy76.isGridJobActive(Unknown Source)
        at com.pmease.quickbuild.grid.GridJob$1.run(GridJob.java:100)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
 Caused by: java.net.NoRouteToHostException: No route to host
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
        at sun.net.www.http.HttpClient.<init>(HttpClient.java:211)
        at sun.net.www.http.HttpClient.New(HttpClient.java:308)
        at sun.net.www.http.HttpClient.New(HttpClient.java:326)
        at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1169)


 All   Comments   Work Log   Change History      Sort Order:
Robin Shen [26/Jul/18 12:14 AM]
Right now QB agent ip address is not assumed to be changed. We will add the logic to handle this in next big release.

AlSt [25/Jul/18 08:58 AM]
Sorry for another comment...

This is also a problem when just one node has a problem and you want to change some settings (like plugin settings or system setting) it fails because of the "no route to host" exception. There is even a very prominent error screen when clicking on "save" because the setting is obviously tried to be distributed to all nodes.

AlSt [25/Jul/18 08:39 AM]
Also the node is still shown as active, but with the wrong IP address.