<< Back to previous view |
[QB-3842] How to trace the agent load history
|
|
Status: | Closed |
Project: | QuickBuild |
Component/s: | None |
Affects Version/s: | 11.0.26 |
Fix Version/s: | None |
Type: | New Feature | Priority: | Major |
Reporter: | Bin Wu | Assigned To: | Robin Shen |
Resolution: | Won't Fix | Votes: | 0 |
Remaining Estimate: | Unknown | Time Spent: | Unknown |
Original Estimate: | Unknown |
Description |
We always meet some system hang problem it not cause by QB
It maybe cause by some build order So we want to query the agent load history to identify them But we try to query from server log and agent log couldn't find them Can you help to point which way to catch them? |
Comments |
Comment by Robin Shen [ 15/Mar/22 02:23 PM ] |
The load history is not available yet. You may however check the agent log to see if there is any clue. |
Comment by Bin Wu [ 16/Mar/22 01:01 AM ] |
I can check with system log to know it run the test, but don't know which test
check with QB agent log just have some error like QB connect error |
Comment by Robin Shen [ 16/Mar/22 10:47 AM ] |
Can you please show me the error? |
Comment by Bin Wu [ 16/Mar/22 11:02 AM ] |
Some error like this, it is not match the test build time tag, it maybe cause by the network
or some cancel failed: ======== 2022-02-28 13:34:27,385 [Thread-12] ERROR com.pmease.quickbuild.Quickbuild - Error connecting server. com.caucho.hessian.client.HessianRuntimeException: com.caucho.hessian.client.HessianRuntimeException: Error connecting 'http://servername:8810/service/connect' at com.caucho.hessian.client.HessianProxy.sendRequest(HessianProxy.java:285) at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:171) at com.sun.proxy.$Proxy19.connect(Unknown Source) at com.pmease.quickbuild.grid.AgentConnectivityTask.run(AgentConnectivityTask.java:62) at java.lang.Thread.run(Thread.java:748) Caused by: com.caucho.hessian.client.HessianRuntimeException: Error connecting 'http://servername:8810/service/connect' at com.caucho.hessian.client.HessianURLConnection.getOutputStream(HessianURLConnection.java:103) at com.caucho.hessian.client.HessianProxy.sendRequest(HessianProxy.java:283) ... 4 more Caused by: java.net.SocketTimeoutException: connect timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:607) at sun.net.NetworkClient.doConnect(NetworkClient.java:175) at sun.net.www.http.HttpClient.openServer(HttpClient.java:463) at sun.net.www.http.HttpClient.openServer(HttpClient.java:558) at sun.net.www.http.HttpClient.<init>(HttpClient.java:242) at sun.net.www.http.HttpClient.New(HttpClient.java:339) at sun.net.www.http.HttpClient.New(HttpClient.java:357) at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1223) at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162) at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056) at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:990) at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1337) at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1312) at com.caucho.hessian.client.HessianURLConnection.getOutputStream(HessianURLConnection.java:99) ... 5 more ======= 2022-03-15 13:02:09,837 [pool-2-thread-1282] WARN com.pmease.quickbuild.grid.NodeServiceImpl - testGridJob is cancelling job '41ce1f76-80e9-4b55-9baf-91f57379facf'... 2022-03-15 13:02:09,838 [pool-2-thread-1282] WARN com.pmease.quickbuild.grid.GridTaskFuture - Job still exists on job node and cancel command is issued (job id: 41ce1f76-80e9-4b55-9baf-91f57379facf, build id: 661422, job node: test-agent-1:8812)... |
Comment by Robin Shen [ 16/Mar/22 11:28 PM ] |
This seems to me like a network issue. When this happens, please login to the agent machine, and run below command to see if it works:
telnet servername 8810 |