History | Log In     View a printable version of the current page.  
Issue Details (XML | Word | Printable)

Key: QB-3842
Type: New Feature New Feature
Status: Open Open
Priority: Major Major
Assignee: Robin Shen
Reporter: Bin Wu
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
QuickBuild

How to trace the agent load history

Created: 15/Mar/22 11:39 AM   Updated: 16/Mar/22 11:28 PM
Component/s: None
Affects Version/s: 11.0.26
Fix Version/s: None

Original Estimate: Unknown Remaining Estimate: Unknown Time Spent: Unknown


 Description  « Hide
We always meet some system hang problem it not cause by QB
It maybe cause by some build order
So we want to query the agent load history to identify them
But we try to query from server log and agent log couldn't find them
Can you help to point which way to catch them?

 All   Comments   Work Log   Change History      Sort Order:
Robin Shen [15/Mar/22 02:23 PM]
The load history is not available yet. You may however check the agent log to see if there is any clue.

Bin Wu [16/Mar/22 01:01 AM]
I can check with system log to know it run the test, but don't know which test
check with QB agent log just have some error like QB connect error

Robin Shen [16/Mar/22 10:47 AM]
Can you please show me the error?

Bin Wu [16/Mar/22 11:02 AM]
Some error like this, it is not match the test build time tag, it maybe cause by the network
or some cancel failed:

========
2022-02-28 13:34:27,385 [Thread-12] ERROR com.pmease.quickbuild.Quickbuild - Error connecting server.
    com.caucho.hessian.client.HessianRuntimeException: com.caucho.hessian.client.HessianRuntimeException: Error connecting 'http://servername:8810/service/connect'
        at com.caucho.hessian.client.HessianProxy.sendRequest(HessianProxy.java:285)
        at com.caucho.hessian.client.HessianProxy.invoke(HessianProxy.java:171)
        at com.sun.proxy.$Proxy19.connect(Unknown Source)
        at com.pmease.quickbuild.grid.AgentConnectivityTask.run(AgentConnectivityTask.java:62)
        at java.lang.Thread.run(Thread.java:748)
    Caused by: com.caucho.hessian.client.HessianRuntimeException: Error connecting 'http://servername:8810/service/connect'
        at com.caucho.hessian.client.HessianURLConnection.getOutputStream(HessianURLConnection.java:103)
        at com.caucho.hessian.client.HessianProxy.sendRequest(HessianProxy.java:283)
        ... 4 more
    Caused by: java.net.SocketTimeoutException: connect timed out
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:607)
        at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
        at sun.net.www.http.HttpClient.<init>(HttpClient.java:242)
        at sun.net.www.http.HttpClient.New(HttpClient.java:339)
        at sun.net.www.http.HttpClient.New(HttpClient.java:357)
        at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1223)
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162)
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
        at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:990)
        at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1337)
        at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1312)
        at com.caucho.hessian.client.HessianURLConnection.getOutputStream(HessianURLConnection.java:99)
        ... 5 more

=======
2022-03-15 13:02:09,837 [pool-2-thread-1282] WARN com.pmease.quickbuild.grid.NodeServiceImpl - testGridJob is cancelling job '41ce1f76-80e9-4b55-9baf-91f57379facf'...
2022-03-15 13:02:09,838 [pool-2-thread-1282] WARN com.pmease.quickbuild.grid.GridTaskFuture - Job still exists on job node and cancel command is issued (job id: 41ce1f76-80e9-4b55-9baf-91f57379facf, build id: 661422, job node: test-agent-1:8812)...

Robin Shen [16/Mar/22 11:28 PM]
This seems to me like a network issue. When this happens, please login to the agent machine, and run below command to see if it works:
telnet servername 8810