History | Log In     View a printable version of the current page.  
Issue Details (XML | Word | Printable)

Key: QB-2906
Type: Bug Bug
Status: Closed Closed
Resolution: Cannot Reproduce
Priority: Blocker Blocker
Assignee: Robin Shen
Reporter: Todd Scholl
Votes: 0
Watchers: 1

If you were logged in you would be able to see more operations.

Quickbuild Server Queue hangs

Created: 22/Feb/17 03:57 AM   Updated: 17/Jan/18 07:19 AM
Return to search
Component/s: None
Affects Version/s: 6.1.9
Fix Version/s: None

Original Estimate: Unknown Remaining Estimate: Unknown Time Spent: Unknown
File Attachments: 1. GZip Archive qb_issues_02212017_1.tar.gz (2.10 Mb)
2. GZip Archive qb_issues_02212017_2.tar.gz (7.90 Mb)
3. Zip Archive quickbuild_issues_2_28_console.log.1.zip (705 kb)
4. Zip Archive quickbuild_issues_2_28_console.zip (572 kb)
5. Zip Archive quickbuild_issues_2_28_quickbuild.zip (2.30 Mb)

Image Attachments:

1. Issue_2_21_2017.JPG
(317 kb)

2. Issue_2_21_2017_2.JPG
(181 kb)
Environment: Linux 3.0.101-71-default #1 SMP Thu Mar 3 12:56:15 UTC 2016 (7bdad2e) x86_64 x86_64 x86_64 GNU/Linux

 Description  « Hide
In our quick build server today we are seeing an odd set of circumstances, this has occurred 3 times in the last 12 hours:
1. Jobs currently in progress stop progressing - Jobs collecting metrics stick in that state, jobs checking for build condition stick there, and job running steps stick there
2. All newly triggered jobs stay queued with Waiting_Process

The only thing that seems to resolve the issue is a Quickbuild service restart which interrupts our production job runs

Can you have a look at the logs and thread dumps we have and see if you can pinpoint what is going on?

 All   Comments   Work Log   Change History      Sort Order:
Robin Shen [23/Feb/17 12:33 AM]
Looks like SSL of two of your build agents "cv-web02:8811" and "cv-yup02:8811" are mis-configured as the log is full of SSL handshake errors. You either need to stop these two build agents, or fix the SSL issue.

Todd Scholl [24/Feb/17 02:46 PM]
I have fixed one of the agent and am working on another. Does that SSL issue hang enough of the server threads to force the jobs to hang?

Robin Shen [25/Feb/17 12:31 AM]
All jobs running on these agents will hang. Jobs running on other agents should not be affected (unless it depends on resources used by hanging jobs)

Todd Scholl [28/Feb/17 05:06 PM]
Issue returned today. I have cleaned up the SSL issues that you mentioned above but the queue still froze. Is there anything else you can see that could be causing this? I am uploading today's logs.

Todd Scholl [28/Feb/17 06:53 PM]
Upload complete. As a second note, there are 3 thread dumps in the "quickbuild_issues_2_28_console.zip" if that helps.

Robin Shen [01/Mar/17 12:37 AM]
The log is full of SSL error, and there are no odd things in thread dump. I'd suggest to fix the SSL error or remove the build agent to see if the build is still hanging.