<< Back to previous view

[QB-2773] Incorrect / Missing Steps in Build Overview
Created: 04/Aug/16  Updated: 06/Sep/16

Status: Resolved
Project: QuickBuild
Component/s: None
Affects Version/s: None
Fix Version/s: 6.1.24

Type: Bug Priority: Critical
Reporter: J. Mash Assigned To: Robin Shen
Resolution: Fixed Votes: 0
Remaining Estimate: Unknown Time Spent: Unknown
Original Estimate: Unknown

File Attachments: JPEG File QBConfigsQueued.jpg     JPEG File QBSteps-Correct.jpg     JPEG File QBSteps-Incorrect.jpg    

 Description   
Greetings,

I have recently set one of our configurations to build concurrently and set it up to build if there are changes in the referenced repositories on a set schedule (once an hour). This seems to work as advertised, but I am seeing a display error that only appears to occur when multiple builds are running concurrently.

When a single build is running, the master step displays correctly with all child steps being displayed as they are running (the QBSteps-Correct.jpg is attached for an example of what that would look like).
When multiple builds are running, the master step displays incorrectly and lists no child steps at all, even though they are running (the QBSteps-Incorrect.jpg is attached to show this behavior, and the QBQueuedConfigs.jpg is attached to show that everything is in fact running correctly).

Please let me know if I can provide any more information to help track this down.

Thanks,
-J. Mash


 Comments   
Comment by J. Mash [ 04/Aug/16 06:29 PM ]
Attached images for added clarity.
Comment by Robin Shen [ 04/Aug/16 10:35 PM ]
Will child steps be displayed when you refresh the page manually or when the build is finished?
Comment by J. Mash [ 04/Aug/16 10:49 PM ]
The child steps do display correctly under the 'Step Status' page after the build is finished. The child steps do not display under the 'Overview' page while the build is running, regardless of if I force-refresh, refresh, or even reload the page with a full cache clean.

This issue is not consistent either, meaning that it will occur for some builds and not for others. When it does occur for a build, though, it is consistent for that build until it's complete.

I will gather more data on this and post as I find it, particularly with regards to reproduction steps. This happened twice today, but I didn't note what I did that caused it.

-J. Mash
Comment by Robin Shen [ 04/Aug/16 10:59 PM ]
When it happens, please check the agent log (the agent running the child step) to see if there are failures updating the child step status.
Comment by J. Mash [ 05/Aug/16 12:04 AM ]
Alright, I've got some more information on how to cause this pretty consistently:

  - Manually trigger a build of /root/config
    - There were no builds currently running when I did this.
    - The 'Overview' page correctly displayed the master and child steps.
  - First scheduled build of /root/config was triggered some minutes later while the manual build is still running
    - The 'Overview' page of the manually triggered build became bugged and only displayed the 'master' step (as indicated in the screenshot).
    - The 'Overview' page of the first scheduled build correctly displayed the master and child steps.
  - Second scheduled build of /root/config was triggered many more minutes later while the first scheduled build is still running:
    - The 'Overview' page of the first scheduled build became bugged and only displayed the 'master' step.

The agent running the steps is the same for all of them (children and master), and I don't see any errors or mention of failures to update status in the logs.

-J. Mash
Comment by Robin Shen [ 06/Aug/16 12:48 AM ]
I tried below and everything works fine:

1. connect two agents to QB server
2. set up root/config with master step running on "any agent"
3. add step1, step2, step3, step4 and step5 as child steps to master step above and each of these child steps run the sleep command for 10 seconds
4. set up build condition of root/config as always and scheduled it to run every 60 seconds
5. open firefox and trigger a manual build, overview screen of the triggered build updates step status correctly.
6. open another tab of firefox and until the scheduled build of root/config launches, open overview screen of the scheduled build, and still the previous triggered build and this scheduled build shows the step updating correctly.

So I doubt there might be some locking at your side that some resource or required agent of some step of previous build is occupied by subsequent running build to block the previous build. Can you please verify it?
Comment by J. Mash [ 06/Aug/16 02:01 AM ]
That doesn't really make sense. This isn't a case of a configuration waiting for a resource or node to become available anyway, it's very specific to the display of the defined children steps. Somehow that's being clobbered, but *only* that. Everything else is functioning exactly as it should, just the display is incorrect.

There are nine build nodes in our build farm, three per platform. There are two builds running, each one taking one node per platform (and a secondary job taking the last Windows node). The list of configurations running int he queue page shows nothing waiting on a node for any reason, resource or otherwise. There is only one resource defined for our build farm, which is a simple 'AgentAvailable' resource, and it's consumed anytime a node is tasked with work. The resource page shows one more agent available for darwin, one for linux, and none for windows. Still there is nothing waiting though.

This is an issue that seems to be 100% reproducible in our configurations, though I doubt you'd be able to replicate our farm and everything to be able to duplicate it on your side. How can I help gather information?

Thanks,
-J. Mash
Comment by J. Mash [ 06/Aug/16 04:17 PM ]
For what it's worth, this appears to be an issue specific to the scheduler triggering the builds, since I can trigger multiple builds manually without the issue.
Comment by Robin Shen [ 07/Aug/16 12:33 AM ]
So I am asking if you can come up with some simple set up steps (something like I do above) reproducing this, it will be of a lot help.
Comment by J. Mash [ 06/Sep/16 05:25 AM ]
Resolution confirmed in 6.1.24.
Generated at Sat May 18 06:01:30 UTC 2024 using JIRA 189.