|
|
|
[
Permlink
| « Hide
]
AlSt [25/Sep/14 08:00 AM]
The log that shows a build started and lots of build until the initial build is stopped
Is it possible for you to attach a simple QB database demonstrating this issue and let me know the reproducing steps?
The reproduction is not that easy as we only see it on this one node but the pattern is very clear:
A long running build starts (one resource on that node, no parallel runs) The server looses connection to the agent and after some timeout just cancels the build on the serverside The build is still running on the node The server - agent connection is reestablished and the server triggers fresh builds that want to use that node BUG: there is still a build running on that node and the wrapper is aware of that as it tries to write back the log when it finally finishes. The server does not accept it and produces an error as it has already canceled the build. The state of the agent is completely ignored when the server agent connection is reestablished. The server does not ask the agent if there is maybe anything still running, it just assumes that there is nothing and triggers a fresh build. In the log i have attached you see that with the build ID 1094608. This build runs into this issue and then a reconnect happens and the agent just starts fresh builds as they appear in the queue. If the dependency build for those would not have been broken at that time the builds would just have failed as resources are already in use by the build 1094608 that is still running and stopping at the end of the run. The agent just has to check if he is running builds before triggering new ones on a reconnect. |