More help for gridpack failure
Hi all,
I have again come across the following error whilst trying to generate events from a gridpack which seemed to be generated without any issues:
internal.
The gridpack was generated in cluster mode on my local batch system. I guess that this is a result of failed subjobs on the cluster.
Now, I do appreciate that it's almost impossible for you guys to test all possible setups on all possible types of batch system to work out why this might not get caught by the internal cluster bookkeeping, which is much better than it used to be (thanks!). But I do have a request for what I think are relatively simple changes that would help in this situation:
1) Test the gridpack before it is closed.
- Presumably the fact that a bad channel is present could be determined before the gridpack is tarred up and sent off for event generation? This would at least give an earlier warning of the problem.
2) Give instructions for (re)submitting single failed jobs by hand.
- When a bad channel is found, would it be possible to print out what command should be run to resubmit the failed job? Looking at cluster.py, it seems to me that this information could be attached to the cluster object by job id and then printed out when a bad channel is found.
Thanks a lot,
Josh.
Question information
- Language:
- English Edit question
- Status:
- Solved
- Assignee:
- Rikkert Frederix Edit question
- Solved by:
- Rikkert Frederix
- Solved:
- Last query:
- Last reply: