timeout error during card update (and eventual pdf copying)
Hi,
we are recently seeing a lot of crashes in ATLAS production with errors like the one pasted on the bottom of the mail.
We (ATLAS) discussed a similar issue in 708223 but this particular one is fairly urgent at the moment, hence I wanted to discuss it separately.
From looking at the stack, to me it looks like during a card update/consistency check, missing lhapdf sets are copied and eventually we run into a timeout, which crashes the job. It also looks like only 20s time are allowed for this step to finish.
If that's the case, it's no surprise this times out, as we have seen that the copy of files from the cvmfs file system (where the lhapdf data lives) sometimes takes a bit longer.
What would be the best way to fix this issue? Is there something we can do immediately on our end or do we need a patch for MG5_aMC to become more patient with the filesystem?
Cheers,
Hannes
---
generate 22:32:27 Py:MadGraphUtils ERROR [1;31mCommand "generate_events run_01" interrupted with error:
generate 22:32:27 Py:MadGraphUtils ERROR TimeOutError :
generate 22:32:27 Py:MadGraphUtils ERROR Please report this bug on https:/
generate 22:32:27 Py:MadGraphUtils ERROR More information is found in 'ME5_debug'.
generate 22:32:27 Py:MadGraphUtils ERROR Please attach this file to your report.[0m
generate 22:32:27 Py:MadGraphUtils ERROR MadGraph5_aMC@NLO appears to have crashed. Debug file output follows.
generate 22:32:27 Py:MadGraphUtils ERROR #******
generate 22:32:27 Py:MadGraphUtils ERROR #* MadGraph5_
generate 22:32:27 Py:MadGraphUtils ERROR #* *
generate 22:32:27 Py:MadGraphUtils ERROR #* * * *
generate 22:32:27 Py:MadGraphUtils ERROR #* * * * * *
generate 22:32:27 Py:MadGraphUtils ERROR #* * * * * 5 * * * * *
generate 22:32:27 Py:MadGraphUtils ERROR #* * * * * *
generate 22:32:27 Py:MadGraphUtils ERROR #* * * *
generate 22:32:27 Py:MadGraphUtils ERROR #* *
generate 22:32:27 Py:MadGraphUtils ERROR #* *
generate 22:32:27 Py:MadGraphUtils ERROR #* VERSION 3.5.1 2023-07-11 *
generate 22:32:27 Py:MadGraphUtils ERROR #* *
generate 22:32:27 Py:MadGraphUtils ERROR #* The MadGraph5_aMC@NLO Development Team - Find us at *
generate 22:32:27 Py:MadGraphUtils ERROR #* https:/
generate 22:32:27 Py:MadGraphUtils ERROR #* *
generate 22:32:27 Py:MadGraphUtils ERROR #******
generate 22:32:27 Py:MadGraphUtils ERROR #* *
generate 22:32:27 Py:MadGraphUtils ERROR #* Command File for MadEvent *
generate 22:32:27 Py:MadGraphUtils ERROR #* *
generate 22:32:27 Py:MadGraphUtils ERROR #* run as ./bin/madevent.py filename *
generate 22:32:27 Py:MadGraphUtils ERROR #* *
generate 22:32:27 Py:MadGraphUtils ERROR #******
generate 22:32:27 Py:MadGraphUtils ERROR generate_events run_01
generate 22:32:27 Py:MadGraphUtils ERROR Traceback (most recent call last):
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR return self.onecmd_
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR return func(arg, **opt)
generate 22:32:27 Py:MadGraphUtils ERROR File "/builds/
generate 22:32:27 Py:MadGraphUtils ERROR switch_mode = self.ask_
generate 22:32:27 Py:MadGraphUtils ERROR File "/builds/
generate 22:32:27 Py:MadGraphUtils ERROR self.ask_
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR self.ask_
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR out = ask(question, '0', possible_answer, timeout=
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR value = Cmd.timed_
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR result = fct(question)
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR return self.cmdloop()
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR super(SmartQues
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR stop = self.postcmd(stop, line)
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR self.do_
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR self.update_
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR mecmd.copy_
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR self.install_
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR return self.install_
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR misc.call([getdata, 'install', filename], cwd = pdfsets_dir)
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR return f(arg, *args, **opt)
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR return subprocess.
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR return p.wait(
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR return self._wait(
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR (pid, sts) = self._try_wait(0)
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR (pid, sts) = os.waitpid(
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR raise TimeOutError
generate 22:32:27 Py:MadGraphUtils ERROR madgraph.
generate 22:32:27 Py:MadGraphUtils ERROR generate_events run_01
generate 22:32:27 Py:MadGraphUtils ERROR Traceback (most recent call last):
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR return self.onecmd_
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR return func(arg, **opt)
generate 22:32:27 Py:MadGraphUtils ERROR File "/builds/
generate 22:32:27 Py:MadGraphUtils ERROR switch_mode = self.ask_
generate 22:32:27 Py:MadGraphUtils ERROR File "/builds/
generate 22:32:27 Py:MadGraphUtils ERROR self.ask_
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR self.ask_
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR out = ask(question, '0', possible_answer, timeout=
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR value = Cmd.timed_
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR result = fct(question)
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR return self.cmdloop()
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR super(SmartQues
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR stop = self.postcmd(stop, line)
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR self.do_
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR self.update_
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR mecmd.copy_
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR self.install_
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR return self.install_
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR misc.call([getdata, 'install', filename], cwd = pdfsets_dir)
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR return f(arg, *args, **opt)
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR return subprocess.
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR return p.wait(
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR return self._wait(
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR (pid, sts) = self._try_wait(0)
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR (pid, sts) = os.waitpid(
generate 22:32:27 Py:MadGraphUtils ERROR File "/cvmfs/
generate 22:32:27 Py:MadGraphUtils ERROR raise TimeOutError
generate 22:32:27 Py:MadGraphUtils ERROR madgraph.
Question information
- Language:
- English Edit question
- Status:
- Answered
- Assignee:
- No assignee Edit question
- Last query:
- Last reply:
Can you help with this problem?
Provide an answer of your own, or ask Hannes for more information if necessary.