Skip/speed up zipping Pythia files possible?
In starting significant pseudodata generation on a cluster, we're finding that two phases of the simulation take a surprisingly large amount of time: "merging results from the split PY8 runs..." and "storing pythia8 files of previous run". I was poking around at the code which performs that merging, and it appears to be removing some leading and some following lines using sed, at least in the version I'm looking at. A simple test of achieving the same task with the head and tail commands was at least an order of magnitude faster; is there some impediment to using these commands as opposed to sed in this context?
Also, for our current analysis we are not actually exploiting information in the PY8 .hepmc file, and thus delete it shortly after the run is completed; is there a straightforward way to bypass the gzip call that is taking so long at the "storing pythia8 files..." juncture? This would significantly speed up our usage of the program; it seems that these two steps take an order of magnitude more time than the rest of the program when running in cluster mode, because they are both serial processes.
Thanks for any insights!
Question information
- Language:
- English Edit question
- Status:
- Solved
- Assignee:
- Valentin Hirschi Edit question
- Solved by:
- William Shepherd
- Solved:
- Last query:
- Last reply: