condor batch system - condor_q

Asked by Sanjay Padhi

Hi

For large number of "jobs" using condor_q can be sometimes expensive.
Is there a way to check the jobs status less frequently in the following printouts?

 Idle: 2787 Running: 581 Finish: 2080
 Idle: 2736 Running: 577 Finish: 2135
 Idle: 2640 Running: 626 Finish: 2182
 Idle: 2626 Running: 584 Finish: 2238
 Idle: 2575 Running: 584 Finish: 2289
 Idle: 2537 Running: 581 Finish: 2330
 Idle: 2480 Running: 596 Finish: 2372
 Idle: 2437 Running: 600 Finish: 2411
 Idle: 2405 Running: 586 Finish: 2457

Thanks, Sanjay

Question information

Language:
English Edit question
Status:
Solved
For:
MadGraph5_aMC@NLO Edit question
Assignee:
No assignee Edit question
Solved by:
Olivier Mattelaer
Solved:
Last query:
Last reply:
Revision history for this message
Best Olivier Mattelaer (olivier-mattelaer) said :
#1

Hi Sanjay,

We don't have that as an option supported for the moment.
I can add it for the next version.

Otherwise you can modify the following file:
cluster.py
(madgraph/various/cluster.py
or
bin/internal/cluster.py (in a already existing SubProcesses directory)

around line 150, you have:
    @check_interupt()
    def wait(self, me_dir, fct):
        """Wait that all job are finish"""

        while 1:
            idle, run, finish, fail = self.control(me_dir)
            if fail:
                raise ClusterManagmentError('Some Jobs are in a Hold/... state. Please try to investigate or contact the IT team')
            if idle + run == 0:
                time.sleep(20) #security to ensure that the file are really written on the disk
                logger.info('All jobs finished')
                break
            fct(idle, run, finish)
            time.sleep(30)
        self.submitted = 0
        self.submitted_ids = []

You can change the number 30 to any (integer) number.

Cheers,

Olivier

Revision history for this message
Sanjay Padhi (sanjay-padhi) said :
#2

Thanks Olivier Mattelaer, that solved my question.