how to debug a ladon script?

Asked by lgm

hello,

i have a very strange behaviour and i am not able to know where to look and search.

i have 2 machines with the same configuration debian/apache/etc. and using them with ladon 0.7.2 over apache (python2.6/dist-packages/ladon-0.7.2-py2.6.egg)

i have a very strange behaviour today as a ladon script that worked as WS on one of this machine, totally hang on the other. Same script, same md5sum, i can not get the point. (To my mind it used to work)

so please, is there any way to activate some loging or some deeper debug so that i can have an idea of what is happening under the hood? This is very weird... thanks

Question information

Language:
English Edit question
Status:
Expired
For:
ladon Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
jsgaarde (jakob-simon-gaarde) said :
#1

Hi lgm
I would use a logger if I were you if you want to mimic the real world situation as much as possible. You can of course start modify your handler to run with pdb and then start apache in single process mode:

# /usr/sbin/apache2 -DONE_PROCESS

If you choose to go with inserting logging in your service method maybe this can help:

>>> import ladon.tools.log as log
>>> log.info('Log something!')
>>>
jakob@web-misc:~$ cat /tmp/ladon_log
2013-07-06 07:01:02,080 - Log something!

Best regards Jakob Simon-Gaarde

Revision history for this message
lgm (lgm31) said :
#2

ok, so i have checked and continue to have very weird results. on one machine the ladon page that provide data about services, never appears, and load indefinitely. If i asked for soap contract, indefinitely wiating also.

i thought that the two systems where exactly configured on the same basis, but it looks like there is a small difference. See below the line about Mutex when lauching Apache2.

so i have added a timer output at the end of MyWs.py script, as it looks like on one machine it do not end. And a function example() just to do process test data. Module 3, do some import of two other modules needed and that also contains some Ladonized objects.

 import datetime
 a = datetime.datetime.now()
 myWs = MyWs()
 b = datetime.datetime.now()
 print("Module 3 (MyWS): loaded: %s." % (b-a))

i can test it with python, on the machine where it works:

$ python
Python 2.6.6 (r266:84292, Dec 27 2010, 00:02:40)
[GCC 4.4.5] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import MyWS
Module 1: loaded: 0:00:00.035944.
Features for module 2: loaded: 0:00:04.999658.
Module 2: loaded: 0:00:06.887894
Module 3 (MyWS): loaded: 0:00:00.077702.
>>> MyWS.example()
... # ok everything fine

on the machine that hangs:

$ python
Python 2.6.6 (r266:84292, Dec 27 2010, 00:02:40)
[GCC 4.4.5] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import MyWS
Module 1: loaded: 0:00:00.027162.
Features for module 2: loaded: 0:00:02.794142.
Module 2: loaded: 0:00:03.879293
Module 3 (MyWS): loaded: 0:00:00.054390.
>>> MyWS.example()
... # ok, everything is fine here too... so this not directly a python problem

but when querying from Apache/Ladon, i can see inside my apache log on the machine that never reply:

$ service apache2 start

=>

$ tail -f /var/log/apache2/error.log

[Mon Jul 08 14:07:14 2013] [notice] Apache/2.2.16 (Debian) mod_wsgi/3.3 Python/2.6.6 configured -- resuming normal operations
[Mon Jul 08 14:07:18 2013] [error] [client xxx.xxx.xxx.xxx] File does not exist: /var/www/taln/favicon.ico
[Mon Jul 08 14:07:34 2013] [error] Module 1: loaded: 0:00:00.035079.
[Mon Jul 08 14:07:37 2013] [error] Features for module 2: 0:00:02.988823.
[Mon Jul 08 14:07:38 2013] [error] Module 2: 0:00:03.273715

... # here nothing happen in the web page, whatever i wait, nothing appears inside
and then nothing never come after, while i should see: Module 3 (MyWS): loaded: <timing>.
the browser is waiting endlessly for loading the page.

if i decide to do "service apache2 stop" i then get :

[Mon Jul 08 14:11:44 2013] [warn] child process 10497 still did not exit, sending a SIGTERM
[Mon Jul 08 14:11:46 2013] [warn] child process 10497 still did not exit, sending a SIGTERM
[Mon Jul 08 14:11:48 2013] [warn] child process 10497 still did not exit, sending a SIGTERM
[Mon Jul 08 14:11:50 2013] [error] child process 10497 still did not exit, sending a SIGKILL
[Mon Jul 08 14:11:51 2013] [notice] caught SIGTERM, shutting down

so ti looks like something is lost somewhere, but why ? any clue of something wrong?

on the machine, where it works, everything went fine see below, and i get the expected Module 3 (MyWS): loaded: <timing>. Ladon webservice worked and so on...

BTW, there is also something about Mutex, i do not know what it is, and if this could explain the problem. I do not know where it comes from as i think everything where installed the same way (i'm now trying to understand this)

[Mon Jul 08 11:40:44 2013] [error] python_init: Python version mismatch, expected '2.6.5+', found '2.6.6'.
[Mon Jul 08 11:40:44 2013] [error] python_init: Python executable found '/usr/bin/python'.
[Mon Jul 08 11:40:44 2013] [error] python_init: Python path being used '/usr/lib/python2.6/:/usr/lib/python2.6/plat-linux2:/usr/lib/python2.6/lib-tk:/usr/lib/python2.6/lib-old:/usr/lib/python2.6/lib-dynload'.
[Mon Jul 08 11:40:44 2013] [notice] mod_python: Creating 8 session mutexes based on 6 max processes and 25 max threads.
[Mon Jul 08 11:40:44 2013] [notice] mod_python: using mutex_directory /tmp
[Mon Jul 08 11:40:44 2013] [notice] Apache/2.2.16 (Debian) mod_python/3.3.1 Python/2.6.6 mod_wsgi/3.3 mod_perl/2.0.4 Perl/v5.10.1 configured -- resuming normal operations
[Mon Jul 08 11:41:01 2013] [error] Module 1: loaded: 0:00:00.034712.
[Mon Jul 08 11:41:06 2013] [error] Features for module 2: loaded: 0:00:04.782057.
[Mon Jul 08 11:41:07 2013] [error] Module 2: loaded: 0:00:05.798397
[Mon Jul 08 11:41:08 2013] [error] Module 3 (MyWS): loaded: 0:00:00.081700.
[Mon Jul 08 11:41:08 2013] [error] [client xxx.xxx.xxx.xxx] File does not exist: /var/www/mysite/favicon.ico

Revision history for this message
lgm (lgm31) said :
#3

ok, so i change the configuration, i now have the same things about Mutex on both machines... but one works , while the other do not. The python script is the same, it works without Ladon. It works on one but not on the other. I can not understand what is wrong and where i should look :/

the weird thing i forgot to mention, that is that if i ladonize through apache only the module 1 and 2 separately (as they both contains some ladon exposed processing) it works on both machines !

when i use the one that wrap everything, it works only on one of the two machines...

(the machine where it waits indefinitely have now:

[Mon Jul 08 15:42:17 2013] [error] python_init: Python version mismatch, expected '2.6.5+', found '2.6.6'.
[Mon Jul 08 15:42:17 2013] [error] python_init: Python executable found '/usr/bin/python'.
[Mon Jul 08 15:42:17 2013] [error] python_init: Python path being used '/usr/lib/python2.6/:/usr/lib/python2.6/plat-linux2:/usr/lib/python2.6/lib-tk:/usr/lib/python2.6/lib-old:/usr/lib/python2.6/lib-dynload'.
[Mon Jul 08 15:42:17 2013] [notice] mod_python: Creating 8 session mutexes based on 6 max processes and 25 max threads.
[Mon Jul 08 15:42:17 2013] [notice] mod_python: using mutex_directory /tmp
[Mon Jul 08 15:42:17 2013] [notice] Apache/2.2.16 (Debian) mod_python/3.3.1 Python/2.6.6 mod_wsgi/3.3 configured -- resuming normal operations
)

Revision history for this message
lgm (lgm31) said :
#4

can my hanging be due to a trouble with accessing some ressources like file or databases? i have seen that there could be some trouble to accessing at the same time to a ressource? i do not have use any mutex exclusion, should i process this way?

is there any place with a sample code about this?

in the buggy scirpt i do many import like

import module1
import module2
import module3

each of it contains one class providing ladonized method, and instanciating some, but will also construct object from some loading data from filesystem or databases.

(my apache is set as mpm-worker)

Revision history for this message
lgm (lgm31) said :
#5

well... this time i do not know where to look. And what may happen :/

trying to enclose the import with

from threading import Lock
mutex = Lock()
mutex.acquire()
import MyWS
mutex.release()

do not change anything.

also the handler.py where like this:

from ladon.server.wsgi import LadonWSGIApplication
from os.path import abspath, dirname

#services = ['module1', 'module2', 'module4', 'module5', ] # all these works!
services = ['MyWS', ] # this one hang on one machine but not the other :/
application = LadonWSGIApplication(
  services,
  [dirname(abspath(__file__)), '/home/me/pylibs/', '/home/me/pylibs/otherlib'],
  catalog_name='webservice catalog',
  catalog_desc='Some webservices!')

the thing that drive me crazy is that it works on one machine, and not the other, that if i consider the buggy module (say here 'MyWS' / module3, alone it works on this machine, and not this other. And that using the part without loading this module everything work on both!

Revision history for this message
Launchpad Janitor (janitor) said :
#6

This question was expired because it remained in the 'Open' state without activity for the last 15 days.