Question #145191 “Some helpful tips” : Questions : PySQLPool

Revision history for this message

Nikoleta Verbeck (nerdynick) said on 2011-02-13:

#1

So in reading this it make me a little curious. Did the docs not help out at all, or did you just not manage to find them. If not they are located at http://packages.python.org/PySQLPool.

If you check the commit logs (Google Code) this is still an active project. Currently I'm looking at migrating everything away from Launchpad/Google Code to Github. As I find it a lot easier for others to contribute to. As well as it allows for better forking. The other reason is I don't find the Launchpad interface intuitive enough and I think people just get lost in it.

As far as the function naming goes I to have a blueprint/story slated for the next release to redue a lot of that. Including an adaptor to allow for older implementations to still be able to use it.

So there are a few corrections to your list.

1) Calling getNewPool() does actually return an object that you can operate on. Granted in most situations no one ever needs to use them. You can see this with the terminatePool() , commitPool(), cleanupPool().

2) getNewQuery() doesn't create a new pool per say. The core of PySQLPool uses the Borg pattern to manage all the connections and allow for multi threaded support without needing to pass around objects. The PySQLQuery object that is returned from getNewQuery() is more a class to help manage everything around a query or set of queries. It makes the calls to the pool layer to get an open connection. Handles parsing & building escaped queries. As well as providing helper functions to fetch needed information about the last preformed queries. In future release it will also have everything to assist in transaction based querying.

If you are willing to help out I do have a list of Blueprints that have been mapped out. That might solve a lot of the problems you are having as well as provide a large set of new features. Fill free to shoot me any patches you may have as well. I'll get them into the release/code base asap. I'll also look towards adding you as a committer after a few patches.

So in reading this it make me a little curious. Did the docs not help out at all, or did you just not manage to find them. If not they are located at http://packages.python.org/PySQLPool.

If you check the commit logs (Google Code) this is still an active project. Currently I'm looking at migrating everything away from Launchpad/Google Code to Github. As I find it a lot easier for others to contribute to. As well as it allows for better forking. The other reason is I don't find the Launchpad interface intuitive enough and I think people just get lost in it.

As far as the function naming goes I to have a blueprint/story slated for the next release to redue a lot of that. Including an adaptor to allow for older implementations to still be able to use it.

So there are a few corrections to your list.

1) Calling getNewPool() does actually return an object that you can operate on. Granted in most situations no one ever needs to use them. You can see this with the terminatePool() , commitPool(), cleanupPool().

2) getNewQuery() doesn't create a new pool per say. The core of PySQLPool uses the Borg pattern to manage all the connections and allow for multi threaded support without needing to pass around objects. The PySQLQuery object that is returned from getNewQuery() is more a class to help manage everything around a query or set of queries. It makes the calls to the pool layer to get an open connection. Handles parsing & building escaped queries. As well as providing helper functions to fetch needed information about the last preformed queries. In future release it will also have everything to assist in transaction based querying.

If you are willing to help out I do have a list of Blueprints that have been mapped out. That might solve a lot of the problems you are having as well as provide a large set of new features. Fill free to shoot me any patches you may have as well. I'll get them into the release/code base asap. I'll also look towards adding you as a committer after a few patches.

Revision history for this message

Eric Heller (eheller) said on 2011-02-13:

#2

Yes, I did find the documentation, but as I said it still left me pretty baffled to be perfectly honest.

I'm sorry if you took this the wrong way, I'm not trying to make any digs at you. For what the package is, it works fine, once you understand how to use it correctly. So, I'm not critizing the code per say, just the way it's organized and the confusion it induces.

I didn't look at the commit logs on google code, sorry! There's stuff everywhere. Launchpad, google code - maybe pick one? Or as you said, put the project on GitHub. GitHub is great!

1) Calling getNewPool() does actually return an object that you can operate on.

Yes, I see that you're right. I didn't realize the PySqlPool class keeps a static dictionary of information, and that creating an instance of the class more or less acts as an interface to the class's static information.

That's still a little confusing though, becuase it's not really a "new" Pool, it's a new object that lets you interface with /the/ Pool.

2. Right, again, I didn't read deep enough into the code. I saw the line `self.Pool = PySQLPool()` in the __init__ method of PySqlQuery and assumed this was creating a new Pool. But (as above) as see that it's not a new pool per say, just an object that interfaces with the pool.

I'm glad there are plans for better transaction support in the future on the 'Query' layer, instead of on the Pool layer. This has made me create a pretty ugly work-around.

3. I'd be willing to help out if the direction of the project, mapped out in your blueprints, point in the same general direction as my own needs. Otherwise, it makes more sense to spend my time either writing my own pooling layer from scratch, or adapting your code for a better fit for me.

Anyway, thanks for the work.

Eric

Revision history for this message

Eric Heller (eheller) said on 2011-02-13:

#3

Oh, I forgot to mention a couple of small things.

1. You have access to the escape_string() function from MySQLdb on the PySqlQuery object, but I really need to get at real_escape_string(), which is dependent on the connection to the MySQL server (eg, if `conn` is a MySQLdb Connection, then mysql_real_escape_string() is equivalent to `conn.escape_string()`. The mysql_real_escape_string() function is dependent on the character encoding used by the server. There needs to be some easier way to get at this. For now I've had to hack the PySqlQuery object, call _GetConnection manually, access self.conn.escape_string(), and then let the connection go back to the pool. Too hacky.

2. It should be noted that in the PySQLConnectionManager, when you make a call to TestConnection(), the code executes the following to determine if the connection is 'alive':

cursor = self.connection.cursor(MySQLdb.cursors.DictCursor)
cursor.execute('select current_user')

However, if you HAVE lost connection, calling cursor.execute(..) has the potential to Block for a very, very long time (forever, as far as I know) until the connection is re-established, /depending/ on how and when the connection was lost. Furthermore, I don't think a call to .execute() has any sort of timeout in MySQLdb. It may not raise an exception either.

An example when this occurs is if you lose network connectivity altogether. The call
cursor.execute('select current_user') will just continue to block until network connectivity is restored.

This is probably not an issue for most people, and probably even then, not very often, but it needs a graceful solution. Unfortunately, the application I'm developing will be permanently sitting behind a flaky wireless network connection, so this is an issue for me :D

There needs to be a way to set a timeout when attempting to test the connection. In the past, I've done this by spawning a separate thread for the "Test Alive" task, and then kill the thread if it doesn't terminate in X seconds, and throw a Timeout exception. I've used this method myself. It may not be the best way to handle the issue, but it does work. Maybe you have some ideas?

3. A similar issue is cleanuping up dead connections. It seems that, for now, this will only happen by explicitly calling cleanupPool() or when you try to create a new PySqlQuery, and there are already maxActiveConnections, it may cull one of those dead connections so that the query gets a new, live connection.

Again, this behaviour can have blocking problems for me (as above).

One option would be to introduce a separate Reaper thread that automatically tests and cleans up the dead connections.

That's all for now.

Oh, I forgot to mention a couple of small things.

1. You have access to the escape_string() function from MySQLdb on the PySqlQuery object, but I really need to get at real_escape_string(), which is dependent on the connection to the MySQL server (eg, if `conn` is a MySQLdb Connection, then mysql_real_escape_string() is equivalent to `conn.escape_string()`.  The mysql_real_escape_string() function is dependent on the character encoding used by the server. There needs to be some easier way to get at this. For now I've had to hack the PySqlQuery object, call _GetConnection manually, access self.conn.escape_string(), and then let the connection go back to the pool. Too hacky.

2. It should be noted that in the PySQLConnectionManager, when you make a call to TestConnection(), the code executes the following to determine if the connection is 'alive':

cursor = self.connection.cursor(MySQLdb.cursors.DictCursor)
cursor.execute('select current_user')

However, if you HAVE lost connection, calling cursor.execute(..) has the potential to Block for a very, very long time (forever, as far as I know) until the connection is re-established, /depending/ on how and when the connection was lost. Furthermore, I don't think a call to .execute() has any sort of timeout in MySQLdb. It may not raise an exception either.

An example when this occurs is if you lose network connectivity altogether. The call 
cursor.execute('select current_user') will just continue to block until network connectivity is restored.

This is probably not an issue for most people, and probably even then, not very often, but it needs a graceful solution. Unfortunately, the application I'm developing will be permanently sitting behind a flaky wireless network connection, so this is an issue for me :D

There needs to be a way to set a timeout when attempting to test the connection. In the past, I've done this by spawning a separate thread for the "Test Alive" task, and then kill the thread if it doesn't terminate in X seconds, and throw a Timeout exception. I've used this method myself. It may not be the best way to handle the issue, but it does work. Maybe you have some ideas?

3. A similar issue is cleanuping up dead connections. It seems that, for now, this will only happen by explicitly calling cleanupPool() or when you try to create a new PySqlQuery, and there are already maxActiveConnections, it may cull one of those dead connections so that the query gets a new, live connection.

Again, this behaviour can have blocking problems for me (as above).

One option would be to introduce a separate Reaper thread that automatically tests and cleans up the dead connections.

That's all for now.

Revision history for this message

Nikoleta Verbeck (nerdynick) said on 2011-02-13:

#4

If you've got some suggestions on where and what in the docs I should cover more. I'd really like to hear them. I will also look at finding a way of integrating a lot of the note from this thread in there as well.

1) I do have a patch that is suppose to resolve the mysql_real_escape_string() issue. I just haven't managed to get around to testing it and everything else that was provided within the patch to commit it and release a minor release. The patch allows the call of escape_string() on a PySQLQuery object. It does exactly what your hack around is doing.

2) I actually didn't know about this. I will have to find away around determining if a connection is alive other then attempting to run a query. I would like to avoid running another thread to test this. Since it could lead to slower processing times.

3) I am actually working on planning out a way of using a Reaper thread to handle this.

Revision history for this message

Launchpad Janitor (janitor) said on 2011-02-28:

#5

This question was expired because it remained in the 'Open' state without activity for the last 15 days.

Revision history for this message

Nikoleta Verbeck (nerdynick) said on 2011-03-14:

#6

You will be happy to know I have finally finished getting the Github base repo up and running. I'm still in the middle of trying to move everything over to it, but all code has been moved. Along with all open branches. Issue and blueprints will aether move into Github or into a Jira haven't full decided on that yet.

URL: https://github.com/nerdynick/PySQLPool

Revision history for this message

Nikoleta Verbeck (nerdynick) said on 2011-03-14:

#7

Updating to Answered so its tracked in the upfront search results.

PySQLPool

Some helpful tips

Question information

Can you help with this problem?

Subscribers

PySQLPool

Some helpful tips

Question information

Related bugs

Related FAQ:

Can you help with this problem?

Subscribers