OpenStack Object Storage (swift)

Sqlite and .ts maintenance with deletes

Asked by André Cruz on 2013-01-04

I have a container that receives frequent PUTs/DELETEs. Over time I've noticed that the fs has a lot of empty .ts and the SQLite file has grown a lot.

The .ts nodes are not being reclaimed after 1 week, as they should, and it seems to me that data is never deleted from the SQLite file. How can I deal with these 2 issues?

Thanks.

Question information

Language:: English Edit question

Status:: Solved

For:: OpenStack Object Storage (swift) Edit question

Assignee:: No assignee Edit question

Solved by:: Chuck Thier

Solved:: 2013-01-15

Last query:: 2013-01-15

Last reply:: 2013-01-15

Link existing bug

Revision history for this message

Chuck Thier (cthier) said on 2013-01-04:

#1

Have you checked to make sure the replicators are running? They are what is responsible for reclimation.

Revision history for this message

André Cruz (andrefcruz) said on 2013-01-04:

#2

They are running. Also, I've noticed that although the SQLite DB is large, there are no deleted objects older than a week, so data seems to be reclaimed there.

However, regarding the .ts nodes, their number is always increasing and I see nodes 2 months old which is the age of the system. Clearly they are not being reclaimed. Is there any scenario when this is supposed to happen?

Can I/should I delete them manually from the file system?

Thanks for the help.

Revision history for this message

Chuck Thier (cthier) said on 2013-01-08:

#3

How active is the cluster? It looks like the reclamation happens when the suffix hashes are re-hashed (which usually only happens when a change, like a new object PUT, happens in that suffix. If nothing has changed in that suffix dir, then it is likely it hasn't had a chance to crawl, and thus reclaim the .ts files.

Revision history for this message

André Cruz (andrefcruz) said on 2013-01-10:

#4

I'd say the cluster is pretty active. We get 400k uploads per day and the average file size is 2MB.

As for the "suffix dirs", I read the http://docs.openstack.org/developer/swift/overview_replication.html documentation but could not find out for certain what they mean. However, I have a lot of uploads/deletes for a specific container, and it is here that the problem has been worse since I already have much more tombstones than regular files. Since this container/dir is changed so frequently, shouldn't it be crawled and .ts files cleaned? Don't these happen in the same "suffix"?

Thanks for the help.

Revision history for this message

André Cruz (andrefcruz) said on 2013-01-15:

#5

How can I force activity on a certain suffix dir? Is it the suffix of the filename? Do I just have to upload an object to the same dir with a filename with the same suffix?

Thanks!

Revision history for this message

Samuel Merritt (torgomatic) said on 2013-01-15:

#6

The suffix is the last 3 nybbles of the hash of the object name (that's MD5(object-name + per-cluster-suffix)). It's there so that there's not a ton of subdirectories in the partition directory. Think of a URL like "http://archive.ubuntu.com/ubuntu/pool/main/c/cairo/cairo_1.6.0.orig.tar.gz"; the "/c/cairo/" part is just there to spread the files around a bit. Same thing.

So, if you wanted to force activity on a certain suffix directory, you'd need to come up with an object name such that the last 3 nybbles of hash(objectname) are the suffix *and* the first P bits are the same as the partition number, where P is your cluster's part_power setting. That's computationally feasible (find at most 44 bits of collision), but it's not trivial.

Revision history for this message

Best

Chuck Thier (cthier) said on 2013-01-15:

#7

As a little bit more drastic measure, you could delete the hashes.pkl file in the partition directory. This will cause the replicator to crawl all of the suffix dirs in that partition, which will take a bit of time, but also remove the .ts files.

Revision history for this message

André Cruz (andrefcruz) said on 2013-01-15:

#8

Well, I suppose this is the best I can do. But it is rather painful doing this on multiple partitions....

To post a message you must log in.

Ask a question

Edit question

Subscribers