(glance S3 client) plan to support multipart upload method

Asked by mozawa

Hi,

Is there any plan for glance S3 client to use multipart upload method for the large virtual image file upload in the future release ?

I haven't tested but it looks like that there are CHUNK upload features in Swift Store and RBD Store but nothing in S3 Store.
Hopefully, S3 Store should have it as well.

# What size, in MB, should Glance start chunking image files
# and do a large object manifest in Swift? By default, this is
# the maximum object size in Swift, which is 5GB
swift_store_large_object_size = 5120

# When doing a large object manifest, what size, in MB, should
# Glance write chunks to Swift? This amount of data is written
# to a temporary disk buffer during the process of chunking
# the image file, and the default is 200MB
swift_store_large_object_chunk_size = 200

# Images will be chunked into objects of this size (in megabytes).
# For best performance, this should be a power of two
rbd_store_chunk_size = 8

Question information

Language:
English Edit question
Status:
Solved
For:
Glance Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Jay Pipes (jaypipes) said :
#1

Hi!

Actually, the only reason we had to do something special for Swift was because it doesn't handle images greater than 5G without manually splitting the file into chunks :)

The current Glance S3 driver actually is extremely inefficient because of the Boto library's design. Because the call to boto.s3.Key.set_contents_from_file() requires a seekable file descriptor (and webob.Request.body_file is not seekable by default and setting the seekable attribute to True pulls the entire body into a StringIO in memory.

To prevent this problem, which resulted in OutOfMemory errors when image file sizes exceeded the available memory (common for Windows images), we write the incoming webob.Request.body_file to a temporary file on the API server's disk in chunks. Currently, that chunk size is not configurable, but it sounds like you'd like to make it configurable?

-jay

Revision history for this message
mozawa (ozaint) said :
#2

Hi,

Thank you for the response.

Well, my point is if Glance can use S3 MultiPart Upload REST API for the large objects. I believe boto already provides this API and Amazon recommends it for the large object upload. If it can be implemented in S3 client in Glance, It's very useful for the customers and a good thing for AWS and other S3 servers.

http://www.elastician.com/2010/12/s3-multipart-upload-in-boto.html
http://aws.typepad.com/aws/2010/11/amazon-s3-multipart-upload.html
http://www.centernetworks.com/amazon-web-services-launches-multipart-s3-uploads

Revision history for this message
Jay Pipes (jaypipes) said :
#3

Thanks for those links, Mozawa! I'll turn this into a blueprint.

Cheers!
-jay

Revision history for this message
Jay Pipes (jaypipes) said :
#4
Revision history for this message
Jay Pipes (jaypipes) said :
#5

Blueprint created