Lost data from my swift cluster, please answer me, its urgent..!!

Asked by Anil Bhargava

Hello,

First of all I want to say that its my third question here inspite of previous two are still unanswered. I solved the previous two but this time problem is very serious, please help me.

I am running the swift cluster on a single server with 2 zones having total 2TB storage space & all services running over the same server. I was trying to add a new storage node. I took backup of existing builder and ring files and added the new zone to the existing ring files.
When I did the rebalance objects got rsynced successfully but I was having an error related to account-server in syslog as follows:

Jul 4 14:18:57 cloudvault account-server ERROR __call__ error with PUT /sdb1/91380/AUTH_365bc339-7f05-45f3-9854-0fb804cb50ed/Kuppusamy_container1_segments : #012Traceback (most recent call last):#012 File "/usr/lib/python2.7/dist-packages/swift/account/server.py", line 317, in __call__#012 res = getattr(self, req.method)(req)#012 File "/usr/lib/python2.7/dist-packages/swift/account/server.py", line 105, in PUT#012 req.headers['x-bytes-used'])#012 File "/usr/lib/python2.7/dist-packages/swift/common/db.py", line 1446, in put_container#012 raise DatabaseConnectionError(self.db_file, "DB doesn't exist")#012DatabaseConnectionError: DB connection error (/srv/1/node/sdb1/accounts/91380/a31/1ff74362319d350c0921907021125a31/1ff74362319d350c0921907021125a31.db, 0):#012DB doesn't exist
Jul 4 14:18:57 cloudvault account-server ERROR __call__ error with PUT /sdb1/27481/AUTH_.auth/.token_6 : #012Traceback (most recent call last):#012 File "/usr/lib/python2.7/dist-packages/swift/account/server.py", line 317, in __call__#012 res = getattr(self, req.method)(req)#012 File "/usr/lib/python2.7/dist-packages/swift/account/server.py", line 105, in PUT#012 req.headers['x-bytes-used'])#012 File "/usr/lib/python2.7/dist-packages/swift/common/db.py", line 1446, in put_container#012 raise DatabaseConnectionError(self.db_file, "DB doesn't exist")#012DatabaseConnectionError: DB connection error (/srv/1/node/sdb1/accounts/27481/528/5bbc8ec47ce5a77183c1f6f33666d528/5bbc8ec47ce5a77183c1f6f33666d528.db, 0):#012DB doesn't exist

Output of SWAUTH - LIST:

swauth-list -A https://my-swift-cluster.example.com:8080/auth/ -K mykey
{"accounts": [{"name": "mainaccount"}, {"name": "testaccount"}]}

swauth-list -A https://my-swift-cluster.example.com:8080/auth/ -K mykey testaccount
{"services": {"storage": {"default": "local", "local": "https://my-swift-cluster.example.com:8080/v1/AUTH_6e1a4b2c-4d79-4a42-8f04-a718236fa1e0"}}, "account_id": "AUTH_6e1a4b2c-4d79-4a42-8f04-a718236fa1e0", "users": [{"name": "test"}]}

swauth-list -A https://my-swift-cluster.example.com:8080/auth/ -K mykey mainaccount
List failed: 404 Not Found

Above command is showing that I have lost access to my mainaccount. I was having approx 600GB of very important data stored under mainaccount. I copied the backup of builder and ring files to swift directory and did the rebalance but still getting the same output as above as "List failed: 404 Not Found".

Please help me, either to solve this problem or to recover my data.
Thanks in advance.

Question information

Language:
English Edit question
Status:
Solved
For:
OpenStack Object Storage (swift) Edit question
Assignee:
No assignee Edit question
Solved by:
clayg
Solved:
Last query:
Last reply:
Revision history for this message
clayg (clay-gerrard) said :
#1

All the data is undoubtably still in-tact, but the error messages seem to
indicate it's not where your running processes expect. You can go hunting
for it piece by piece with swift-get-nodes and swift-object-info.

Did you try putting your old rings back in place? If the new rings have a
different part power - replication isn't going to fix it. You could
probably even put the old rings on the new server and let it drain any data
that accidentally got placed on it back to the old server and we can try
again fresh for a smooth capacity adjustment.

We can always help you with how to proceed, but email isn't the best forum
for an "emergency"

I'm sorry you don't feel like you got a prompt response to your earlier
questions, this week is a holiday in the U.S.

Warm Regards,

-Clay

On Sat, Jul 6, 2013 at 4:46 AM, Anil Bhargava <
<email address hidden>> wrote:

> New question #231972 on OpenStack Object Storage (swift):
> https://answers.launchpad.net/swift/+question/231972
>
> Hello,
>
> First of all I want to say that its my third question here inspite of
> previous two are still unanswered. I solved the previous two but this time
> problem is very serious, please help me.
>
> I am running the swift cluster on a single server with 2 zones having
> total 2TB storage space & all services running over the same server. I was
> trying to add a new storage node. I took backup of existing builder and
> ring files and created the builder files again rather than adding the new
> zone to the existing one as I think it was a mistake.. :(
> When I did the rebalance objects got rsynced successfully but I was having
> an error related to account-server in syslog as follows:
>
> Jul 4 14:18:57 cloudvault account-server ERROR __call__ error with PUT
> /sdb1/91380/AUTH_365bc339-7f05-45f3-9854-0fb804cb50ed/Kuppusamy_container1_segments
> : #012Traceback (most recent call last):#012 File
> "/usr/lib/python2.7/dist-packages/swift/account/server.py", line 317, in
> __call__#012 res = getattr(self, req.method)(req)#012 File
> "/usr/lib/python2.7/dist-packages/swift/account/server.py", line 105, in
> PUT#012 req.headers['x-bytes-used'])#012 File
> "/usr/lib/python2.7/dist-packages/swift/common/db.py", line 1446, in
> put_container#012 raise DatabaseConnectionError(self.db_file, "DB
> doesn't exist")#012DatabaseConnectionError: DB connection error
> (/srv/1/node/sdb1/accounts/91380/a31/1ff74362319d350c0921907021125a31/1ff74362319d350c0921907021125a31.db,
> 0):#012DB doesn't exist
> Jul 4 14:18:57 cloudvault account-server ERROR __call__ error with PUT
> /sdb1/27481/AUTH_.auth/.token_6 : #012Traceback (most recent call
> last):#012 File
> "/usr/lib/python2.7/dist-packages/swift/account/server.py", line 317, in
> __call__#012 res = getattr(self, req.method)(req)#012 File
> "/usr/lib/python2.7/dist-packages/swift/account/server.py", line 105, in
> PUT#012 req.headers['x-bytes-used'])#012 File
> "/usr/lib/python2.7/dist-packages/swift/common/db.py", line 1446, in
> put_container#012 raise DatabaseConnectionError(self.db_file, "DB
> doesn't exist")#012DatabaseConnectionError: DB connection error
> (/srv/1/node/sdb1/accounts/27481/528/5bbc8ec47ce5a77183c1f6f33666d528/5bbc8ec47ce5a77183c1f6f33666d528.db,
> 0):#012DB doesn't exist
>
>
> Output of SWAUTH - LIST:
>
> swauth-list -A https://my-swift-cluster.example.com:8080/auth/ -K mykey
> {"accounts": [{"name": "mainaccount"}, {"name": "testaccount"}]}
>
> swauth-list -A https://my-swift-cluster.example.com:8080/auth/ -K mykey
> testaccount
> {"services": {"storage": {"default": "local", "local": "
> https://my-swift-cluster.example.com:8080/v1/AUTH_6e1a4b2c-4d79-4a42-8f04-a718236fa1e0"}},
> "account_id": "AUTH_6e1a4b2c-4d79-4a42-8f04-a718236fa1e0", "users":
> [{"name": "test"}]}
>
> swauth-list -A https://my-swift-cluster.example.com:8080/auth/ -K mykey
> mainaccount
> List failed: 404 Not Found
>
> Above command is showing that I have lost access to my mainaccount. I was
> having approx 600GB of very important data stored under mainaccount. I
> copied the backup of builder and ring files to swift directory and did the
> rebalance but still getting the same output as above as "List failed: 404
> Not Found".
>
> Please help me, either to solve this problem or to recover my data.
> Thanks in advance.
>
> --
> You received this question notification because you are a member of
> Swift Core, which is an answer contact for OpenStack Object Storage
> (swift).
>

Revision history for this message
Anil Bhargava (anilbhargava777) said :
#2

Thank you very much for your response clay.

I dont know how to access the data becuse error shows that account related DBs are missing.

Following is the output of coomands suggested by you:

$ swift-get-nodes account.ring.gz mainaccount

Account mainaccount
Container None
Object None

Partition 134708
Hash 838d1ece4c89e07201dc996b91d10d6c

Server:Port Device 10.180.32.20:6012 sdb1
Server:Port Device 10.180.32.20:6022 sdb2 [Handoff]

curl -I -XHEAD "http://10.180.32.20:6012/sdb1/134708/mainaccount"
curl -I -XHEAD "http://10.180.32.20:6022/sdb2/134708/mainaccount" # [Handoff]

ssh 10.180.32.20 "ls -lah /srv/node/sdb1/accounts/134708/d6c/838d1ece4c89e07201dc996b91d10d6c/"
ssh 10.180.32.20 "ls -lah /srv/node/sdb2/accounts/134708/d6c/838d1ece4c89e07201dc996b91d10d6c/" # [Handoff]

 $ ls -lah /srv/node/sdb1/accounts/134708/d6c/838d1ece4c89e07201dc996b91d10d6c/
ls: cannot access /srv/node/sdb1/accounts/134708/d6c/838d1ece4c89e07201dc996b91d10d6c/: No such file or directory

 $ swift-object-info /srv/1/node/sdb1/objects/111/bd9/001bf33e41d1d4fa438bf75c814b6bd9/1361176574.73346.data
Path: /AUTH_.auth/mainaccount/rameez
  Account: AUTH_.auth
  Container: mainaccount
  Object: rameez
  Object hash: 001bf33e41d1d4fa438bf75c814b6bd9
Ring locations:
  10.180.32.20:6010 - /srv/node/sdb1/objects/111/bd9/001bf33e41d1d4fa438bf75c814b6bd9/1361176574.73346.data
Content-Type: application/octet-stream
Timestamp: 2013-02-18 14:06:14.733460 (1361176574.73346)
ETag: 6a56ee9b6f38ed2ea8b3a84d0791570f (valid)
Content-Length: 93 (valid)
User Metadata: {'X-Object-Meta-Auth-Token': 'AUTH_tkb8127e8c0c134c6eb9acd6be078d8dd1'}

there was a single account named as mainaccount and various users and their containers wwere under it.

Does replication factor can be the cause of the error because it was set to 1 only ?
Is there any way to recover and download this data manualy ?
Can we open and read these db files to get info out of them ?

And can you please tell me that how swift works with creating so many directories and long unique file names and how it maps them ?

Revision history for this message
clayg (clay-gerrard) said :
#3

oic, replica count on your rings is 1 - that's interesting, I've heard of folks trying to get by with only 2...

the architectural overview gives a brief description of how the names of the files uploaded map to the placement of the data in the cluster:
http://docs.openstack.org/developer/swift/overview_architecture.html#the-ring

the replication overview gives a little background on the reason for the file system layout:
http://docs.openstack.org/developer/swift/overview_replication.html

The databases are just sqlite, you can take a copy it and open it up and poke at it, the schema's not bad

    sqlite3 -line 9d00c9d091fea9ea855f29c22714dc83.db "select * from account_stat"

But the objects themselves will be in .data files similar to the swauth object you inspected with swift-object-info

If you know the names of all the objects and containers you can use:

    swift-get-nodes /etc/swift/object.ring.gz AUTH_mainaccount/mycontainer/myobject

If you don't it might be useful first to find the account database for mainaccount - is it in

    /srv/1/node/sdb1/accounts/134708/d6c/838d1ece4c89e07201dc996b91d10d6c/

Is *everything* on 10.180.32.20?

How many account databases could there be really?

    find /srv/node1/sdb1/accounts/ -name \*.db

Do you still have the old rings? What about the builder files? Can you post the output of

    swift-ring-builder account.builder
    swift-ring-builder container.builder
    swift-ring-builder object.builder

Another idea; you might try removing swauth from the pipeline and adding an overlapping user to tempauth:

    user_mainaccount_ rameez = testing .admin

With that `swift -A https://my-swift-cluster.example.com:8080/auth/v1.0 -U mainaccount:rameez -K testing stat -v` might work...

Or even removing all auth from the pipeline, then most clients don't work, but it's easier to poke at with curl...

Revision history for this message
Anil Bhargava (anilbhargava777) said :
#4

Thank you very much clay for your informative and explanatory answer.

I tried all the ways you told and finllay able to list the containers and data with curl having no auth system in pipeline.. :)

This is the output of Swift-Ring-Builder verification, which you asked for:
$ swift-ring-builder account.builder
account.builder, build version 2
262144 partitions, 1 replicas, 2 zones, 2 devices, 0.00 balance
The minimum number of hours before a partition can be reassigned is 1
Devices: id zone ip address port name weight partitions balance meta
             0 1 10.180.32.20 6012 sdb1 100.00 137971 0.00
             1 2 10.180.32.20 6022 sdb2 90.00 124173 -0.00

$ swift-ring-builder container.builder
container.builder, build version 2
262144 partitions, 1 replicas, 2 zones, 2 devices, 0.00 balance
The minimum number of hours before a partition can be reassigned is 1
Devices: id zone ip address port name weight partitions balance meta
             0 1 10.180.32.20 6011 sdb1 100.00 137971 0.00
             1 2 10.180.32.20 6021 sdb2 90.00 124173 -0.00

$ swift-ring-builder object.builder
object.builder, build version 2
262144 partitions, 1 replicas, 2 zones, 2 devices, 0.00 balance
The minimum number of hours before a partition can be reassigned is 1
Devices: id zone ip address port name weight partitions balance meta
             0 1 10.180.32.20 6010 sdb1 100.00 137971 0.00
             1 2 10.180.32.20 6020 sdb2 90.00 124173 -0.00

As you said in your previous answer that there could be some new node also, so try to find that.
So I got the following output on 10.180.32.22 which I added previously as a new storage node:
$ find /srv/node/ -name \*.db
/srv/node/sdc1/containers/209408/5e8/cc800b2a07d30c3776dedf4f3edc55e8/cc800b2a07d30c3776dedf4f3edc55e8.db
/srv/node/sdc1/containers/231321/a01/e1e67042d7b7390aee99d743c7139a01/e1e67042d7b7390aee99d743c7139a01.db
/srv/node/sdc1/containers/203107/716/c658c9213ce5e47e384bb643f4939716/c658c9213ce5e47e384bb643f4939716.db
/srv/node/sdc1/containers/222609/31a/d9644d9126a48054b763b189cf03f31a/d9644d9126a48054b763b189cf03f31a.db
/srv/node/sdc1/containers/183616/ebf/b35003a21a3fa14be615f0ce8a994ebf/b35003a21a3fa14be615f0ce8a994ebf.db
/srv/node/sdc1/containers/223947/520/dab2ee343d5721cb564cee9b769e1520/dab2ee343d5721cb564cee9b769e1520.db
/srv/node/sdc1/containers/235872/d22/e6583fa257ed916e3e676a002c084d22/e6583fa257ed916e3e676a002c084d22.db
/srv/node/sdc1/containers/225846/1f6/dc8da88182acc2aa3abeee493b1171f6/dc8da88182acc2aa3abeee493b1171f6.db
/srv/node/sdc1/containers/231948/01a/e28323de7b4938763e7455338f1e401a/e28323de7b4938763e7455338f1e401a.db
/srv/node/sdc1/containers/231068/008/e1a7275ad16bb6553fb50089fe0ac008/e1a7275ad16bb6553fb50089fe0ac008.db
/srv/node/sdc1/containers/188038/16d/b7a19cb6844e3c001d55076faf3d516d/b7a19cb6844e3c001d55076faf3d516d.db
/srv/node/sdc1/containers/245841/175/f0145712b7b24186b580e26e34294175/f0145712b7b24186b580e26e34294175.db
/srv/node/sdc1/containers/247277/6c7/f17b4f811f1ae54fe43330dbddb8f6c7/f17b4f811f1ae54fe43330dbddb8f6c7.db
/srv/node/sdb1/containers/254796/d98/f8d33903bae5cfab5ba4fa2bc1456d98/f8d33903bae5cfab5ba4fa2bc1456d98.db
/srv/node/sdb1/containers/198130/235/c17cab7ae29eb87da50fa9559b472235/c17cab7ae29eb87da50fa9559b472235.db
/srv/node/sdb1/containers/209854/e82/ccefb19244f20fe741ce3c3ae0051e82/ccefb19244f20fe741ce3c3ae0051e82.db
/srv/node/sdb1/containers/220045/205/d6e344ddb343ce017ca789a4b9bd1205/d6e344ddb343ce017ca789a4b9bd1205.db
/srv/node/sdb1/containers/231503/7aa/e213dc13c8170cdfb2d1dea6cfc177aa/e213dc13c8170cdfb2d1dea6cfc177aa.db
/srv/node/sdb1/containers/225629/eef/dc57716afcec9c7f828399999ba40eef/dc57716afcec9c7f828399999ba40eef.db
/srv/node/sdb1/containers/203088/235/c65425050db47f7e3e21038c48794235/c65425050db47f7e3e21038c48794235.db
/srv/node/sdb1/containers/236697/86b/e7264fb2cd53361525a36d3f1d65a86b/e7264fb2cd53361525a36d3f1d65a86b.db
/srv/node/sdb1/containers/235301/709/e5c957fa7fd0e679801d92adde157709/e5c957fa7fd0e679801d92adde157709.db
/srv/node/sdb1/containers/209933/2d7/cd0359cea5442c85f6cc90586db962d7/cd0359cea5442c85f6cc90586db962d7.db
/srv/node/sdb1/containers/236620/f24/e7131ad2a2bf8d01a98174ad595f3f24/e7131ad2a2bf8d01a98174ad595f3f24.db
/srv/node/sdb1/containers/205210/aab/c866add742d8b7fecc99869ea88deaab/c866add742d8b7fecc99869ea88deaab.db

so, Do I need to transfer all these container related DB files back from 10.180.32.22 to 10.180.32.20 (old server) ?
I so, then How can i do that, because that node is not in ring now?

Thanks clay that you helped me a lot But still I have some more questions in my mind to proceed with repair work of my cluster, as follows:

1. As I was having a web application running as a front-end for swift cluster and there are some existing user so I have a plan that I will download all data using curl command for respective user and will have a fresh multinode installation then then will create existing user and will re-upload the data. So is it the right way to proceed with or you can suggest me any other approach ?

2. I have 3 servers now ( 4x3TB + 4x3TB + 1x2TB ) and planning for multinode swift production cluster. Can you suggest me the number of zone, partition power and replication count for the same ?

3. Can you tell me how to decide the partition power and replication count according to the available storage space ?

4. Can we increase or decrease the partition power of an existing swift-cluster ring files, having data in the cluster ? if so, then is it going to make replica of new upcoming data only or will it create replica for existing data ?

5. What kind of challanges I can end up in future with this kind of multinode production cluster so I can design for a scalable cluster ?

You might find it irritaing answering so many questions but I need help, to procced with making my cluster in healthy running state !!

Regards.

Revision history for this message
Best clayg (clay-gerrard) said :
#5

> 1. As I was having a web application running as a front-end for swift
> cluster and there are some existing user so I have a plan that I will
> download all data using curl command for respective user and will have a
> fresh multinode installation then then will create existing user and will
> re-upload the data. So is it the right way to proceed with or you can
> suggest me any other approach ?
>

I think learning how to do a smooth ring rebalance and capacity adjustment
would be a good exercise - it's sort of a big deal. If you have headroom
maybe leave things where they're at and spin up some vm's to practice?

> 2. I have 3 servers now ( 4x3TB + 4x3TB + 1x2TB ) and planning for multinode
> swift production cluster. Can you suggest me the number of zone, partition
> power and replication count for the same ?
>

Region and Zone are physical things, just like drives and servers, there's no
longer any reason to make them up. If you have link(s) between some node(s)
that are slower or faster than others, that's a region. Zone is failure
domain, TOR switch, redundant power/battery backup, different rack/location in
the datacenter/colo.

Part power is hard, you have to shoot for the middle and think ahead (see
below).

IMHO Replica count is ment to be 3, I've heard of people getting by with 2 and
now that we have adjustable replica count maybe the first time you have a
downed node and start throwing 404'S or 500's or loose data because of multiple
correlated disk failures - you can consider adding capacity and shoring up
your replica count? You'd really have to ask people running with 1 or 2 replica's
to say if that's working out for them?

> 3. Can you tell me how to decide the partition power and replication count
> according to the available storage space ?
>

If you plan on running those nodes at capacity without adding more
nodes/drives (say > 70% disk full) - I sure would hate to see your replication
cycle times with a part power greater than 17. I think 16 may even be
reasonable, but then if down the road you've got more than 600 drives in maybe
25-30 nodes (let's say 5TB drives by then, that's ~1PB usable) you may be
fighting balance issues as each drive has less than 100 partitions on it :\
Maybe limiting your max object size to something like 2 GB might help protect
you, or maybe if you've got metrics you can apply back presure from disk
capacity over a rebalance loop until you get unabalanced partitions spread
around (hard when there's only 100 of them drive - that acctually probably
wouldn't work :\), so... maybe swift has adjustable partpower by then!?

Too big is problem now, too small is problem later. I personally save hard
problems for future me. You could get by with as much as 18 (a very
respectible part power IMHO) if you plan on growing rapidly, and I don't think
there's any reason to limit yourself to less than 16 with 9 drives.

> 4. Can we increase or decrease the partition power of an existing
> swift-cluster ring files, having data in the cluster ? if so, then is it
> going to make replica of new upcoming data only or will it create replica
> for existing data ?
>

Moving from one part power to another is currently not an "adjustment" it's a
migration, it's not a process that is built into swift today. Duplicating
subsets of data during the migration would probably be needed to ensure
seamless availability. Not trivial.

> 5. What kind of challanges I can end up in future with this kind of
> multinode production cluster so I can design for a scalable cluster ?
>

Load balancing is a PITA! Swift can make the storage part easy to scale and
takes failures in stride if you've got a sound understanding of how the bits
go together. Good luck!

Revision history for this message
Anil Bhargava (anilbhargava777) said :
#7

Thanks clayg, that solved my question.