PDA

View Full Version : [Dev] Long Downtime Period



Jin
16-03-2011, 04:13 AM
So first of all huge apologies for the very very very long downtime period which lasted about 84 hours, an explanation is due I will try and keep it as simple as I can so everybody understands sorry for spelling and grammar mistakes I haven't had much sleep since these issues began.

DC = datacenter (where our server is located)
Host = the middle man between us and the datacenter
disk = Hard drive


12th March

15:06: Work began moving services running on our server to a new disk to relieve R/W wait periods on our main disk.
16:06: Data found to be corrupting silently (we monitor for corruption) on our main disk.
16:10: Sites closed so recent backups can be taken.
17:48: Host is given orders to replace primary disk and reinstall OS.
18:46: Host finally agrees that disk replacement is required after much discussion.
19:33: Final check of services is made on servers and go ahead given to Host to begin work.
22:00: Confirmation that job has been passed to DC is given.

13th March
01:18: Disk is replaced and server is handed back over.
01:20: Restoration begins with core services to server.
04:13: Restoration begins with Habbox websites.
07:23: Backups made from original server show signs of corruptions due to data being stored on the bad sectors of the disk.
07:30: Request made to host to restore original disk so new backups can be made.
09:13: Confirmation that job has been passed to DC.
09:59: Confirmed disk has been swapped.
10:15: Server refuses to boot correctly, investigation request made with host.
10:49: Host confirm request sent to DC.
11:17: Datacenter state that corruption has occurred to the boot partition of disk.
11:25: Request made that primary drive is swapped back to the new drive and disk mounted within system.
12:33: Host reply with message from DC stating that new primary drive also has become corrupt due to motherboard fault which has been corrupting data.
12:45: Request made that motherboard is replaced ASAP.
18:47: Server handed back with new MoBo and Disks and loaded as rescue os.
18:33: Restoration of core services begin again.
22:06: Backups are made and verified, tables show signs of corruption.

14th March
02:12: All tables downloaded to local machine for repair.
04:43: All tables repaired on local machine.
05:02: Request made for OS to be reloaded.
07:15: Confirmation received that job has been passed to DC
Sleep
16:00: Server has been handed back.
16:19: Server Ip's are incorrect compared to those previously allocated, OS is also different. Host notified.
17:13: Some crap story given back and told to update Ip's in globalDNS.
17:56: globalDNS system interface not working (nothing to do with us).
18:23: Request made with host to gain the server id number which references our machine in the DC.
20:34: After much arguing server id number obtained.
21:15: Phone call made to DC and issue transferred to ticket with an account previously held with them.
21:25: DC agree that mistakes have been made and they only acted from the requests from Host, work begins to restore OS and ips.
22:00: Repaired tables on localhost uploaded to cloud. Sleep.

15th March

07:00: Server handed back from DC.
07:15: Restoration of core services begin.
12:12: core services completed.
12:32: Backups download begin.
12:42: Download fails, fault traced back to incorrectly configured network card.
13:00: DC is called. DC ask for the job to be authorized by the host.
13:04: Request made to host.
13:35: Configuration resolved.
14:09: Backups complete download from cloud.
14:48: Habboxlive restored.
15:00: Work begins on Habbox.com
17:23: Deep corruption of critical v5 habbox.com files in premium modification, sierk contacted to find source files.
18:12: Source files couldn't be obtained, belongs to an ex-member of staff. V6 beta release uploaded.
18:30: Forum database restoration begins.
| : Restoration fails, configuration changed and retry begins.
| : Restoration fails, configuration changed and retry begins.
| : Restoration fails, configuration changed and retry begins.
| : Restoration fails, configuration changed and retry begins.
| : Restoration fails, innodb table pool mismatch. mismatch resolved. retry begins.
| : Restoration fails, innodb table pool mismatch. mismatch resolved. retry begins.
| : Restoration fails, innodb table pool mismatch. mismatch resolved. retry begins.
| : Restoration fails, innodb table pool mismatch. mismatch resolved. retry begins.
| : Restoration fails, innodb table pool mismatch. mismatch resolved. retry begins.
| : Restoration fails, innodb table pool mismatch. mismatch resolved. retry begins.
| : Restoration fails, innodb table pool mismatch. mismatch resolved. retry begins.
| : Restoration fails, innodb table pool mismatch. mismatch resolved. retry begins.
| : Restoration fails, innodb table pool mismatch. mismatch resolved. retry begins.
| : Restoration fails, innodb table pool mismatch. mismatch resolved. retry begins.
| : Restoration fails, innodb table pool mismatch. mismatch resolved. retry begins.
| : Restoration fails, innodb table pool mismatch. mismatch resolved. retry begins.
| : Restoration fails, innodb table pool mismatch. mismatch resolved. retry begins.
| : Restoration fails, innodb table pool mismatch. mismatch resolved. retry begins.
| : Restoration fails, innodb table pool mismatch. mismatch resolved. retry begins.
V
03:05: Forum Restored.
Power nap
07:00: Basic security hardening begins.
13:30: Forum reopens on a limited service.

Hecktix
16-03-2011, 01:48 PM
We are currently aware of a login prompt appearing in certain threads from "http://develop.davzy.com" - we are unsure why this is happening and David will be able to sort it out when he gets online.

We assure you the prompts are not dangerous, just simply press "Cancel" and it will go away.

HotelUser
16-03-2011, 01:54 PM
This password prompt was put in place as a temporary security measure as a precaution in accordance with the Davzy host and has now been removed :)

Want to hide these adverts? Register an account for free!