HabboxWiki needs you!
Are you a Habbo buff? Or maybe a rare trader with a bunch of LTDs? Get involved with HabboxWiki to share your knowledge!
Join our team!
Whether you're raving for rares, excited for events or happy helping, there's something for you! Click here to apply
Need a helping hand?
Check out our guides for all things to help you make friends, make rooms, and make money!


Results 1 to 1 of 1
  1. #1
    Join Date
    Jun 2004
    Location
    Reading, Berkshire
    Posts
    2,260
    Tokens
    11,722
    Habbo
    :Jin:

    Latest Awards:

    Default 28/11/16 Outage Report

    On 28th November between 22:45 and 23:03 we suffered a brief outage the cause of which was discovered within 10 minutes.

    What happened?
    The new servers we migrated to were originally set up 3 weeks ago, back then there were two database servers that were replicating from each other with the basis being that we could offer high availability and load balancing between the two. However a week before we began the migration we were performing some tuning steps which broke the replication between host 1 and host 2.

    We thought nothing of it and shutdown the second host on the basis that we would go back to repair it over the christmas holidays.

    Whilst the sites continued to operate on the single database host, this host was generating binary log files which were not being processed by the second host as it was shut down. As a result it slowly filled the C:\ to 100% which is what caused the outage.

    Why were the sites suspended?

    Whilst attempting to fix the issue we were faced with max connection errors as the slowed down server clung onto each connection for several minutes whilst it slowly attempted to process the request. In the end the user traffic had to be stopped from reaching the database server so the fix could be applied.

    How are we going to prevent this from happening?

    There are a number of things we still have left to do on with our new infrastructure the first and foremost being our new monitoring platform which will monitor for things such as disk space utilisation, memory usage, processor usage and service uptime.

    As we have migrated over in a hurry for commercial purposes we have had to priorities the web and database servers over monitoring. We are slowly catching up to these tasks.
    Last edited by Jin; 28-11-2016 at 11:14 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •