• Welcome to Tamil Brahmins forums.

    You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our Free Brahmin Community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content and access many other special features. Registration is fast, simple and absolutely free so please, join our community today!

    If you have any problems with the registration process or your account login, please contact contact us.

Site Issues

praveen

Life is a dream
Staff member
On Friday, the server where the site is hosted ran out of space and it had to be upgraded.

During the upgrade process, it was found the database of this site had got corrupted. In spite of several attempts by the host to fix the issue, the database kept shutting down and server had to be taken offline.

Only now (7.30am IST) that the issue got resolved after the backup from Dec 4th was restored.

Any posts made after Dec 4th is gone unfortunately.
 
Last edited:
Unfortunately, the server crashed once again after I posted the above message and only now came back.
 
On Friday, the server where the site is hosted ran out of space and it had to be upgraded.

During the upgrade process, it was found the database of this site had got corrupted. In spite of several attempts by the host to fix the issue, the database kept shutting down and server had to be taken offline.

Only now (7.30am IST) that the issue got resolved after the backup from Dec 4th was restored.

Any posts made after Dec 4th is gone unfortunately.
Praveen, Do we have a backup process. Suggest you carry out a daily back up & weekly back up. Also Pl define your RPO & RTO,
 
Praveen, Do we have a backup process. Suggest you carry out a daily back up & weekly back up. Also Pl define your RPO & RTO,

We have a daily back up. The host has two separate daily backups. But the database got corrupted we had to go back skip two days and restore from 3 days ago.

As a matter of fact, the server is being changed as I type this.

Hopefully we should not have the issues anymore.
 
We have now moved to a new and better server with a different host. This host is a huge upgrade to the one earlier.
 
Good! That should help! May be use an open source monitoring tool to check the DB health. You can also set up alerts for disk usage!
 
Good! That should help! May be use an open source monitoring tool to check the DB health. You can also set up alerts for disk usage!

Just to give more update on what happened.

The previous host reported a disk usage alert on the drive that had the operating system. Usually this does not happen but it happened. I asked them to add a new 500GB hard drive. In this process, they found the database had got corrupted because the operating system ran out of space and did not save some stuff.

They tried to repair the database but it did not work as expected. So they restored the previous day's backup but it ran into more issues. When they restored Dec 4th backup, the motherboard and network card had to be replaced. This took a while and DNS failed because of the new network card and it took a while for the changes to propogate.

The old server had a alerts set up for various things. But in spite of all of it, it failed. The senior system admin who looked into said and quote below

As for the technical issues, this situation has become complicated in strange ways. Generally no software issues should affect the hardware unless it is something actively malicious designed to do so. We have no reason to believe any of this is being caused by a bad actor. Instead it looks like the hardware issues were just a case of very very bad luck. We are still investigating the failing services, so I can't offer much information about why this is happening. If we knew, we'd be fixing it already. Currently I do not see any database issues, but I do see the DNS issues still persisting. I am investigating these right now

I have been with them since 2020 so there were no causes for alarm. But what prompted the change now was they have been acquired by someone else and the issues we just had is very similar to another issue we had last year (more or less same time). At that point also, it should have been a simple 6 hour migration to a new server but it dragged on for 3 days.

So, I had enough and switched to a different host.

The cost is more than what I was paying earlier, but they are a good company with a 5 minute tat for support tickets. This server has monitoring as well, so I am hoping we do not run into any more issues.
 

Latest posts

Latest ads

Back
Top