Web hosting 101, or how to lose my business in five words

Behold, a star­tling tale of awful tech sup­port staff, too lit­tle too late man­age­ment strate­gies, and ver­bose tirades of dar­ing do that was sup­pos­edly done did! Since Jan­u­ary, my vir­tual pri­vate server account with Tek­tonic has gone down more than a fresh­man soror­ity pledge. Every time, I have had to play this two-step swing­ing game with them to get any sort of response as to why my server was behav­ing in a way that could only be referred to as "wonky". After the jump, you will read the tran­script of the camels back breaking.

If you don't want to read 36 hours worth of sup­port ticket tran­scripts, the cliff notes ver­sion is that Tek­tonic basi­cally went out of their way to lose my business.

So, I've moved into an account with Lin­ode and while it's still our hon­ey­moon, it's mov­ing along pretty sweet right now. In the roughly 48 hours since the move from Old Busted to New Hot­ness, the traf­fic has spiked to approx­i­mately 5x the load due to my friend Rick, an edi­tor involved in the news side of the comic book indus­try, part­ing ways with his now-former employer for the sec­ond time in a year. His site, a Word­Press pow­ered site with only mild (read: not aggres­sive) caching is sin­gle­hand­edly respon­si­ble for said 5x increase in incom­ing traf­fic. The new server has not even shrugged under the new load. No sweat has been bro­ken. I might turn up avail­able Apache processes to com­pen­sate but seri­ously, it just doesn't seem to care.

Mon Aug 04 2008 11:20AM by Ryan

IP: (redacted)

Any one want to tell me why my VPS is report­ing as "The sys­tem is unavail­able" in the con­trol panel, again?

This is the sec­ond time in the last week this has happened.

Real­iz­ing that I could not recieve their emails as my account was now down, I requested a sim­ple follow-up at another address.

Mon Aug 04 2008 11:21AM by Ryan

Client IP: (redacted)

Adden­dum: since my server is down, my email, again, doesn't work. Please direct all cor­re­spon­dence to (redacted) until this is resolved.

At this point, the ticket was closed upon start up of server, with­out any­one telling me what hap­pened or what they were doing about it. Approx­i­mately 12 hours elapsed after my account came back up, and it went down again.

Tue Aug 05 2008 11:38AM by Ryan

Client IP: (redacted)

My VPS is down and report­ing as "The sys­tem is unavail­able" in the con­trol panel, again. This is the third time since last week.

Is the machine my VPS lives on expe­ri­enc­ing hard­ware problems?

Addi­tion­ally, why was my ticket from yes­ter­day closed with no correspondence?

Tue Aug 05 2008 11:44AM by support@tektonic.net

Hi, The host is cur­rently down for a file sys­tem check as we found file sys­tem errors on the node. ETA is 1-2 hours.

That's it. That's the reply I received. Con­cise, yes. Almost to the point of rude­ness. I wanted to ring them up and shout "WELL NO SHIT ASSHOLES, DO YOU KNOW WHAT I DO FOR A LIVING?"

But they don't, so I didn't. I'm classy.

Tue Aug 05 2008 11:48AM by Ryan

Client IP: (redacted)

Why don't you notify cus­tomers about these unex­pected down­times then they occur?

Is there a rea­son these down­times almost always occur dur­ing peak hours?

Was any­one going to tell me that I may have suf­fered data loss from a poten­tially dirty file system?

Tue Aug 05 2008 12:02PM by support@tektonic.net

Hi, The out­age wasn't planned so a notice could not of been given. No data loss is expected.

For those play­ing along at home, those are the magic five words. That was when I did a san­ity check in my nightly back­ups and began look­ing for a new host.

Tue Aug 05 2008 02:51PM by Ryan

Client IP: (redacted)

"No data loss is expected" is not good enough. If I have files being writ­ten when this hap­pens, or data­base tables being flushed while the sys­tem is halt­ing, and the sync doesn't hap­pen fast enough or it writes the data to inodes that then get "mas­saged" by fsck on the way back up, there is a chance, though slim, of data loss.

The out­age hap­pened dur­ing busi­ness hours and it is well within rea­son to pull a list of affected cus­tomers whose accounts are and fire off a quick email to them along the lines of "hey, we dropped the ball and some­thing blew up. again."

Even if I wouldn't haven got­ten it, my email server being hosted on the affected VPS, the effort alone would keep me from foam­ing at the mouth over this.

As it is right now I've been down all day wait­ing for a filesys­tem check to fin­ish up for the sec­ond time in as many days. Is there a plan to pre­vent this from hap­pen­ing? Is there a migra­tion route to a more sta­ble server?

Frankly put, and you can for­ward this to man­age­ment or sales if you want since you're so adamant about strat­i­fi­ca­tion between tech­ni­cal sup­port and busi­ness admin­is­tra­tion (and I can­not seem to get a cus­tomer ser­vice phone num­ber out of any­one there), what are you doing to keep my busi­ness? If the last 6 months of ser­vice are an indi­ca­tion, I can't count on your ser­vice to be even mar­gin­ally reli­able. I'm not ask­ing for the myth­i­cal 5 9's in reli­a­bil­ity here; I'm just ask­ing for my account to not be knocked offline two days in a row because some sys­tem admin­is­tra­tor didn't notice a mess of disk write errors in /var/log/messages.

I can­not get a straight answer from any­one about what­ever hap­pened every time my VPS is offline. That's infu­ri­at­ing enough, but this cav­a­lier han­dling of my data is utterly unac­cept­able, espe­cially see­ing as how you just billed me for another month of this same level of lack­ing ser­vice YESTERDAY. Maybe I'm not pour­ing thou­sands of dol­lars of high end infra­struc­ture dol­lars into your busi­ness, but maybe there is a damned good rea­son for that.

Tue Aug 05 2008 06:20PM by support@tektonic.net

Hello Ryan,

For major out­ages we post threads on the forums to track them. In regards to the cur­rent one it can found here. http://www.tektonic.net/forum/showthread.php?p=5089&posted=1#post5089 We find this much more effi­cient due to cases such as yours.

The pre­vi­ous errors you have seen were tied to the bad moth­er­board that was men­tioned in the thread above. This will solve and future occur­rences of such prob­lems in the future.

We actively mon­i­tor all of our servers for a mul­ti­tude of prob­lems but things such as this can only have so much pre­ven­ta­tive mea­sure put in place.

I apol­o­gize for any down time you've expe­ri­enced due to the failed hard­ware and if there is any­thing we can do please let us know.

Thanks, Bruce

Tue Aug 05 2008 06:43PM by Ryan

Client IP: (redacted)

I want some­one to explain to me the con­tra­dic­tory state­ments of this sup­port ticket thread and the forum thread, specif­i­cally why when I asked explic­itly about data loss or file cor­rup­tion, I received the curt reply of:

"No data loss is expected."

No con­text, no expla­na­tion, noth­ing more than a terse 5 word answer that pro­vides me with as much assur­ance as it does information.

Now, hav­ing had this forum thread pointed out to me FINALLY, I checked the thread to find the fol­low­ing from Matt Ayres:

"VPS's are start­ing. There may be file corruption."

I also want to know why no one thought to point out this forum thread when they knew this was going to become an all day affair. I've been wait­ing since 2:00pm for what I thought was just an fsck to complete.

Also, why isn't my VPS run­ning yet if the VPSes were com­ing back up as of what looks to be 6:30pm EST from the forum? The con­trol panel still lists my VPS sta­tus as "The sys­tem is unavailable".

Tue Aug 05 2008 07:40PM by support@tektonic.net

Hello,

Ini­tially it was believed to be a sim­ple issue with the raid card that just needed replac­ing. At the time of com­ment we had no indi­ca­tion of pos­si­ble data loss.

After swap­ping the card it became appar­ent that it was indeed a moth­er­board issue.

In regards to the last part of your email there was more cor­rup­tion and errors present then we had antic­i­pated. Start­ing the VPS's began ok but was not stable.

Thanks, Bruce

I am offi­cially spit­ting blood at this point. But it's ok, I've made up my mind and started mov­ing ahead. Noth­ing to do now but log into the server that is finally up, and start mov­ing over what­ever inci­den­tals I might have missed in my nightly backup, right?

Wed Aug 06 2008 10:47AM by Ryan

Client IP: (redacted)

I see my server is down, again. That's ok though. After being fed up with the fiasco yes­ter­day, I moved into a new account last night, at more than twice the price for a lit­tle less than dou­ble the resources, with a com­pany who promised me, explic­itly, that they could pro­vide a more sta­ble account than you. They had me at "we can do bet­ter than tek­tonic" and the price never even entered into it.

It didn't help your case any that my data­bases, file sys­tem, and in some cases sys­tem bina­ries were all cor­rupted by the errant, but recur­ring, hard­ware prob­lems. For­tu­nately for all involved (but no one more so than me) I smelled this com­ing three weeks ago and started rsync­ing the impor­tant data and nightly data­base back­ups off the server the last time there was an extended outage.

I'm can­cel­ing my account, and I'll be con­tact­ing sales to request a full refund of the pay­ment posted for early august as I've been unable to use my account thus far, and have no inten­tions of con­tin­u­ing to use it any longer. If you can expe­dite that process, won­der­ful. If not, well, that doesn't seem to be any dif­fer­ent from any other expe­ri­ence I've had with your support.

I'm hon­estly amazed how, look­ing back at the year on and off between two accounts I spent with Tek­tonic at how often my VPS was down, unavail­able, or unpre­dictable. Look­ing at your server uptime report, it kills me to see how my account invari­ably winds up placed on the least reli­able server in your farm.

No Comments