Friday, April 29, 2005
Cross Our Fingers.
Update 05/05/05 Remember all that cool stuff I told you we were working on? Well, instead of building neat new feature sets and funny blinking text buttons, we've spent the last week improving what we do best: Collecting Away Messages.
BuddyGopher is now doing manual Away Message checks for all buddies in the system at a rate of once every hour. Even though BuddyGopher is built on the real-time presence awareness of live instant messaging, the occasional away message has been known to slip through our cracks. Usually this is due to users who change their Away Message without clicking the "I'm Back" button in AIM. It should never take more than five seconds for a new away message to appear in our database, but in the event that our records are innacurate, BuddyGopher will now catch the error within sixty minutes.
So we will continue to focus our BuddyGopher development efforts on building scalable and reliable server software that crawls and archives public user data on the AOL Instant Messenger network. Less fluff, more buff.
If you have BerkeleyDB and/or memcached experience, we'd love to get some outside advice on server architecture! Please contact us today.
Friday, April 22, 2005
Please send me an IM if you experience anything strange.
Wednesday, April 20, 2005
Recovery Gossip, Part 5
Monday, April 18, 2005
Recovery Gossip, Part 4
For the first time in four days, BuddyGopher collected away messages. Seven away messages per second is a new record for us. That's over 25,000 away messages per hour. Most of the buddies were already on-line and away when our OSCAR array initialized, so there was a huge backheap to collect - plus all of the buddies who may have gone away during the next fifty nine minutes. Away Message collection rates later cooled down to our average of 12,000 unique messages per hour. So we are happy that the biggest custom-built AIM BuddyList still runs strong with around 72,000 buddies.
Strange things continue to plague a few circuits in our faulty disk drive, so we won't know until tomorrow whether or not our loving users will be required to re-enter all of their buddy names into their BuddyGopher Buddy List. Maddest props to anybody reading our development blog and patiently waiting for BuddyGopher to come back up. Shout-outs go to John Wigle, who has been holding our hand with the EV1Servers support staff, Marek Publicewicz, the real brains behind BuddyGopher, and anonymous Japan-bound person for his continued partnership patience. Our tentative back on-line date is Friday?
Sunday, April 17, 2005
Recovery Gossip, Part 3
Things with BuddyGopher are looking a lot better today. While waiting to see if our away message database can be recovered, Marek has been working in overdrive to piece together assorted code fragments to nearly rebuild the entire system. I think he even pulled an all-nighter in Warsaw on Friday night waiting to recompile some libraries?
Side note: In order to see performance gains from running epoll, we had to be running the 2.6 Linux kernel. poll and epoll are ways of having a large number of file descriptors (sockets, in our case) and being able to know which of them are ready for reading and/or writing. With our gargantuan live Buddy List, scalable performance gains like this are clutch.
Saturday, April 16, 2005
Recovery Gossip, Part 2
Date: Apr 15, 2005 11:50 PM
Subject: Hard drive problem
The problem started last night when an incorrect move command was issue causing the server to crash, the user knows he did it and that is not the issue. A restore was ordered for the server since it was going to take more time to recover it then just restore it. We requested that the old drive be placed into the server as is commonly done so that the drives information could be recovered. Up until this point the server had never had any trouble with the disk drive in the server and no warning messages were ever generated. When the server was brought back up the tech reported that the drive's circuit board had fried and the drive was not recognized by the system. What does not make sense is that the tech said he could physically see the burnt chip. I know that static electricity can fry a circuit board but it should not leave a physical mark as described in a ticket. The only other thing that comes to mind is that somehow it was exposed to some sort of high voltage. This is what does not make sense, the drive that failed should never have left the box and only a single cable switched. Would you be able to look into this issue and perhaps see if anything else can be done? I just find it hard to imagine how this happened and want to make sure that the drive was not misplaced or switched.
Unfortunately like too many servers this user did not have any backups and there was no copy of the important data. I do realize this is completely the users fault, we just want to make sure that everything has been done.
Friday, April 15, 2005
Well, the opinion right now is that BuddyGopher is totally borked for at least the next one week. Come back then, or continue reading this blog now for hot recovery gossip.
The server that we all work on died today. What follows is the open trouble ticket with EV1Servers.net, our dedicated hosting provider.
4/15/2005 9:29:38 AM
We are now working on your ticket. And will update you as soon as possible.
4/15/2005 10:56:20 AM
Your original master drive is no good. Not able to be slave. The restore has be completed and your server is online.
4/15/2005 5:04:05 PM
4/15/2005 5:38:52 PM Unfortunately the drive is physically damaged, showing a burned chip on the drive board. The drive will not spin and can not be recognized by the server. We have forwarded your ticket to management for any other data recovery options.
One character mistake.
How could a missed 'spacebar' key take down a web server? I'm sure there are lots of ways, but here is one that we found:
mv db /* ./instead of:
mv db/* ./When issued as root, things get nasty.
EV1 is in the process of restoring the BuddyGopher server after this catastrophic accident. Our production server is a sweet little HP P4 2.0Ghz running Redhat Enterprise on two 80GB HDs and 1.5GB of RAM. BuddyGopher should be back on-line later this weekend.
Wednesday, April 13, 2005
We're working on it.
As a result of ongoing scalability tests, a few away messages might not have been collected from friends on your BuddyGopher BuddyList last night.
In other news...
- History Reduction
We have limited the Away Message history page to only display the last seven away messages. Dustin Smith suggested that we extend our history timeline to infinity. We can do that soon.
- New Favorites Feature
Just another good excuse to play around with XMLHttpRequest techniques? Nope. This was a user suggestion to curb problems with the limited availability of Away Message history. You can save unlimited Favorite messages, and over time we hope it'll turn into something like an Away Message yearbook. We definitely do not have any plans to delete the messages that you add to your Favorites list.
- MyDict bot
Our dictionary bot is not very stable. We didn't expect it to be so popular and are considering opening it up to get developer feedback.
Sunday, April 10, 2005
MyDict is an AIM Dictionary Bot
MyDict is an AIM bot that we built as a test-of-theory. You can query MyDict for English dictionary results from your Danger Sidekick (or any other AIM-enabled mobile device). MyDict works on the desktop, too, but our dictionary results aren't as current as other on-line dictionary sources.
As you may have noticed from using BuddyGopher, we take pride in returning fast, text-only results. One of the reasons that MyDict is so good at what it does is that we locally cache all of the dictionary definitions in an SQL database. This local database eliminates the need to query a central dictionary server.
I was considering offering a Wikipedia bot, but figured that the content was not as conducive to SMS-length messages. Maybe we could add a hyperlink at the end of MyDict results to a Wikipedia search? For a great implementation of a website integrating Wikipedia content, check out what Answers.com does.
Saturday, April 09, 2005
Thursday, April 07, 2005
Your IM client is silently broadcasting a handfull of presence indicators to all of the buddies on your BuddyList. These indicators tell us to what degree you are available for chatting. Different states of presence include,
- away + idle
- Ready to Engage (for example, on Skype as "Skype Me")
In the BuddyGopher Development Wiki, you'll see us refer to these presence broadcasts as buddy_in notifications.
User software only needs to broadcast its presence state once, when it has changed. When you put up an Away Message on the AOL Instant Messenger network, your software broadcasts a presence notification to the AOL servers that says, "Hey! I'm away." and "Here is my Away Message: xxxx" These messages go to the AIM servers, and then the AIM servers tell everybody who has you on their BuddyList that you are now in an Away state.
Some clients are notoriously terrible about re-broadcasting their presence information over and over and over again. It's akin to calling your boyfriend and saying, "Hey! I'm bored!" every commercial break. A few versions of Trillian had problems like this. I have seen Trillian clients continuously broadcast their state once every minute. For one or two hundred buddies on your personal BuddyList, this is not a problem. For a system like BuddyGopher, where we have the largest BuddyList in the world (69,190 buddies at the time of this post), accidentally loquacious IM applications cause us to consume a lot of extra processing power.
Most people use the standard AIM client, which is the right thing to do.
Take a peek! BuddyGopher's freelance team of advisors includes many talented away message lovers. We want your help, too! Photograph by Youngna Park, featuring one-time BuddyGopher GUI architect Zach Klein.
Monday, April 04, 2005
Let's get started.
- one hour- 11,944 messages (between 4pm and 5pm)
- one day - 160,000 messages (between 4am yesterday and 4am today)
One would imagine that with the cumulative nature of our persistent collection system, we would continue to set new house records each week with an increasing user base. One would be correct. (A user is more likely to simply stop using BuddyGopher than to log in and delete all of their buddies.) This just made for a good first post.