blogdex.com

it appears that the folks at nupedia.com are using the address blogdex.com to direct traffic to their service. i just wanted to mention that we are in no way associated with the nupedia project, nor are we quite sure why they're using the domain name 'blogdex.com'.

Posted by cameron on February 27, 2002 at 03:42 PM
new webloggers confirmed

after being away for over a week, the number of unconfirmed, newly added webloggers had grown to nearly 700. this was a daunting task for our crew, but we tackled it on friday. we should be back on a tight schedule (~ 1 day) again for site additions.

Posted by cameron on February 24, 2002 at 11:55 AM
Yet Another Explainable Problem (YAEP)

so last night the crawler failed to complete AGAIN, thanks to the fact that I ran out of disk space :) I guess I didn't really need a copy of the database from EVERY DAY SINCE BLOGDEX BEGAN. got rid of those and now we're good to go.

oops.

Posted by cameron on February 10, 2002 at 12:47 PM
saturday morning bug

somehow, the weirdest bugs always seem to create themselves on saturday morning. i was at home, relaxing with my morning tea, watching a bit of the ballers cribs, when for some reason the index page stopped displaying any links. after getting to work, i still don't know what happened, but it should be working again for the time being.

the statistics were fine, as the other index pages and rss would reveal, but the main index was broken.

Posted by cameron on February 09, 2002 at 03:22 PM
url escapes

some of the information pages were not being found due to unescaped url characters. i'm not sure how this one got by me for so long, but it's fixed nonetheless. unfortunately, now the urls are twice as long as they were before.

for brevity's sake, i did not escape "/" and "\" since these don't seem to affect anything. i was also forced to escape the non-standard "%" and "+" due to the escaping of escaped url's problem (i need to prevent perl from unescaping urls that are in the database in that form).

anywho, let me know if you experience any issues, or think there is a better solution . . .

Posted by cameron on February 08, 2002 at 10:08 PM
crawler confusion

for some reason the crawler didn't complete successfully last night. it's running right now, and the new statistics should be up shortly. all of these hiccups should finally be resolved when we get the new crawler in place, probably within the next week or so.

Posted by cameron on February 08, 2002 at 11:56 AM
i'm hot on the trail

sorry for the garbage interface, but i'm hot on the trail of the new problems that i've created by tweaking a few of the underlying systems.

Posted by cameron on February 05, 2002 at 10:36 AM
one million mark

we just broke 1 million observed links. and we're not stopping. in honor of this joyous event, we'll be testing out our new crawler over the next couple of days. it should provide nearly striking up-to-the minute results and fantabulous efficiency. plus, it should enable us to try all kinds of new experiments that were impossible under our old model. more to come...

Posted by cameron on February 04, 2002 at 04:10 PM