configuration has arrived
xml feed fixed
announcing (finally) an rss feed!
blogdex button
weblog in the latimes
titles are working
who needs useless words?
yet another color scheme
a bit of good timing
July 2003
June 2003
April 2003
March 2003
January 2003
December 2002
November 2002
October 2002
August 2002
June 2002
May 2002
April 2002
March 2002
February 2002
January 2002
December 2001
November 2001
October 2001
September 2001
August 2001
the most recent redesign (to allow for configuration) used an old version of browseSource.asp, which introduced an error in the listing. i just fixed it, so things should be back to normal. i also changed the default listing to 25 items per page, as opposed to the original 10.
the long awaited set of configurations that i promised have finally arrived! you can now edit your preferences so that blogdex looks appealing to you.
i'm planning on adding more color schemes sometime in the near future (the design is pretty extensible). thanks to some persistence by people, i've also added a checkbox to open links in a new window. as soon as i get some time, i'll engineer it so that this too is a configurable preference. likewise, i'll get around to switching this weblog over to the new design system as soon as i get some time.
if you can think of any other simple preferences that would make your viewing habits easier, please don't hesitate to ask.
thanks to a tip from eric, i've fixed a "bug" that might have affected anyone trying to use xml::xpath to parse the blogdex feed. the problem resulted from some heading whitespace -- i'm not sure if there is a standard regarding what line the xml header needs to appear on, but it's fixed nonetheless.
after a bit of thinking on the issue of feedback (with help from paul nakada and dan chan), blogdex finally offers its first rss feed:
http://blogdex.media.mit.edu/xml/recent.asp
this offers the current top 10 links on blogdex. if you're interested in presenting more than the top 10, you can specify the count:
http://blogdex.media.mit.edu/xml/recent.asp?c=50
will get you the top 50. there is a hard maximum of 1000 links, but any given day only has ~500 links in the range that blogdex notices. this feed is now available on newsisfree.
in the near future i hope to add rss for all of the other pages, so that a weblog can include the last 10 people to link to them, or follow a link as it progresses through memehood. let me know if you have any suggestions. i'm also working on an appropriate button (88x31, by netscape's standards).
in the process of wasting time, i made a little blogdex button, that will be going out with syndicated content:

it's free to use, available at http://blogdex.media.mit.edu/blogdex-site.gif . i might make a smaller one as well, but for the time being i thought i'd just shrink the logo. i've never made a logo button before, and am not sure whether or not there are any "logo button standards" that i'm supposed to be adhering to.
in case you didn't catch it, there was a nice story about weblogs in the context of 9/11 last sunday. there's a bit of a mention of blogdex, in addition to an interesting quote from me: "Tech geeks, Marlow pointed out, are Trekkies at heart." the good stuff comes out under pressure.
i'm still keeping track of all of the news that has come out referencing blogdex, to describe blogdex as a sort of meta-meme. maybe i'll work on that right now.
i had a bug in my title crawler which was preventing it from retrieving content from foreign sites. now that it is back up and working, all of the recent titles should be crawled nightly (the older ones will come with time).
as many of you have noticed, the new interface to blogdex includes a list of phrases used by people to describe a given site, giving a context to an otherwise meaningless piece of information. one of the downsides of this information is that sometimes people don't really add any information, but instead refer to sites as "this" or "here." in order to increase the amount of good information, i've stoplisted a few phrases:
1. contextually meaningless words (link, site, url, article, story)
2. certain prepositions (this, that)
3. prepositions + contextually meaningless words (this article, that site, etc.)
4. url's (http://www.thissite.com, www.thissite.com)
if you can think of anything else that's distracting, let me know and i'll recompile the descriptions. i think that it's a lot easier to read now.
for those of you that are frustrated with the 3LIT3 color scheme (white on black), i'm brewing up a more resepectable look for you which should look something like this:

i said that i'd have a solution by friday, but the events of the past week left me broken on friday and the weekend, so i'll get around to it sometime early this week. i'm also allowing for the configuration of the font size and brevity of descriptions.
so i had this talk to give yesterday about blogdex (which is what originally prompted the redesign that has been so controversial). to make a long story short, there is a point in the talk where i make the claim that blogdex is good at identifying memes. sometimes the top links are just links to other news stories, which tends to make people start asking questions. but not yesterday.. because the bert meme was going strong! it really helped drive the point home. it's such a weird meme that even a seasoned memester like myself didn't know how to interpret it. good stuff.
it's the middle of our sponsor week here at the media lab, and amist all of the ideas and spittle flying around the room, there's also a huge number of webcast packets flying around our network. this is making blogdex flaky, at least from my end. if you get some weird reaction from the web server, just persist and it will work eventually.
after dealing with a huge spamming of the urls (~ 10000 sites a day), i've added an ip address field to my database so that i can pick out the culprits.
all of the spam has come from one domain, t-dialin.net, which seems to be a pretty major isp in germany. instead of blocking all ip addresses from this domain (which is a pretty big lot), i'll just censor them after they are added.
i really do not want to block any ip addresses, but if the traffic from this guy gets out of hand, then i'll have to. has anyone had any experience with this sort of thing before? since this person probably speaks german, there is little chance that this message will dissuade him, but i thought i would post it anyway.
if you have added your system recently, give me a little bit of time to overcome this problem. it might take a couple of days.
i've been tweaking some of the representations that i use in mysql for better performance. despite some frustrating setbacks (first and foremost being the inability to create an index which has two columns sorted in reverse of each other), i think that things should be faster now. browsing the indexes should be much much faster.
in preparation for the coming week here at the media lab, i have completely redesigned the site. if you were fond of the old site, there is good news: i will be implementing a bit of personalization so that you can make things look more like the old way:
1. you will be able to get a brief listing, without all of the extra information
2. the colors will be either white on dark blue OR black on white
3. the fonts will be adjustable.
if you have any general comments about the design, please let me know here. thanks!
i'm working on a new version of the site, which should be going up next week. for those that are curious, it can be viewed immediately here. i've take a little bit from the design of google and daypop to provide more information in the initial interface.
the list of phrases given for each site is the references that were made to that site, ordered most popular to least popular. this way you can get a good feel for what each site is about before you visit it.
again, this site is in development, so go easy on it. many of the links do not work yet. i should have it up and running by the end of the weekend.
i apologize to those that came to the index today to find a majority of the websites dominated by this "freesites.net" stuff. blogdex was not, in fact, subverted, but rather victim to a common problem that could be avoided.
all of the weblogs that use the server "crosswinds.net" last night were taken down by their provider last night. in their place was a generic "this site not found, but please use crosswinds" page. this page linked a number of times to sites that blogdex has not seen before.
the system is aware of all pages that are exactly the same, but right now i am manually using a tool to delete those that are actually the same system (such as the culprits last night). sometime soon i'll put a check in to mark all of the links with duplicate pages as "questionable," and not use them in the statistics.
thanks again to everyone that pointed this out to me, and i'm sorry i didn't get to it sooner.


