Archive for January, 2005

RSS to Combat Spam?

There’s an interesting post over at zDnet suggesting that RSS might be a way to replace the problematic aspects of spam and phishing style email attacks.

First, I think they are correct in noting that email has become increasingly problematic as a communication medium because of what spam has done. The idea that information is reliably transmitted via email has decreased dramatically, and interestingly, not because of its security problems. Given that companies like Ebay can’t rely on it to send important information because nobody trusts email that says Ebay on it anymore, what kind of solutions are there?

The idea that RSS could replace email is provocative. Here’s what they have to say:

Why not have a separate feed for every customer? This is the same thinking that went into another idea I had — overnight shippers setting up separate RSS feeds for every package they handle. This way, I can subscribe to packages I’m sending or receiving, and my RSS aggregator (Newsgator, etc.) alerts me to changes in each package’s status. To keep a lid on the number of RSS feeds a shipper must run, the RSS feed for each package would expire a few days after the package arrives.

From a security point of view, this makes a tremendous amount of sense. Because the client queries the provider, as opposed to the email system which allows any provider to query the client, it makes it much easier for clients to get good data.

They go one to think about how whole email systems could be replaced by RSS:

Finally, could widespread use of this approach be the backdoor towards flipping all existing e-mail solutions on their ear, turning them from SMTP-based store-and-forward systems to RSS-based alert-poll-and-retrieve systems (alert my mail server of an RSS feed that has something for me, poll that feed, and retrieve the message)?

Yes, this might be a great way to deal with some of the problems of spam, but really, it’s probably a better idea to start with the security flaws in email instead trying to use RSS to solve the problem.

Email isn?t going to be replaced by RSS. Perhaps this is naive of me to say, but what works about email is that it allows a “cold call”, something that the client to provider query model doesn’t work for. You can’t get somebody to subscribe to your RSS feed unless they take action. While this solves some the problems around spam, it prevents you from getting information that you’re not actively trying to get.

Instead, trying to figure out how to allow for the good “cold calls” to make it through is probably a better way to deal with spam. Sender authentication, secure ids and so on, present much better ways of dealing with the problems of the provider to client model.

That being said, for those people who actually use RSS as their life line, companies like Ebay ought to be providing feeds. Since Ebay already produces feed like content, they might as well make it RSS compliant.

Comments

Book Content via RSS?

Russel Beattie is suggesting that book content delivery could be done via RSS. By serializing or “chunkifying” a book (or large piece of content) into installments, RSS could be an easy form for delivery. Given that more and more kinds of content are being delivered via RSS, encouraging people to use aggregators, having larger textual content in the same sphere makes some sense. The question is is it useful?

As much as I have distained podcasting previously, what I do like about it is that it makes content delivery easy. Since it can be tied into software that people already use (audio players and rss readers), it makes finding and getting new content easy. Photo services like Flickr do a similar kind of thing- it makes it easy to find and retrieve content while using existing tools.

What seems different about syndicating book content is that while people use their RSS readers to read content on line, I’m not sure that they use it to read large texts on line. Certainly there are people who do, but frankly, give me a book in my hand. Neither my laptop nor my desktop are happy reading companions and in my limited research, most people feel similarly. Given that there aren’t devices to which one might download book content that are reasonable (sure, there are devices, but who really uses them), the idea that this might take off the way that podcasting has doesn’t really seem to square.

Perhaps the kinds of books I like to read are different than what might be done on an RSS feed, but my hunch is that to do this kind of content delivery either there needs to be a semi-ubiquitous device that many many people have and use every day to read content (like mp3 players for podcasting), or the kinds of content that might be suitable for this begin to appear- ala the 2 minute mystery.

I like using RSS to collect data, but I find where it is most useful is where it either becomes an aggregation of a search for content (ie, a compilation of all the news that I have to search out on the web), or a way to get data automatically into another form which I don’t want to do by hand.

Comments

Securing Basecamp

At Eggplant, we use BaseCamp to manage our various projects. One of the nice features about BaseCamp is that it allows us to store files related to projects on a server (not on BaseCamp) and keep important documents available to the various projects easily accessible.

This creates two problems. For one, since the files aren’t on BaseCamp’s servers, you have to put the files somewhere, which means having an ftp login some where. A recent update now lets you use SCP, which is a massive improvement, so that’s one problem down. The real problem though is that since the place where the files go has to be web accessible, it means that there is an open directory somewhere where all your files live. This is a major security problem and potentially really bad for relationships with clients. Especially when Google has a habit of finding things that it shouldn’t, or at least, that I don’t want it to.

Here’s a solution.

1) create an htaccess password on the directory where you’re storing the files
2) in the Basecamp url encode the user name and password:

https://eamorg:flapjacks@red.eggplantmedia.com/~basecamp/

Now you keep Google and other prying eyes out, while still keeping the files accessible to BaseCamp. Of course, if BaseCamp actually stored the files on their server, this wouldn’t be a problem, but for now, it’s a fix.

Comments

MusicBrainz ate my mp3s

Jason pointed out MusicBrainz to me recently, and now that I’m giving it a whirl, I have to say that I’m impressed. By taking “fingerprints” of mp3 files, it’s able to compare files to a moderated database of files maintained on the site. This allows users to match existing files to the database and determine mp3 tags. This is really useful if your mp3 (acc, ogg, whatever) files weren’t well tagged to start with. I ripped many of my own cds long before the online databases were as good as they are now and many of my files have bad tags.

The problem is that for such a program to work, it has to integrate well with how you listen to music. The OSX client is really quite nice, or at least, it seems like it would work well in a standard configuration. The problem is that I don’t use iTunes. Generally, I use amarok to listen to music at home, but as of yet, amarok doesn’t have have a direct plugin for musicbrainz- least one that I can figure out how to operate. My devious plan to work around this was to connect my laptop (OSX) to the samba server on my desktop (linux), grab my whole mp3 directory into iTunes, tag it via iEatBrainz, and then I’d be done with this mess.

Unfortunately, because the files don’t reside on the mac, it doesn’t seem to want to actually alter the tags. I was able to finally get good track and album data for some 300 or so tracks that I never bothered to label, but I haven’t had any luck actually getting that data onto those files.

This aside, I did have some bad file matches that were somewhat obvious to me. Some of these errors I was able to quickly catch by looking at the file name. It seems like it would be good to parse the file name in addition to the fingerprint and do some comparative analysis that way. Often times the file names have artist or track names which can be good clues for getting the right complete tag. I’m not sure if iEatBrainz works this way or not, but either it doesn’t or it does it poorly.

I also wish that it used a different interface. It seems like the most useful way to do this would be to have two different modes- one is to verify individual tracks, an other is to view a whole collection. From the collection standpoint, the program could recurse the whole collection and give % certainty on tracks, giving access to the list of potential suggestions if it found them. Simultaneously, it could take data that looks like it’s pretty good (ie: complete tags, file exists in a artist/album hierarchy that matches, etc), and populate the online database.

Even though I haven’t gotten it quite right yet, it’s a remarkable piece of thinking. Particularly that the online database can be moderated by members and continually updated. It’s a great idea, I just want it to work!

Comments

Template Toolkit Javascript Image Navigator

I built a little javascript/TT image navigator that allows you to load images from a directory, display them as thumbnails and zoom them as you click on them. This is nice for doing things like selecting images for various web applications. It relies on some javascript, but to my knowledge it’s totally cross browser compliant.

Here’s the javascript you’ll need:
[code]

// this is used to toggle a specific div’s display status
function div_toggle (thisdiv) {
var the_div = document.getElementById( thisdiv );
if (the_div.style.display == ‘block’) {
the_div.style.display=’none’; }
else { the_div.style.display=’block’;}
}

[/code]

Here’s the TT code. The code assumes you have print size images in

[code]

[% USE dir = Directory('/var/www/eam/www/images/flyermaker') #name of the directory you're using -%]
[% thepath = "/images/flyermaker/" #path that you want images to use -%]

Choose Image:

[% FOREACH file = dir.files -%]
[% IF ((loop.index mod 12 == 0) && (loop.index >= 1)) -%]

[% END -%]


[% IF loop.index mod 6 == 5 -%]
[% END -%]
[% IF (loop.index mod 12 == 11) -%]Images [% loop.index - 11 %] - [% loop.index + 1 %]
[% IF loop.index > 12 %]
 
< Previous
 |
[% END %]
 
Next >
[% END %]
[% IF loop.last %]
< Previous[% END %]
[% IF ((loop.last) || (loop.index mod 12 == 11) ) -%]

[% END -%]
[% END %]

Image Choice:




Click image to zoom

[/code]

Fun! Here’s an example of it’s use, shamelessly stolen from the code I wrote for Jason for the forthcoming eggplant site.

Choose Image:



Images 0 - 12   Next >


Images 12 - 24   < Previous  |   Next >


Images 24 - 36   < Previous  | Next >


Images 36 - 48   < Previous  | Next >


Images 48 - 60   < Previous  | Next >


Images 60 - 72   < Previous   |  Next >
Image Choice:


Click image to zoom

Comments

US stops looking for WMD

Finally, the US stops looking for WMDs in Iraq. Here’s a quote from ABC news:

Chief U.S. weapons hunter Charles Duelfer is to deliver his final report on the search next month. “It’s not going to fundamentally alter the findings of his earlier report,” McClellan said, referring to preliminary findings from last September. Duelfer reported then that Saddam Hussein not only had no weapons of mass destruction and had not made any since 1991, but that he had no capability of making any either. Bush unapologetically defended his decision to invade Iraq.

While it is some what surprising to see the semi-critical statement at the end of this quote, the substance of it has been clear for some time. The irony of course is that the tactics of the Bush administration has been to take the belligerent position and stick to it no matter what.

Here’s where I wonder really about the “Wisdom of the Crowd” mentality, or even the argument that media is in some sense self correcting. Now that the administration itself has release information that indicates that the pretext for invasion was completely wrong, will the blogosphere “self correct”? Even if the bloggers get to “correct” the media to continuously mention that the Bush administration’s policy for entering Iraq was completely mistaken, there is no connection between the “self correction” and the policies of the administration. Habbermas, who championed the idea of a 4th estate, understood that media ceased to function as a corrective mechanism decades ago.

Comments

BookTracks and Maybe “Open Book Data Exchange Format”?

After kellan pointed out that somebody else has tried to do track backs with books (aka: booktracks) I’ve spent a bunch of time trying to think through how this kind of information architecture might work. Much of my thinking around booktracks is related to publishers- for publishers, it’s useful to track reviews and have ways of seeing who is talking about books and where.

The problem with this model is that most people don’t think about books in terms of publishers. When we look for books for the most part we look by subject mater and authors. Since publishers often produce a wide variety of material the potential of liking other books from the same publisher isn’t as high as it would be for an author.

The problem is that in terms of tracking conversations about books, publishers are probably more interested than anybody else. Of course, authors are going to be interested in what people say about their books, and people who want to read books want to know what people are talking about, but publishers want to be able to watch all of their books and have an easy way of tracking down that information. Amazon has been a central place where people do this, but it drives traffic away from publishers and other book sellers and also doesn’t necessarily have the best format for discussions about books. Given Amazon’s dominance in the world of book selling, it’s easy to understand why it became the place for people to find reviews.

While generally, I take the position that the centralization of information is typically a good thing as it means that people who want to use information know where to go for it, the problem with things like reviews and book discussions is that the aren’t all going to take place in one spot. Furthermore, the software that Amazon is using isn’t really conducive for sharing information about books or the reviews that are on its site, and it’s a for profit company who could probably care less about sharing information from books. Of course, people have, are and will keep figuring out ways to mine Amazon for data, but relying on them isn’t exactly great for getting rich data to flow,

So the question is how do you both centralize book review and conversation data without relying on a single site?

One possibility is to use the publisher as the repository. This makes sense in that the publisher ought to have the best data on the book and be able to notify when new editions come out etc. The problem is that not all publishers are online, they aren’t an intuitive place to look for reviews, and they may not be happy if negative things are said about books.

My hunch is that FOAF (friend of a friend) is something to look at. Imagining for a moment that we have some kind of Open Book Data Exchange Format, this would all anybody taking about books to publish something in this format and then any number of applications could sniff around and use this data in unique ways. Because books already have a handy unique identifier (ISBN numbers), it would be easy to aggregate data and do fun manipulations with it. Of course, the problem is in supporting it and getting people who write reviews to actually start using it, but given that bloggers seem to love various forms of syndication, it makes sense to me that it could be adopted by a community.

Comments

« Previous entries