Amyloo and Dave on pinging and scaling
Both Amy Bellinger and Dave Winer noticed my sentence about ’scaling’ in the previous post, where I said:
“As more and more people both publish and subscribe to OPML Reading Lists, the polling that we are all doing isn’t going to scale.” (from “Reading List Ping“)
Amy said: “Polling won’t scale, Pito says. Apparently that’s the conventional wisdom. Could someone explain why it’s the conventional wisdom so I can understand it?”
Dave said: “Amyloo spotted a feature announcement at Pito’s blog, he says that polling of OPML Reading Lists won’t scale, but it will, and imho it’s the only way to go.”
I was being very vague, and I could also have been wrong.
Let me explain what I meant. If “P” is publishing an OPML Reading List and “S” is using that list for one some purpose — like displaying the underlying blogs in a web-based user interface, or doing some complex analysis of the stuff referenced in the OPML perhaps to populating some big database — then somehow “S” will want to know when “P” changes his, and vice versa. It’s in the interest of both parties. If they are cooperating with each other they can agree on way to notify, which is what this ping is about.
If now we are talking about lots of people publishing and lots of others subscribing and we want to cooperate with each other, then it seems like things could get inefficient with all the point to point pinging. So to be clear, the ping idea is purely about notifying that something has changed not about avoiding the need of actually talking to get the changed content.
I don’t have a good answer though to Dave’s point. Why can’t “S” just use HTTP and/or ETags to see if “P”’s list has changed and if it has, grab it and check the differences? It is indeed extremely efficient.
Actually it gets even weirder if “P”’s reading list references (through a url) some other reading list or OPML “X”. Now “X” changes, “P” needs to find out, and “S” needs to find out. It starts looking like a spreadsheet recalc problem. Pinging won’t help us there at all. Hmm. Needs more thought.
Technorati Tags: OPML, readinglists







How many “RSS won’t scale” gurus have turned into mutes?
Comment by Randy Charles Morin — April 22, 2006 @ 11:11 pm
a well known distributed objects pattern could apply here in that you have the Opml - a collection of dynamic uri’s - be an object that both P and S audiences are interested in. The P’s have update permission. The S’s listen for changes and the OPML object knows how to announce “I have changed”. The “I’ve changed” event on the OPML delegates to push notification that propogates to all the listeners in the S group. This is canonical pub sub that middle like apache PubScribe handles pretty well, thou, it uses heavy weight protocols ( SOAP not Http alone ) and it assumes all nodes are always on.
Comment by rob r — April 23, 2006 @ 12:27 pm