Utterances of a Zimboe

Programming the Internet.

Archive for November 2007

Volatile (ie., highly dynamic) supergraphs

with 6 comments

Update: Chris Messina had similar thoughts a month earlier:

In fact, it’s no longer even in your best interest to store data about people long term because, in fact, the data ages so rapidly that it’s next to useless to try to keep up with it. Instead, it’s about looking across the data that someone makes transactionally available to you (for a split second) and offering up the best service given what you’ve observed when similar fingerprint-profiles have come to your system in the past.

Now back to my contribution…

Now that everyone’s frenzying over the gruffalo, I guess it’s as good a time as any (at least) to expand my graph post [also] a little. The post isn’t anything notable but it gives some context, if interested.

For the topic’s current initiation, see TBL on Giant Global Graph:

[Internet] made it simpler because of [instead?] having to navigate phone lines from one computer to the next, you could write programs as though the net were just one big cloud, where messages went in at your computer and came out at the destination one. The realization was, “It isn’t the cables, it is the computers which are interesting’.
The WWW increases the power we have as users again. The realization was “It isn’t the computers, but the documents which are interesting”.
Now, people are making another mental move. There is realization now, “It’s not the documents, it is the things they are about which are important”.

(See also a nice roundup on ZDNet, Who is afraid of the GGG?)

My views next.

Dynamic graphs

Put it simply, my approach to building a next-gen SNS would be to extends the current feed polling capabilities to the graph aspects as well (social networks etc). For example, I have Facebookimporting’ my Jaiku feed currently, so why FB couldn’t just as well import changes to my Jaiku contact list and apply them to the FB domain?

The underpinning aspect here is that the future services should not try to be exhaustive by themselves, but to present the whole graph from their point of view as well as possible — be that ‘entertainingly’ or whatever is their beef. More generally, services should fetch data all over the web, and decorate it in their own, valuable way. Services would be much more interesting if they combined all the data and not just tried to hold on to that very tiny fragment (eg., the FB account) of my overall graph so fiercely.

This implicates that services would need to adapt to whatever changes in the user’s social graph, and that happens to be exactly what I want at a very personal level — ofcourse, this is also professionally highly intriguing. In practice, it’d be even somewhat trivial; just import the damn changes as you read the feeds. The difference to feed polling here is that it’s not new content per se that is fetched, but changes to existing data. So, services need to delete stuff (relationships etc.) as well. For implementation details, see for example “Dynamic Graph Algorithms” by Eppstein et al., 1999, which appeared among the first google hits (can’t remember for what; try out). (‘Someone’ should probably adopt those to rails so we could get the ball rolling…)

Of course, the most prohibiting problem currently are the walled gardens. (I expected the Open Social to do something about this, but Google failed me.) There are however at least a couple of sites that publish more data, but it’s very little still. And note that the data here comprises is the basic building blocks of the semantic web, but semantic web drafts completely ignore the fluctuating nature of data; which is why I think the semweb is born retarded.

Supergraph instead of a flat global graph

Then there was my notion about ‘supergraphs’ in the title; volatile was just to emphasize the true dynamism — if you don’t store external data, you don’t need to synchronize it. (Keep it simple.) With supergraph I mean that different kinds of graphs should not be bluntly blended but the metadata should be used accordingly. Social networks provide a fine example here, too.

This ‘supergraph’ thing should be very intuitive also: if I ever wanted my LinkedIn contacts to be mapped to Facebook, it’d be extremely nice if the LI contacts were presented in a different style than my Jaiku contacts. And, perhaps, there could be some different tools available for each kind of network — please, no vampire stuff (like, wtf…) for the business contacts, aight?

And, just to note, not all contact networks should be mapped to every service (of course), but that’s a whole another graph story and I’ll leave it for later.

So, this ‘volatile supergraph’ thingy should be rather easy to implement (no hard non-trivialities) if you was a systems designer (and not part of the 80%) and I’d be very excited to have it working. I bet a few googol other zimboes would be as well.

Expect next: dynamic, rich graphs. (Or probably not. But still,) Thanks for listening!

Dynamist artists used the concept as part of a way of representing the complexity of processes, rather than be limited by the discrete and static moments within change, which also illustrated the limits of human perception.

— found in Wikipedia

Tags: , , , ,

Read the rest of this entry »


Written by Janne Savukoski

November 28, 2007 at 10:55 pm

Posted in Future, Internet

Processing speed, not compression

with one comment

In Will EXI Mean XML Everywhere?:

“It is unlike data compression, which has overhead associated with it,” explained Schneider, whose day job is as CTO of AgileDelta, which makes XML parsers. “XML is verbose and inefficient; [EXI] streamlines all of [the processing] with the minimal size representation of XML information possible,” said Schneider. “We want to make it competitive with hand-optimized binary formats.”

“When W3C tested EXI, it was, on average, 12 to 14 times faster than processing normal XML.”

I hope people will finally stop mixing up compression and efficient processing.


Quoting myself: ;)

And, as the binary format should be the most efficient way to represent that data, it really is the leanest ad-hoc binary format.


Only minor improvements should be able to be gained with further ad hoc binary formats, as the qualitative improvement in reducing the parsing complexity is already achieved. Thus, there should be no room for more efficient ad hoc format […].

Written by Janne Savukoski

November 18, 2007 at 9:18 pm

Posted in Technology

The Stinky Bits

leave a comment »

These are hardly of any inspiration for anyone else, but just to show the code(s) that ‘sparked’ my previous observation:

override def init(ctx: ComponentContext) {
  val descs = descriptions(new File(ctx.getInstallRoot))
  provides.foreach { svcName =>
    debug("Searching description for service %s", svcName)
    descs.map(t => (t._1.getService(svcName), t._2)).find(_ != null) match {
      case Some(t) => services = t :: services
      case None => warn("No description found for service '%s'", svcName)
  info("Found %d services: [%s]", int2Integer(services.size), services.map(_._1.getName).mkString(", "))
def descriptions(dir: File): Iterator[(Description, Document)] = {
  val cache = new HashMap[String, Option[(Description, Document)]]
  dir.listFiles(wsdlFilter).elements.map(f =>
      try {
        wsdlReader.synchronized {
          debug("Reading WSDL '%s'", f.getName)
          val doc = domParser.synchronized(domParser.parse(f))
          Some((wsdlReader.readWSDL(wsdlReader.createWSDLSource | { s => s.setSource(doc) ; s.setBaseURI(f.toURI) },
                new WodenHandler(f.getName)), doc))
      } catch {
        case t: Throwable => error(format("Exception while reading '%s': %s",
                                   f.getName, t.getMessage), t)

That’s just a bit of JBI-related framework logic, which lazily parses some WSDL-files and caches the structures until the required descriptions are found.

Works fine. :)

ps. definitely not the best show-case for abstraction, but I just wanted to play around with files a little.

pps. my editor is 100 cols wide, and the parts that don’t fit are irrelevant.

Written by Janne Savukoski

November 7, 2007 at 1:46 am

Posted in Programming