Ok, so I got this idea from those posters you can find that map all the connections from place to place in the Internet. Why not do the same thing with e2? Every node gets a point, every softlink is a line... It would make a cool poster that we could maybe sell (I would buy one, but then I bought the t-shirt too). I'd maybe leave the hardlinks out of this... we can just see where people come from, and where they go. Now, I haven't checked to see whether we keep track of comings and goings once we've hit the maximum number of softlinks, but i hope we do so we can put The Node Linked To All Others at the center...

Clearly this is a nontrivial project... I personally am much more comfortable with C than with perl, but if we need all that funky string-manipulation capability that perl seems to excel at, then perl it is.

So basically, if you have thoughts on this, ideas, want to tell me I'm a moron, and especially if you want to help write the code (I am no perl guru, and I can always use someone looking over my shoulder on the c stuff), fire away. I would love to see this happen, and I think it would be an incredibly fun project to take on.

Kickin'.

I have seen a few mapping projects out there (like the one that maps the Linux kernel) use a program to crawl over the "source" and output some metadata which is then fed into another program which generates an insane amout of PostScript. I could imagine something similar, but I would like to take a simpler approach than having to debug some insane PostScript output. Perhaps having an intermediate human-readable form of the data would be a good idea.

I could see nodes color-coded based on type (person, place, thing, idea, superdoc, user, etc.) and possibly given a scale (in bold-ness, maybe 5 shades of differentiation or something?) based on that nodes highest rep or something.

One of the hardest bits of this problem is that unlike the internet, e2 is not a hierarchy or even an n-to-n mesh; any two nodes can be linked. Graphically representing that without having some sort of detail threshold could make any representation of e2 get out of hand awefully quickly. So, it would be best to asign weights to links (soft and hard) per node that would demonstrate numerically how connected one node is to another. That would be shown graphically by modifing the thickness of the line.

It would probably be best to have a three pass parser that would:

  1. Step one: Parse nodes into objects that contain node names links.
  2. Step two: Further process this data:
    1. Assign weighting to the links.
    2. Filter links below a certain weight.
    3. Sort the nodes in geographic space clustering nodes with the most simular links close to eachother.
    4. Output some sort of easily parsable format (XML?).
  3. Step three: Parse the intermediate format into PostScript or some other vector description.

I'll will post more, after I will have thought about it some more and will have talked about it with some of my routing visulization buddies (future perfect tense, wow!).




Changelog:

  • 06/16/2001: Added blurb on how soft/hard links would be graphed.

I've had a first attempt at doing this, and managed to shoe-horn it into one source file for posting here (see nodes2graphviz.cc). Example images are available at "www.tom-carden.co.uk/nodegraphs".

FYI I used the following libraries and third-party tools:

  • libcurl ("a library that groks the URLs"), a part of cURL available from "curl.haxx.se".
  • TinyXML, available from Grinning Lizard at "www.grinninglizard.com/tinyxml".
  • the set of GraphViz tools, specifically "neato" and "twopi", available from "www.research.att.com/sw/tools/graphviz", or "www.graphviz.org"

The program can either take your login name and password and fetch the User Search XML Ticker for itself, or you can provide one (by default it needs to be placed in a sub-directory "username/xml". After this, it extracts all your node_id's and fetches all your nodes. It then translates the softlinks from all of your nodes into a dot file for processing with neato or twopi. The current version also adds links to the nodes if SVG or PS are used for output. Unfortunately, PNG is the only format which works reliably with the files I have produced so far (but I don't have a good PostScript viewer). I am working on producing image maps to display links for each node, but there are issues to be resolved with scaling and with overlapping nodes.

At this point, I wouldn't recommend using neato if you have more than about 150 nodes, unless you have a large amount of memory and a fast processor. I don't have hard figures, but my 70 nodes took around 5 minutes, conform's ~120 nodes took around 20 minutes, and I was unable to produce anything with neato for wertperch's ~300 nodes. The process used by twopi differs to that used by neato (neato uses an iterative, spring-based layout model) and so is much less computationally expensive.

If you'd like one of these graphs right now, the best thing to do would be to e-mail me the XML source of your User Search XML Ticker output (Sorry, I'm quite backed up with these at the moment, and now work full-time). I can provide binaries or assistance with compiling if you would like to run it yourself or modify the source for your own purposes. At the very least, C and C++ fans might be interested in the libcurl code for fetching e2 pages as XML.

Log in or register to write something here or to contact authors.