Update - I wrote a caching e2 proxy. If you're in edev, you have the URL (check your inbox)... If you're not but you have a strong desire to see it in its horribly incomplete state, /msg s_alanet. And those of you viewing this page through the proxy... Salud! More information below under "The Implementation."

Introduction

E2 has some problems. We've got a huge database to store, and hefty bandwidth requirements. We don't have the cash to buy a shiny cluster to host the site, or a bigger, phatter pipe to feed it through. There isn't really cash to hire full time developers or editors, though we have a lot of really wonderful volunteers that should be paid.

I was thinking about the whole distributing e2 thing, and I realized that we need a stopgap. An interim measure. Maybe a long term measure (after all, making a fully distributed e2 system is a big task - it might not be done for years!).

I was also thinking about the difference between e2 and a blog. I like my friend's blogs because they're very personal. I look at whose I want. If they're fancy I can have little discussion in the comment boards with my friends. I post stuff, they post stuff back - it's a lot of fun. But not very interesting for other people. E2 is the opposite - it's all about high quality writing for everyone to enjoy. If I post a story about my dog, it had better be good, not just some drivel that only people who have seen my dog will appreciate. Factual information - which would be boring on a blog - has a strong presence here.

We also have a lot of neat tools - the e2 node tracker, the E2 Nodegel Visualizer, and a lot of really cool clients.

How - I thought to myself - could these wonderful things be combined? It seems like they're all connected... but how?

And then it all came together.

The Idea

An Everything2 Hub - or an E2 Blog, or whatever you want to call it.

What I have in mind is a website that you can drop into your webspace on your ISP or university. It's a PERL or PHP script that automagically does everything to set itself up - a seed, so to speak. It probably uses flat files for maximum compatibility.

  • It provides a blog syched with your daylogs. You post as you see fit - it deals with making sure the right daylog writeup on E2 is appended to, edited, or created. Sort of a blogger for the E2 set. It also displays the blog entries (including converting hardlinks to href's).
  • It shows statistics relevant to you - latest writeups, for instance. People who are interested in you can hit your e2 hub and see what you've written. Optionally, more statistics could be available - most popular writeups, "goodest" writeups, least voted on writeups, etc.
  • It provides a cached view of E2. Click on a link in the blog and it'll take you to a cached view of the node. You only hit e2's server if you want to.
  • It provides "user services" to the owner of the blog. An integrated node tracker would be a good start (and pretty much a necessary feature to do the statistics). A nodegel visualizer would also be useful - in fact, there could be a whole subsection devoted to managing one's homenode. In addition, a scratchpad would be good (saving E2 from rendering lots of pageviews of the scratchpad), and the ability to make rough drafts of writeups, then upload them when they're ready, would be even better.

Basically what I'm talking about is a web-based client integrated with a web-based user info tool. A hub for your everything experience.

The Implementation

In the process of writing this thing, I've found some lingering XML malformations in E2's code. I've submitted patches for these. If you're using the e2hub and you get a blank page, it's probably because you hit a page with an unescaped character, like a bare & or something.

I also went through and cleaned a lot of stuff up in the code... It caches user information, nodes, and displays them. I'm about ready to call it 1.0 - I have a few issues with Simpleton's wonderful E2Interface library, mostly relating to me using it in ways it was not designed to be used, and when I clear those up, I'll be posting the code to the hub somewhere.

Last updated 15:52 23/2

So I've actually started to code some of the stuff mentioned here. Specifically, what I have at this point is a web-based caching E2 proxy. In other words, it grabs data from the various XML sources in the site to generate an e2-look-alike. It's got a 100 megs of cache space, and it keeps data cached for one hour. This is all tweakable. Right now it supports viewing nodes and users. If you try to view something else it will tell you to go back to E2 and look at it. The whole thing is templated.

This is the key to the hub concept, in my mind. Without a cache - without an interface to E2 - then I'm just talking about fancy newsletters. Now that I have a pretty solid cache implementation, I can really go somewhere.

If you have any questions, feel free to contact me.

Extensions

It's also not hard to imagine such a tool being turned towards making E2 more approachable. How often have you heard, "It seems nice, but I just don't think I'd fit in..."? We are losing talented contributors every day because they don't know what to make of hot nude thespians. Sure, it's a part of E2's charm.... but we're reaching the end of where charm can take us. We need quality to reach the next stage of development.

How better to attract quality contributors for music writeups than a hub that focuses on, say, "Music Reviews"? Or a "Computer Science" hub that focuses on terminology and concepts? A "classical literature" hub? E2 is a big place. If we want to lure in experts and promote really quality writing, we're going to have to start filtering things to fit the audience. Metanodes are a good start, but they're not enough.

It should always be easy to start free-associating - clicking through the database, wherever the information flow takes you. But if we want to attract serious contributors, we need to be able to present them focused information.

Conclusion

There are a lot of other areas that this concept could be extended along - letting people check their messages from their hub, for instance, or making a hub "farm", sort of like Sourceforge does for software projects or LiveJournal does for blogs. Such a site could even charge people a small fee for advanced functionality, or faster service - thus helping E2 stay afloat.

Call for Comment

But of course, all this is just conjecture. I would like to make it, or something like it, a reality. E2 should live on! So please, if you have any comments, thoughts, or suggestions on this, drop s_alanet a message or e-mail me at <bgarney@purdue.edu>. I'll note them down here (as well as thanking you profusely for your willingness to tolerate my ramblings :). If you are struck by the urge to start coding this, talk to me, too... I'd like to help (I don't have time to write it on my own).


And the responses come in...

Ouroboros says re Everything2 Hubs: like a "distributed" E2?
you said "re hubs: Except here we have everyone talking to a central server. On Making E2 Distributed, I was talking about a fully distributed, peer to peer model. This is just a bit more hierarchical."
(That is to say, this is an extension to the existing architecture whereas distributed e2 in my mind is a total redesign.)

PhillC sez re: Hubs. I like the idea, but perhaps the easiest 4 hubs would be E2 People, E2 Places, E2 Ideas, E2 Things. They could all be easily pulled from the DB.
you said "re hubs: True... Though given the arbitrary nature of the people/place/idea/thing distinction, I think topical hubs would be more useful."

HongPong sez:

well i think it would be good to get better clients available for everyone. there is no good reason for scratch pads. however i think it will be more effective to look at where e2 is getting processor eaten. also i think it would be good to have a perl/PHP script which could generate fuzzy connections from your nodes to other topical nodes... within topics and those of your friends...

i kno from my metanodes that e2 has a broad-level anarchy to its organization. my general idea is to make a new 'metanode' node type which appears bold in the softlinks.. ie my Middle east would always appear bold. this would point people to topical indices... eventually fuzzy math + something like voting could be used to create topic fields. think fractal organization

have you seen this thing called 'helloworld' which works by making fuzzy collections of data. http://www.cooperatingsystems.com/helloworld/factsheet/index.html

i support anything making topical stuff and metanodes better. i think in particular it could be possible to flag certain nodes and have it generate others you might be interested in... but how to do this without crushing the server...

me sez:

"re hubs: A goal of the E2 hub is to offload things from the central server. I think that a generic mechanism to 'cache' unfinished writeups would serve nicely in that role, as it would give people a chance to collaboratively review or individually rewrite wus before posting them on the live server. The scratchpad is a very convenient part of e2, but it's also a source of database usage and system load.
A huge portion of e2's load is caused by the complexity of its pages. Every pageview results in dozens and dozens of comples queries, as well as in the execution of lots of code. Making the system more complex by adding filtering rules or nodetypes will do nothing to reduce pageviews caused by casual browsing, while making page renders more expensive. Quite the opposite of what we wish to accomplish!
On the other hand, if we write a hub that works to off-load as much as possible - both caching content for casual browsers, as well as providing a "draft then upload" paradigm for nodes in progress, we cut two expensive operations out of the loop - we reduce page renders for passive consumers by at least an order of magnitude, and page renders/database updates caused by writing/rewriting.
As of 1pm PST today, E2 has gotten fifty thousand hits. What if we could reduce that to 25k?

(Addendum to that: helloworld does look pretty neat... but for E2, I think that a web based solution will be more amenable than a program to download and run.)