So what else would I do to celebrate the dawning of the morning of 4/20 but... learn XML? Thanks to Professional PHP Programming by WROX Publishing (on loan from Knarphie), I figured it out how to play around with the XML code put out by a bunch of weblog and dynamic content sites, including this one. This is the basic code that I put into a file on my page called everything.php. Explanations are below the code.

<?php

$curel = "";

function Neil($parser, $name, $attribs) {
  global $curel;
  $curel = $name;
  if ($name == "WRITEUP") {
    print "<br><a href=\"http://everything2.com/index.pl?node_id=".$attribs."\">";
  }
}

function Bob( $parser, $name ) {
  if ($name == "WRITEUP") {
     print "</a>\n";
  }
}

function Joe ( $parser, $data ) {
  global $curel;
  if ($curel == "WRITEUP") {
    print $data;
  }
}

# Interesting stuff starts here

$parser = xml_parser_create();
xml_set_element_handler ( $parser, "Neil", "Bob" );
xml_set_character_data_handler ( $parser, "Joe" );

$file = "/home/rob/everything.xml";
if (!($fp = fopen($file, "r"))) {
  die ("Count not open file");
}

while ($data=fread($fp, 4096)){
  if (!xml_parse($parser, $data, feof($fp))) {
     die();
  }
}

?>

First of all, yes, I decided to call my Element Handler functions Neil and Bob in honor of the Goats comic. Lets take a look at what they do. Look at the lines after Interesting stuff starts here, you'll see that a parser gets initialized by the xml_parser_create function. This just gives the next two functions something to reference while we build the program's knowledge of the XML code it's going to be fed.

The next function, xml_set_element_handler, takes that $parser variable we made and ties in the functions "Neil" and "Bob" as what to run on an open tag and a close tag. What happens when the program runs is the xml_parse function sees an open tag and knows to call "Neil" in this case. So "Neil" gets called as:

  Neil($parser, $name, $attribs)

Where the $name is the tag being opened. For instance, <html> would be $name = "HTML" when the function was called. Then $attribs is an associative array. That is, if you have <TAGNAME QUALITY="High"> then you'll find the value of "High" in $attrib. And yes, the caps are important in this case. Even though XML has all lowercase, the parser refers to them all in capitals.

So Neil basically waits for the name of the tag to be WRITEUP, saves the fact that we're inside a WRITEUP node into the global $curel variable, and starts the HTML code output with an A HREF link. Then, the next thing to occur will be the data, which is handled by Joe. If Joe sees that we're still within a pair of WRITEUP tags (by looking in $curel), Joe outputs whatever is between them. This is whatever Joe was passed in $data when it got called. In this case, it's the title of the node.

Then the closing WRITEUP tag gets called, and Bob closes up the hypertext link. This continues as long as the open file (in my case /home/rob/everything.xml) still has data in it. Once it's done, the parser exits and the program is done. And I have a nice Everything2 slashbox on my own site with my profile. I can call this code with a PHP include function from any page on my site.

Then, a handy

wget 'http://everything2.com/index.pl?node_id=762826&usersearch=vees' \
-O /home/rob/everything.xml

line in the crontab file loads it down every once in a while to keep my homepage up to date with Everything. Also thanks to alex for the tip about using wget instead of lynx for the downloads.

omegas wrote me to tell me about some neat functionality in wget that allows you to send your E2 cookie along with the XML node request. If you do this, you get your writeup reputations and other interesting info that only you're supposed to know about your nodes included in the XML.

You can grab the userpass values from your E2 cookie contents (how to do this specifically depends on what browser you're using) and use it for the wget --header option like this:

wget --header="Cookie: userpass=vees%257CABCDEFG123XX..." \ 
'http://everything2.com/index.pl?node_id=762826&usersearch=vees' \
-O /home/rob/everything.xml

If you want to put Cools in bold, try adding a few lines to the open tag code that look something like this:

global $cooled;
$curel = $name;
$cooled = 0;

if ($attribs == 1)
{ 
  print "<b>";
  $cooled=1;
}

And corresponding code in the closing tag function that references the global $cooled to close the <b> after the link is finished.

You can see the results at http://epistolary.org/rob/ on the bottom of the blue sidebar. I'll keep this line here as long as this information is accurate.


PS: So, sockpuppet told me if this were good, I should post it up. If you like it, I deserve the XP. If not, take your wrath out on him. :-) Thanks.