After a day here on Everything, I wrote an article called "my first day on everything" in which I lamented about being short of XP. Someone, I don't know who, soft linked me to "learn how to integrate" and I am now and forever in his or her debt.

From "learn how to integrate" I learned the importance of taking five minutes to follow the hard links in the writeup to create soft links back to me.

But, since I'm a little bit of a geek boy, I thought to myself, Why spend five minutes doing something simple when I can spend a couple of hours to write a program to do the very same thing?

So I did.

Here's a short perl script that looks at your E2 node and automagically creates soft links for you.

If nothing else, it's a nice example of how to use HTML::Parser and LWP.


#!/usr/bin/perl -w

#
# $Id: e2autonoder.pl,v 1.2 2001/02/26 06:02:17 eric Exp $
#

use strict;

use HTML::Parser;
die "HTML::Parser needs to be version 3.x or higher!" 
    if ( $HTML::Parser::VERSION < 3 );

use LWP::UserAgent;


exit main();

##############################################################
############################################################## main
##############################################################
sub main {
    my $node_id = $ARGV[0] || return _usage();

    my $content = _get_e2_node( $node_id );          # download the node
    if ( !defined($content) ) {
        print "Error getting node no. $node_id.\n";
        return 0;
    }

    my $parser = _e2_parser();                       # parse the node
    $parser->parse( $content );

    my %temp = map { $_ => 1 } @{$parser->{links}};  # make a unique list
    my @unique_uris = keys( %temp );

    _set_e2_soft_links( \@unique_uris );              # set soft links to node

    return 0;
}


##############################################################
############################################################## e2 interface
##############################################################
sub _get_e2_node {
    my $node_id = shift() || return undef;

    my $ua = LWP::UserAgent->new();
    $ua->agent( "e2autonoder" );
    my $req = HTTP::Request->new( "GET", 
        "http://www.everything2.com/index.pl?node_id=$node_id" );
    my $response = $ua->request( $req );

    return undef unless ( $response->is_success() );
    return $response->content();
}

sub _set_e2_soft_links {
    my $uri_ref = shift();

    my $ua = LWP::UserAgent->new();
    $ua->agent( "e2autonoder" );

    foreach my $uri ( @$uri_ref ) {
        my $req = HTTP::Request->new( "GET", $uri );
        my $response = $ua->request( $req );
        if ( $response->is_success() ) {
            if ( $response->content() =~ />Here's the stuff:/ ) {
                print "NAK $uri\n";
            }
            else {
                print "OK $uri\n";
            }
        }
        else {
            print STDERR "ERROR $uri\n";
        }
    }
}

##############################################################
############################################################## e2 node parser
##############################################################
###
### This is the E2 parser.  It's only job is to find and
### store the links in the actual writeup.  The writeup
### is defined as being the second table row after the
### topic (i.e. node name) is displayed
###
sub _e2_parser {
    my $p = HTML::Parser->new( api_version => 3 );
    $p->handler( default => "" );
    $p->handler( start => \&_start, 'self, tagname, attr' );
    $p->handler( end => \&_end, 'self, tagname' );
    
    $p->{flag_topic} = 0;       # set when we get to the topic
    $p->{flag_writeup} = 0;     # set when $p is looking at the writeup
    $p->{links} = [];           # list of nodes linked to
    $p->{tr} = 0;               # incremented at <tr>, decremented at </tr>
    $p->{tr_rows} = 0;          # # of table rows we've seen

    return $p;
}

sub _end {
    my ($self, $tagname) = @_;

    if ( $tagname eq 'tr' ) { 
        ($self->{tr})--; 
        if ( $self->{tr} == 0 ) {
            $self->{tr_rows}++;
        }
        if ( $self->{flag_writeup} ) { $self->{flag_writeup} = 0; }
    }
}

sub _start {
    my ($self, $tagname, $attr) = @_;

    if ( $tagname eq 'h1' && $$attr{class} eq 'topic' ) {
        $self->{flag_topic} = 1;
        $self->{tr_rows} = 0;
        $self->{tr} = 0;
    }
    elsif ( $tagname eq 'tr' ) {
        ($self->{tr})++; 
        if ( $self->{flag_topic} && $self->{tr_rows} == 1 ) {
            $self->{flag_writeup} = 1;
        }
    }
    elsif ( $tagname eq 'a' && $self->{flag_writeup}) {
        my $cgi_param_string = $$attr{href} or return;
        $cgi_param_string =~ s/^.*?\?//;
        my $uri = "http://www.everything2.com/index.pl?$cgi_param_string";
        push( @{$self->{links}}, $uri );
    }

}


##############################################################
############################################################## miscellany
##############################################################
sub _usage {
    print <<__HERE__;
usage: e2autonoder.pl <node_id>
       where node_id is the id number of your writeup, *not* the full node.
__HERE__
}

I'll leave it as an exercise for the reader to edit the script to confirm that the links were made.

For hard links, see the E2 node autolinker.

Update: 20010225, if a soft-link can't be made because the target node does not exist, "NAK" is printed instead of "OK". Kudos to dmd for the suggestion.

Log in or register to write something here or to contact authors.