nodegraph.pl

This code is for the E2 Nodegel Visualizer. Please don't vote on this writeup -- go to the original node. Thanks.

You should be able to copy this script directly into notepad and save the file, or in unix, to cat > nodegraph.pl (ending with ctrl-d). For more hints, see the writeups under E2 node tracker.

Usage: nodegraph.pl parameters

-data

filename Specify the node database (filename) to use. The file must have been generated by the original E2 Node Tracker or in the same format. cowofdoom's file format is also acceptable. If this option isn't specified, "getnodes.dat" is assumed.

-size WxH

Sets the width and height of the scatterplot

-spread N

Sets the fudge-factor used when marks collide. Larger numbers make fuzzier, more dishonset graphs, but can finish faster. (set to 0.5)

(+|-)(opt)(=C)?

Turns scatterplot options (opt) on (+) or off (-), optionally setting the mark (C) used to draw them. The scatterplot options (and their current settings) are:

nodes: draw a mark for each of your nodes (on, 'O')
cools: draw a mark for each of your C!ed nodes. This option lets you use a different mark for C!ed nodes, or display only C!ed nodes if -nodes is also used. (on, '@')
links: generate pipelinks for the node marks (on)
hrefs: generate HTML links for the node marks. This lets you put the plot into a webpage outside of Everything. (off)
flip: flips the axes so time is indicated on the y-axis (off)
axes: draw axes and their max/min values (on)
rate: draw line for writeup creation rate (off, '=')
avg: draw line for normal average node-fu (off, '.')
mavg: draw line for moving average node-fu (off, ',')
all: turns everything on or off. Any option settings appearing before 'all' on the command line will be ingnored.

Some example scatterplot options to try: "-links" gives a nice scatterplot which you can read on your terminal. "-all+nodes+cools -size WxH", if you pick W and H so that (W * H) is about twice your nodecount, you'll get a neat compact plot.

#!/usr/bin/perl

##
## You need [kaatunut]'s original [E2 Node Tracker], available on everything2.com.
##
## As for license, well, let's say it's GPL. I hope it's applicable to source-
## form (non-compilable) software.
##
## Most of the code by me, except the bits which read the datafile.
## That code is straight from [kaatunut]'s original [E2 Node Tracker].
##

use FileHandle;
use CGI qw(escape);
use Time::Local;
use POSIX qw(strftime);
use re 'eval';

sub printscat;
sub getcell;
sub loadnodes;
sub parse_file;
sub getcell;

$E2server="everything2.com";

$dataname="getnodes.dat";

## This option changes the way command line options are parsed;
## makes it easier to integrate the script into getnodes.pl if
## you so desire.
$strict_opts = "";
#$strict_opts = "-scat";

## These are the default options
## "+" == enabled
## "-" == disabled
%scat_opts  = ( "nodes", "+",
                                "cools", "+",
                                "links", "+",
                                "hrefs", "-",
                                "axes",  "+",
                                "flip",  "-",
                                "rate",  "-",
                                "avg",   "-",
                                "mavg",  "-");

@scat_opts_names = keys %scat_opts;
$scat_opts_names_piped = join('|', @scat_opts_names, "all");

@scat_help  = (
                 "nodes",        "draw a mark for each of your nodes",
                 "cools",        "draw a mark for each of your C!ed nodes.  This option lets you use a different mark for C!ed nodes, or display only C!ed nodes if -nodes is also used.",
                 "links",        "generate pipelinks for the node marks",
                 "hrefs",        "generate HTML links for the node marks.  This lets you put the plot into a webpage outside of Everything.",
                 "flip",      "flips the axes so time is indicated on the y-axis",
                     "axes",        "draw axes and their max/min values",
                 "rate",        "draw line for writeup creation rate",
                 "avg",                "draw line for normal average node-fu",
                 "mavg",        "draw line for moving average node-fu",
                 "all",                "turns everything on or off.  Any option settings appearing before 'all' on the command line will be ingnored."
);

## These are the default marks to use for each type of point or line
%scat_marks = ( "nodes", "O",
                                "cools", "@",
                                "rate",  "=",
                                "avg",   ".",
                                "mavg",  "," );

%map = {};
%mapmark = {};

$spread = 0.5;

$SIG{'INT'} = "cleanup";
## Just in case someone does a control-c. Added by ymelup

#if (!@ARGV) {
#    push @ARGV,"-help";
#}
my $manyargs = 1 if scalar(@ARGV) > 1;
$printscat = 1 if !$strict_opts;
while (@ARGV) {
    $t=shift(@ARGV);

    if ($t eq "-help") {
        my $stderr = select STDERR;
          my $dash_scat_help = "      -scat                 Generates a scatterplot of your nodespace" if $strict_opts;
        print STDERR <<EOF
Usage: $0 [parameters]

      -data        filename  Specify the node database (filename) to use.
                      The file must have been generated by the original
                      [E2 Node Tracker] or in the same format.
                            [cowofdoom]'s file format is also acceptable.
                      If this option isn't specified, "$dataname" is assumed.

$dash_scat_help
      $strict_opts-size WxH        Sets the width and height of the scatterplot
      $strict_opts-spread N        Sets the fudge-factor used when marks collide.
                                   Larger numbers make fuzzier, more dishonset graphs,
                                   but can finish faster.  (set to $spread)

      $strict_opts(+|-)(opt)(=C)?  Turns scatterplot options (opt) on (+) or off (-),
                                   optionally setting the mark (C) used to draw them.

EOF
;

        print STDERR <<EOF
      The scatterplot options listed below may be specified in one or more
      separate command line parameters.  Thus, the following two command
      lines are equivilent:
        $0 $strict_opts $strict_opts-size 80x24 $strict_opts+nodes=o $strict_opts+cools=! $strict_opts-axes
        $0 $strict_opts $strict_opts-size 80x24 $strict_opts+nodes=o+cools=!-axes
EOF
if $strict_opts;

        print STDERR "      The scatterplot options (and their current settings) are:\n";
                format SCATHELP =
         @<<<<<  ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
         $key    $helptext
~                ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
                 $helptext
~                ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
                 $helptext
~                ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
                 $helptext
~                ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
                 $helptext
~                ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
                 $helptext
~                ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
                 $helptext
~                ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
                 $helptext
.
                $~ = SCATHELP;
                select $stderr;
                while (@scat_help) {
                  $key = shift(@scat_help);
                  $helptext = shift(@scat_help);
                  if ($key ne "all") {
                    my $on = $scat_opts{$key} eq "+" ? "on" : "off";
                    $helptext .= " ($on";
                    $helptext .= ", '$scat_marks{$key}'" if $scat_marks{$key};
                    $helptext .= ")";
                  }
                  write STDERR;
                }
        print STDERR <<EOF

      Some example scatterplot options to try:

         $strict_opts-links
                 Gives a nice scatterplot which you can read on your terminal

         $strict_opts-all+nodes+cools $strict_opts-size WxH
            Pick W and H so that (W * H) is about twice your nodecount and
            get a neat compact plot.

EOF
;
        exit;
    } elsif ($t eq "-data") {
        $dataname = shift(@ARGV);
    } elsif ($t eq "-scat") {
        $printscat=1;
    } elsif ($t =~ /^$strict_opts(?:([+-])($scat_opts_names_piped)(?:=(.))?(?{
        %scat_opts = {} if $2 eq "all";
        $scat_opts{$2}=$1;
        $scat_marks{$2}=$3 if $3;
    }))+$/) {
          ## all the work is done in the regexp.
    } elsif ($t eq "$strict_opts-spread") {
           $spread = 0 + shift(@ARGV) or die "spread must be a decimal number!";
    } elsif ($t eq "$strict_opts-size") {
        shift(@ARGV) =~ /^([0-9]*)x([0-9]*)$/;
        $scat_width = $1;
        $scat_height = $2;
    } else {
        print STDERR "Unknown parameter: $t!\n";
    }
}

loadnodes($dataname, \%oldnode);

print_scat(\%oldnode) if $printscat;

sub print_scat {
  my %nodes = %{$_[0]};
  my ($mindate, $maxdate) = (9e9, -9E9);
  my ($minrep, $maxrep) = (9e9, -9E9);
  my ($resx, $resy) = (60,20);

  ## pull in command-line values
  $resx = $scat_width if $scat_width;
  $resy = $scat_height if $scat_height;

  ## clean up the command line options
  foreach $key (@scat_opts_names) {
    $scat_opts{$key} ||= $scat_opts{"all"};
    delete $scat_opts{$key} if $scat_opts{$key} eq "-";
    if ($scat_opts{$key} && $key ne "all") {
      print STDERR "drawing $key";
      print STDERR " as '$scat_marks{$key}'" if $scat_marks{$key};
      print STDERR "\n";
    }
  }

  my ($flip, $axes, $links) = ($scat_opts{"flip"},
                                         $scat_opts{"axes"},
                                         $scat_opts{"links"});

  if ($flip) {
    ($resx, $resy) = ($resy, $resx);
  }

  my $nodecount = scalar(keys %nodes);
  my $gridcells = $resx * $resy;
  die "Not enough space on the graph for your massive nodespace.\n($nodecount nodes won't fit into $gridcells cells...)" if $nodecount > $gridcells;
  print STDERR "I'll be fitting $nodecount nodes into $gridcells cells...\n";

  foreach (keys(%nodes)) {
    #print "node $_";
    my %node = %{$nodes{$_}};
    my $rep = $node{"rep"};
    #$rep += 10 if $node{"C"};
    my $date = $node{"time"};
    $minrep = $rep if $rep < $minrep;
    $maxrep = $rep if $rep > $maxrep;
    $mindate = $date if $date < $mindate;
    $maxdate = $date if $date > $maxdate;
    #print " = ".%node."\n";
  }

  #print "rep range  = [$minrep, $maxrep]\n";
  #print "date range = [$mindate, $maxdate]\n";

  my $xf = $resx / ($maxdate - $mindate);
  my $yf = $resy / ($maxrep - $minrep);

  my $yzero = int ($yf  * (0 - $minrep));

  #print "xf, yf, y0 = $xf, $yf, $yzero \n";

  my $time_lie = 0;
  my $rep_lie  = 0;

  foreach (keys(%nodes)) {
    my $name = $_;
    #print "node $name ";
    #print STDERR "\n.";
    #flush STDERR;
    my %node = %{$nodes{$_}};
    my $rep = $node{"rep"} - $minrep;
    #$rep += 10 if $node{"C"};
    my $date = $node{"time"} - $mindate;
    #print " $rep/$date";
    my $x = int ($date * $xf);
    my $y = int ($rep * $yf);
    my ($x0, $y0) = ($x, $y);
    my $rstep = $spread;
    my $r = 0;
    my $Pi = 6.283185;
    my $otheta = rand() * $Pi;
    my $theta = $otheta;
    my $thetastep;
    while ($map{"$x, $y"}) {
      #print STDERR "*";
      #flush STDERR;
      ## If a cell is occupied, we search in circles around the
      ## cell until we find an empty one.  Among other things,
      ## this method eventually tries every cell deterministically.
      if ($theta - $Pi > $otheta || !$thetastep) {
        $r += $rstep;
        $theta = $otheta;
        $thetastep = 1 / $r;
        #print STDERR "^";
        #flush STDERR;
      } else {
        $theta += $thetastep;
      }
      my $yfac = 1.0;
      $yfac = $yzero / $resy if sin($theta) < 0;
      $nx = $x + int ($r * cos $theta);
      $ny = $y + int ($r * sin $theta * $yfac);
      $nx = 0 if $nx < 0;
      $ny = 0 if $ny < 0;
      $nx = $resx if $nx > $resx;
      $ny = $resy if $ny > $resy;
      if (!$map{"$nx, $ny"}) {
        ($x, $y) = ($nx, $ny);
      }
    }
    #print "$name @ ($x,$y) \n";
    $map{"$x, $y"} = $name;
    $mapmark{"$x, $y"} = $scat_marks{"nodes"} if $scat_opts{"nodes"};
    $mapmark{"$x, $y"} = $scat_marks{"cools"} if $scat_opts{"cools"} && $node{"C"};
    $time_lie += abs ($x - $x0);
    $rep_lie  += abs ($y - $y0);
    #print "\$map{\"$x, $y\"} = ".$map{"$x, $y"}."\n";
    #print "\n";
  }

  $time_lie /= $xf * 3600;
  $rep_lie  /= $yf;
  printf STDERR "Total dishonesty: %.2f rep, %.2f hours.\n", $rep_lie, $time_lie;
  $time_lie /= $nodecount;
  $rep_lie  /= $nodecount;
  printf STDERR "Average dishonesty: %.2f rep per node, %.2f hours per node.\n", $rep_lie, $time_lie;

  ## draw the rate & average
  my $x = 0;
  my $otot = 0.0;
  my $ocnt = 0.0;
  my $omy = 0;
  while ($x++ <= $resx) {
    my $y = $resy;
    my $tot = 0.0;
    my $cnt = 0.0;
    while ($y-- + 1) {
      if ($map{"$x, $y"}) {
        $cnt++;
        $tot += $y;
      }
    }
    ## draw the rate (nodecount for interval)
    my $ratey = $yzero + $cnt;
    $mapmark{"$x, $ratey"} ||= $scat_marks{"rate"} if $scat_opts{"rate"};

    ## draw the average (overall reputation/node)
    $otot = $otot + $tot;
    $ocnt = $ocnt + $cnt;
    my $my;
    $my = int ($otot/$ocnt) if $ocnt;
    $my ||= 0;
    $my += $yzero;
    $mapmark{"$x, $my"} ||= $scat_marks{"avg"} if $scat_opts{"avg"};
  }

  ## draw the decaying average
  my $window = 4;
  my $wrat = ($window - 1) / $window;
  my $x = 0;
  my $otot = 0.0;
  my $ocnt = 0.0;
  my $omy = 0;
  while ($x++ <= $resx) {
    my $y = $resy;
    my $tot = 0.0;
    my $cnt = 0.0;
    while ($y-- + 1) {
      if ($map{"$x, $y"}) {
        $cnt++;
        $tot += $y;
      }
    }
    $otot = $otot * $wrat + $tot;
    $ocnt = $ocnt * $wrat + $cnt;
    my $mx = int ($x - $window/2);
    my $my = int (0.5 + $otot/$ocnt) if $ocnt;
    $my ||= 0;
    $mapmark{"$mx, $my"} ||= $scat_marks{"mavg"} if $scat_opts{"mavg"};
  }

  print "\n<pre>\n";

  my $lessthan = $links ? "<" : "<";

  ## TRUST ME: this is easier to do with two separate chunks of code.
  if ($flip) {
    my $indent = "";
    if ($axes) {
      $indent = "$minrep".$lessthan;
      print $indent;
      $indent =~ s/./ /g;
    }
    my $x = -1;
    while ($x++ <= $resx + 2) {
      my $y = -1;
      while ($y++ < $resy + 2) {
        my $zero = 1 if $y == $yzero;
        my $default_mark;
          if ($axes) {
          if ($x == 0) {
            $default_mark = "+" if $zero;
                $default_mark ||= "-";
            } else {
            $default_mark   =    "\\|/"   if $y == $yzero - 1 && $x == $resx + 1;
            $default_mark ||=     "now"   if $y == $yzero - 1 && $x == $resx + 2;
              $default_mark ||=      "|"    if $zero            && $x <= $resx;
          }
        }
        print getcell("$x, $y", $default_mark);
      }
      print ">".$maxrep if $x == 0;
      print "\n$indent";
    }
  } else {
    print "$maxrep\n" if $scat_opts{"axes"};
    print "/|\\\n" if $scat_opts{"axes"};
    my $y = $resy + 2;
    while ($y-- + 1) {
      my $zero = 1 if $y == $yzero;
      if ($scat_opts{"axes"}) {
        if ($zero) {
          print "0+-";
        } else {
          print " | ";
        }
      }
      my $x = -1;
      while ($x++ <= $resx) {
        my $default_mark = "-" if $zero && $scat_opts{"axes"};
        print getcell("$x, $y", $default_mark);
      }
      if ($zero && $scat_opts{"axes"}) {
        print "> now";
      }
      print "\n";
    }
    if ($scat_opts{"axes"}) {
      print "\\|/\n";
      print "$minrep\n";
    }
  }

  print "\n</pre>\n";

}

sub getcell {
  my ($cell, $default_mark) = @_;

  #print "\$map{\"$cell\"} = $map{$cell}\n";

  my $node = $map{$cell};
  my $mark = $mapmark{$cell};
  $mark ||= $default_mark;
  $mark ||= " ";
  if ($node) {
    $mark = "[$node|$mark]" if $scat_opts{"links"};
    $mark = "<a href=\"http://$E2server/?node=&escape($node)\">$mark</a>" if $scat_opts{"hrefs"};
  }

  return $mark;
}

# loadnodes(filename,nodearray-ref)
# filename=filename to load from
# nodearray-ref=ref to 2-dim array to save the nodes into
# name => [ rep, C, type, time ]
sub loadnodes {
    my($name,$type,$rep,$C,$time);


    open(DATA,$_[0]) || return;
    <DATA>;
    my $cow_format = /([0-9]+:){3}[0-9]+/;
    close DATA;

    open(DATA,$_[0]) || return;

    if ($cow_format) {
        print STDERR "detected [Cow of Doom] format datafile...";
          while (<DATA>) {
          chomp;
          /^([^:]):(([^:]):([^:]):?(?{
            ${$_[1]}{$1}{$2}=$3;
          }))+$/;
        }
    } else {
      while (<DATA>) {
        chomp;
        /^(name|type|rep|C|time|node_id): (.*)/ or next;
        if ($1 eq "name") {
            $name=$2;
        } else {
            ${$_[1]}{$name}{$1}=$2;
        }
      }
    }

    close(DATA)
        || die "can't close datafile!";
}

# parse_file(*FILEHANDLE,target,target_type)
# see getnodes() for notes about target and target_type, but this is an internal
# function now, sort of
# btw, I wonder if there's something wrong with how I'm passing filehandles...
# it seems to work but the books don't seem to like it, they just don't tell why
sub parse_file {
    local *E2 = $_[0];
    my $data_read=0;
    my $old_data_read=0;
    my $DATAREAD_REPORTFREQ=4000;

    if ($_[2]==1) {   # target: file
        open(DATAOUT,"$_[1]");
    }

    while (<E2>) {
        $old_data_read=$data_read;
        $data_read+=length($_);
        if ($_ =~
/<writeup\s[^>]*?
node_id="(\d+)"\s[^>]*?
reputation="(-?[0-9]+)"\s[^>]*?
createtime="(\d+)-(\d+)-(\d+)\s(\d+):(\d+):(\d+)"\s[^>]*?
cooled="([01])"[^>]*?
cooledby_user="(.*?)">(.*?)\s$(thing|idea|person|place)$
<\/writeup>/x) {
            $name=$11;
            ${$_[1]}{$name}{"rep"}=$2;
            ${$_[1]}{$name}{"C"}=($10 || ($9 && "[unknown user]"));
            ${$_[1]}{$name}{"type"}=$12;
            ${$_[1]}{$name}{"time"}=timelocal($8,$7,$6,$5,$4-1,$3);
            ${$_[1]}{$name}{"node_id"}=$1;
        }
        if ($DATAREAD_REPORTFREQ &&
            ((($data_read-$old_data_read)>$DATAREAD_REPORTFREQ) ||
             ($data_read % $DATAREAD_REPORTFREQ) <
              ($old_data_read % $DATAREAD_REPORTFREQ))) {
            printf("%.fk ",$data_read/1000);
        }
    }

    close(E2);
    close(DATAOUT) if $_[2]==1;

#    return $nodes_read;
}

sub cleanup {
    system "stty", "echo";
    print "Aborted!\n";
    exit(2);
}

E2 node tracker	E2 Nodegel Visualizer	E2 Explorer	E2 Link and Logger Client
E2 node autolinker in perl	Ultramarines	Small helpful scripts for noders	Demonyms of Australia
Epiphany	Perl

Page category: