The parselinks code is four lines of Perl responsible for turning all your [bracketed text] text into hardlinks. It works well, but a little too well, because there's no way to add brackets to your writeup without knowing their HTML escape codes ( [ and ] ).

However, this is a problem for edevdocs and (probably) superdocs that use JavaScript code. Normal writeups have JavaScript code automatically parsed out, so it's not a problem there, and the HTML escape codes (while inconvenient) can be used there.

However, JavaScript uses brackets to specify array elements and, if needed, for describing a regular expression. parselinks interprets these brackets as hardlinks and screws up the JavaScript code utterly. Because this is script, and not HTML text, the HTML escape codes can't be used.

My suggestion would be to modify parselinks to search for an "escaped bracket" and remove it, add hardlinks as usual, then replace the removed "escaped bracket" with a single bracket. The best way to "escape" a bracket would be either using a double-bracket ( [[ ) or a slash-bracket ( \[ ). Since a slash-bracket may be used as a normal part of a regular expression, but a double-bracket shouldn't, I would use a double-bracket.

The new code for parselinks would look like this:

my ($field) = @_;
my $text = $$NODE{$field};
$test =~ s/\[{2}/\&\#91\;/gs;  # replace double brackets with HTML escapes
$test =~ s/\]{2}/\&\#93\;/gs;
$text =~ s/\[(.*?)\]/linkNodeTitle ($1, $NODE)/egs;
$test =~ s/\&\#91\;/\[/gs;     # replace HTML escapes with single brackets
$test =~ s/\&\#93\;/\]/gs;
unMSify($text);  #take out microsoft chars

This has the side-effect of turning any [ and ] entered in the writeup into [ and ] , but since it's displayed the same in a browser, and since the writeup itself is unchanged in the database, this shouldn't matter.

The following test string produced the desired results in using these regular expressions in JavaScript:

  [start][start2] [3]  [[double]] [[re]][[peat]] 
  [in[[side] [also]]inside] [ [[] []]] [end]
Which should output like this:
startstart2 3 [double] [re][peat] in[side also]inside [ ] end

Footnote: I originally tried a solution very similar to JK's below. However, it failed to trap zero-character hardlinks, one-character hardlinks, and unusual combinations of three or more [s or ]s, not to mention hardlinks at the beginning or end of the string. Hence the above solution.

update - 17 Aug, 2001

The changes described here are now implemented in the new function devparselinks which is now called from edevdoc display page instead of parselinks.

If this works out ok (and it's a pretty simple change, so there's no reason to expect problems) might it be an idea to do the same for displaying superdocs, for example like the ekw preferences page, where brackets in JavaScript code might be wanted?

Also, can anyone think of browser problems, internationalisation issues, etc, that would prevent us simply modding parselinks along the lines of devparselinks? We wouldn't want to encourage people to use double-bracketed escaping in writeups if this type of problem existed, but if it doesn't, it might be a big win in terms of noder-convenience.



Here's my version of this.

One difference is that we ignore [ and ] altogether - these just pass through the regexp unmolested.

Another difference is that I compound all occurrences of two or more identical bracketting characters into just one literal bracket. If this is not wanted, we just lose the commas in lines 2 and 3. (This will be necessary if we want to allow un-space-separated and nested constructions like the C expression arry[a[i],b[j]]. I don't know whether similar constructions exist in browser scripting languages.)

The behaviour is that text within brackets gets hardlinked as at present, except where there are two or more identical brackets at one end or the other of the bracketed text, in which case the multiple brackets are reduced to a single literal bracket by the last two lines, and what's between them isn't linked.

The thought behind my 'double-bracket' suggestion was that the syntax it adds is currently just broken syntax - instances have no use, and we shouldn't worry about any legacy data that contains such syntax. As JayBonci noted in an edev discussion on this, if we used \[ and \] then existing nodes containing linked directory paths (for example) would be broken:

C:\[windows]\[mess]
would render as C:[windows][mess] instead of the intended C:\windows\mess

Anyway, here's the suggested code modification:

1: $text=~s/(?<!\[)\[(?!\[)(.*?)(?<!\])\](?!\])/linkNodeTitle ($1, $NODE)/egs;
2: $text=~s/\[{2,}/[/gs;
3: $text=~s/\]{2,}/]/gs;

This has no problems with hardlinks (or 'escaped' brackets) at the start and end of strings, because it uses zero-width assertions.

On mblase's original test data,

[start][start2] [start3] [[double]] [[re]][[peat]]
[in[[side] [also]]inside] [ [[] []]] [end]
(he's changed it as I noded this! Grrrr!) this produces the output (except I've added a linebreak for 'clarity' :)
startstart2 start3 [double] [re][peat]
in[side also]inside [ ] [end
However, I don't think we really need worry exactly how
[ [[] []]] [end]
renders, since it seems quite obviously broken in any case. The important thing is that it does link normally linked stuff, and it does convert [[string]] to [string], which is all we really require.

One possible flaw with altering parselinks in this way is that noders might start using the double brackets to get literal brackets in the browser, in the database proper. This would possibly be bad, if the bracket characters are not absolutely standard (as presumably they *are* inside <script> tags)

As a solution to this, we could create a new devparselinks function along the lines above, and call it from edevdocs, replacing parselinks, while still using parselinks from the main database. (Though we lose the advantage of having an easy way to node brackets!)

In any case, I think it's quite clear, cut and dried, and more or less unarguable, that if the user puts &#91;, then &#91; is what should get sent to the browser. If mblase's approach is thought superior, we could get round the problem by replacing the bracket pairs with strings that are likely to occur in nodes with probability 0 (as ariels would say).

For example, we could replace [[ with 'JerboaKolinowski is cool!', and ]] with 'dem_bones is a drooling idiot'. We then just substitute these unlikely strings back into the appropriate single brackets. I think mblase's idea of using &#91; and &#93;, while cute, is flawed - ultimately because it loses information - the distinction between &#91; and a literal [ just evaporates during processing.

I also think if people really want their node to have [sometext], etc., they should probably just use &#91; and &#93;.

The question of exactly how

[[[string]]]
should be treated is somewhat thorny! :)

Edevdocs are a problem.

Firstly, I've seen a lot of edevdocs around that seem to indicate that people aren't aware of the difference between edevdoc's and superdocs. So, you know, maybe we need a bit of documentation. Hey, it's just a thought.

Secondly... brackets. Brackets are, of course, a pain on Everythings in general, what with parselinks being used somewhat universally on content.

There is no simple, elegant way to denote a bracket that will survive a trip through parselinks. The standard way of passing a bracket through parselinks doesn't pass a bracket through at all; it just passes the HTML entity for a bracket, &#091;. This only happens to look a bit like a bracket when your browser renders HTML. As far as, say, JavaScript is concerned, though, it's not a bracket at all.

The 'solution' we currently have in place to this problem, is, in a word, wrong. The solution is that while normal documents are displayed with parselinks, edevdocs are displayed with devparselinks, an entirely different function.

This brings trouble when we have a javascript utility, such as E2 Bookmarklets or Writeup table formatting utility which we want to open up to the general E2 user population rather than just edev, the only real way to do this is to convert the edevdoc (viewable only by edev members) into a document, viewable by all. And here we can get bitten by the difference between documents and edevdocs.

E2 Bookmarklets uses its own scheme to get round the lack of brackets; if it had used the edevdoc scheme, it would be useless in a document.

There are ways to get round the lack of brackets, of course.

  • construct any regular expressions using ASCII equivalents, and for array dereferencing, parselinks can be defeated by using, eg:
    array /*[*/ [i /*]*/ ] = f(i);

    This causes the HTML generated by parselinks to be encapsulated inside the javascript comment, while the following [ is thought to be inside the '[]' so isn't interpolated, and the closing ']' is thought to be an unmatched ']' so isn't interpolated.

    Please note, the technical term for this method is 'A Gross Hack', and if I find anyone attempting it, I shall kill them. Or... you know, laugh at them. Whichever I feel like at the time.

  • encode the code (eg. with escape) and store in in a string, which can then be unescaped and evaluated. The edev JavaScript escaper will do this for you.

However, I would like to propose that we actually solve this problem, rather than simply bubble-pushing it. I see two solutions:

  1. Create a publically accessible nodetype that uses devparselinks for its display page, which edevdocs can be transformed into to make them publically accessible. Naturally, for the same reasons that edevdocs can only be viewed by edev members, we would want to ensure that the new nodetype can't be created by non-deity edev members, but can easily be created from an edev member's edevdoc. Alternatively,..
  2. Beef up parselinks so that it doesn't catch brackets which are inside HTML tags or HTML comments (a little tricky perhaps), so it ignores brackets inside properly commented-out scripts. Such a parseLinks might look a little like this:

        my ($field) = @_;
        my ($text) = $$NODE{$field};
        join "-->", map {
            my @a = split /<!--/, $_;
            $a[0] =~ s/\[(.*?)\]/linkNodeTitle ($1, $NODE)/egs;
            join '<!--', @a;
        } split /-->/, $text;
    

    Not quite as elegant as the current definition, but certainly does the job.

Log in or register to write something here or to contact authors.