update - 17 Aug, 2001
The changes described here are now implemented in the new function devparselinks which is now called from edevdoc display page instead of parselinks.
If this works out ok (and it's a pretty simple change, so there's no reason to expect problems) might it be an idea to do the same for displaying superdocs, for example like the ekw preferences page, where brackets in JavaScript code might be wanted?
Also, can anyone think of browser problems, internationalisation issues, etc, that would prevent us simply modding parselinks along the lines of devparselinks? We wouldn't want to encourage people to use double-bracketed escaping in writeups if this type of problem existed, but if it doesn't, it might be a big win in terms of noder-convenience.
Here's my version of this.
One difference is that we ignore [ and ] altogether - these just pass through the regexp unmolested.
Another difference is that I compound all occurrences of two or more identical bracketting characters into just one literal bracket. If this is not wanted, we just lose the commas in lines 2 and 3. (This will be necessary if we want to allow un-space-separated and nested constructions like the C expression arry[a[i],b[j]]. I don't know whether similar constructions exist in browser scripting languages.)
The behaviour is that text within brackets gets hardlinked as at present, except where there are two or more identical brackets at one end or the other of the bracketed text, in which case the multiple brackets are reduced to a single literal bracket by the last two lines, and what's between them isn't linked.
The thought behind my 'double-bracket' suggestion was that the syntax it adds is currently just broken syntax - instances have no use, and we shouldn't worry about any legacy data that contains such syntax. As JayBonci noted in an edev discussion on this, if we used \[ and \] then existing nodes containing linked directory paths (for example) would be broken:
C:\[windows]\[mess]
would render as
C:[windows][mess] instead of the intended C:\
windows\
mess
Anyway, here's the suggested code modification:
1: $text=~s/(?<!\[)\[(?!\[)(.*?)(?<!\])\](?!\])/linkNodeTitle ($1, $NODE)/egs;
2: $text=~s/\[{2,}/[/gs;
3: $text=~s/\]{2,}/]/gs;
This has no problems with hardlinks (or 'escaped' brackets) at the start and end of strings, because it uses
zero-width assertions.
On mblase's original test data,
[start][start2] [start3] [[double]] [[re]][[peat]]
[in[[side] [also]]inside] [ [[] []]] [end]
(he's changed it as I noded this!
Grrrr!) this produces the output (except I've added a linebreak for 'clarity' :)
startstart2 start3 [double] [re][peat]
in[side also]inside [ ] [end
However, I don't think we really need worry exactly how
[ [[] []]] [end]
renders, since it seems quite obviously broken in any case. The important thing is that it does link normally linked stuff, and it does convert [[string]] to [string], which is all we really require.
One possible flaw with altering parselinks in this way is that noders might start using the double brackets to get literal brackets in the browser, in the database proper. This would possibly be bad, if the bracket characters are not absolutely standard (as presumably they *are* inside <script> tags)
As a solution to this, we could create a new devparselinks function along the lines above, and call it from edevdocs, replacing parselinks, while still using parselinks from the main database. (Though we lose the advantage of having an easy way to node brackets!)
In any case, I think it's quite clear, cut and dried, and more or less unarguable, that if the user puts [, then [ is what should get sent to the browser. If mblase's approach is thought superior, we could get round the problem by replacing the bracket pairs with strings that are likely to occur in nodes with probability 0 (as ariels would say).
For example, we could replace [[ with 'JerboaKolinowski is cool!', and ]] with 'dem_bones is a drooling idiot'. We then just substitute these unlikely strings back into the appropriate single brackets. I think mblase's idea of using [ and ], while cute, is flawed - ultimately because it loses information - the distinction between [ and a literal [ just evaporates during processing.
I also think if people really want their node to have [sometext], etc., they should probably just use [ and ].
The question of exactly how
[[[string]]]
should be treated is somewhat thorny! :)