Too many ideas and not enough doing.
That's my refrain, and I'm looking to break it. It's very easy for me to think of ideas which are fascinating, but then I often find something else to occupy me while I think about it. Laundry, dishes, breakfast, lunch, after-lunch snack, after-after lunch snack, jogging, GemCraft. Some days it seems the only time I get anything done is when I'm busy not getting something else done. Which might explain why I'm writing this.
I was struck by a video I saw the other day. It was ostensibly about how to do a video blog, but one of the points he included was an advisement about life in general: "When doing something, try. Because it might hurt less if you fail after putting in your full effort, but you're a lot more likely to fail if you don't." It's not like I haven't heard such a thing before. It's not so much a revelation as it is a call to focus. Part of the reason I don't get some things done is that I know they are difficult, and I want to think about them as much as possible beforehand, so they don't go wrong. And I think the right into never getting done. A weird sort of being so cerebral as to not try at all.
Which leads into a bunch of stuff related to E2, go figure. You guys are probably aware I have a ton of random ideas relating to E2, and I try to only write the stuff in root logs which actually has been coded and is running. Still, there's dozens of ideas buzzing around, some of merit and others certainly not, and I thought I'd write down a smattering of them that have crossed my mind tonight:
- Autocomplete linking to reduce typos and saves time. Say, you'd type »[foo« and a small list of autocompletes would appear below where you're typing "foobar", "foo", "food". A very short list of matches, though, but enough that it'd save people time searching and linking. You could, of course, just keep typing and ignore it, just like spelling completions in Word. Would have the added side benefit that the common confusion of pipelinks ("Node first or text first?") would be automatically answered since the autocomplete would trigger immediately.
- Links displayed in the WYSIWIG editor (aka TinyMCE). I've been trying to use TinyMCE lately to get an idea of how capable it is, and where its shortcomings are. It'd be nice if stuff you put in square brackets actually did give you the "what you see is what you get" treatment -- or at least the link button were restored and created a WYSWIG link. This is a little hairy to do because the way TinyMCE works for non-HTML stuff is it keeps translating back and forth between HTML code and sourcecode. You run a regexp on the source code to take <a href="/title/foobar">bar</a> links into bar and back. That's hazardous and easy to get wrong. It's why Wikipedia chose not to use TinyMCE
- Accurate display in WYSIWIG Everything is black text on white, which might be easier to edit with, if you're using a theme optimizied for reading (but not writing), but it really isn't WYSIWIG. It'd be nice to insure the same styles got applied to the WYSIWIG div as to normal writeups. Individual themes could, of course, still customize the WYSIWIG display to make it easier.
- Prevent pasted junk in WYSIWIG Although we disable buttons for tags that don't work on E2, you can still paste in arbitrarily-formatted HTML from another web page or Word and have it show up in the WYSIWIG. And it'll look fine in there, but terrible when other users see it rendered on E2. No idea how to do this one just now.
- AJAX saving of writeup being edited Begun work on this one. Common sense thing to do, but it's important to do it right and not accidentally make it a liability.
- Other languages. Perl is somewhat unpopular at present with no sign of that changing. Perl 6 has been in hazy development for 7 years now (but some impressive essays about language design called "Apocalypse"s), and Larry Wall's once-benefactor O'Reilly no longer can harp on a publish date to make it get out faster. Being Perl-only is probably bad for E2's longterm viability.
However, greenfields rewrite project almost always fail, and more than enough would-be-E3 projects have died already.
It might be possible to make production-quality changes to support other languages while keeping E2 just as functional as it always has been. We could expose symbols to other, more popular languages, and still have acceptable performance, by compiling them down to machine code. Since we already have the framework of htmlcodes as self-contained functions, this seems very easy to test while keeping it self-contained. I've played with PHP::Interpreter which does most of this (minus the machine code part) in PHP, and has really helped debugging some ugly PHP before. The problematic thing (in my eyes) is insuring that adding support doesn't mean we're running a giant mod_perl *and* a giant mod_something_else for what is presently only a theoretical gain in coders.
- Compilation of opcodes and htmlcodes Perl is an interpreted language, which means that when the code is first loaded, a parser has to spend a fair amount of time and memory making the human-readable squadoo into machine-readable squadoo. This is done for ecore once when Apache loads up a process, and then is done, allowing the compiled Perl to run fairly fast.
Unfortunately, we do this compilation step every time an htmlcode is called, and dozens of these are called on every pageload. I had months ago written some code to store the result from this compilation so that E2 would use less CPU. It didn't work quite right, though. DonJaime found it was causing every htmlcode to run as if it were using the same node as when it was first run. He suggested a solution, but I have not tried it yet.
This might cause a greater use of RAM, since we'd be storing the compiled code, but maybe not. If Apache serves pages faster, it's less likely to need to fork off processes, so we might wind up using less RAM overall.
This is also of interest to me because, at present, notifications recompile and run two bits of code for every one displayed. One to make sure the notification is still valid, and one to show the notification. That adds up quickly.
- Notifications list tentative suggested this one months if not years ago, but it was never a priority. However, with notifications being used for more and more things, it's easy to lose track of them, and the only way to find a notification that has "scrolled off" is to keep clicking the little 'x' to dismiss other notifications. Most of the code is there and already pretty modular from the notifications nodelet. Just need to turn it on.
- Messages nodelet AJAX load Oolong wrote up a Messages nodelet which will hopefully be the default for new users soon, replacing the current non-intuitive Chatterbox private messages section. The main thing I've had trouble with on it is that when I archive and delete messages, it doesn't pull in more messages to replace them, so I have to reload the page. On further investigation, the nodelet reloads fine so long as it's in the sidebar. Doesn't work on the nodelet display page, where I was mostly using it.
- Nodelet editing for individual pages This was something DonJaime has put in place. There just needs to be a clean interface for adding and removing nodelets. I want this so that I can use chatterlighter with the Messages nodelet for the most part. Also, it'd be really nice now to have chatterlighter replace chatterlight. (And, yes, I was the heel who complained about DonJaime's plan to do that very thing way back when.)
- Profiling E2 and optimizing We now have multiple webheads, so I can do something we couldn't do before the move: Run a profiler on Apache (which requires running it in single threaded mode) and actually get some hard numbers about what eats up the most time when running E2.
- Memcached Ecore has the preliminary bits to use a distributed memcached as a backup to the local NodeCache that every Apache instance keeps. It may or may not buy us much. However, at present, we check that every single node we access isn't stale on a pageload, and this includes every htmlcode, container, stylesheet, and user. On many pageloads, that's close to a hundred of brief SQL calls to do a "SELECT version FROM version WHERE version_id = 123456" The good news is these are tiny requests which can be answered by the MySQL query cache the vast majority of the time. The bad news is that they are almost always saying "Your node is perfectly fine, no need to update." If memcached offers a way to do a push notification to all clients that a given stored item has been updated, thus removing the need to do a read to check whether a node is updated, this could cut down on a lot of read traffic to the database.
But I need to research to know whether this is a viable solution.
- AJAX user creation form (aka Create a New User) This is something most modern sites do. If you've got mismatched passwords or you have chosen an illegal/previously-used username, it shows up on the page before you submit, so you don't have to go through a couple pageloads to get it to work. Just a small thing to make the place look nicer.
- Switch health check page to something lower cost Right now I have haproxy, our load balancing software, to try to grab ENN to determine if a web head is healthy. If this takes longer than 50 seconds or comes back with an error page, that web head gets removed from the pool. This was a switch from a static HTML page which we had it grabbing tha ascertained if the machine was healthy, but not if ecore was, and so it left a bad machine in the pool. However, it takes on average between 2 and 3 seconds to get this page which is really long. Although it means it's a good test for errors, might want to switch it to grabbing something cheaper like Everything Quote Server.
- AJAX sign in form Primarily for failed sign ins, so you just get a message on the current page, without everything loading just to tell you that you got your password wrong.
- Prevent auto-saved pages There have been multiple reports now of people claiming they were "signed out by E2". I can reproduce this behavior with Safari, but the problem isn't really with E2. It goes like this: You sign into E2 but don't check the "Remember me". You open multiple tabs to E2. You close Safari. You open Safari. It remembers the pages you were on, and loads them from cache, making it look like you're signed in. However, since you didn't check "Remember me", your cookie got deleted when you closed the browser. Any links you click on the page take you to an E2 page where you're signed out, and stuff like the catbox doesn't seem to work. If there's a way to tell Safari not to cache pages like this, it'd help. Otherwise, will have to do something with Javascript, encoding a variable like "var signIn=" . $$USER{title} eq 'Guest User' .";" in the served page, then doing a regular AJAX call to see if the user is still signed in, and, if they don't match, reloading the page so things look right. Opera might do this same thing.
Call for Developers
That's a short list, and most of those aren't priorities, but all of them are worth having a look taken at them. However, outside of people on staff, E2 is really low on casual coder contributions. Traffic on edev has dropped to almost nothing. I'm sure part of this is because I have been ill-temperated on that list in the past, and for that I apologize.
Development is way better than it has been for years, though. We have multiple dev servers, so people can work on pet projects without disturbing each other. Almost all changes are now done using patches, so things are well-documented, and it's a lot easier to track down where new bugs come from. Plus, all changes go through the dev server first, so we almost never see site-breaking coding mistakes any longer. If something goes kaboom on the dev side, we can recreate a dev server from scratch in under five minutes. And I'm not sure when was the last time we had as many active people on staff who understood ecore so well.
So, if any of that stuff up there sounds interesting to somebody, toss me or Oolong a message to get in edev, and we'll get you rolling on the dev server. Contributors very much welcomed.
Cool Python/Ruby Talk
Most programmers are aware that Python and Ruby are very popular languages today. Ruby's popularity is driven a great deal by the popular Rails platform, which makes it possible for somebody who doesn't know much about writing code to still make an acceptable, database-driven website. And Python's popularity has gotten huge boosts from testimonials by geek luminaries such as Eric Raymond (seen here) and Randall Munroe (seen here).
Slightly related to the long list of ideas for E2 above, I caught a neat little presentation, about 20 minutes long, entitled Python Vs. Ruby: A Battle to the Death. It's more about the design philosophy of the two languages and how those affect how much you can bend the language syntax.
What I really came away with it was how difficult it must be to make Ruby's interpreter run quickly. One of thse neatest things in compilers I have seen in the last few years was HipHop, an invention by Facebook's engineers which will take PHP code and create C++ code which you can then compile, gaining significant performance improvements. Now, PHP is sort of hideous in one major fashion: It didn't have namespaces for most of its existence. So most PHP code puts everything in a global namespace. Symbols are everywhere, and it's hard to tell where they come from. Even Perl doesn't do that. But the PHP language syntax itself is still static enough that most code can still be compiled to C++ without preserving the PHP interpreter.
Ruby, with its ability to add entirely new syntax, not just at the start of interpretation, but dynamically changing what syntax means as a program runs, seems to make it impossible to abstract it into a lower level language for speed gains. Python, on the other hand, already has PyPy and Shed Skin which will compile most Python down to C or C++. Since things compiled to C/C++ can be exported as symbols and called easily from Perl, this provides a pre-existing route to potentially add Python htmlcodes with very good performance and little original code necessary.
That's not to say Python, if its full featureset is utilized, can't be confusing or prevent this sort of optimization. I do recall an amusing hack, a few April 1sts back, when Python was extended to add the "comefrom" directive. But the way that worked was by preparsing the relevant Python files and catching the exceptions thrown by the Python interpreter when it found this new (illegal) keyword. Generally speaking, Python is static enough that it can be compiled down, removing the need to lug around the Python interpreter to run Python code.
Drifting away from the E2 implication for a bit, and back to the talk, the thesis seemed to be "Ruby lets you create domain specific languages which can allow more readable code than Python ever will". Which is a funny, since one of the things people tout about Python is that it almost reads like English, whereas Ruby often looks similar to Perl with puncutation all over the place. So there Ruby is: extending itself to overcome its own weaknesses. Which I have to admit is so meta that it is very cool.