display | more...

Hello crew! If you're reading this, you're interested in the future of the site, and for that I thank you. There's a lot of work to be done, but I'm super optimistic about the near and long term future of this place. I'm going to talk a bit about the project list, and what that means for changes you will see. The last few months have been invisible changes that have helped to shape the platform that we're on away from something that holds us back to being something that really works for us and allows us to grow. If you're interested in the technicals, read below the project tree.

You can see from last month's log, I made a lot of cuts to functionality that just wasn't used. It's not that I want to destroy people's work, but if we're going to make radical engine changes, we need to only support the code that we use. There is non-trivial technical debt we have accrued here that needs to be solved

Here's the project tree. Each of these items has their own dependencies, but this is the high-level list:

    S3 Integration - Leverage Amazon's S3 to be a ghetto CDN for us.
  • Improve Everything::S3 to be able to delete items and upload non-filesystem objects
  • Integrate the s3content dbtable into jscript and stylesheet nodes so that items can have their own version and bucket configuration.
    • For performance reasons, we will want to minify JS/CSS and upload a gzipped version
    • S3 cannot negotiate content type, but we can negotiate browser capabilities on the main page, and if the browser supports gzip encoding, we can ship them off to like $NODEID.$version.min.gzip.css, otherwise just $NODEID.$version.min.css
    • We will eventually need to ship four versions of CSS for each upload of CSS to S3.
    • Implement a preference for theme developers (and site developers) to get the raw, uncached JSCRIPT stuff directly from the page.
  • Backup logs to S3, so that we can keep disk usage down, and prevent machine dependence.
  • Figure out why backups are completing to S3, but are not showing up as complete.
  • Apply content cache suggestion headers to the jscript and the CSS. Because we are doing explicit versioning, we can set very long-running caches.
  • Research built-in S3 versioning, and if it makes sense.
  • Move the S3 versioning away from using the NodeCache's version, as it's really not the right way to go for some technical reasons. It also screws with the development environment.
      Sitemap generation
    • Set the content type and caching properly on the sitemap stuff. That's more everything::s3 work.
    • Set the granularity of the date to be better than day in the sitemap
    • Include superdoc positions and set proper reload suggestions on various superdoc and other dynamic pages
    Platform stuff
  • Add better verbose logging now that we capture cronjob output. This might be better handled as an Amazon SNS message, but I'm not sure.
  • Install NTP to combat clock skew
    Webserver stuff
  • Add in a reverse proxy rule for images, which might solve the Jukka Dim emulation problem below
  • Add in a reverse proxy rule for java chatterbox, move to S3
  • Add in a reverse proxy rule for TinyMCE, because having it off page breaks some stuff, including the html edit function
    Layout stuff
  • Remove the h1 from the logo in zen, because it confuses search engine snippet generators where the actual content stuff is. Ideally, this sort of depends on S3 upload stuff.
  • Custodian reported a bug where categories are spooling up a bunch of stuff in Javascript, so there might be some performance work there.
  • Jukka Dim emulation has hardcoded image paths, which is causing it to be broken. I can fix it with either a reverse proxy rule or an S3 thing
    Arbitrary node metadata
  • I'm implementing a system where we can place a key-value on any node. This is going to eventually replace user $VARS, and allow us to place any arbitrary piece of data on a node. The inaugural feature of this is to squash ads on pages that google has reported to me as being out of policy for their ad generation.
  • I'm also going to be documenting each piece of metadata (somehow), so that I can look it up when I inevitably forget what particular keys are across the app.
  • This feature will expand into providing microdata on various types of nodes. See http://schema.org for those technically inclined to learn how microdata works. Non-nerdy version is that it's going to give google better insight into your writing so that it's more easily findable, and the content is easier to pick up
    Everything engine
  • I'm going to be implementing a delta-encoded layer 1 nodecache where anything that is marked as "permanent" is tracked by a global delta update time. Whenever a perm node gets updated, it will update that global time. This will prevent us from having to go back to the versions table whenever we access a node inside of our L1 cache. If we pass the global delta check, then no version checks needed. If we fail the global delta check, on this pageload only, we will revalidate the versions of everything in cache. This is a fairly straight-ahead change that should provide us pretty huge performance gains, that I will subsequently hammer into useful features. As noted below, typeversion handles most of that now without a problem, so we'll continue with that until we need a memcached
  • Remove a lot of the Everything::* functions which really want to be inside of the Everything::HTML execution space, and place them in that space instead. When that's done, we can move a lot of those out of a polluted namespace into an app logic section so that we can reduce the permutations of copy and pasted code in the codebase.

This section gets updated as I work on stuff, and I often leave notes for myself here.
    October 1
  • Update guestuserbanner to put a comma in the right place.
  • Create cachedeltaversion for delta-encoded nodecache. Since that is such a complicated piece, it's easier to do that before node parameters, though it's tempting only create a read-once cache for nodeparameters and be done with it until we need a memcache
  • Moved the items from Everything::NodeBase::PERM into Everything::CONF so that they are tweakable outside of code, basically
    October 2
  • Did a pretty serious investigation last night into what I thought would be a good idea to handle delta-encoded L1 caching, but the truth is that it's going to be a lot of really finicky work to untangle the typeversion entries from a broader typeversion strategy. No one has actually touched the typeversion stuff for years, but it's still working under the hood, so that's a good thing. The problem is that we don't pick out individual types, we just grab all of the typeversion table and just run with it, so that's relatively performant. It's also a super minor speed increase to kill off nodetypes that we don't need. I'll need to bootstrap the typeversion table in the future in dev to properly simulate it, so there's some work to be done there before I can really continue to see if that or a different cache hierarchy strategy is in order. I did update sustype, nodelet, and a few other typeversion items to improve performance, but the other nodetypes don't hit sit in cache enough to warrant it.
  • Backend of arbitrary node metadata is working fine and installed into production. I'll deal with the database problems when they show up with a front-end memcache/elasticache. In-page nodecache caching, it gets thrown away after the pageload. I'm going to wire up some admin/c_e tools shortly to tag stuff, and we'll move on from there.
  • Cleaning out the restricted_testdoc type just to wind down an unused nodetype. Nuked mauler's testdoc
  • Nuked New punch thyself
  • Nuked mauler's Sandbox X: Sand of Destiny
  • Nuked my updated proposal in 300 words or less
  • Nuked mauler's sandbox XI: To Infinity and Beyond
  • Nuked mauler's Sandbox I: A New Hope
  • Nuked restricted_testdoc, the nodetype
  • Added the Node Parameter Editor, and added it to episection_ces. I'm working on validation and application logic rules right now.
  • As one of the inaugural cleanups, working on adding cloakers support into the nodeparam system. Started off inaugurally with the inactive E2Med account. Name of the parameter, for my reference 'chatterbox_cloaker'
  • Added verbosity to the database_upload_to_s3 script so that I can help catch why it's acting like it's failing.
    October 3rd
  • Deleted user_attributes, as it was a skeleton function very similar to the nodeparam stuff. Removed the user attributes code from Everything::Experience.
  • Updated noding speedometer to not use Everything::Experience as an explicit namespace in getLevel
  • Removed the Everything::Experience explicit namespacing inside of swapvote. This is an untested change, please let me know if it breaks. I need to move the Everything::Experience functions into Everything::HTML, then into something sane that isn't that, because we're getting namespace poisoning everywhere as I try to untangle stuff and move it into the .pms, and de-duplicate the code.
  • Edited kill opcode, adding in explicit Everything::Experience call momentarily, so when I change namespaces around for a few functions, it will all work out.
  • Removed the explicit Everything::Experience bit from the wertperch occasional playground.
  • Major change: completely revamped the module Everything::Experience to be under Everything::HTML. This is a really good thing as it is going to help us clean up a huge amount of the code sloppiness here with recursive and unmet dependencies in things. Everything is going to go into ::HTML, since it's the primary execution context for all of the eval()s, and then we can break it out into properly spaced, non-exported functions.
    October 4
  • Did the same thing with Everything::Room that I did with Experience. Similar results. Will need to remove literally 200 'use Everything::*' statements in places.
  • Cleaned out some Everything::XML pieces, so that I can have less code to maintain. We aren't going to use nbmasta anymore. Everything engine isn't an engine as much as it is something that drives this site, so we'll continue to clean it up to make it the best engine for E2 it can be.
  • Turned back on static nodetyping, and put both that and the configuration items for memcached into the JSON file, so that it you know, uses the same configuration as everything else
  • Removed 'Everything::Experience' stuff from a few places e2 gift shop, Voting Oracle, Websterbless, e2compile.pl
    October 7
  • Hollowed out all of Everything::Search and placed it into our new global Application logic module, Everything::Application. This is going to simplify our namespacing problems, and make our object hierarchy less convoluted. It will also enable proper code removal from various htmlcodes and other less great processing hacks we have had to use over the years.
  • ... and subsequently fixed a problem where getId and getType were not covered by the $this->{db}->$method namespacing.
  • Write out everything.errlog now as e2app.$date.log, which makes so much more sense and will allow an automated log to S3 delivery agent move properly. Also, it can be used to help track down various problems to when they occur, rather than a long, cumbersome grep.
    October 10
  • Having a busy week away from E2, but should be back shortly. Removed the horrible, horrible namespace cross-contamination inside of Everything::NodeBase, and replaced it with an instantiated call to Everything::Application.
  • Updated All The Code to use Everything::Application instead of Everything::Search
  • Removed 'Everything::Search' pieces from rdf search. Also fixed a typo
  • Removed the 'Everything::Search' piece from E2 xml search interface
  • Ugh, turns out there's two searchNodeNames. One in Everything.pm and one in Everything::Search. I'm getting rid of the one in Everything.pm, but I don't like this change. It's tough to suss out what various pages were doing.
  • Updated parentdraft and rootbeer277's other sandbox to remove the other Everything::Search calls
  • Updated cache dump to give info on which htmlcodes are compiled.
  • Added an AUTHORS.md to the repository to give people credit. I want to know if someone is missing who has worked on the site in the past.
    October 14th
  • Changed the nodecache nodelet to use the newer form of what is going to be the ARRAY return of NodeCache::cacheDump/CacheQueue::listItems
  • Updated cache dump to do the same
  • Made cacheQueue::listItems return an arrayref to include the permanent information. Turns out we have some usergroup bloat we are going to need to handle
  • Added group size caching information to cache dump
  • Turns out that after some investigation, group caching code is somewhat broken, and group is not getting recalculated on permanent nodes. Group calculation is a form of allowed nodecache poisoning, and that is more than a bit weird.
  • Figured out the cache corruption in Everything::NodeBase. Moved all of the groupCache stuff to NodeCache where it can't break anything, and actually clear the groupCache stuff on cache purge
  • Changed Master Control to use the new $APP->isEditor() code.
  • Moved getLevel over to Everything::Application, installing a compatibility layer
  • Moved noding speedometer over to the Everything::Application version of getLevel
    October 16
  • Continuing with the getLevel cleanups from yesterday: Epicenter
    October 19th
  • Added a parameter item to start to wind down untrue gods
  • Updated Other users to use that item
  • Added a replacement for system_settings as system in Everything::CONF, so we can get rid of %HTMLVARS. Reason for this is that it's not accessible from outside of the web-application, so it makes it harder to abstract the logic when we use those short cuts. Guest user is the first such piece of go, and we'll be pulling out the thirty different ways we check for guest user into Everything::Application
  • Moved the flushcbox stuff to use APP->securityLog instead of a raw insert into seclog.
  • Pulling out e2contact stuff, as that group is no more. set_htmlvars, message, chatterbox, flushcbox, Usergroup discussions
    October 24th
  • Performance enhancement on the isApproved stuff with groupCache, and added a "nogods" parameter (just has to be defined) in Everything::Application::isEditor
  • Deleted e2mottos, afd2009Username
  • Completely rewrote the way we handle meta description tags. This is kind of important. If it's a writeup, it grabs a cleaned 155 characters of the writeup. If it's an e2node, and there's a lede writeup, it grabs that one's text, otherwise it uses the highest rated writeup in the node. It's not perfect, but it helps push our google problem downfield some.
    October 26th
  • Our site got pretty badly chuffed up by 80legs.com, which is a shady botnet operation. Fixed it, robots.txt'd them and mailed them.
  • Pushed an optimization to isSpider so that it doesn't run unless it needs to.
  • Looked at displaywriteuptitle some. It's a bit frustrating because node hits were getting updated there, but it doesn't look like it's being called from zen. Either way, removed the direct call to the old isspider htmlcode
  • Added pagecache stuff to cache dump
  • Made ajax update page return 'application/json' as the mime type
  • Through a series of recent code changes, we no longer were triggering a codepath which repaired autocommit to being on, so I explicitly turned it on. This removed a stack of cases where selects were happening for_update, leaving these hugely long running transactions going in the background, which when the bot storm hit us, we collapsed under the weight of those open transactions. I've cleaned up the code a bit, and will have to do some more, but that was a deep, odd problem to hit.
  • Fixed up update New Writeups Feeder so that it doesn't just update all the time whenever someone votes :(

Log in or register to write something here or to contact authors.