Hello noderfolks. I wanted to give an update as to the infrastructure moves that we've made here. With today's database maintenance complete, we are now in sustainable footing for the next decade. Today we'll be locking in a lower price for a year and reducing costs. Ultimately E2 needs to be a business so whatever I put in that reduces OpEx is excellent and allows me to focus on the next few layers of the problem. E2 is a complex technical debt problem, so here is what was accomplished in phase 2:

Moved all resources into the VPC: There was limited support inside of EC2-Classic for features and inexpensive hardwre types, so I moved everything into the new virtual datacenter for E2. We are still stuck to one availability zones (us-west-2a), but that's for expense. Increasing it to two AZs would expand cost about 2x to replicate everything, so that's not an option right now

Finished automated deployment mechanisms. The site now can deploy itself through Amazon OpsWorks. There's going to be some tweaking to make sure that apache2 automatically restarts in every necessary situation, but that's mostly polish.

Started down the path towards "secretless" E2

Right now there is a database password and S3 IAM keys, but since the apps have their own instance role types, we can now use ephemeral permissions to contact other services. Once E2 is secretless, then that unlocks the next generation of services. This is part of making E2 a more "cloud native" app, which simplifies a lot of the deployment.

What's next?

There's a few threads here:

Code in the database: I need to get all the code out of the database. Right now the site is incredibly difficult to profile and debug. There's a proper MVC construct in the site even though it's not used in the mainline display routines, it is ready to go. There's about a hundred or so documents left to go. This will allow us to wind down the patch system, because everything would just be handled through git pull requests on our GitHub site.

Spam prevention and user management: We've got a pretty big spam account problem, so I am thinking of implementing CAPTCHA or some other mechanism

MySQL 5.6 compatibility: E2 is an app that is 20 years old. Practices and literal defaults that were okay 20 years ago are no longer acceptable to MySQL. In the development system I have those defaults turned off, but without better knowledge of the code, it's hard to say how they are used. For instance, the default for node creation times is 0000-00-00 00:00:00, which is not a standard date. For curiosity, dates below the year 1000 aren't accepted.

Frontend consolidation: All of our frontend javascript code is all over the place and doesn't inherently play nicely together. We need to create one solid JS library instead of improperly behaving snippets. This also means getting more recent versions of jQuery

Remove the cron server: Right now I pay for a cron server (it's about 200 a year or whatever). It's not much but it's an expense I'd like to eliminate. This is doable by converting the cron jobs over to Amazon Lambda.

Use Amazon's logging agent for E2 logs: A huge task that the webservers do is to keep the disks from filling up by cleaning logs up after themselves. Now that we are on a managed linux platform, we need to rationalize our log format and get it over to CloudWatch logs so that we can alert on error types. Our logs are kind of unparseable since they contain like perl backtraces, and that doesn't help us detect errors.

The nearest alligators to the boat are secrets and logging. I'm going to continue pulling code out of the database and working towards a future where we can start to address the display problems.

Towards the future!

-Jay