contegix: beyond managed hosting

Archive for the ‘Tech’ Category

As of late, Erlang (the functional programming language behind ejabberd and portions of Facebook) has been picking up steam and gaining some popularity. Recently, there was a question regarding Erlounge meetings in Missouri on the Erlang email list. I have personally been hacking around with Erlang for a few months and decided it would be the perfect time to start ErloungeSTL.

I am proud to announce the first meeting will be held at Contegix on September 11th at 7pm. There will be something for the seasoned Erlanger as well as the novice. The presentations/sessions should be interesting for anyone that has a background or interest in Erlang, functional languages, concurrent programming, or just programming.  More info can be found here.

All are welcome to come.  Contegix will be providing food and drinks to celebrate this first step.

We’re often asked at Contegix, “Do you perform automatic upgrades of Application XYZ?”, and are answer is always emphatically, “No”. This tends to spark some debate, since we do tend to perform RHEL updates automatically. First, let’s define “automatic”, because obviously we’re not shutting down instances/servers without explicit permission from you or your team. In regards to standard RHEL updates, we inform you after the updates have passed a rigorous round of testing and have both the Redhat and Contegix internal “go-ahead” that we need to perform updates on your servers. We consider these mandatory for the reasoning of security. Redhat doesn’t push superfluous updates down the pipe to your servers. They’re generally provided for very specific means, and the number one reason is security. We can push these updates because 99% of the time, the end-user (you) won’t even notice the difference in most cases. On the rare occasion an update may have an odd effect, but I’d like to stress that the case of that happening is excruciatingly rare.

Let’s compare this to… well, -any- web application you’re running right now. First off, keeping up with what’s running on every customer’s server is a massive chore in of itself. Keeping up with that list, and checking to make sure every web application is running the newest version is just a research nightmare. Obviously, the big applications (aka our managed applications) we’re aware of, such as the Atlassian suite of applications, Wordpress, Jive’s suite of applications, and so on. Unfortunately, keeping tabs on all the various web applications we use, and their version numbers, is a bit rough, but is something we plan on tackling in the future. The real problem however exists in the following question: “Do you really want to upgrade?”

The problem is that many applications have introduced the wondrous world of plugins into their applications. Honestly, from our side of the fence, plugins create a lot of havoc. For one, they’re not always supported by the main developers of the application in question, which leaves us restricted in the level of support we can offer to a product using them. Secondly, they make application upgrades comparable to a roller coaster where the cars may or may not come unhinged from the track, sending you careening into a brick wall. That’s not to say we don’t like plugins, because we love plugins. For instance, the Wordpress Automatic Upgrade Plugin turns Wordpress upgrades into a quick 5 minute ordeal. No need to worry about asking us to upgrade your Wordpress, take backups, and hope that we catch any theme changes that need to be made in the process. Instead, a few button clicks and this plugin will complete the upgrade in no time flat, bringing you to the latest version of Wordpress. I’ve used it on my personal blog a couple times now, and it worked flawlessly. Obviously, your mileage may vary, but if nothing else it performs backups before it does anything, so if the upgrade fails, reverting back is a snap.

Why on Earth would a Wordpress upgrade fail though? Plugins. It’s the same reason we have upgrade problems with any application we work with, plugins inherently create issues for upgrade procedures because they introduce new quirks that may fail when the core application is upgraded. Depending on how integral that plugin is to your application instance, this could cause an upgrade to become a complete failure. A default instance of Confluence/JIRA/Crowd upgrades smoothly, no problems to worry about. An instance of Confluence with a bunch of plugins, theme changes, and so on, however tends to be a bit more interesting. It’s not really Confluence’s fault, in fact it’s quite likely that weird plugin you were skeptical about installing is breaking something internally, thus causing the upgrade to fail. More often than not though, Confluence upgrades can fail due to heavy edits to themes, generally via the Theme Builder plugins. This causes theme anomalies, as the Theme Builder plugin is out of date, not functioning properly, and the changes in Confluence between versions have also contributed to some issues with your themes, such as in 2.8 when the theme was prettied up quite a bit (nice job Atlassian!). All of sudden, what should have been an easy, smooth ride, is now resulting in an extra half hour of down time as we scramble to fix the problems. Then we have to come to a decision on rolling back, or progressing through the issues.

This is why we generally frown on automatic upgrades, because plugins add a significant curve ball to the mix that we can’t foresee. If keeping up with every web application is a documentation job of epic proportions, imagine trying to track compatibility of plugins, the plugins installed, and the ones not installed on all customer Confluence instances! We like to keep downtime to an absolute minimum, which is half the reason you’re with us we hope, and that’s why we avoid automatic upgrades. Instead we encourage staging instances, scheduled tasks, and taking each upgrade on a case by case basis. Do you want us to merely say “Confluence 2.8.1 is out, and we’ll be upgrading you on MM/DD/YYYY at 00:00″? We believe it’s in everyone’s best interest for you to decide when to upgrade, and to let us know. We’ll work through the process with you, check compatibility/dependency issues, and set the event up for a time that suits your needs best. If you’d like to see it staged out first, that’s fine too, we’re more than happy to setup a small staging instance for the upgrade when necessary, assuming it’s not detrimental to the overall health of the server. We want to work with you, as much as we work for you and your company. If you have any thoughts or suggestions on our upgrade procedures, feel free to drop them in the comment box!

We’ve spoken in the past about Hyperic monitoring, and the roll out of this application to our managed customers. I felt that Hyperic is so slick it deserved more lip service. Some of our customers have already been given access to our monitoring system, and from the feedback we’ve received it would appear they’re quite ecstatic with it. That’s not to say that there aren’t some kinks, because there are, but I must say the microscopic kinks are almost unnoticeable. Hyperic is always improving though, and we’re doing our best to exploit the very best of this application to better serve your infrastructure here at Contegix. The servers that have Hyperic configured on them have a wide range of monitoring options such as:

  • CPU Monitoring
  • Load Averages
  • Filesystem Usage
  • Database Monitoring (MySQL, PostgreSQL, EnterpriseDB, etc)
  • HTTP Checks
  • Zimbra 4.X
  • IMAP, POP3, SMTP (on any obscure ports imaginable!)
  • Memcached
  • Tomcat
  • Resin
  • Apache HTTP
  • And so many more options you would die reading the list

We receive well over a thousand emails a day from our monitoring system, letting us know when your servers are leaving the realm of acceptable levels in a wide variety of categories. This allows us to be proactive in regards to your server’s health, and attack trouble areas before services are impacted. For instance, if we see the load on your server climbing above the typically acceptable level of 5, and staying high, we know to investigate the server before services are impacted.

Most importantly though, you don’t have to deal with the awkward situation of your website’s visitors telling you your site is down, if you’re monitored by our system. If Hyperic is monitoring your site, then that site will be checked every 5 minutes, making sure it gets a response, and checking the site for a search string that should appear on your site. If the monitor fails, we’re alerted immediately, and respond to the situation. If you have special instructions for us, we make every effort to follow them to a tee, and if you don’t have special instructions we’ll handle the situation the best way we know how to return your site to working order. For instance, on typical Java applications, we’ll thread dump the instance, restart it, and notify you of the maintenance that was performed.

I do admit though, as much as we strive to be, we’re not always perfect. At times we do require assistance from you and your team to help us be the best that we can be. While many servers at Contegix follow the Contegix way of doing things, not everything follows exactly what we’re accustomed to. That’s okay though, we don’t mind it, after all these are your servers! However, for us to fully monitor your services to our fullest potential, we do encourage you to let us know what needs to be monitored. Even if you don’t have a special setup, we don’t mind you checking with us on what’s being monitored. In fact, I encourage that too! We want you to feel comfortable here, and if double checking with us that everything you need monitored, is monitored, then drop us a line. There’s absolutely no harm in that, as it ensures that nothing is missed, and that we’re serving you to the best of our ability. Please keep in mind though that running the Hyperic agent on your server will require a small amount of memory, as this is a Java application which means it requires some resources. If you already have a heavily taxed server, throwing the Hyperic agent into the mix may not be a good idea, but I believe this to be a very rare situation.

Finally, maybe the coolest part of Hyperic is that we can give you access to the system as well! This gives you the ability to see the metrics that are produced by the monitoring system for your servers. The access that is granted to you is read only access, so you can’t create sensors, but you can always ask for new ones (again, it’s encouraged!). This ability has already helped a few of our customers, by giving them insight into how their services were behaving, allowing them to clean up trouble spots in their applications and infrastructure. All you need to gain access is drop a line to support@contegix.com, and we’ll be happy to get it setup for you. Let’s take a look at Atlassian for a perfect use-case scenario in which Hyperic can be of great assistance.

Their documentation has a section for monitoring critical production systems. If you visit that section you’ll notice the power of Hyperic on display in the images shown. They go onto demonstrate in that article one particular scenario how the graphs enabled them to catch a critical issue with an instance of theirs, which gave them the nudge in the right direction towards correcting the problem. Furthermore, Hyperic themselves noticed Atlassian’s documentation, and hint at a potential pair of plugins for monitoring Confluence and JIRA in particular! Just remember, we’re here to help you improve in anyway possible. Drop us a line, and get more from your hosting environment with us with Hyperic access!

At Contegix, the NOC engineers spend a lot of time working with Atlassian’s products. We are in a constant cycle of installing, maintaining, and upgrading Confuence, JIRA, etc. for customers. We install their plugins, help work out the kinks, and make sure the applications stay running as close to 100% of the time as possible. Oddly, up until lately we’ve mainly only used one application of Atlassian’s internally - Confluence. Of course, we use Crowd here and there as well, but it’s transparent and I never need to worry that it even exists in our infrastructure. JIRA is used for projects with our development consultants and special projects. Everything we document ends up in Confluence, and that allows us to be more productive as we have this incredible encyclopedia of knowledge constantly at our disposal. The need for JIRA by the engineers didn’t every really seem excessively relevant in the past, nor did Bamboo, Fisheye and Crucible. We’re a hosting company with system administrators, not a software development company. In our minds, it didn’t make as much sense. At least, it wasn’t overtly clear that we needed JIRA.

We have Subversion running to handle the code for most of our internal projects, emails to and fro between engineers worked as bug reports for the internal scripts we use, and email was used to announce new versions of various scripts. We’d also use Confluence in a backwards way to help manage some of our internal projects as well, which wasn’t the best solution. It worked, yes, but it wasn’t optimal. We didn’t know any better though. We’re administrators, not users! How were we supposed to know that JIRA was so slick?

Well, the advent of JIRA Studio has taught us a solid lesson, the NOC engineers needed JIRA a looooong time

ago. We often ran into problems in the past where one person would write a script or an application, but it wouldn’t gain widespread use amongst our engineers. The simple fact was that either not everyone knew about it or the script would need updates; however, it wasn’t being properly maintained. This would lead to a wide variety of editions of the script floating around - the absolute death of the script in the first place. Then, we’d revert back to everyone doing everything by hand again. It was an endless cycle that would start everytime someone wrote a mediocre or decent script. We wouldn’t give the script the proper care and love it deserved, it’d gain moderate use for a bit, and later find itself in the script graveyard. The other alternative reality was the individual engineers wrote their own scripts, but never really found the means to openly share their scripts. All of that soon changed, as our illustrious leaders bestowed upon us a great and magical gift… JIRA Studio (add your own fantasy based sound effects here, I prefer trumpets personally).

All of a sudden, we went from having no way to manage our many scripts that have been tossed around the office like dirty laundry, to having so far five projects managed in JIRA Studio. We haven’t had the instance up for more than a couple weeks, and I’m sure the number of projects we maintain internally with JIRA Studio will only increase. It’s amazing how much easier it has made life for us already though. We have subversion repositories for our individual projects, issue tracking, code reviews, great tools to analyze our repositories with, plus a solid documentation backbone. Each project is sectioned off to what feels like its own little world, yet it’s still a part of the big picture that is our developmental operations here at Contegix. We have scripts for customers, programs that make our internal life less chaotic, along with our very website just in case anyone feels the need to make improvements to it.

I think the best part of the whole experience is that we’re all finally starting to share code, discuss new ides for automation, and expanding our thinking quite a bit. Gone are the days of not enough time, not enough resources, or too much effort. We don’t have to worry about those issues now. Now I can whip up a script that has a solid base to help out our company, create a JIRA Studio project, and as a team we can nurture the script to a fruitful life.

It’s odd how just having a launching pad for our internal development projects has opened our eyes quite a bit. Before we were quite content doing a lot of our work by hand, because our development process wasn’t exactly the best. The big problem with automating what we do is we’re dealing with your (our customers) production systems. If we’re going to develop automation tasks for your systems, they absolutely must be 100%. We won’t mess around with your systems by testing half baked scripts on what for many of our customers is their livelihood. That was the past though, because now we have solid testing sandboxes setup for our automation tests, along with JIRA Studio to help us manage the process of developing our applications. We’re already starting to see the benefits as bug reports roll in, fixes roll out, and new projects are being started. I’d say this is the beginning of a new era for Contegix, as we’re now more capable of streamlining of our efforts thanks to JIRA Studio

I guess my overall point is that if you think JIRA Studio isn’t right for your company, because you’re not a development team, you may want to reconsider. I don’t believe we ever though we needed JIRA, and we don’t need it, but I sure don’t want to go back to life without it!

As a large hosting provider we use a lot of different applications, and we try to keep them all as secure as possible. Unfortunately, we can only win so many battles at any given time, and we do require help from you, the customer at times to ensure your system is safe. Wordpress, as great as it might be as a blogging platform, seems to find itself getting hacked more than most applications that we host. Now, I’m not saying that Wordpress is a bad application by any means, but with it being such a large platform it draws a lot of unwanted attention.

As such, there are quite a few hackers and script kiddies out there that will try to compromise your Wordpress based website. We’re hoping that with this article we can further educate Wordpress users on how to protect their sites. Here’s a few helpful tips we can provide, some of which you can have us do, and some things we’ll recommend that you do:

5. Please, please, please don’t use a user name of ‘admin’! I know that’s the Wordpress default, and it’s just easy to use it, but what user name do you think is in every brute force attack? You guessed it, ‘admin’. We’d recommend using a unique user name for administration purposes, like ‘mark.rogers’, or ‘mrogers’. Of course, you can use your own name if you don’t like mine I suppose.

4. Remove the Wordpress version number from any headers, footers, css, etc, etc. Leaving the version number in your page source is a dead giveaway to would be vandals to dig through google to find ways to exploit your specific version. Granted this is the equivalent of leaving your lights on at home while you’re away, but if it deters someone, then consider it a victory! It’s just too easy to use the version number to find exploits for your site, as Wordpress exploits become public knowledge too often.

3.Let us put basic Apache authorization on the /wp-admin section of your blog! We’d be more than happy to do it, and it’ll make every php file in the /wp-admin path even harder to get to. Granted it can be a bit of a nuisance to double login, but not nearly as big of a nuisance as restoring from that backup you took last week right? We can also limit access to the wp-content, and wp-includes directories as well. Plus we can lock it down by IP, or user name/password combos.

2. I know it should go without saying, but please choose hard, random passwords. I know a lot of blogs, my own included, started off really small, and I never worried about getting hacked. My blog never got big, but maybe yours will! Either way, play it safe, and go with a hard password from the get go. That way if your little playground gets bigger one day, or if you land on Digg by accident, you’ll be at least somewhat prepared.

1. While the above options are great for helping secure your instance of Wordpress, there’s one piece of the puzzle that is probably the most important. That piece? Keep Wordpress up to date at all costs! There isn’t an option that can replace this critical piece, because Wordpress being the giant of blogging that it is, is constantly being updated to fix security flaws. Staying up to date is a way of staying ahead of the game, and it’s generally a ten minute ordeal that we’ll take care of for you if you’re a customer of ours! Look at it this way, if you’re running a year old version of Wordpress then you’ve given vandals a year to figure out how to hack you. Why give them that edge? Most Wordpress upgrades are painless, and you know we’ll gladly work with you to schedule it for a time that’s best for your company’s needs as well.

Hopefully this helps answer some question on how to protect yourself from would-be hackers in regards to Wordpress. The fun part is that this applies to quite a few PHP applications in a general sense. Drupal, Simple Machine Forums, and so on can all benefit from these security tips, especially security tip #1! As always customers, drop us a line at support@contegix.com with any question you might have.

In the past few months, nginx (pronounced “Engine X”) has become The Little Engine That Could. This is most evident in Rails deployments and in Zimbra 5, where it replaced perdition for IMAP/POP3 proxying. For Rails, it is typically replacing Apache 2.2 proxy_load_balancer as a front-end to Mongrel.

One of our engineers, Joe Williams, decided to put both system to the test with a Battle Royale. Check out the results.