Why we need a secure, high performance content engine
Conventional CMS platforms have not changed much in the last decade. Though there have been addition of minor features and several plugins that provide missing features in these platforms, the core platforms still depend largely on legacy technologies - typically the LAMP stack consisting of the Linux OS, Apache Web server, Mysql Database and PHP as the application platform.
On the other hand, there have been significant new developments in web technologies including emergence of rich HTML5, CSS3 and javascript based single page applications and websites, proliferation of cloud based API driven services that offer ready-to-use interactive features that conventional websites had to build on their own, and finally the evolution of DevOps which automates the entire design, development and deployment of web applications and websites. This white paper outlines how continuing to use conventional CMS platforms is becoming increasingly risky and irrelevant with the emergence of these new technologies, and outlines the architecture of an alternate CMS based on emerging technologies. To understand the alternative approaches to a conventional CMS, we first identify the core features provided by a CMS.
The Content Management Lifecycle
The following diagram represents a typical Content Management Lifecycle:
Conventional Database Driven CMS Platforms
Conventional database driven CMS platforms handle the entire content lifecycle by maintaining all the content in the database and rendering static HTML, CSS and Javascript on demand.
Conventional Database driven CMS platforms provide an integrated platform for all the steps in the content management lifecycle and help manage all the artifacts in a centralized database. The price to pay for this convenience is that the database and application code is directly exposed to users, thereby making the CMS application a single point of failure from a security and performance perspective.
The following section highlights some of the security and performance challenges posed by conventional database driven CMS platforms.
Risks in conventional database driven CMS platforms
Most conventional database driven CMS platforms are based on the LAMP stack - i.e. Linux, Apache, Mysql and PHP. Based on Accion’s experience, though the PHP and MySql based content management platforms such as Wordpress and Drupal have matured in terms of features and capabilities, there are some significant risk areas that have emerged in the last decade. This section highlights some of these risk areas, and further sections recommend solutions that are based on implementations done by Accion for other customers.
Risk 1: Security
As a platform, PHP and all platforms based on PHP have been exposed to significant security risks. The following table lists the number of security updates released for core PHP, core Drupal and core Wordpress only.
Data Sources:
●http://php.net/ChangeLog-5.php
●https://www.drupal.org/security
●https://wordpress.org/news/category/security/
The data above demonstrates a significant growth in the number of patches released for core PHP, Drupal and Wordpress over the last few years. This does not account for the number of patches for all the plugins and extensions - that may add up to a significantly larger number. Several of the security risks are related to the database, as the core capability of PHP is to integrate dynamic database driven content into HTML pages. Based on our experience, none of the security patches are worth ignoring, and it is required for each patch to be updated. Updating a patch requires a full lifecycle of database backup, code backup, implementation of new updates, testing and release. The overall maintenance overheads for PHP based websites has therefore increased manifold in the last few years.
Risk 2: Performance The second major risk is of scalability of performance of PHP and MySql (LAMP) based websites. For the number of page visits per day for all websites and particularly Search Engine Optimized websites has undergone a significant rise, it is now common for enterprise websites to require to handle several thousand page views per day.
For each website visit is served from the database, the largest performance bottleneck for LAMP based websites is the MySql database. Based on Accion’s experience, the minimum configuration for a relatively standard corporate website requires a minimum of 4 node cluster of reasonable size with 2 database and 2 web server nodes being load balanced. Ideal configuration requires an auto-scaled setup with a minimum of 2+2 nodes growing to 5+5 nodes based on actual traffic.
Despite such high infrastructure requirements, the performance that LAMP based websites is quite slow due to the fact that all content is dynamically generated from the database. Considering that content does not change with each website visit, it is prudent to use caching of the static HTML, CSS and Javascript generated into a cache server such as Redis or Memcached.
The caching server requirements further increase the cost of infrastructure for a LAMP based website.
The Approach - Separating the Content Lifecycle
The key area that the approach differs from the conventional database driven CMS platforms is the way in which the content lifecycle is handled. For most web applications and particularly for websites, the frequency of updating content is significantly smaller than the number of times the content has to be delivered to visitors. Design changes are even more infrequent than content changes. The new approach leverages this difference in the velocity of update.
The new approach to content management leverages this difference in the frequency of updates of various stages of the content lifecycle. It separates the content and design phases from delivery and maintenance into independent systems.
Emerging technologies Used
The new architecture uses the following new developments in the web technology space that have emerged in the last decade:
- NoSql Database - provide flexible content and metadata schema that are easily managed with minimal database redesign
This makes the process of content planning highly flexible and provides custom content objects without the overheads of the conventional RDBMS 2. HTML5, CSS3 and Javascrip - With the emergence of these standards along with platforms like AngularJS, ReactJS and EmberJS, web content is now driven by browser based rendering, and the server only interacts with the browser for dynamic data.
This means that the theme and templates required to render a website need not be stored in the database and can be directly served to the browser for dynamic rendering 3. Separation of Content and Design - With design being driven by HTML5, CSS3 and Javascript, the content and design are clearly separated.
Content and Metadata can now be independently stored from design in any form that captures only the structure of the data and not the formatting or layout 4. Content Delivery Network - Most cloud service providers have CDN services that can deliver rendered HTML5, CSS and Javascript files from geographically distributed, automatically synchronized servers
This makes the cache servers of conventional CMS platform redundant and provide extremely high performance of websites and web applications 5. Continuous Integration and Deployment (DevOps) - with the emergence of the DevOps methodologies and agile development tools such as github, jenkins, selenium, chef and puppet, it is now possible to automate the entire process of integration, testing and deployment
As a result, the content delivery capabilities of a conventional CMS are rendered redundant 6. Cloud Based API Service - A proliferation of API based cloud services allows easy integration of dynamic capabilities into websites, including examples such as:
- Online marketing integration using platforms like Marketo
- Sales lead capture using platforms like Salesforce
- Integration of e-commerce features and payment gateways using platforms like Shopify, Bigcommerce or Volusion
- Interactive commenting and social media likes using platforms like Facebook and Disqus
- Multimedia integration using platforms like Youtube or Vimeo for videos, slideshare for slideshows, Soundcloud for audio, Google Photos, Flickr and other services for automatically optimized images
- Website analytics using platforms like Google Analytics
- Online surveys using platforms like SurveyMonkey
- Several other such free and SaaS platforms that provide scalable interactive features as a service.
Advantages of Emerging Technologies
Accion’s approach of developing the new solution addresses all the features that are typically expected in a conventional CMS by addressing each part of the Content Management Lifecycle using standard web components and technologies.
- Performance : The entire website is delivered using Amazon’s Cloudfront CDN and hence provides extremely high performance
- Flexibility : The Content platform is based on a NoSql Document Oriented Database that provides maximum flexibility in content management workflow such as role based access control, content workflow and dynamic object schema
- Security : The content platform is only accessible to authorized users and is NOT used for delivery of the actual web content. For each new update to the content, the content platform publishes static HTML5, CSS3 and Javascript files directly to Amazon Cloudfront from where the site is served from the closest possible location from the visitor’s location. Hence the database is not used for content delivery and exposes no security risk!
- Cloud & DevOps Read : The website is hosted directly on Amazon Web Services platform where native support exists for continuous integration, testing and deployment
- Low Infrastructure Cost : The only infrastructure that is used for delivery is Amazon’s Cloudfront network, that is extremely cost effective to scale up
- Low Development and Maintenance Cost : Accion maintains all the components of the solution for several existing customers and has a dedicated support team for the platform.
Section 3: Case Studies
The following list is a set of selected customers who are using the emerging technologies based solution provided by Accion Labs:
Object Rocket : a division OF Rackspace
Object Rocket provides database as a service for various NoSql databases including MongoDB, Redis and Elasticsearch. One of the early adopters of the emerging technologies based CMS, Object Rocket’s website is hosted on Rackspace Cloud and is super fast!
Image Consulting Business Institute
A university that provides training and certification for Image Consultants through the world. Accion’s solution powers the institute as well as hundreds of professional business sites belonging to image consultant graduates of the institute
http://imageconsultinginstitute.com
Fastacash
A Singapore based mobile payments company that provides an omni channel device agnostic payment engine.
TripChamp
An innovative startup company that has created a machine learning based platform for intelligent search that combines GDS along with an unlimited number of consolidators, providing travelers more travel options at different price points. TripChamp used Accion’s emerging technology solutions to implement the front end and some of the backend APIs