Content Management Meets DevOps (Part 1 of 2) How a Git-based CMS Improves Content Authoring and Publishing

Traditional CMS platforms based on SQL and JCR repositories have begun to show major signs of weakness in keeping up with today’s demands for a high rate of innovation and rapid scalable deployment on modern elastic architectures. This is nowhere more evident than the move towards headless CMS. Many CMS platforms today push headless, or what some call Content as a Service (CaaS), as the one-stop-shop solution to the struggles most CMS platforms have in providing support for scalability, multi-channel, and development integration. It’s not. Headless capability is important but it has its own limitations.

Crafter CMS, an open source Git-based dynamic CMS tackles all of these challenges head-on with a set of technologies and that incorporate lightweight development, integration with developer tools and process, and elastic scalability for content delivery that provides the ability to serve any front-end technology via API or markup with fully dynamic content.

In this two-part series, we’ll explain the basic mechanics that support content authoring, publishing and developer workflow and demonstrate how these mechanics combined with Crafter’s architecture and developer stack set a new standard for what a CMS can provide in today’s competitive digital marketplace.

Content Management and Deployment Mechanics

In this section we’ll explore the mechanics of how (non-technical) content authors work with the CMS and how their changes, once reviewed and approved, are deployed from their authoring tools to a live content delivery system.

Crafter CMS is decoupled, composed of several microservices where content authoring and content delivery services are separated into their own distinct, subsystems. This model has many advantages related to security, scalability and delivery flexibility. In a decoupled architecture, content is published from authoring to delivery as shown in the diagram below. The delivery system may be any number of independent digital channels – enterprise website, mobile app, social, augmented reality, digital kiosk or signage, e-commerce front end, microsite, etc.

Crafter CMS supports authoring via Crafter Studio that sits on top of a headless Git-based repository and publishing system. Content authors don’t need to know anything about Git. They simply work with Crafter Studio, a web-based application. Crafter Studio provides beautiful content entry forms, in-context editing with multi-channel preview, drag-and-drop layout, component placement, image cropping, and more. While content authors are performing their work, Crafter is managing all of the Git mechanics, managing locking, creating a time-machine like, Git-based version history and audit trail for them behind the scenes, all accessible to them via the Studio UI.

 

Figure 1: Crafter CMS microservices applied to decoupled architecture

Crafter’s publish mechanism deploys content from the Authoring system to the Delivery system. Content logically flows from the authoring environment to the delivery environment. The mechanism for this, given the underlying Git repo, is a “pull” type interaction. Meaning the actual network conversation is initiated from the delivery infrastructure to the authoring infrastructure, as shown in Figure 2.

Each delivery node has a Deployer agent that coordinates deployment activities on the node for each site that is being delivered on that node.

  • Delivery nodes can initiate deployment pulls either on a scheduled interval (a “duty cycle”), on-demand via an API call, or both.
  • The Deployer performs a number of activities beyond receiving and updating content on the delivery node. A list of post-commit processors is run. These can be used to execute updates on search indexes, clear caches and perform other such operations.
  • The Delivery node maintains a clone of the Authoring Git-based repository.
  • The Crafter Deployer takes care of managing the synchronization of the delivery node’s clone authoring repository from the authoring environment.
  • Git-mechanics ensure content sync is 100% accurate.

Figure 2: Crafter’s Dynamic CMS Publishing via Git

Technically speaking, Authoring does not require knowledge of the Delivery nodes. This makes the architecture more elastic, globally scalable and even enables Crafter to support disconnected and intermittent content delivery.

  • Elastically add new nodes or revive dormant nodes and they will sync to the latest without any additional wiring.
  • Create region-based depots to avoid transferring data more than once over long distances for global deployments.
  • Airplanes, cruise ships, drilling/mining locations and other remote disconnected deployments can operate with their latest pull of content, and sync up with Authoring when connectivity is available.

Figure 3: Elastic Delivery

In Crafter CMS, only approved content is published to the delivery environment. Crafter manages this by using 2 repositories for each project. One called a “Sandbox” which contains work-in-progress and the other called “Published” which represents approved, published work and complete content history.

  • Authors use the Crafter Studio UI to review and approve content via workflow.
  • Crafter Studio takes care of moving approved work between Sandbox and Published repositories.
  • Delivery nodes monitor the published repository for updates.


Figure 4: Authors work in Sandbox. Delivery nodes pull from Published.

Benefits

Crafter’s Git-based publishing model provides your authoring team with a highly reliable, highly accurate publishing mechanism that is elastically scalable, globally distributable and supports multi-channel.  Crafter CMS’s architecture enables your team to reliably deliver your dynamic content on any channel, wherever and whenever it is needed.

Further, As we’ll see in in Part 2, this architecture enables content authors to work side-by-side with DevOps, while they continually introduce new features and functionality without any disruption to the authors.

How do I set up this workflow?

The underlying Git repositories and related workflow for Authoring require no setup at all. When you create a project in Crafter Studio it automatically creates the local “Sandbox” and “Published” repositories. When you add a new “Delivery” node a simple command line script is run on that node that configures the node’s deployer to replicate and process content from the “Published” repository from authoring.

  • Instructions for creating a site via Crafter Studio can be found here.
  • Instructions for initializing a delivery node can be found here.

Conclusion

Content authors are non-technical users who need powerful but easy-to-use tools to help create, maintain and manage their digital experiences. Crafter Studio provides these users with a web-based application that makes it easy for content authors to achieve their goals. Under the hood, Crafter Studio leverages a powerful Git-based repository and deployment engine that provides authors with next-generation versioning and auditing mechanics as well as robust, elastic and distributed deployment.

Today’s digital marketplace is constantly evolving. Companies are always iterating on existing functionality with improvements and deeper integration or introducing new functionality and channels for their audiences. For today’s most innovative and competitive organizations, ongoing development and the move to DevOps is a fact of life. The companies that have the greatest success are those that have the right technology and processes through which they are able to achieve a high, sustainable continuous rate of constant, iterative development, integration and delivery, i.e., Continuous Integration and Delivery (CI/CD).

In the second half of this blog series, we will take a deep dive into how Crafter CMS seamlessly integrates with your CI/CD processes to enable your entire team of developers and content authors to innovate collaboratively without interfering with each other’s workstream.

 

Working with Crafter Studio’s API

Crafter CMS is a decoupled CMS composed multiple microservices where content authoring and content delivery capabilities and services are separated into their own distinct, subsystems.

Organizations often want to interact with the content authoring and management system via APIs. In this article, we’ll show the basics of interacting with this API by example:

  • Authenticate
  • Get a list of projects under management
  • Write content to a project

To keep things really basic, we’ll use CURL, a ubiquitous Linux command tool as our client.

You can find the full Crafter Studio API for Crafter CMS version 3.0 here
http://docs.craftercms.org/en/3.0/developers/projects/studio/api/index.html

Step 1: Authenticate

We’ll use the authenticate API
http://docs.craftercms.org/en/3.0/developers/projects/studio/api/security/login.html

curl -d '{"username":"admin","password":"admin"}' --cookie "XSRF-TOKEN=A_VALUE" --header "X-XSRF-TOKEN:A_VALUE" --header "Content-Type: application/json" -v -X POST http://localhost:8080/studio/api/1/services/api/1/security/login.json

The first thing you’ll note is that we’re going to perform a POST, passing the username and password as a JSON object.  In a production environment, you will want to use HTTPS.

The next thing you will notice, we are passing a cookie “XSRF-TOKEN” and a header “X-XSRF-TOKEN”.  The value passed for these are arbitrary.  They must match and they must be passed in all future PUT and POST API calls.  These are used to protect against certain cross-browser scripting attacks.  If you are using Studio APIs as part of a web client you want to make sure these values randomly generated.

When you issue the curl command you will get back a response:
 * Trying ::1...
 * Trying 127.0.0.1...
 * Connected to localhost (127.0.0.1) port 8080 (#0)
 > POST /studio/api/1/services/api/1/security/login.json HTTP/1.1
 > Host: localhost:8080
 > User-Agent: curl/7.43.0
 > Accept: */*
 > Cookie: XSRF-TOKEN=A_VALUE
 > X-XSRF-TOKEN:A_VALUE
 > Content-Type: application/json
 > Content-Length: 39
 >
 * upload completely sent off: 39 out of 39 bytes
 < HTTP/1.1 200
 < Cache-Control: no-cache, no-store, max-age=0, must-revalidate
 < Pragma: no-cache
 < Expires: 0
 < Set-Cookie: JSESSIONID=2E114725C82F3EE44ADC04B578A3BE8F; Path=/studio; HttpOnly
 < Content-Type: application/json;charset=UTF-8
 < Content-Language: en-US
 < Transfer-Encoding: chunked
 < Date: Mon, 22 Jan 2018 21:32:48 GMT
 <
 * Connection #0 to host localhost left intact
 {"username":"admin","first_name":"admin","last_name":"admin","email":"evaladmin@example.com"}

Note the response returned is a successful 200 status code and the response contains JSON with details for the authenticated user.

Also found as part of the request is the JSESSION cookie.  You will need this value for all future requests.

Step 2: Get a list of sites the user is authorized to work with

http://docs.craftercms.org/en/3.0/developers/projects/studio/api/site/get-sites-per-user.html

curl --cookie "XSRF-TOKEN=A_VALUE;JSESSIONID=2E114725C82F3EE44ADC04B578A3BE8F" -H "X-XSRF-TOKEN:A_VALUE"  -X GET http://localhost:8080/studio/api/1/services/api/1/site/get-per-user.json?username=admin
Note the CURL command contains your session ID and XSRF tokens.
When you issue the CURL you will get a response that contains sites your user has access to:
{"sites":[{"id":9,"siteId":"ar","name":"ar","description":"","status":null,"liveUrl":null,"lastCommitId":"951004363449cc83209f307b1e9f110dab37fed7","publishingEnabled":1,"publishingStatusMessage":"idle|Idle","lastVerifiedGitlogCommitId":null},{"id":5,"siteId":"diiot","name":"diiot","description":"","status":null,"liveUrl":null,"lastCommitId":"92d543eaa164b1ebfbdd6ce538ae028d4d6421b7","publishingEnabled":0,"publishingStatusMessage":"idle|Idle","lastVerifiedGitlogCommitId":"92d543eaa164b1ebfbdd6ce538ae028d4d6421b7"},{"id":10,"siteId":"editorialcom","name":"editorialcom","description":"","status":null,"liveUrl":null,"lastCommitId":"503d922f226e8ab821073e23ef5a229f907212a0","publishingEnabled":1,"publishingStatusMessage":"","lastVerifiedGitlogCommitId":"503d922f226e8ab821073e23ef5a229f907212a0"},{"id":3,"siteId":"flow","name":"flow","description":"","status":null,"liveUrl":null,"lastCommitId":"21923775c3a1fc778a364d47884b9ee2bb4928a5","publishingEnabled":1,"publishingStatusMessage":"idle|Idle","lastVerifiedGitlogCommitId":"21923775c3a1fc778a364d47884b9ee2bb4928a5"},{"id":8,"siteId":"vr","name":"vr","description":"","status":null,"liveUrl":null,"lastCommitId":"c67fd9dd25d1aa59ff13e3fda2a4387be50dfc69","publishingEnabled":1,"publishingStatusMessage":"idle|Idle","lastVerifiedGitlogCommitId":null}],"total":6}

The response above contains a number of projects.  In the next call I want to write a content object to one of the projects (editorial.com.) To do this I need the site ID.  I get this from the response above: editorialcom

Step 3: Write content to the Editorial com Project

http://docs.craftercms.org/en/3.0/developers/projects/studio/api/content/write-content.html

curl -d "<page><content-type>/page/category-landing</content-type><display-template>/templates/web/pages/category-landing.ftl</display-template><merge-strategy>inherit-levels</merge-strategy><file-name>index.xml</file-name><folder-name>test3</folder-name><internal-name>test3</internal-name><disabled >false</disabled></page>" --cookie "XSRF-TOKEN=A_VALUE;JSESSIONID=2E114725C82F3EE44ADC04B578A3BE8F" -H "X-XSRF-TOKEN:A_VALUE"  -X POST "http://localhost:8080/studio/api/1/services/api/1/content/write-content.json?site=editorialcom&phase=onSave&path=/site/website/test3/index.xml&fileName=index.xml&user=admin&contentType=/page/category-landing&unlock=true"

In the call above note:

  • We are passing in content as the POST body.  The content is in XML format.  In Crafter content objects are stored as simple XML documents.
  • We are passing the Session ID and the XSRF tokens
  • We are passing a number of parameters that tell Crafter CMS where and how to store the content in the repository

Conclusion

In this article we covered the basic mechanics of connecting to and interacting with Crafter Studio, the authoring services of Crafter CMS.  We’ve avoided the nitty gritty details of each API call in favor of the macro mechanics.  You now have the basic skills and capability to interact with any Crafter Studio API found here: http://docs.craftercms.org/en/3.0/developers/projects/studio/api/index.html.  Get out there and integrate!

 

Match Highlighting for Search in Crafter CMS

Highlighting search terms in search results is a common requirement for many websites.  Crafter CMS builds on top of Apache Solr and make implementing rich search and other query-driven experiences super simple.

In this tutorial, we’ll create a simple article search backend that highlights the search terms that were used within the results returned to the user.

Step 0: Prerequisites

If you haven’t gotten Crafter CMS set up and built your first site you can follow this tutorial to get started: Working with Your First Crafter CMS Web site.

Step 1: Build a content model for articles

The next thing we need is content to query against.  The first step in supporting content creation is defining the Content Type.  The Content type is the definition of the structure of a particular type of content.  In our example, we want to define the structure of an Article.

A simple article should have:

  • A Url
  • A title
  • An author name
  • A body

To define this we use Crafter’s Content Type management console:  Below you can see the example model including the fields and their types.

You can learn more about content modeling here Content Modeling in Crafter CMS.

Step 2: Create content

Now that you have your article content type defined you can create articles.  To create an article open the pages folder (we modeled the article as a page with a URL) in the sidebar and right click on the home page:

Step 3: Create a REST script to return highlighted results

Now that you have content you can write a RESTful controller and test it.  Let’s create a simple GET based REST controller.

  1. Open the Sidebar and locate the Scripts folder.
  2. Open the Scripts folder and navigate to “rest” folder.
  3. Right click on it and choose “Create Controler
  4. In the script name dialog, enter “search.get” and click Create.
  5. Enter the following code:
// build a query
def keyword = params.q 
def queryStatement = "content-type:\"/page/article\" " 
 
 if(keyword) {
      queryStatement += " AND $keyword"
 }
 
 def query = searchService.createQuery()
 query.setQuery(queryStatement)
 query.addParam("hl", "true")
 query.addParam("hl.fl", "body_html")
 query.addParam("hl.simple.pre", "<b>")
 query.addParam("hl.simple.post", "</b>")
 
 // execute the query
 def executedQuery = searchService.search(query)

def matches = [:]
matches.found = executedQuery.response.numFound
matches.articles = executedQuery.response.documents
matches.highlights = executedQuery.highlighting

return matches

Step 4: Execute the Script

In a browser, go to http://SERVER:PORT/api/search.json?q=AWORDTHATEXISTSINRESULTS

example: http://localhost:8080/api/search.json?q=bacon

Why Developers Should Care About CMS

As developers, we’ve got a strong handle on how to manage and deploy our code assets.  Yet every one of us, at some point in our application build has said, “What about this text? What about these images? Where do these belong?”  That’s pretty universal.  Nearly every single application today has content in it.  Be it a web app or a native app; it’s full of strings, images, icons, media and other classes of content.

This content doesn’t really belong in our code base — because it’s not code.  These non-code assets make us as developers pretty uneasy.  We know that at some point a business user is going to ask us to make a change to one of those strings and we’re going to spend hours of build and deploy cycles to handle a 30-second code change.  We know that at some point we’re going to need to translate that content.  We know at some point we’re going to replace this UI with another one.   We know all these things  — and we know leaving that content, even if it’s abstracted into a string table or a resource bundle is going to come back to haunt us; no matter the abstraction: it’s part of the build, developers need to update it.  Developers and Systems folks need to deploy it.  

Smart developers separate code from content.  They make sure that the content in the application is completely independent of the build and deploy cycle of the application itself.  Where appropriate they make sure non-technical business users and have access to the externalized content and can update it and publish changes at any time.

Consider the following scenario:  Your application is for an insurance company.  Along comes major regulation change.  Now your (and every other application) needs to change its language accordingly.  If the content was separate from the application the response time and cost to make changes is minimal.  It’s business as usual.  If the content is not separate you are looking at months of builds, testing and deployment that costs time, opportunity and a tremendous amount of money,

What we’re really talking about is application architecture.  Making the right decisions has significant impacts on our development teams and process and as we’ve illustrated, our organization’s responsiveness and bottom lines.

Solving the Content Problem Through Application Architecture

This problem is very similar to another problem:  Websites.  Waaaaaaaaaaaaay back in the 1990’s people built and maintained entire websites in HTML by hand.  You had to be a programmer who understood the markup and how to update it and ultimately deploy it to a server once complete.  This model didn’t work. Once business started using the web to communicate to the masses marketing departments started getting involved.  Marketers wanted constant updates. Developers wanted to write code, not update copy.  They needed a solution.  The Web Content Management System (WCM for short) was born.  WCM offered a new architectural solution to manage websites. WCM allowed the developers to put the code into templates that contained placeholders for the content and gave authors a user interface to manage and edit the content at any time.  The content is managed separately from the code (the templates) and the two are brought together to create the final product. Everyone wins.

We need the same capabilities that WCM provides but for our applications.

Why Not Use a Traditional CMS?!

Simple.  Traditional CMS platforms have the wrong underlying design. Most CMS platforms were built or are based on the same design as CMS’s that were built 20 years ago.

  • They were built to manage pages:  Most CMSs do not have a generic concept of what content is.   These systems were built to manage pages.  You might be able to contort the system to get what you want from it but it will be a hack. You need a CMS that is agnostic to the kind or type of content you need to manage.
  • They rely on the wrong type of data tier: RDBMS/SQL databases were the primary backends for data regardless of the problem and open source options like MySQL were readily available.  You need a CMS that leverages a data tier that is more distributed and flexible.
  • They are built on the wrong technology: Many CMS platforms are built in PHP.  That’s fine for certain use cases. But frankly, Java has a larger community, more library support and a lot of powerful platforms like Solr, Elastic Search, Hadoop (and many others) that plug into a Java CMS perfectly.  Why run multiple tech stacks to accomplish a single goal?
  • They aren’t secure: Most CMS platforms couple authoring and delivery together in a single database.  That means work in progress (like pending policy updates) is available in the delivery runtime to anyone who finds a way into the database.  That can spell disaster in the event of a hack. Large companies and governments have suffered expensive and dangerous leaks due to this kind of coupled, insecure architecture.
  • They don’t support multi-tenancy: Most of the CMS platforms out there are not multi-tenant.  If you’re going to back apps with a CMS you need a CMS that will let you properly segregate and secure content between apps.  No one wants to do a CMS install PER app.
  • They don’t scale:  RDBMS is hard to scale.  You either cluster or replicate.  Both have serious issues when it comes to really large, global and auto-scale type deployments.  Supporting an app with your CMS will demand scale. Design for scale from the start.
  • They don’t fit into your development tools and process: When most CMS platforms were designed the content authors were the only audience that mattered.  Today it’s different.  Developer tasks like managing templates, CSS, JavaScript, content models and other artifacts need to be just as easy — and it needs to fit in with your existing workflow.  That means Git source code management, Continous Integration (CI), Integration with your IDE, the ability to fork and support multiple teams simultaneously.

If Not Traditional CMS, Then What? A Modern CMS, That’s What!

To get different results you need a different design.  You need a modern CMS:

  1. Built with the right technologies like Java, Spring, Groovy, Solr, Git.  These are the technologies that will be easy to hire for, will fit into and integrate with any enterprise and will have the out of the box capabilities and horsepower to back any scale deployment.
  2. Decoupled architecture that cleanly separates authoring and content delivery responsibilities on to separate, independently highly securable, highly scalable infrastructures by a publishing process.  Decoupled Architectures don’t have work in progress outside the firewall.  They are much easier to scale.
  3. Built on scalable document-oriented data tiers like XML+disk/Git and Casandra and others.  Disk-based tech is much better for streaming rich content and support global distribution better because they don’t need to be clustered.  At a minimum, you want something that’s document oriented and that distributes globally better than a than a SQL backend RDBMS.
  4. Content type agnostic:  From strings to web pages, to videos and virtual reality experiences.  The CMS needs to accept any kind of content.  That comes down to how content is modeled and how it’s stored.  Does the CMS store content in a SQL database or a JCR repo?  Red flag.  Can’t provide Headless Content / APIs AND rendered content?  Another Red Flag.
  5. Is Multi-Tenant. You need to be able to support many customers/apps on a single install. Either you support it or you don’t and today it’s a must!
  6. Supports authors AS WELL AS developers and DevOps. Developers don’t want to work in some clunky CMS.  They want to work locally in an IDE, then be able to work with their team and ultimately go through the entire workflow out to production.  A modern CMS gives them this.
  7. Open Source.  Open is better than closed. The debate is over. Is it a must?  No. But you’ll get more for your money and you’ll get a lot more innovation at a faster rate if you leverage open source.

Where can I find a Modern CMS?!

Check out Crafter CMS!  Crafter CMS is a modern, 100% open source Java based CMS built on Git!  It’s easy for authors, Amazing for developers and Awesome for DevOps!  Crafter doesn’t sit on a SQL database or a JCR repo like so many CMS platforms out there.  It’s built on Git, Java/Spring, and Solr.  It brings the power Git to CMS!  Authors have true versioning.  Developers and DevOps can work they way they are used to, with the tools they are used to based on the technologies they already know.  Crafter CMS is 100% content type agnostic, highly secure, high performance and scales like nothing else.  If you’re building any kind of application or digital experience, you have content, and you NEED Crafter CMS.  Great experiences are crafted!

 

Start an Activiti Process via Rest Script in Crafter CMS

Activiti is a powerful open source workflow engine built by Alfresco.  Incorporating a workflow engine into your customer and employee facing sites and portals is an excellent solution for automating complex workflows that cross system boundaries while providing the user with a simple to use, contextual user experience.   Working for a bank or an insurance company and need to workflow contracts with customers?  No problem. Crafter CMS (an open source Java based CMS) paired with Alfresco is a perfect solution.  It’s customer friendly, highly scalable and about as powerful as enterprise technology gets!

In this blog, I’ll demonstrate the most straightforward example of a Crafter CMS REST service being used to start an Activiti Process.

Prerequisites

  • You have Activiti (http://www.activiti.org/) installed
    NOTE:
    The authentication and process are hard coded to simplify the example

Step 1: Create a REST Controller

  • Under Scripts/rest right click and click create controller
    • Enter start-process.get as the controller name
  • Add the following code to the controller. This code assumes Activiti is deployed into the same container as Crafter Engine.
@Grab('org.codehaus.groovy.modules.http-builder:http-builder:0.7')
@Grab('oauth.signpost:signpost-core:1.2.1.2')
@Grab('oauth.signpost:signpost-commonshttp4:1.2.1.2')

import groovyx.net.http.HTTPBuilder
import groovyx.net.http.ContentType
import groovyx.net.http.Method
import groovyx.net.http.RESTClient

def http = new HTTPBuilder("http://localhost:8080")
def user = "kermit"
def password = "kermit"
def authPair = user + ":" + password
def authEncoded = authPair.bytes.encodeBase64().toString()

http.setHeaders([Authorization: "Basic "+authEncoded])

def ret = null

http.request( Method.POST ) {
    uri.path = "/activiti-rest/service/runtime/process-instances"
    // ACTIVITI ENTERPRISE URL
    // uri.path = "/activiti-app/api/enterprise/process-instances"

    requestContentType = ContentType.JSON
    body =  [ processDefinitionKey: "vacationRequest", variables:[  [name:"employeeName", value: "Russ"], [name:"numberOfDays", value: "5"],[name:"startDate", value:"10-08-2015 11:11"],[name:"vacationMotivation", value: "rest"]    ]]

    response.success = { resp, reader ->
        ret = reader
    }
}

return ret

Step 2: Execute the Service

Step 3: Verify that a new process instance has been started