Five Reasons Why You Should Use a Git-based CMS (Part 3 of 5)

Most CMS technologies are what we would call a “coupled CMS.”  The content authoring and content delivery environments are usually part of the same stack. The act of “going live” with new content or a feature is essentially based on the act of marking a true/false in a database field. There are a lot of problems with coupled CMS platforms around security, performance, scalability, and flexibility (you can learn more here.)

For these reasons and many others, Crafter CMS is built as a decoupled CMS. With a decoupled CMS you author content in one system and publish to another separate system. For platforms like Crafter CMS that are decoupled, when correctly implemented, the architecture provides great solutions for the issues mentioned above. That said, nothing is without its challenges. Decoupled systems, by their nature, are typically very scalable and can have many instances all over the world. Security, scalability, and distribution are no longer issues that only concern the Internet’s biggest players like Google and Amazon.  Security and distribution impact customer experience, safety and help reduce operating costs.  Every brand-conscious and customer-forward organization in the world is focused on these tactical issues.

Once you have a decoupled, distributable deployment model, the challenge becomes making certain that the content on the servers all over the world is the same — everywhere. Every decoupled solution has an approach for this. Some better than others. That said, few if any of the approaches offered out of the box by today’s traditional CMS platforms “mathematically” ensure every remote instance is 100% up to date and in sync with every other instance. If there are bugs in the deployment code or there is trouble in the environment you may get out of sync.

In our previous posts, we looked at Crafter CMS and its Git-based versioning (part 1) and distributed repository (part 2).  In this post, we’ll take a deeper dive into how Crafter CMS leverages Git mechanics to provide a better, more consistent distributed publishing mechanism.

Reason #3: Distributed, scalable, consistent publishing

Crafter CMS uses Git mechanics to publish content to its decoupled delivery space. When Git reports that its repository is set at a specific version that means that every file is guaranteed to be present and in the proper state for that version, it is. Fact. It’s provable.

The reason it’s provable is due to the fact that the Git mechanics that underlie Crafter CMS’ content repository are based on Git’s purely functional data structures. “The main difference between an arbitrary data structure and a purely functional one is that the latter is (strongly) immutable” (Wikipedia).  What this means is that as commits happen within the repository an entirely new immutable data structure is created containing the changes for the commit. No action is taking on the previous data structure(s.)  Nothing you ever change can be lost or corrupted by an operation once the change has been committed.  Moreover, in Git, the ID for the commit is essentially a SHA1-hash of metadata and the content in the directory tree. By definition, if a single bit changes anywhere in the tree a new SHA1-hash must be generated.

 

While this explanation is an oversimplification of Git’s algorithm, it is essentially the model of how it works. The point is that two repositories on two different machines with the same commit ID are mathematically guaranteed to be the same. That’s an extremely useful mechanism for versioning but it also is a very large helping hand in publishing. Crafter CMS publishes (replicates) content based on Git commits. If you want to know if an endpoint on the other side of the world is the same as what you expect, you only have to compare the commit ID(s).

In today’s elastically scalable, globally distributed world you can have any number of servers.  You need a means to make sure they are all in sync. As you can see above, Git’s internal mechanics give us just that.  Crafter CMS is the first decoupled CMS with the capacity to scale geographically across an elastic cloud and at the same time make 100% certain that remote instances are consistently running the same version of content and code.

CONCLUSION

Decoupled CMS platforms provide push-based publishing, offer greater architectural flexibility and are much easier to scale elastically and distribute globally.  Along with this increased power and flexibility comes a need to ensure that all remote endpoints are in sync with one another and are up to date.  While this problem is solvable, few of today’s decoupled CMS platforms provide a solution for this that is 100% guaranteed and mathematically verifiable.  Crafter CMS and its Git-based repository leverage Git mechanics for publishing and replication to remote nodes.  Calculating changesets and verifying that an endpoint is in a particular state is based on the proven algorithms and data structures that back Git, the world’s most powerful and popular distributed source repository.
Stay tuned for our next blog entry to learn another major reason why you should use a Git-based CMS!

Five Reasons Why You Should Use a Git-based CMS (Part 2 of 5)

Since the birth of content management system (CMS) technology, well over 20 years ago, platforms have been leveraging “obvious backends” like SQL databases as a store for the content. Not because it’s the best or right store for the job, but because SQL databases are a commonly available, simple to use technology that (kinda) gets the job done. By the early 2000s, it was clear with many implementations that directly leveraged SQL and similar database stores do not provide the full range of features like versioning that a CMS requires. They can’t. They were not built to do it. The Java Content Repository (JCR) and other similar technologies entered the scene. The implementations of these technologies sit on top of the same old database stores and add a layer of capability to fill the gaps. This is good but not good enough. Ultimately, the fact that they sit on top of a database comes back to haunt them.

In Part 1, we looked at what kind versioning model is needed to support modern digital experiences. Today we focus on another critical capability that is missing in traditional CMS solutions: a distributed repository. More specifically, distributed versioning and workflow.

Reason #2: Distributed repository

Most databases are not easily distributable from a geographic sense, and more importantly, they are not distributable from a versioning and workflow sense.

I could spend a lot of time talking about how scaling and distributing a database geographically matters in the context of CMS and why it’s so difficult. I don’t have to. If you have the need for a CMS with high availability and global distribution you already know why it matters. If you have tried to make this work with a CMS based on a traditional database or a JCR repository, you already know it’s a difficult and sometimes impossible errand.

What is distributed versioning and workflow? The easiest way to get at this is by example. In the software development space, we’ve had Source Code Management (SCM) systems for a long time. These SCM systems allow teams of developers to work on a single code base as a team without stepping on each other’s toes by checking out work locally, working on it and then checking back in edits. Hint: This is not much different from what a CMS provides to content authors behind its UI.

Back to developers: In the past, we had CVS, SVN along with many others. These SCM systems provided basic version management as well as branching and tagging but fundamentally the system was a centralized model. With such solutions, there is a single central store and source of truth for the code base.

This SCM model worked well for smaller teams and smaller code bases but for large projects like the Linux operating system, it failed completely. Linux has so many developers spread out all over the world, working on many separate but related projects. A single, centralized system simply does not scale (in several ways) to meet this need. To make a long story short (collapsing a lot of history and detail), Linus Torvalds created Git as a lightning fast, open source solution to solve this problem. Git allows developers to have their own local and intermediary repositories that are all born from a parent repository. This makes distributing developers easy, it makes concurrency simple and most importantly to us, it distributes the versioning and workflow which makes “flowing” code to and from these independent repositories possible, fast and easy. Yes!

In the CMS space, for more than 20 years all the way up to this day, we’ve had repository solutions of various capabilities and quality. All of these solutions have no real, workable solutions for moving content back from production to lower environments like Staging, QA, Development, Load Testing and local developer machines. Yes, you can do it. But it’s a nightmare. You end up doing an export/import process and it’s not easy. Some systems are easier than others but they all stink. CMS consumers rig up all kinds of replication and publishing workarounds to try and deal with this problem. It’s all a hack. There’s no technical solution in the CMS space that was built to handle the problem specifically. For this reason and many others, development, and operations teams HATE the CMS options available today. They do nothing to help the team work — worse, they fight them in almost every way. The technical members of the team put up with CMS technology because their business counterparts need content creation and editing capabilities. That’s all.

Moreover, today we understand that to some degree, in the digital experience space, “code is content.” Just as we need to be able to move content back to environments, we also must be able to move code (templates, javascript, CSS, etc.) forward through the environments. Developers have processes that they use to ensure quality and performance. With traditional CMS, moving code forward through environments is even harder than moving content back. Wholesale export/import doesn’t work!

Because Crafter CMS is Git-based and because we’ve specifically built capability in Crafter CMS to handle these needs, the world finally has a CMS that solves this problem. The same approach developers use to make and promote source code changes with Git is used by Crafter CMS to move code forward and content back.

Every organization that uses a CMS for more than simple edits and blog posts know exactly what I am talking about. Today, it’s understood that customer experience is one of the biggest competitive advantages an organization can have. Further, beyond the human element, digital enablement and innovation is the most important component of delivering great customer experience. Because content and code are inseparable from customer experience, the CMS is a mission-critical component of any and all customer experience solutions. Here’s the kicker: nearly the entire world is using a CMS technology that not only fails to enable the organization to innovate faster — it actually fights them!

The Git-based distributed capabilities in Crafter CMS allow your organization to have many environments that are all related to one another — syncing and moving objects between them is natural and part and parcel to the technology itself. This means it’s easy to move content back and code forward.

Because the system is distributed and Git-based, developers can work locally and still be part of the CMS. That means they can use the tools they know and like, and they are not working on an island. The best way to make a developer love the CMS is to let them work with the CMS without having to work _in_ the CMS. Organizations that want to win, need to innovate without impedance.

Conclusion

Today’s CMS systems are rooted in 20-year-old architectures and technologies. As the demand for greater amounts of innovation and digital experience has grown and organizations are under pressure to deliver more at ever increasing rates CMS platforms have become more of a hindrance than a help. Crafter CMS, with its Git-based approach, not only solves these fundamental problems but also integrates very well with developer process and tools that innovation moves even faster. Finally, a CMS approach that accelerates development instead of blocking it.

Stay tuned for our next blog entry to learn another major reason why you should use a Git-based CMS!

Five Reasons Why You Should Use a Git-based CMS (Part 1 of 5)

Crafter CMS is a revolutionary open source digital experience platform based on Git. Crafter CMS solves problems from scalability and performance to ease of innovation that has existed in the CMS space for more than 20 years. What makes Crafter CMS so unique is its technical approach and underlying architecture. From its repository layer to its content delivery technology, Crafter CMS is designed to handle today’s most difficult content management challenges associated with creating and managing omnichannel digital experiences.

While there are many architectural advantages of Crafter CMS, in this series we will focus your attention on Crafter’s underlying repository technology: Git. Crafter CMS is the first and only enterprise-class CMS based on Git. We’ve based our CMS on Git for many reasons, and throughout this series, we’ll explore five of the most important.

Reason #1: Event-based multi-object versioning

Traditional CMS platforms like Drupal, WordPress, Adobe Experience Manager, Sitecore, and most others either have severely limited versioning or provide basic versioning capabilities that track single object graphs or maintain clunky data structures to track relationships.

Figure 1: Single file versioning model. Each object has its own version tree. How and whether relationships are tracked between objects differ from one system to the next.

Such simplistic approaches work for basic content management needs like blogs or boring websites but largely fall down in the face of managing today’s multi-object, multi-asset digital experiences. Today’s content models are component-based, and they have many relationships and dependencies. Further, there is often a relationship between the content and the code (CSS, Javascript, templates, etc.) that needs to be considered. Tracking the edits of any one specific object in isolation is simply not enough.

While simple object versioning does support basic editing, simple review, and basic reversion, these use cases are only the tip of the iceberg in a real-world environment. Scenarios like legal audits, company re-branding, and concurrent feature development drive the need for much more sophisticated CMS capabilities like a “time-machine” preview, multi-object reversion and content/codebase branching.

Instead of the single file versioning we see in the CMS space, what’s needed is a multi-object versioning approach like we see in the programming space. We require an approach that tracks “the entire state of the universe on each change.” With this level of version detail, a system can provide real previews at any point in time, make intelligent decisions about what must be reverted and support a host of branching and workflow needs.

Figure 2: Multi-object (“striped”) versioning model. Each event tracks the state of the entire repository at the time of the event.  

This type of solution already exists in the enterprise software development space. With software, one source file is often related to many others. Versions between objects matter. Modern Source Code Management (SCM) has evolved to support this need. Git is today’s most popular and widely used source code management system. It’s clear that the content and technical components of today’s digital experiences share many of the same needs that we see in the software development space. Rather than re-invent Git to achieve the same versioning capabilities in the context of content management, we’ve based Crafter CMS on Git’s versioning mechanics.

Because Crafter CMS is based on Git, every content change event is tracked with an event ID known as a “Commit ID.” Using this ID, it is possible to know the state of every content object in the system at the time of the event. For the sake of simplicity, we can say that we’ve created a version “stripe” across the entire repository at a given moment in time. The system does not make a copy of every object on every edit. That would be too slow and cost too much in terms of storage. Instead, this is done in a very efficient and effective manner by leveraging Git’s own proven versioning mechanics.

Moreover, because of the way Git stores and manages versions, traversals to any point in time are extremely fast. Performance is very important when it comes to the types of use cases we discussed earlier. Let’s take, for example, an auditing scenario: legal needs to see what the site looked like 46 days, 2 hours and 42 minutes ago. With most CMS platforms, this scenario is impossible to support. At best a systems group can attempt to restore a backup from that date and staff can be diverted to give the lawyers what they need. Even if your CMS claims to support this kind of review, the speed at which it can be provided is of key importance. If it’s too slow it won’t be practical. I’ve seen demos of CMS platforms that take minutes to render a previous version of a dynamic site. That’s too slow when you are doing a triage. It’s worse if you are traversing for editorial reasons. Crafter CMS simply doesn’t have this issue. Because of the way Git stores versions, traversal of versions n our Git-based CMS is extremely fast.

Finally, Crafter’s Git-based versioning approach itself hints at another important and related characteristic of Crafter CMS: content is managed in a document-oriented, file-based store. In short, content is stored as XML. Git is a file-based versioning system. Storing content as files are not only necessary, but the file-based approach has several major advantages. Because we’re dealing with files, content is easy to move among environments (Dev, QA, Prod, etc.) and migrate between systems. It’s much easier to integrate the content with other 3rd party systems, such as for language translation, e-commerce, and marketing automation. And because we store content in an XML format, it’s multi-byte character set friendly and totally extensible.

Conclusion

Most CMS’s lack the sophisticated versioning mechanics that are needed by today’s multi-disciplinary teams who are creating modern digital experiences. Today’s sophisticated digital experiences call for a much richer set of versioning mechanics similar to those we see in the software development space with Source Code Management Systems (SCMS.)  Git is today’s most powerful and popular SCMS. Because it’s based on Git, Crafter CMS is able to deliver the versioning needs for today’s most sophisticated needs and use cases.

Stay tuned for our next blog entry to learn more reasons why you should use a Git-based CMS!