Book Image

Alfresco One 5.x Developer's Guide - Second Edition

By : Benjamin Chevallereau, Jeff Potts
Book Image

Alfresco One 5.x Developer's Guide - Second Edition

By: Benjamin Chevallereau, Jeff Potts

Overview of this book

Do you want to create more reliable and secure solutions for enterprise apps? Alfresco One 5.x is your gateway to developing the best industry-standard enterprise apps and this book will help you to become a pro with Alfresco One 5.x development. This book will help you create a complete fully featured app for your organization and while you create that perfect app, you will explore and implement the new and intriguing features of Alfresco. The book starts with an introduction to the Alfresco platform and you’ll see how to configure and customize it. You will learn how to work with the content in a content management system and how you can extend it to your own use case. Next, you will find out how to work with Alfresco Share, an all-purpose user interface for general document management, and customize it. Moving on, you write web scripts that create, read, and delete data in the back-end repository. Further on from that, you’ll work with a set of tools that Alfresco provides; to generate a basic AnglularJS application supporting use cases, to name a few authentication, document list, document view. Finally, you’ll learn how to develop your own Alfresco Mobile app and understand how Smart Folders and Search manager work. By the end of the book, you’ll know how to configure Alfresco to authenticate against LDAP, be able to set up Single Sign-On (SSO), and work with Alfresco’s security services.
Table of Contents (17 chapters)
Alfresco One 5.x Developer’s Guide - Second Edition
Credits
About the Authors
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface

Alfresco in the real world


Alfresco will tell you that the product is a platform for enterprise content management (ECM). But ECM is a somewhat nebulous and nefarious term. What does it really mean? It depends on who is saying it. ECM vendors usually use it as an umbrella term to describe a collection of content-centric technologies as follows:

  • Document management ( DM ): This is used for capturing, organizing, and sharing binary files. These files are typically produced from office-productivity software, but the scope of the files being managed is unlimited.

  • Web content management ( WCM ): This is used for managing files and content specifically intended to be delivered to the Web. The key theme of WCM is to reduce the "web developer" bottleneck and empower non-technical content owners to publish their own content.

  • Digital asset management ( DAM ): This is used for managing graphics, video, and audio. You can think of this as DM with added functionality specific to the needs of working with rich media such as thumbnailing, transcoding, and editing. Like WCM, the intent is to streamline the production process.

  • Records management( RM ): This is used for managing content as a legal record. Like DAM, RM starts with DM and adds functionality specific to the world of RM such as retention policies, records plans, and audit trails.

  • Knowledge management( KM ): This is used for capturing knowledge from employees or customers and providing it in a form that others can use.

  • Case management( CM ): This is used managing information related to a case, such as an insurance claim, an investigation, or personnel processing.

  • Imaging : This includes capturing, tagging, and routing images of documents from scanners.

Most people will also include collaboration, search, and occasionally, portals as well.

Practitioners have a different perspective. They will say that ECM is less about the technology and more about how you capture, organize, and share information across the entire enterprise. For them, the how is more important than the what.

What's important to know from an Alfresco perspective is that Alfresco is a platform for doing all these things.

So rather than worrying about a concise definition of ECM, let's look at a few examples to illustrate how clients are using Alfresco today, particularly in Alfresco's sweet spots such as DM and WCM.

Basic document management

Alfresco started its life as a document management repository with some basic services for document management. Alfresco focused on this smart area initially for two reasons. First, it allowed Alfresco to establish a strong foundation and then build upon that foundation by expanding into other areas of ECM. Second, there is a huge market for systems that can manage unstructured content (aka "documents").

The market is so big because document management is a problem for everyone. All companies generate files that benefit from the kind of features document management provides such as check-in/check-out, versioning, metadata, security, full-text search, and workflow.

Examples of classic document management are often found in insurance, manufacturing, packaged goods, or other companies with large research and development divisions. As you can imagine, companies such as these deal with thousands of documents every day. The documents are in a variety of formats and languages, and are created and leveraged by many different types of stakeholders from various parts of the company.

The critical functionality required for basic document management includes things such as:

  • Easy integration with authoring tools : If users can't get documents into and out of the repository easily, user adoption will suffer. This means users must be able to open and save documents to the repository from applications such as Microsoft Office, Microsoft Windows Explorer, and e-mail.

  • Security: Many documents, particularly legal documents and anything around new product development, are very sensitive. Employees must be able to log in with their normal username and password, and see only the documents they have access to.

  • Library services: This is a grouping of foundational document management functionality that includes check-in/check-out, versioning, metadata, and search. The ability to offer these library services is one of the things that sets a document repository apart from a plain filesystem.

  • Workflow: Quite literally, workflow describes the "flow of work" or business process related to a document. Requirements vary widely in this area and not everyone will leverage workflows right away. Workflows can be used to streamline and automate manual business processes by letting the document management system keep track of who needs to do what to a document at any particular time.

  • Scalability/Reliability: The system needs to scale in order to support several hundred or more users and hundreds of thousands or even millions of documents with some percentage of growth each year. Because the repository holds content that's critical to the business, it needs to be highly available.

  • Customizable user interface: Alfresco is split into two web applications. The first one contains only the core engine capabilities that are required for all Alfresco installation. The second one is the out-of-the-box Alfresco Share client made for generic document management, which may be appropriate in many cases. Most clients will want to make at least some customizations to the web client to help increase productivity and improve user adoption. It's possible as well to develop your own frontend from scratch.

The following diagram shows an example of high-level architecture to understand how basic document management might be implemented:

The diagram shows a single instance of Alfresco authenticating against a Directory Server (such as LDAP). Some content managers are using Alfresco Share via HTTP/S, while others are using Windows Explorer, Microsoft Office, and other thick clients to work with content via one or more protocols such as CIFS, WebDAV, FTP, or SMTP. As noted in the diagram, Alfresco stores metadata in a relational database and the actual content files on the filesystem.

Most of the techniques for customizing Alfresco for DM solutions apply to other ECM solutions such as WCM, RM, Imaging, and DAM. Of course, there are business concepts and technical implementation details specific to each that make them unique, but the details provided in this book apply to all because the specialized solutions are built as extensions to the core Alfresco repository. This books dedicates an entire chapter, Chapter 9, Amazing Extensions, to some very famous extensions as Alfresco Mobile and Alfresco Analytics.

Web content management

On the surface, WCM is very similar to document management. In both cases, content owners store files in a repository. Often, the content is assigned metadata, secured, indexed for search, and routed through a workflow. The most obvious difference between DM and WCM is that the content being managed is meant specifically to be published on a website or as part of a web application. Beyond that high-level distinction, there are several other differences that make WCM worthy of separate discussion. These include:

  • Content authoring tools used to create content

  • Separation of presentation from content

  • Systematic publication or deployment of content

Let's briefly look at each of these.

Content authoring tools

The majority of document management solutions deal with files generated by an office suite. Of course, there are exceptions such as various types of graphics files, CAD/CAM drawing formats, and other specialized tools. But mostly, the files are generated by a small number of different tools and an even smaller number of different software vendors.

In the case of WCM, there is a wide variety of tools involved from text editors to integrated development environments (IDEs) to graphics programs with multiple vendors in each category. This means the WCM solution needs to be very flexible in the way it integrates with authoring tools. The alternative, which is forcing authors to give up their favorite tools in favor of a standard, can be a management nightmare.

Separation of presentation from content

WCM does not require the separation between content's appearance on the web site and its storage. But many implementations take advantage of this principle because it makes redesigning the site easier, facilitates multichannel publishing, and enables people to author content without web skills.

To understand why this is so, think about a website that has its content and presentation of that content merged together. When it is time to redesign the site, you have to touch every single web page because every page contains presentation markup. Similarly, content authoring is limited to people with technical skills. Otherwise, there is a risk that the content owner (for example, the person writing a press release or a job posting) will inadvertently clobber the page design.

One way to address this is to separate the content (the press release copy) from the presentation of that content. A common way to do that is to store the content as presentation-independent XML. The XML can then be transformed into any presentation that's needed. A redesign is as simple as changing the presentation in a single place, and then regenerating all of the pages.

The impact of separating content from presentation is three-fold. First, assuming the content consumers aren't interested in reading raw XML, something has to be responsible for transforming the content. Depending on the implementation, it may be up to the WCM system or a frontend web application.

Second, in the case of static content, any change in the underlying content has to trigger a transformation so that the presentation will be up-to-date, keeping in mind that there may be more than one file affected by the change. For example, data from a job posting appears in the job posting detail as well as the list of job postings. If the posting and the job posting index are both static, the list has to be regenerated whenever the job posting changes.

Third, content authors lose the benefit of WYSIWYG (What You See Is What You Get) content authoring because the content doesn't immediately look the way it will as soon as it is published to the web site. The WCM system, then, has to be able to let content authors preview the content as they author it, preferably in the context of the entire site.

Systematic publication or deployment

A document management system is a lot like a relational database in the sense that it is typically an authoritative, centralized repository. There are exceptions, but for the most part, content resides in the repository and is retrieved by the systems and applications that need it. On the other hand, a WCM system often faces a publication or deployment challenge. Files go into the repository, but must be delivered somewhere to be consumed. This might happen on a schedule, at the request of a user, as part of a workflow, or all of the above. Granted, some websites retrieve their content dynamically; but most sites have at least a subset of content that should be statically delivered to a web server.

Alfresco WCM example

Let's look at an example of a basic corporate website. Most companies have a mix of About Us content that probably doesn't change very often, Press releases or News sections that might get updated daily, and maybe some document-based content such as marketing slicks, product information sheets, technical specifications, and so on. There's also some content that is used to build the site such as HTML, XML, JavaScript, Flash, CSS, and image files.

It is likely that there are several different teams with several different skill sets, all collaborating to produce the site. In this example, suppose the About Us and News pages come from the marketing team, the site is built by the web team and the document-based content can come from many organizations within the company.

Alfresco WCM sits on top of the core Alfresco product to provide additional WCM-specific functionality. An important distinction between Alfresco WCM and other open source content management systems (CMS) is that Alfresco is a decoupled CMS while something such as Drupal is a coupled CMS. This means that Alfresco manages the website but does not concern itself with presentation unlike Drupal, which is both a repository and a presentation framework. This doesn't mean that Alfresco can only manage static sites. You can easily query the repository in any number of ways. It just means it is up to you to provide the frontend from the ground up.

Custom content-centric applications

Content-centric applications are those in which the primary purpose of the application is to process, produce, archive, collaborate on, or manage unstructured or semi-structured content.

The Alfresco Share client is an example of a content-centric application, although it is meant for a very general, all-purpose use case. When solutions are very close to basic document management, Alfresco Share can be customized as previously discussed. At some point, it makes more sense to build a separate custom application with Alfresco as the backend repository for that application.

Consider the sales process within a company, for example. Sales people create proposals. Those proposals are usually routed internally for review and approval, and then are delivered to the client. If the client accepts the proposal, a contract is drawn up and the product is delivered. The out-of-the-box Alfresco Share could be used to manage these documents, assign metadata, manage the review process through workflows, and make it all searchable. But the sales team might be even more productive if it used a purpose-built user interface. For this solution, a frontend built on top of NodeJS and Angular, a custom Spring web application, or even a custom mobile application might be a good option. Alfresco would provide the document management services. The frontend would talk to Alfresco via CMIS or RESTful services.

Another example is using Alfresco in a digitization project. More and more companies are trying to reduce the use of paper-based process for many different reasons. Alfresco can be integrated with various scanning solutions as Ephesoft via CMIS, or Kofax via the connector supported by Alfresco. Documents can be ingested and processed by the scanning solution and exported to Alfresco. Alfresco will be responsible to store, index and secure the scanned documents. Using the integrated Activiti framework, Alfresco can automatically start a process depending of the document type. If an invoice has been scanned, Alfresco will start a review process for the financial team. If it's a job application, Alfresco will start a new process for the HR team to track the different stages of this application.

As discussed previously, Alfresco provides two out-of-the-box web applications. The first one is the Alfresco repository engine. The first one provides only administration capabilities from a user interface point of view. The second one is the default web interface Alfresco Share. Many clients appreciate this separation because it gives them complete freedom with regard to how they build the frontend. Depending of your use case, you may want to use the standard Alfresco Share user interface; or including some customizations; or even build the frontend from scratch.

Alfresco Share provides many different options if you need customizations. The basic level is to configure some forms and pages to display your custom metadata. If you need further customization, you may want to customize an existing Dashlet or to develop a new one to add on the user or site dashboard. You may need to create custom actions in the use interface. If it's not enough, it's even possible to create new pages within Alfresco Share reusing the entire UI framework. Finally, if it's not sufficient, Alfresco can be integrated to any frontend using CMIS or REST API.

We'll see in one of the following chapters how Alfresco created tools to generate Angular applications from scratch:

The openness of the Alfresco repository, particularly its ability to be easily exposed as a set of services, makes Alfresco an ideal platform for content-centric applications. As the examples have shown, custom content-centric web applications use Alfresco as the backend. As a result, they have complete flexibility in frontend technology choices from portals to lower-level frameworks to no framework at all.