Over the last few articles we’ve covered a number of important content management topics: knowledge graphs, blockchain, misinformation. Concepts that have a real bearing on how we manage our information and even on how we work.
If we put all these concepts together, they form a picture of the entire lifecycle of information in the foreseeable future. In this column, I’d like to bring you on a trip in a content management time machine to provide a realistic preview of what managing and using information will look like in the next three to five years, from how we create and capture information to how to consume it, all in the way that content management was intended — true content management.
Goodbye Clippy, Hello Assisted Content Management
Remember when you used to start typing a resume or business letter, and Clippy (Microsoft’s Word Office Assistant) would pop up and say, “Hey, it looks like you’re typing a resume!”
Within the next few years, you should be able to start work on a document and the content management system will recognize what you’re doing and suggest any and all related documents extracted from somewhere in the system. Up till now, you could only extract and aggregate related information if you did the work of applying good metadata tagging and architected robust cross-system search.
Natural language processing (NLP) gives computers the ability to understand text and spoken words in the same way that humans can. NLP uses linguistics (modeling of human language) along with statistics, machine learning and deep learning to ‘understand’ the meaning of voice or text data. NLP can translate text from one language to another, respond to spoken commands and summarize large volumes of text in real time.
Knowledge graphs, or semantic networks, connect and relate events, concepts or things. It can aggregate related sets of information, i.e., a model of your knowledge domain by describing entities and their relationships. Knowledge graphs have become popular as a way to organize knowledge over the internet and to integrate information extracted from multiple data sources.
Related Article: Is Viva Topics the World’s First Topic Computing Solution?
I Don’t Think ‘ECM’ Means What You Think it Means
As director of research in the Data and Analytics practice at Info-Tech, my colleague Irina Sedenko has rich experience in data and content management as well as artificial intelligence (AI) and machine learning (ML). She said, “When we talk about ECM, we’re talking about document management, not really content.”
Real content management means not managing fully formed documents as units, but rather reusable modules that you create once and package together with other modules to build flexible new products. You manage it once, change it once at its source and the change is reflected in all its multiple instances. Translation software works on this premise: you catalogue and tag standard reused paragraphs or snippets and pull them into your workspace in translated form to build a finished document.
Current technology allows us to build solutions that can detect what document a user is working on and automatically add a description and classification tag. And as authors are adding more content, the system might change the classification tag of the document. This is made possible by NLP and knowledge graph.
“We should be able to do this with simple configuration,” said Sedenko. “I should not be cutting and pasting.” With real ECM, the selection of possible paragraphs should be presented in a preview panel so that we can pick that paragraph and that one and drag into our template workspace. This is what you see when you ask questions in Google — you not only get lists of links but you get complete paragraphs or snippets pulled from the source document to answer your question without you having to open the link.
Related Article: How Well Do You Understand Your Content Processing Pipeline?
Checking In: Assisted Metadata Management
Many organizations know the importance of metadata but struggle to get users or systems to successfully populate metadata values. Sometimes they get the metadata in at check-in, but it’s never visited again.
Metadata changes over the course of the information lifecycle. These technologies should change the metadata automatically for you as the information is processed over time.
Products such as OpenText Magellan crawls content and enriches it with metadata to aid records management and findability. According to the company, Magellan Text Mining quickly accesses and analyzes billions of pages of textual content and imagery from documents, email, social media and web pages, and uses powerful algorithms to determine what a piece of text is about and assess its relevance. Content makes up more than 80% of our data and we can’t find it or use it. The components we need are all here: NLP, knowledge graph, AI.
Related Article: Using AI for Metadata Creation
We’ve Got the Content Management Tools, We Just Need to Put it Together
If we have all the components, why aren’t we at the true content management good place? “Someone needs to put it all together,” Sedenko said. A team that understands the content manager and user and can translate that into modern practices and technologies. It’s not just technologies but re-thinking — or returning to — what content management really is.
Next-generation content management systems can only do so much. They’re just waiting for us to catch up.
Andrea Malick is a Research Director in the Data and Analytics practice at Info-Tech, focused on building best practices knowledge in the Enterprise Information Management domain, with corporate and consulting leadership in content management (ECM) and governance.
Andrea has been launching and leading information management and governance practices for 15 years, in multinational organizations and medium sized businesses.