How to curate learning content at scale

By Filtered

4 minute read

Corporate learning is about doing as much as you can with as little as possible. That’s why curation is central to everything that Filtered does.

So why is it important to curate learning content at scale? It’s through content curation that we get rid of anything superfluous and redundant and distil your libraries so that the only content left is the stuff your people, and your organisation, need.

It’s the first stage of matching the right content to the right person. And it’s a vital precursor to the second - personalisation.

We’ve spoken at length about our theory behind effective curation and its proven benefits. But it’s important that you know how human and artificial intelligence work together to curate massive learning content libraries so you can understand why it’s so effective for our clients. So - short only of our best kept secrets - here’s how the process of curation goes.

1 - Groundwork

You can’t filter anything until you know what you want to get out of it. In the case of most of our clients, the two overriding criteria are strategic relevance and quality

The first of these relies entirely on developing a business-aligned skills framework. We go into more depth on how we develop them here, but it entails working closely with an organisation to develop a simple, needs-focussed, and business-oriented set of weighted capabilities which will act as the raw materials to drive organisational change. 

We've helped change the skills landscape at GlaxoSmithKline, Heineken and many more. Book in a chat with the team and they'll help you draw up a free action plan for your skills framework.

The second is more difficult. To be in any way scalable, curation has to be undertaken by software and, despite its many abilities, artificial intelligence is not human enough yet to accurately define something as subjective as “quality”. Luckily, we have a workaround which, while simple, is incredibly effective. As Gloria Origgi elaborates
here, with an excess of information too large to possibly digest, the solution is to rely on the quality of the provider, not the information itself.

That’s what we do. Part of the content curation for learning process is judging which providers can be trusted to consistently supply high quality content. This is a human process in which a provider’s portfolio of content is judged according to a set of 7 criteria: 

  1. Business-useful (applicable in real business) 
  2. Good value per word (visually appealing, persuasive, enjoyable, succinct)
  3. Evidenced
  4. Original
  5. Independent: no in-content agenda (eg selling a product)
  6. Serviceable: available, mobile-friendly, easy to navigate, not full of clickbait/ads
  7. (As a selection) a good mix of length and format-type

There are exceptions. For example, if you’re looking for a really niche bit of content that very few providers will cater to, a dip in some of the quality criteria is an acceptable sacrifice if the material covers the gaps. However, in more standard content, where there are overlaps, choosing the trusted provider is an effective way of ensuring a base level of quality. 

New call-to-action

2 - Method

Our curation technology follows a flexible dual-pronged approach to achieve optimal results in the huge variety of content situations we deal with for our clients. The first, more human-intelligence-based is what we call our autotagger and the second is our more artificial-intelligence-based neural network. 

In the first, human-based system, an algorithm is set rules by human curators which it then applies to an entire library. After a skills framework and set of capabilities is decided on for an organisation, human curators go through a range of relevant content (say 50-100 pieces) judging it according to the framework’s capabilities. They then draw up a lexical blueprint for each capability. This involves working out which words and phrases are best linked to a capability and then weighting them according to their relevance. So, a word/phrase very clearly linked to a capability would get a higher score than a word/phrase whose connection was only tangential.

This information is then programmed into the algorithm for the entire skills framework. The algorithm trawls through a whole library of content in a matter of seconds and selects and ranks the content best suited to each capability. 

This method is utilitised for its immediate effectiveness and scalability. Once the parameters have been set, reams of content can be analysed and tagged in a matter of seconds. While its usefulness is undeniable, the price paid for autotagger's speed is in certain limitations. Firstly, the level of granularity it can achieve is based on the human curator’s rules and, as they are rigid, it cannot be as fine a comb as something with its own intelligence. Secondly, the data it is able to leverage is incomplete. While the text itself is the most important factor, the meta-data around it can give the algorithm even more clues about how relevant a piece of content is to a specific capability. 

New call-to-action

Our alternative is our artificial intelligence-based curation engine. This technology relies on the same notion of a human-machine handover but is able to generate far more accurate and specific results due to the algorithm's ability to learn through a neural network. For the sake of intellectual property, we can’t go into too much detail about the specifics. But, here’s an outline of how it works: 

First, a trained human curator goes through a selection of a few hundred test pieces of content. The human curator then tags each piece of content according to the capabilities in the defined skills framework.

This tagged content is then fed to the neural network as training. The algorithm uses this tagged content to learn to draw its own connections between the text/metadata, and the tagged capabilities. This algorithm stack isn’t just connecting and listing phrases and words, it uses the human tagging as a springboard rather than a guide. It can work with a far fuller and more nuanced picture, understanding larger and more complex selections of text in context, seeing how metadata relates content, and keeping in mind a breadth of overlapping connections far too sprawling for humans to compute. From here, the tagging process can be scaled up to hundreds of thousands of pieces of content - and the machine keeps learning as it goes along. While it’s not as rapid as the autotagger method, Filtered technology (LXP) can go through a library of content unfathomably fast in comparison to the human alternative. 

3 - Next Steps

The meat of Filtered’s curation process is in providing the information - tagging and organising all of a company’s content according to its skills framework. The process could feasibly end here and still be hugely useful. First, and foremost, companies can find out what content they need and leverage it as the fuel to drive change. And this initial curation doesn’t have to be the final one. If organisational priorities change or re-skilling efforts need to be directed, the structure is there to make re-curation a simple process. There are other benefits here too: by revealing redundant and irrelevant content within their libraries, organisations can potentially save vast amounts of money by getting rid of content providers they don’t need. 


However, we usually keep going. For most of our clients, the curation stage is just the first of two filtrations. The second is personalisation, where the - already culled - content selection is tailored to perfectly fit the individual needs of each learner. Learn more about that here

Have you got way too much content with not enough idea of what’s useful? Chat to one of our team about getting it Filtered.

Free learning content library benchmark
Filtered logo rotating

Get the best return on your L&D spend.