Five years ago, I wrote an article about a pervasive corporate problem Learning Gunk. I described it as the glut of files - .mp4, .mp3, .pptx, .pdf, SCORM 1.2, SCORM 2.4, AICC, .zip, .wordx, .m4A, .wav, .wma - accumulating over the years in disparate, rarely-opened locations on corporate servers. The core issue then was that wading through this mess was an unpleasant, painful experience for people, and the vast majority just avoided it entirely.
I wrote it because I saw it at many of Filtered’s clients and at most large companies I spoke to in general. This phenomenon and problem is very much part of the essence of Filtered - indeed, the reason that Filtered is called Filtered. We’ve always been about filtering out the useless many in order to turn attention to the vital few.
Most of what I observed half a decade ago is painfully true today. Every single company I´ve ever spoken to about learning content (a large number!) still has an overwhelming amount of this material. After five years, they just have even more. Much of it remains locked in SCORM formats, and these traditional file types stubbornly persist, in huge volumes.
The small but truly important minority of hyper-relevant, high-quality, genuinely impactful learning materials are usually the ones created by your own workforce. But companies - and the individuals at those companies - can’t distinguish the good from the obsolete / duplicative / irrelevant / low quality (note the many different ways that learning content can be poor!) because this internal content tends to have poor or missing metadata. They can’t find it. The truth is that barely anyone even looks.
The big change in the world since then - and for L&D and learning gunk specifically - is, of course, generative AI. But AI can’t actually generate this tacit, internal, company-specific knowledge. It’s too specific, and foundation models trained on the broad, generic text of the internet just won't get there.
What the technology can do is unlock the immense value trapped inside monolithic SCORM packages. We can break them open, break them down, and tag them at a granular level so the best of it will, actually see the light of day.
In a little more detail:
-
Break it open. We can parse the SCORM manifest and internal structure to automatically extract individual modules and lessons.
-
Understand it. We can use LLMs to analyze the extracted material to generate clear summaries, contextual descriptions, and inferred learning objectives without any manual instructional design effort at all.
-
Break it down. Instead of broadly categorizing a massive course, each newly freed lesson is independently analyzed and auto-tagged to specific skills. This ensures precision and relevance, bypassing the old method of inheriting generic skills from the parent package.
-
Fix it. That means adding transcripts or metadata where needed. If audio or video content is missing a text counterpart, the system can automatically generate it. The ultimate goal is microlesson-level discovery. We don't stop at tagging the sealed container; we tag and bring to life the small individual learning units.
This is a major new capability here at Filtered which we’re officially launching now SCORM Intelligence. We’ve tuned our existing skills tagging models to do just that work, but now with stubborn SCORM files. This analysis, along with an analysis of other content (web, OTS libraries), gives you a complete, cohesive view of what you have and what you're missing, relative to the high-value skills at your organisation.
“Filtered’s SCORM Intelligence finally unlocks what’s inside our SCORM eLearning packages. By extracting transcripts and tagging skills at module level, we get far more accurate skill alignment at scale, without breaking the course structure. It turns SCORM from a black box into AI ready, skills-powered learning.”
- Global L&D Innovation & AI Leader at a big four Professional Services Firm
What L&D gets out of this?
Reuse, rather than buying or building. Instead of buying or even renewing large new off-the-shelf libraries to fill perceived gaps, you may well be able to get much of that value (and probably more) from your existing content. It should be a win internally if you're effecting a costs saving of this scale. But even if you cat get full coverage from internal content, the real benefit here is that you bring to life the highest-value, most expensively-acquired content in your library. The downstream performance gains here - though never easy to measure - will be massive.
The personalization benefits of granularity. You can't personalize a monolithic 1-hour course for a time-poor employee. At that size it just won´t be or feel personal for them - it doesnt just consist of the lesson that they, specifically, need. But you can personalize a 3-minute video. Disaggregating these packages allows for much more precise learning experiences and recommendations, from your LLM, from your LXP or simply from (human!) managers who want a member of their team to retain a particular thought, theme, lesson, idea, etc.
Governance-friendly. Tampering with potentially important files that have been with the company a while may be daunting. But the original files needn´t be destroyed or overwritten. They should be archived with the newly extracted lessons remaining logically linked to the parent SCORM asset. This means your governance, reporting, and traceability to the source material are fully preserved while enabling lesson-level micro discovery and personalization. Be confident, compliant and accountable, still!
What You Can Do Right Now
1. Run the Audit Yourself
You can start with a simple extraction from your LXP or LMS. On paper, that sounds straightforward. In reality, it usually means pulling exports from multiple systems, opening spreadsheets full of inconsistent metadata, trying to interpret course titles that made sense in 2019, and guessing what might still be relevant. Then you’re left trying to answer questions like:
- Where is all this actually sitting?
- How much of it might be genuinely valuable internal knowledge?
- What are we currently doing to maintain, tag, or surface it?
- How much duplication or decay are we carrying?
You’ll probably confirm what you already suspect — there’s a lot of “stuff.” But separating gold from gunk manually is slow, subjective, and rarely conclusive. It’s a significant time investment, and the outcome is often… a bigger spreadsheet.
2. Or Send Us a Sample
Alternatively, send us a small sample of your files or data (post-NDA). We’ll run it through our algorithms and SCORM Intelligence and come back to you within 48 hours. No workshops. No internal clean-up project. No weeks of manual tagging. We’ll show you: What’s genuinely valuable What’s reusable at microlesson level Where the real skill signals sit And how much hidden internal knowledge you actually have In other words, we’ll show you exactly what kind of gold you’re sitting on — amongst the gunk. Fast. Low lift. Clear outcome. If you’re curious, it starts with a sample. See below to schedule a call to activate your sample SCORM intelligence.
Stop Wasting Learning Content.
Start Leveraging It.
See how SCORM Intelligence can extract value from your SCORM library in minutes, not months.
Click here to book a no obligation 30-minute consultation to explore SCORM Intelligence.