Build an AI-powered content taxonomy - fast & free
Building a content taxonomy doesn’t need to be complicated or costly. This guide shows you how to use AI-powered image recognition to generate a working taxonomy from your existing content - fast, free, and scalable.
Start smarter: let your content define the structure
Taxonomies are a foundational element of effective content management. They shape how information is organized, discovered, and reused across teams and systems. Yet for many, building a taxonomy from scratch feels intimidating- especially when the pressure to “get it right” leads to overcomplication or internal bias.
At Fotoware, we’ve spent three decades helping organizations structure content at scale.
Common pitfalls we seen is that teams either over-plan - designing taxonomies too rigidly around personal roles - or "underbuild", adding only a handful of tags relevant to their immediate needs.
The result is often the same: a structure that fails to scale, becomes cluttered with duplicates, and confuses more users than it helps.
This article introduces a more intuitive starting point - one powered by AI. Instead of retrofitting auto-tags into a prebuilt taxonomy, we flip the process: using image recognition tools to generate a vocabulary based on what your content actually contains. It’s a bottom-up approach that reflects real usage and provides a lean, data-driven starting point for further refinement.
By the end of this guide, you’ll have a fast, free method to build your own working taxonomy - from real content, not assumptions. Whether you're structuring a DAM, CMS, or enterprise content platform, this method provides a smarter foundation for the metadata that follows.
Start simple - then grow your taxonomy naturally

Create a vocabulary based on real content to create immediate value.
Creating a taxonomy using auto-tagging tools means that users can immediately create a vocabulary based on content what currently exists in their content libraries: not what they imagine them to contain or what they predict may need to be included in the future. This creates something very valuable: a taxonomy based only on ‘hard data’ captured in the present moment.
Of course, using this method – users can then go on to adapt and grow their taxonomy as it expands naturally over time; either by using auto-tagging or by adding structure themselves. The key concept is to create a basic taxonomy, one that can be built on and developed.
Expert guidance when you need it - we can help
If you would like assistance in getting started in creating your own taxonomy, using a content management system or any related topics – please get in touch with our consultants.
What you’ll need to start building a taxonomy
With the power of auto-tagging; it is possible to quickly and effectively create a basic taxonomy.
For this to work, all one needs is the following:
- A pre-existing content sample base of images to use.
- A willing DAM vendor, developer or access to an online auto-tagging service.
- A test group to help you refine the AI-built taxonomy vocabulary.
Choosing your sample size is an important area to think about prior to starting to create your taxonomy. You must first decide upon which images to use and the quantity you would like to use. It goes without saying that, like in any research, a large sample size means that you have a wider base to experiment on. In theory, this means that anomalies have a reduced impact upon results and reduces what statisticians refer to as the ‘margin of error’.
However, you must also balance this with aspects such as the amount of time you are willing to invest and the amount of appropriate images you have. It is better to go for a smaller sample size, than it is to unnecessarily bulk it out with images that will distort your taxonomy’s vocabulary. As a minimum, you should have at least a few hundred images to use.
Three ways to build an AI-powered taxonomy
In the interests of versatility, we will cover three different methods for creating a taxonomy using auto-tagging tools:
- Option 1: use a DAM with built-in AI tagging
- Option 2: Going directly to an AI provider with a demo version of a web based image recognition service.
- Option 3: Creating a small API client. This method will require some time in development; either by yourself or if coding. If not your forte, there is always the option to hire a developer for a few hours work.
Option 1: use a DAM with built-in AI tagging
Most DAM systems have integrated artificial intelligence functionality. For instance, Fotoware Alto has an inbuilt Clarifai connector that will allow for content to be auto-tagged. You can simply request a free trial from us and get set-up quickly, so that you can begin the process. To do so, simply fill in your details here.
Using a DAM provider like Fotoware also means that you will have access to features that will make the whole process of taxonomy vocabulary building much easier; such as a native option to export a list of keywords based on your image collection. We’ll cover why this is important soon.
Option 2: Using an AI Web Demo
If you chose to go direct to an artificial intelligence provider with a browser-based trial, there are various options that you can consider. If you would like to use one of the providers we use frequently here at Fotoware, then head on over to the Clarifai demo page. However, there are also other free online demos to consider such as: Imagga and Microsoft Azure.

A glimpse at Clarifai’s browser-based image recognition demo.
Using these tools might be the best option if you only have a small sample of images to use to build your taxonomy vocabulary, as these are just browser-based demo versions; they come with no option to export keywords en masse… so be prepared to copy and paste!
Option 3: build your own API-based tagging tool
If you opt to build your own API client, then you also have a range of different options in this area. If you would like to opt for this route then as well as Clarifai’s own API, you can also look at other artificial intelligence providers which give out free API keys for development purposes, they include: Google Vision, DeepAI, Amazon Rekognition and CloudSight. Be aware that, the free versions of the API keys are limited in various ways and full versions are paid, so be sure to check which works best for your purposes and read the small print from each provider before you set out on any development because privacy terms differ too.
From tags to taxonomy in two steps
The process is simple, with only two steps:
- Collect auto-generated tags: Harvest keywords through one of the three aforementioned methods.
- Begin Text Analysis: Use a text analysis tool to prioritise and remove duplicates.
Step 1: collect auto-generated keywords
Once you’ve selected the method that works best for creating your taxonomy, you can then move on to starting to process and begin to harvest keywords, which will go on to form your taxonomy structure.
Looking through your list of keywords, you’ll notice that there are some that may have been misidentified but as you will soon study the frequency of keywords found, this should filter them out.
Additionally, there are likely to be many duplications listed too but don’t worry, as the next step in the process is to begin ‘digesting down’ the data you have captured.
Step 2: clean, sort, and refine with text analysis
To do the text analysis, we will be using a free tool called Textalyser, this tool will sort through your keywords and find out which are used most frequently. The concept here is to begin making a second list of terms, ideally on a new spreadsheet, that can directly be used for your taxonomy.

After you have copied your list of keywords into Textalyser, you will see a results page. On this page, the two areas you specifically need to pay attention to are: “Frequency and top words” and “2 word phrases frequency”. Before you copy the data from these areas into a new spreadsheet, it is important to first look through your results and study the frequency percentage. This simply shows, via percentage number, how frequently a phrase or word has been listed. You need to select the percentage number where your results are no longer relevant and use this as a cut off point where you stop copying. See the below screenshot for an example of the frequency percentage.

A freshly generated flat taxonomy.
There is no correct specific percentage to use as a benchmark for the cut-off point and instead, it is best if you look through to where you deem your results are relevant and where they are irrelevant. Going down to a percentage too low will mean that you will include anomalies and ‘AI mis-fires’ in your taxonomy. Also keep in mind that you always can add back any omitted terms, so don’t go low just to see them included.
This step is more of an art, than a science, and it will be good to involve your internal test group at this point in the process, for a review. Some DAM, Content Management or Intranet software can export information from user search queries. Using this information, it becomes possible to directly compare and contrast your AI generated taxonomy against a ‘crowdsourced’ human test group. If you would like guidance in this area, a taxonomist professional could prove very helpful – see the end of the blog for more information.
Once you have copied this onto a new spreadsheet, you have created the backbone of your own basic taxonomy. This can now be used for a variety of purposes – everything from improving it by adding a hierarchy or synonyms (for which our sample taxonomy download could prove interesting), noting down interesting observations about your content or even creating your own instance on a content management system.
Taxonomies don’t have to be intimidating
Taxonomies can often be an area of worry; one that can intimidate new users when starting out in content management.
What we set out to do with this blog post, is to prove that creating a basic taxonomy doesn’t have to involve a great deal of work, patience or even foresight.
A taxonomy is not static: instead it grows and adapts over time as content changes.
Image recognition can be used for many things. In this case, we reverse engineered the use of artificial intelligence to create a purpose-built taxonomy using a sample collection of images. Once created, this has the potential to provide a framework for manual additions or giving to a taxonomist to improve.
Turn metadata into a strategic asset
Book a meeting to see how Fotoware connects AI, taxonomy, and structured metadata into one powerful content system.