Why you need to organise your law firm's data more effectively — and how to do it

Why does it help to classify your legal work, experience, knowledge and other information in a law firm, asks Graeme Johnston, a software entrepreneur in the legal sector, and how do you go about it?
My thoughts about this topic have evolved over many years. First in the 1990s and 2000s in the context of studying law, then working as a lawyer and writing about law. Then in the 2010s in the context of setting up and running a legal software company. Then in the 2020s, while establishing and developing an open source community with legal taxonomy issues at its heart and doing related work for law firms and legal departments.
Whatever you think of the latest round of artificial intelligence (AI), one thing that’s been interesting to observe over the past couple of years is how it has motivated organisations to pay more attention to the classification of data. The realisation is dawning more widely that, unless you define what conceptual definitions, relationships and distinctions matter to you, you’re going to get whatever the models give you, which isn’t necessarily what you need.
Key areas to consider
I think of the four major areas in which classification can add value to a law firm as, in no particular order: (1) delivering legal services, including pricing; (2) sales and marketing; (3) people – training, staffing, credentials; and (4) knowledge – developing and effectively sharing what the firm and its people know and can do. They overlap in ways that will be well-known to you as a reader of the Journal but are also distinct in the sense that they tend to have their own teams of people and characteristic software applications.
Addressing this topic in a way that is joined up across those four areas can help you with a coherent approach to identifying:
- what is more or less profitable – so you can reinforce success and tackle the challenges, be they of a process, pricing, people, knowledge or other nature;
- what areas of work are waxing or waning – so you can decide how to invest in marketing, sales and knowledge; and
- where you need more people with certain skills, whether that means newly qualifieds, partner promotions or laterals.
Addressing this in practice involves a mixture of search (‘finding a needle in a haystack’) and analysis or summarisation (‘understanding what the different types of needle, and indeed haystack, amount to’). Better classification of matters, people’s experience and knowledge helps with this. A lot.
Common problems
If your classification systems are fragmented across your various systems, it becomes really hard, and expensive, to join these things up. In practice, the classification systems used in most law firms tend to suffer from several types of problem:
- taxonomies of variable quality
- structural design weaknesses in taxonomies
- taxonomies which are fragmented, with unnecessary and unhelpful differences between the various systems and groups mentioned above
- poor application to data in practice – where the rubber hits the road
These problems also tend to become worse over time. Think of it as ‘taxonomy entropy’. Classification systems become increasingly disorganised as concepts are added piecemeal without sufficient consideration of the overall structure and context. And the failure to apply them accurately to data can give the sense that it’s all hopeless: not even worth tackling.
On the taxonomy design side of this problem, an important part of the answer is ‘faceting’ – breaking down classification into different aspects (like process types, sectors and legal specialisms) that can then be combined. For example, ‘employment’ would be a legal specialism and ‘litigation’ would be a process type, so that ‘employment litigation’ is expressed not as a subtype of ‘employment’ or ‘litigation’ but as a combination of the two.
This approach is quite flexible and efficient – just to illustrate the arithmetic, with just three facets of 10 types each, you can express 1,000 combinations (10 to the power three). In practice, a half-dozen or so facets with a modest number of types in each will give most law firms what they need.
Keeping it simple
One really important thing I would suggest is to balance expressivity with simplicity. While hierarchical structures (types and subtypes) can be valuable for both reporting and searching purposes, too many levels can become impractical. Even a large firm can benefit from relatively simple facets, each with only two or three levels, supplemented by reporting or search tools when needed.
Through the noslegal project (a voluntary, not-for-profit open source initiative), we’ve developed core facets that are starting to be used, with configuration, by major firms including Shepherd and Wedderburn and A&O Shearman. We released our first two versions in 2022 and 2024, and the third version at the end of March 2025.
The noslegal facets are designed to be jurisdiction-agnostic and extensible. Extensible means that you can add detail where needed for:
- your particular firm’s specialisms; and
- the distinctions that matter in particular parts of your firm – for example, Business Development will often have less need for a detailed legal specialism breakdown than, say, Knowledge.
Something to emphasise is that, even though you can and should extend where useful to do so, still bear in mind the importance of simplicity. Insist on having a realistic business need for each distinction and a workable way to apply the taxonomy accurately. Then follow through to make sure it happens properly and that you learn from the inevitable errors.
Avoid the pitfalls
Probably the most common taxonomy-related failures I’ve seen in law firms are to:
- spend ages debating and over-specifying the taxonomy in ways that people become grumpy about, which go beyond what’s useful and which are not applied well in practice;
- not spend enough time addressing how it will be applied in practice, focusing on needs and practical realities and challenging over-complication;
- not spend enough time in following up to improve the process, the taxonomy and the quality of its application to data.
These errors can lead to a large waste of money, energy and goodwill, together with a lost opportunity. But if you keep it simple and focus on practical usefulness and realistic processes – all in the context of your firm’s systems and what you’re trying to do – then it can be valuable.
A few words on AI. Recent developments appear to present new opportunities for taxonomy development and application to data. Some care will be needed to apply these thoughtfully, but in my view an appropriate strategy is likely to involve a fairly high-level human-designed taxonomy such as noslegal or your version of it – while leaving the AI tools to make the finer distinctions ‘on the fly’ when you’re searching for something or trying to make sense of a dataset.
In the context of Scotland specifically, I think there’s a real opportunity for firms of medium size – tens or low hundreds of people – to address this topic now. At that sort of size you’ll typically have a sufficiently complex practice that would benefit from better insights into what’s going on, but without the level of organisational complexity that can slow down getting things done in really large firms.
Written by Graeme Johnston, a software entrepreneur and former Herbert Smith Freehills partner