Recently, SSTC was engaged by a company overseas for some to help them position their Secure Coding Practices courses in the UK. Due to this particular arrangement being new to both companies; on hand boilerplate contracts were not quite correct for the business relationship both companies were after. We had to look to source and/or create new contracts. Boilerplate commission/referral agreements are of course, no huge challenge to source, but tailoring one for use in a multinational agreement does present some challenges. SSTC has been watching the growing integration of LLMs into all aspects of business and decided that this would be an interesting avenue to test how various large language models deal with the problem of legal document refinement, and how that matches up with how humans would approach the problem, with a view to understanding how these models will be further integrated into our working lives in the future
We first trialed using a service called GenieAI which markets itself as a Legal specific language model. At the recent Risk AI conference one of the more salient points I noticed was that a few people brought up how more specialised,discrete large language models are likely to become the norm for specialised roles and tasks,as it improves the accuracy of responses, minimizes hallucination, and drastically reduces the amount of compute needed to run models, meaning that rather than the enterprise scale level of compute needed to run these models, future LLMs could run on machines at the SME or even consumer level. The Legal specific LLM did pick up on some useful suggestions, including language localisation, with specific advice that some legalese may not translate in direct cognates. It also singled out that the country we were contracting with places specific interest in protecting the intellectual property rights of its companies, which was interesting as this had been pointed out by our opposite legal number, showing a congruence between human and machine advice. This all said, the generalised large language model used caught most of these suggestions, with the exception of the specific emphasis on intellectual property. Without knowing the inner workings of Genie AI’s model, it is hard to say if this small but noticeable increase in accuracy is actually a function of it using a more specific model, but it is certainly interesting to note
When switching to recommendations from a general AI it picked up on many of the same several useful suggestions on determining a language clause, a localization clause and much of what Genie AI had done. It agreed with recommending a neutral venue for arbitration, though it noted that the specific mention of Vienna was considered too far away for both parties, and also suggested including a force majeure clause, along with a clearer denomination of all times, currencies, and other placeholders. To my eye, these recommendations would not have put a legal practitioner in a much worse position than if they had relied on a legal specific AI, making their utility as a differentiated tool at the moment a bit questionable. That said, being able to run an AI discretely, thus offline, may be a significant leg up which might soon be afforded by using these smaller models.
When using a general AI to compose a table comparing recommendations for two legal jurisdiction, humans reviewing the table, found many similarities between the recommendations between the two jurisdictions, however, synthesizing actionable advice from the comparison proved challenging, as the details required deeper legal expertise to interpret effectively, whereas when asked, the general AI was able to recommend a set of actions that would bring the document into further compliance with BOTH jurisdictions, a task that seemingly would require one or several human experts, and a considerable amount of time and effort. As the jurisdiction we are looking to contract with is also quite niche as a business partner for UK businesses it is also not even a given that there are actually experts in both legal systems who would be on hand to make such a recommendation.
Both the Legal and General purpose AIs excelled at giving clear, actionable advice in a way that probably would have taken human practitioners a large amount of time and resources to do. This can be taken as a bit of a double edged sword, as mentioned later, this advice can occasionally be a bit rough around the edges for very fine,niche, specific details. This coupled with the availability and clarity of the instructions
The generalised LLM made an error in omission in not noting the particular emphasis a foreign jurisdiction places on Intellectual property rights; if we had not also been running the document through a legal specific LLM for refinement, we would have needed a legal expert in that foreign jurisdiction to catch this error, which reduces the amount that we would currently be able to rely on a LLM for document refinement.
The general AI showed clear advantages over a legal specific AI in the simple digestibility of the recommendations and language it returns to prompting, however, when compared to the legal AI it does sometimes make errors of omission. Utilizing both models may end up yielding the best results for teams seeking accurate legal solutions, but conveyed in a manageable, non legalese fashion.