By Michael Borella --
When boiled down to a fundamental level, all technologies are double-edged swords. A spear can be used to hunt game or to wage war. A hammer can be used to build a shelter or to murder fellow humans. Social media can be used to connect lonely and geographically-distanced affinity groups in an emotionally meaningful way or to foster misinformation and possibly even genocide.
Artificial intelligence (AI) is no different.
We are largely unaware of the prevalence of AI. From content recommendation, to detection of financial fraud, to drug discovery, to spam filters, these computational models exist in the background of everyday life. AI mostly hangs out behind the scenes impacting our lives in invisible ways. But the slow rollout of autonomous vehicles and the significantly faster adoption of personal digital assistants are more overt examples.
In science fiction, there is no shortage of utopian stories in which menial tasks carried out by humans are performed by various types of robots, ostensibly leaving humans with more time to think, create, relax, and enjoy life. In reality, replacement of human labor with non-intelligent automations has so far proven to be disruptive to many societies. For the most, part knowledge workers such as lawyers have escaped this disruption. We believe, perhaps arrogantly, that the value we provide to our clients requires a generalized intelligence and a sense of empathy that is missing from modern AI. Thus, we may look down our noses at the thought of incorporating AI into our workflows.
It is time to reassess that viewpoint.
All lawyers, especially those of us in patent law, employ various types of technological assists. We draft, edit, and review on computers. We look things up in search engines and Wikipedia. We use docketing software and reminders to stay on top of our schedules and deadlines. The technology-assisted lawyer is already here, and those who eschew these technologies are hard-pressed to keep up.
But let's not forget the aforementioned dual nature of these technologies. The same tools that help us do our jobs can also distract us with non-stop notifications. Moreover, search engine results can be misleading, and Wikipedia can contain mistakes. But we've adapted to use these tools by applying the same skeptical and inquisitive frame of mind that makes us suited for the profession. We take non-verified information for what it is -- information. When in doubt, we double and triple source it. Indeed, being a patent attorney who needs to understand new and complex science and engineering inventions would be frustrating and hard if not for our technological assists even accepting that they are not 100% reliable.
Generative AI is yet another assistive tool, though with bigger caveats.
The latest large language models, such as ChatGPT, are remarkably good at producing human-like text focused on a particular topic. While ChatGPT output generally falls far short of a well-trained and experienced human author, it can often exceed the quality of writing of an average human. Thus, it is premature to state that patent lawyers (or other types of lawyers) are going to be replaced anytime soon. However, ignoring the trends in large language models may cause some of us to be gradually obsoleted.
Currently, these models are useful yet unreliable. They frequently produce insightful results, but also can "hallucinate" pure nonsense, falsehoods, and fabrications. This is because they are little more than sophisticated sentence autocompleters, with arguably no understanding of what they write. Still their output can be cogent and detailed, in the form of a paragraph or essay.
Thus, one might be tempted to cut and paste these results into a legal document. Of course, that would be a mistake at least due to said hallucinations. Instead, large language model output, when relevant, should be edited and/or recrafted. In a sense, this is not that different from what one might do when paraphrasing or otherwise incorporating information found in a web search, in case law, or from Wikipedia. But large language models put you closer to the finish line by writing a first draft for you.
For example, when writing a patent application, one might describe how an invention can be used. It could make a handful of existing technologies more efficient -- faster, better, etc. We can spend an hour or two writing descriptions of each of these technologies from scratch, or we can farm that task out to ChatGPT. Within minutes it will provide workmanlike descriptions that can be edited for stylistic consistency and accuracy,[1] and then we can explain how the invention improves each.
Not unlike junior associates or talented paralegals, today's large language models can help us shave off a couple of hours of time per application by helping us write the background section and describe the prior art. Beyond that, current technology is hit or miss. For example, ChatGPT can draft patent claims but there are numerous reasons not to use it for this purpose.
Like their predecessors, these tools can be used for various purposes, some constructive and other destructive.[2] The key is to use them for what they are good at doing, and not for tasks at which they are likely to fail. Over time, ChatGPT may evolve to a point where it can automate even more of the drafting process, perhaps even taking a first pass at office action responses, as well as validity or invalidity arguments. This may be as little as five to ten years out, though no one knows for sure whether these models will continue to improve at their current pace or hit some unforeseen plateau.
Regardless, the AI-assisted patent attorney is just the latest iteration of the technology-assisted patent attorney. As the world changes, we need to be flexible and adapt to new professional and business realities. ChatGPT and its many rival models that are in the process of development and launch represent just one of these realities.
[1] In this scenario, ChatGPT is likely to provide a reasonably on-point result because it is describing something well-known.
[2] One of the more troubling abilities of ChatGPT is that it can reduce the marginal cost of generating massive amounts of disinformation to nearly zero. In the wrong hands and without safeguards built into the model, it may not be long before our social media and news channels are overflowing with nonsense at a level well beyond what we already see. This is not quite what Orwell predicted, but likely just as bad.
Will Artificial Intelligence Force Us to be Less Dumb about How We Evaluate Humans?
By Michael Borella --
Years ago, I was a proud parent when my children were invited to participate in an honors math program at their grade school. But this initial delight turned to confusion, and eventually frustration.
As just one example of why I was less than pleased with our school's pedagogy, one very highly emphasized part of the curriculum required that the kids memorize as many digits of pi as they could, with the minimum being 25. Sure, they also learned how pi defined the ratio of a circle's circumference to its diameter and how to use it in simple algebra, but this memorization task was the focus of the unit, with the child who memorized the most digits (130 one year) winning special accolades.
To me, this assignment missed the point. Pi is a critical value in many aspects of science and engineering, and can be taught directly or indirectly in a number of compelling and fun ways involving wheels, pizza, spirographs, and so on. And its importance in aviation and communications can at least be mentioned.
But the focus was on committing those 25-plus digits to memory and being able to recite them on demand. When I pointed out to the teachers that maybe -- just maybe -- this was not the best way to prepare children to have an appreciation for STEM fields, they looked at me like I was from another planet. The curriculum was designed around what was easy to test (can the kid produce the 25 digits when asked?) rather than the harder-to-evaluate skills (does the kid know how and when to use pi to solve problems?) that are actually important when using math in the real world.[1]
Thus, when the news broke that OpenAI's GPT-4 large language model passed the uniform bar exam at the 90th percentile, I was less than impressed. In fact, this outcome is completely unremarkable given that it was trained on billions of units of human text.
The bar exam is a memorization exam. Aspiring lawyers typically spend 10-12 weeks taking a bar exam review course, which involves committing massive amounts of legal rules and principles to memory, as well as learning how to write essays in a formulaic fashion (IRAC). Then you sit for two days of testing in which you regurgitate as much as you can. If you manage to score highly enough, you pass and become a licensed attorney.
During the summer that I spent preparing, I remember at one point mentioning in frustration to my study partner that what the bar exam is actually testing is how much pain one is willing to accept to be a lawyer, and that rapping us across the knuckles a few times with a ruler would probably have the same effect. Indeed, I know of individuals who graduated law school in the top ten percent of their class (in terms of GPA), failed the exam on their first try, later passed, and went on to be excellent attorneys. Clearly, these folks were bright, but when speaking to them they attributed their failure (which was quite the source of shame) on not studying hard enough during bar review. Let that sink in -- top law students can fail to be licensed because they do not learn the mechanical proclivities of one specific exam.
A recent paper from Professor Daniel Katz evaluates GPT-4's bar exam performance and states that "These findings document not just the rapid and remarkable advance of large language model performance generally, but also the potential for such models to support the delivery of legal services in society."[2] The key word in this sentence is "potential" but even so this statement is misleading.
GPT-4 scoring well on the bar exam is not because AI is achieving human levels of intelligence. It is because the bar exam tests a human's ability to perform like a robot. Missing from the bar exam are tests of executive function (e.g., staying organized, keeping to deadlines), soft skills (e.g., client interaction and counseling, interpersonal competencies), and law firm operation (e.g., finance, marketing, managing groups, how to be a good employer), all of which are more relevant to a lawyer's success than their ability to stuff facts into their brains.
Indeed, it is now widely accepted that GPA is much more predictive of a student's ultimate success than standardized test scores. This is because maintaining a high GPA requires more than the raw cognitive ability to do well on memorization-based exams -- the aforementioned executive functioning and soft skills play a significant role. Intellectual ability is important, but so is emotional intelligence.
Turning to patent law, there might be one multiple choice question out of 200 addressing intellectual property on the typical year's bar exam. So for us patent attorneys, the bar exam is measuring our ability to regurgitate law that we are unlikely to ever apply in practice. To that point, the USPTO requires that we pass a separate patent bar exam. Admittedly, it is also memorization-based, but at least it is open book.
So, to the extent that Professor Katz is implying that GPT-4 or any other of the current generation of large language models can perform significant legal tasks, I have to disagree. Large language models are tools that lawyers can employ, not unlike search engines or Wikipedia. They may be able to carry out certain first-level research functions in place of a junior associate. But when it comes to crafting creative legal strategies that guide clients through complex transactions, they are still far from the mark.
Nonetheless, the strong performance of GPT-4 on memorization-based exams provides us with a golden opportunity to re-evaluate how we teach both children and law students. If the goal is to turn out humans with skills that can be easily replaced by automation, then maintaining the status quo will get us there. But we would be much better off by recognizing and embracing large language models, while remaining cognizant of their strengths and weaknesses. Integrating these tools into a broad-spectrum education system with a flexible curriculum is much more likely to produce graduates who can adapt to the changing needs of the legal profession, or any other field for that matter.
The modern education system is still based too much on a paradigm established in the 1800s, one in which an instructor lectures and the students passively receive their lessons. Given that large language models can outperform most humans in these scenarios, we need to seriously consider changing the system to meet the demands of 21st century life.
And for anyone who absolutely needs to know the first 25 digits of pi, don't worry because GPT has you covered: "The first 25 digits of pi (π) are: 3.14159265358979323846264. Note that pi is an irrational number, meaning that its decimal representation goes on infinitely without repeating." Or, it almost has you covered, as the 25th digit is missing from its output.
[1] To get a sense of how prevalent issues like this are in education, 60 years ago Nobel laureate physicist Richard Feynman was asked to help the state of California select math textbooks for its schools. He wrote about the process, which is both humorous and disheartening. From what I have seen, today’s textbooks are better than they were back then but still leave plenty of room for improvement . . . such as justifying why one needs a textbook, period.
[2] Katz, Daniel Martin and Bommarito, Michael James and Gao, Shang and Arredondo, Pablo, GPT-4 Passes the Bar Exam (March 15, 2023). Available at SSRN: https://ssrn.com/abstract=4389233 or http://dx.doi.org/10.2139/ssrn.4389233.
Posted at 09:22 PM in Media Commentary | Permalink | Comments (9)