By Michael Borella --
Introduction
It seems like everyone is talking about artificial intelligence, especially the subset thereof referred to as machine learning. While some of the discussion is cast in terms of politically-stirred angst about human jobs being replaced by robots or algorithms, a more informed and rational dialog would also set forth machine learning as a platform for the next great breakthroughs in science, technology, medicine, and lifestyle. Regardless of rhetorical positioning, machine learning represents a fundamental shift in how problems are solved across industries and lines of business. In the near future, a machine learning library may become a standard part of all computers, just like networking and database technologies have in the past.
For the majority of the existence of computers, programmers wrote functions that were designed to take some input and produce a desired output. Machine learning inverts this paradigm. A data set (which in practice usually needs to be quite extensive) of mappings between inputs and their respective desired outputs is obtained. This data set is fed into a machine learning algorithm (e.g., a neural network, decision tree, support vector machine, etc.) which trains a model to "learn" a function that produces the mappings with a reasonably high accuracy. In other words, if you give the computer a large enough set of inputs and outputs, it finds the function for you. And this function may even be able to produce the correct output for input that it has not seen during training. The programmer (who has now earned the snazzy title of "data scientist") prepares the mappings, selects and tunes the machine learning algorithm, and evaluates the resulting model's performance. Once the model is sufficiently accurate on test data, it can be deployed for production use.
Such models are already in use today -- they suggest what products you might want to purchase, and movies and music that you might find interesting. They also silently improve the quality of photos taken from your digital camera, help security screeners at airports and sports stadiums, detect financial fraud, and improve your online search results. And the real-world applicability of machine learning has yet to peak.
Naturally, innovators in machine learning, like innovators in any other industry sector, seek to protect their work with patents. Indeed, the number of patent application filings related to artificial intelligence and machine learning has been growing dramatically in the past several years, especially in the U.S. Nonetheless, inventors, applicants, and even patent attorneys have often struggled to adopt a claiming strategy for inventions incorporating machine learning.
Not surprisingly, the strategy to be taken when drafting such claims depends on the character of the invention and how machine learning is incorporated therein. Thus, there is no one particular "silver bullet" approach. Use of just a small number of guidelines, however, can help you focus your claim drafting approach in a direction that may bear the most fruit for your clients.
In short, these guidelines involve looking to several areas where innovation is most likely to occur in machine learning inventions: (i) the structure of the model, (ii) the training process, (iii) input data preparation, (iv) input data mapping to the model, and (v) post-processing and interpretation of output data from the model. Along with these five "positive" rules are two "negative" rules of what not to do: (i) do not mix the training phase and the execution phase in the same claim, and (ii) be careful with inventions that are no more than just conventionally applying an existing model to existing data.
Each will be addressed in turn. But throughout this discussion, it is important to remember that details matter. Like claims to most inventions, claims to a machine learning process must specify enough detail for the reviewer (e.g., patent examiner or judge) to be able to determine that the invention as claimed is substantial enough for patenting. High-level or vague claims are unlikely to meet the requirements of novelty and non-obviousness, much less subject-matter eligibility.
Claim the structure of the model
When the invention involves a new or unusual model structure, this aspect may be a candidate for claiming. For example, is a neural network with a particular pattern of layers or number of neurons per layer key to providing a desirable result? Or are multiple neural networks used in parallel or tandem? Taking this point one step further, the best known solutions for some problems involve ensembles of two or more models. If your problem is well-addressed by an ensemble, and the structure of the ensemble is new, that might be the starting point of a claim.
Claim the training process
Especially when an unconventional model is used, the training of this model may be unconventional as well. This provides another avenue for claiming. For instance, are parts of the model trained with specific subsets of the input data? Or is the model trained in phases? Does the training employ a clever form of parallel processing in order to reduce training time? Regardless, if the execution of the trained model (i.e., its application to input data) is carried out in an orthodox manner, the training might be more easily protected and should be carefully considered. On the other hand, detecting an infringing training phase might be difficult once the training is complete and the model is commercially deployed.
Claim the input data preparation
Data scientists spend a surprising amount of time preparing their data for introduction to a model. Real-world data is messy and often needs to be normalized, transformed, have outliers removed, or otherwise processed so that its characteristics can help the model produce quality results. Often, this is a trial-and-error procedure, with the data scientist attempting several approaches before finding one that is satisfactory. For instance, some natural language processing models might be utilize word counts, but common words such as "and", "the", "it", and so on may be removed from these counts in order to force the model to focus of the words with a contextual meaning closer to that of the problem being addressed.
Claim the input mapping to the model
Once you have chosen your model and prepared your data, you have to map the input data to the model's input. This is frequently an inherent process, as the model and input data preparation are selected to work together. Nonetheless, this mapping can be of interest. For example, a neural network that classifies sections of black and white images might have 64 inputs, one for each pixel in an 8x8 patch of an image. The individual inputs may be numbers representing the intensity (brightness) of the respective pixels. To the extent that such a mapping is innovative, it is fodder for a claim.
Claim the post-processing and interpretation of the output data
Just because a model provides a result, even a desirable result, that does not mean that the overall machine learning pipeline is over. In some cases, the raw output of a model has to be transformed, normalized, or run through another algorithm to provide useful output data. In other cases, and as hinted at above, the output of one model may be used (with or without intermediate processing) as input to another model. For certain classes of models, part of the model itself is the output -- perhaps a particular layer of a neural network that encodes a semantic meaning of the input.
Do not mix the training phase and the execution phase in the same claim
A machine learning model is first trained and then deployed for production use. Thus, it is likely that the entity performing the training and the entity executing the model are different. Accordingly, combining steps or elements directed to both training and execution in a claim can result in divided infringement of the claim by these entities. Instead, draft separate independent claims for the training and execution phases. In cases where the execution phase does not seem to have enough substance to stand alone, the details of the training phase can be provided in passive clauses (e.g., "wherein the model was trained by comparing random pixel inputs to ground-truth image classifications . . .").
Don't claim conventionally applying an existing model to existing data
As with any type of technology, some solutions are less patentable than others. If the invention at hand applies an off-the-shelf model to a known data set in a non-specific manner, and none of the above "positive" rules are pertinent, then the machine learning aspects might not be where to focus your claiming efforts. While machine learning is new, the general concept of applying any generic model to a data set is likely to be considered obvious, at least. Instead, look to other aspects of the invention for protectable innovation.
Conclusion
While machine learning will almost certainly remain an active area of patenting in the coming years, the path to obtaining meaningful protection for machine learning inventions is scattered with pitfalls. While the above guidelines are not exhaustive, following them will avoid most of the elementary traps.
And finally, a note to patent agents and attorneys attempting to break into machine learning -- do your homework. Do not treat machine learning as a black box technique that can be added to claims as an afterthought. Instead, educate yourself in how and why machine learning works. Read papers and books, watch videos, take a course, do some programming -- use whatever means are available. Doing so will dramatically enhance your ability to draft meaningful claims (and other parts of patent applications) that employ these technologies.
In a vacuum, this might contain some nice tidbits.
In the Alice-driven world? Not so much.
Posted by: Skeptical | November 27, 2018 at 12:33 PM
I've had good experiences avoiding and overcoming Alice issues following this approach.
Posted by: Mike Borella | November 27, 2018 at 03:16 PM
Thanks Mike. I do not doubt your personal reflections, but would add (and link to this story as somewhat of a confirmation: http://www.ipwatchdog.com/2018/11/28/artificial-intelligence-technologies-facing-heavy-scrutiny-uspto/ ), that for every one of your "successes," four others likely fail with the same approach.
The link does point out that an additional case after Alice (Electric Power Group) is having a highly negative effect.
Posted by: Skeptical | November 28, 2018 at 08:30 AM
Interesting and timely article. I certainly could poke holes in the analysis (such as, looking just at 101 rejections for AI applications is misleading when the same trend likely applies to software patents as a whole). But it is hard to draw conclusions from USPTO data period, so I applaud the efforts. Nonetheless, I don't think that the situation is as dire as it would seem, and I suspect that if one were to look at the claims of these applications, one would see that that most of the applications being rejected have broad and vague language.
Posted by: Michael Borella | November 29, 2018 at 08:00 AM