By Michael Borella --
Machine learning is more than just a buzzword. It represents a fundamental shift in how problems are solved across industries and lines of business. In the near future, a machine learning library may become a standard part of all operating systems, just like TCP/IP and database technologies have in the past.
For the majority of the existence of computers, programmers wrote functions that were designed to take some input and produce a desired output. Or, if i represents the input and o represents the output, the goal of the programmer was to develop a function f such that o=f(i).
Machine learning inverts this paradigm to some extent. A data set (which in practice usually needs to be quite extensive) of mappings between inputs and their respective outputs is obtained. This data set is fed into a machine learning algorithm (e.g., a neural network, decision tree, support vector machine, etc.) which trains a model to "learn" a function that produces the mappings with a reasonably high accuracy. In other words, if you give the computer a large enough set of inputs and outputs, it finds f for you. And this function may even be able to produce the correct output for input that it has not seen during training.
The programmer (who has now earned the snazzy title of "data scientist") prepares the mappings, selects and tunes the machine learning algorithm, and evaluates the resulting model's performance. Once the model is sufficiently accurate on test data, it can be deployed for production use.
The number of patent application filings related to artificial intelligence (of which machine learning is the hottest subset) has been growing dramatically in the past several years, especially in the U.S. But so has the legal uncertainty of certain types of software and business method inventions due to the Supreme Court's rulings in Bilski v. Kappos, Mayo Collaborative Services v. Prometheus Labs., Inc., and Alice Corp. v. CLS Bank Int'l. So a natural question to ask is what does the patent-eligibility landscape look like for machine learning inventions? Not unlike all questions regarding patent-eligibility, there are no easy answers.
But first, let's dig in to the nuances of the case law. In Alice, the Supreme Court set forth a two-part test to determine whether claims are directed to patent-eligible subject matter under 35 U.S.C. § 101. One must first decide whether the claim at hand is directed to a judicially-excluded law of nature, a natural phenomenon, or an abstract idea. If so, then one must further decide whether any element or combination of elements in the claim is sufficient to ensure that the claim amounts to significantly more than the judicial exclusion. But generic computer implementation of an otherwise abstract process does not qualify as "significantly more," nor will elements that are well-understood, routine, and conventional lift the claim over the § 101 hurdle.
The Federal Circuit has not been a paragon of consistency across its § 101 cases. For instance, Electric Power Group v. Alstom S.A. has been interpreted as holding that claims directed to no more than gathering, processing, and outputting data are ineligible. On the other hand, in Enfish LLC v. Microsoft Corp., the Court found that a database arrangement that provided improvements over traditional relational databases met the § 101 requirements. Likewise, in McRO, Inc. v. Bandai Namco Games America Inc., claims to software displaying lip synchronization and facial expressions of animated characters were eligible because they used a rule-based approach that was different from manual animation techniques.
Clearly, it is advantageous for claims related to machine learning inventions to be more than Electric Power Group's gathering, processing, and outputting of data. But, at its core, that is what machine learning is all about. Training a model takes in data, crunches it, and produces a program as output. Using a model also involves taking input data, running it through the model, and obtaining output data as a result. Thus, a naïve approach to claiming machine learning procedures may lead to § 101 difficulties.
To date, the Federal Circuit has not considered the patent-eligibility of machine learning claims. The closest opportunity the Court had was in 2014's I/P Engine, Inc. v. AOL, Inc. But in that case the claims were directed to non-specific ways of conducting a search for one user based on relevant search results found for other users. While the patentee pointed out that its specification described use of a neural network to carry out the invention, the Court (in a footnote) dismissed this disclosure since the claims were not limited to such techniques.
In the district courts, there is relatively little to report. Most relevant decisions to date, such as Kaavo Inc. v. Amazon.com, Inc. (D. Del. 2018), eResearchTechnology, Inc. v. CRF, Inc. (W.D. Penn 2016), and Neochloris, Inc. v. Emerson Process Mgmt. LLLP (N.D. Ill. 2015) are similar to I/P Engine in that the claims and/or the specification do not explicitly detail specific aspects of machine learning.
In Blue Spike, LLC v. Google Inc. (N.D. Cal. 2015), the claims were similarly non-specific to machine learning. The patentee argued against the abstractness of inventions mirroring human perception and analysis on a computer, cautioning that a restrictive approach could render future breakthroughs in artificial intelligence technology unpatentable. The Court remained focused on the breadth of the claims, stating that "[t]he mere fact that the claims may cover a computer implementation that surpasses in scope or complexity what a human mind is capable of accomplishing is irrelevant where the claims are not limited to such complex activities, but also encompass more basic approaches."
One case that provides a more substantive discussion is PurePredictive, Inc. v. H2O.AI, Inc. (N.D. Cal. 2017), and therefore is worthy of our attention. PurePredictive sued H2O.AI, alleging infringement of U.S. Patent No. 8,880,446. H2O.AI filed a motion to dismiss on the grounds that the claims of the '446 patent were invalid under § 101.
The invention at was described in the '446 patent as "an apparatus, system, method, and computer program product to generate a predictive ensemble in an automated manner . . . regardless of the particular field or application, with little or no input from a user or expert." In this context, an "ensemble" is a set of machine learning models that can be operated in series or in parallel, with the goal of doing so being to provide better results than the output of any one individual model.
Claim 14 of the patent recites:
A method for a predictive analysis factory, the method comprising:
pseudo-randomly generating a plurality of learned functions based on training data without prior knowledge regarding suitability of the generated learned functions for the training data, the training data received for forming a predictive ensemble customized for the training data;
evaluating the plurality of learned functions using test data to generate evaluation metadata indicating an effectiveness of different learned functions at making predictions based on different subsets of test data; and
forming the predictive ensemble comprising a subset of multiple learned functions from the plurality of learned functions, the subset of multiple learned functions selected and combined based on the evaluation metadata the predictive ensemble comprising a rule set synthesized from the evaluation metadata to direct different subsets of the workload data through different learned functions of the multiple learned functions based on the evaluation metadata.
The Court summarized the claim as consisting of three steps: "First, it receives data and generates 'learned functions,' or, for example, regressions from that data[, then] it evaluates the effectiveness of those learned functions at making accurate predictions based on the test data[, and then] it selects the most effective learned functions and creates a rule set for additional data input."
The Court applied the Alice test starting with part one. H2O.AI took the position that the patent was "an attempt to monopolize the use of basic mathematical manipulations without reference to any specific implementation, application, purpose, or use." PurePredictive disagreed, stating that the invention solved "a specific problem in and [made] improvements to computer-related technology," and argued that the claims were analogous to those of Enfish and McRO.
The Court leaned heavily on FairWarning IP, LLC v. Iatric Systems, Inc. as the most relevant precedent. Similar to Electric Power Group, FairWarning involved claims that (in the view of the Federal Circuit) amounted to no more than "collecting and analyzing information to detect misuse and notifying a user when misuse is detected" and did not improve a technological process.
In light of this holding, the Court stated that "[t]he method of the predictive analytics factory is directed towards collecting and analyzing information." Particularly, "[t]he first step, generating learned functions or regressions from data—the basic mathematical process of, for example, regression modeling, or running data through an algorithm—is not a patentable concept." Further, the next two steps were "mathematical processes that not only could be performed by humans but also go to the general abstract concept of predictive analytics rather than any specific application."
PurePredictive argued that it would be impossible for a human to carry out the claimed invention. But the Court found that this point held little weight, noting that "just because a computer can make calculations more quickly than a human does not render a method patent eligible." The Court also stated that "[t]he patent specification's description of this process as a 'brute force, trial-and-error approach,' reinforces that this process is merely the running of data through a machine." Regarding Enfish and McRO, the Court concluded that the claimed invention did not improve the functionality of computers or computer-related technology, and instead just used computers as a tool.
Moving on to part two of Alice, PurePredictive attempted to make analogies between its claims and those found to provide "significantly more" in DDR Holdings, LLC v. Hotels.com, L.P. and BASCOM Glob. Internet Servs., Inc. v. AT&T Mobility LLC. Particularly, it contended that the claimed ensemble technique "do[es] not need extensive tuning and customization" and is "applicable regardless of the particular field or application." But the Court disagreed because, unlike DDR Holdings, the claims address a broad scope of problems rather than being focused on solving a specific technical problem. Furthermore, unlike BASCOM, the claims did not describe a specific physical architecture and instead was focused on software modules.
As a consequence, the claims failed both parts of the Alice test, were ruled ineligible under § 101, and therefore held invalid.
While this case is a useful data point, it reinforces the old saying that "bad facts make bad law." The claims are indeed quite broad and not directed to solving a specific problem or making a particular technical improvement. Prior to Alice and its progeny, this was unlikely to trigger issues under § 101, but now it does more often than not.
Still, it should surprise no one that machine learning claims are treated like any other type of software claim in the § 101 analysis. While one could posit that the whole point of machine learning is to train a computer to do something that a human can't do (or is impractical for a human to do) in a way that a reasonable person would not do it, that is not enough to avoid Alice pitfalls. Machine learning claims can potentially be even more vulnerable to § 101 challenges when the claims recite only data manipulation and do not provide a well-defined technological need or advantage.
Therefore, like other types of software inventions, those involving machine learning should be focused. Some example claiming strategies involve reciting specific types of data associations, detailing the training phase and/or the structure of the model, and placing the model within the context of a larger system. Until we hear more from the district courts -- and hopefully from the Federal Circuit as well -- moving forward with these best practices is the recommended approach.