My work involving big data and new measures focuses on developing an integrated, consistent, and validated tool set to support theory-driven research.
This project applies novel patent-to-patent similarity data (described below) to examine the information content of patent citations. Existing measures of innovation often rely on patent citations to indicate intellectual lineage and impact. We show that the data generating process for patent citations has changed substantially since citation-based measures were validated a decade ago.
Today, far more citations are created per patent, and the mean technological similarity between citing and cited patents has fallen significantly. These changes suggest that the use of patent citations for scholarship needs to be re-validated. We propose a basic correction and show that methods for sub-setting and/or weighting informative citations can substantially improve the predictive power of patent citation measures.
This project is joint work with Kenneth Younge, who holds the Chair of Technology and Innovation Strategy at EPFL, and Alan Marco, the Chief Economist of the United States Patent and Trademark Office. I presented this paper at the Searle Center Conference on Innovation Economics and the Academy of Management Annual Meeting. It was also presented at the Munich Summer Institute. We plan to make the data available for public use.
Concepts of technological space, distance, and relatedness are central to the study of invention and innovation. Empirical studies of technological space generally rely on manual classification of patents by the patent office or the linking of patents through prior art citations. In this project, we demonstrate that these approaches are simply too coarse or too biased for many of the comparisons or groupings required for academic research. We introduce a new, continuous measure of technological similarity based on a vector space model.
We apply the model to calculate the pairwise similarity for more than 14 trillion pairs of patents. We validate the measure and demonstrate that it can provide greater accuracy, specificity, and generality than existing approaches for many common research questions. Moreover, we illustrate how a pairwise similarity comparison of any and every two patents in the USPTO patent space can open new avenues of research in economics, management, and public policy.
This project is also joint work with Kenneth Younge. I have presented this paper at the USPTO Visiting Speaker Series, the Academy of Management Annual Meeting, and SKEMA Business School (Sophia Antipolis). It was also presented at the DRUID Academy Conference 2016.
In addition to the projects described above, I have developed several patent data sets that I apply across different projects.