Title: Mining chemical information from Open patents
Authors: Jessop, David M
Adams, Sam
Murray-Rust, Peter
Keywords: chemistry
text-mining
patents
natural language
semantic
CML
Issue Date: 4-Jul-2011
Publisher: Murray-Rust group, Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge
Abstract: Linked Open Data presents an opportunity to vastly improve the quality of science in all fields by increasing the availability and usability of the data upon which it is based. In the chemical field, there is a huge amount of information available in the published literature, the vast majority of which is not available in machine-understandable formats. PatentEye, a prototype system for the extraction and semantification of chemical reactions from the patent literature has been implemented and is discussed. A total of 4444 reactions were extracted from 667 patent documents that comprised 10 weeks’ worth of publications from the European Patent Office (EPO), with a precision of 78% and recall of 64% with regards to determining the identity and amount of reactants employed and an accuracy of 92% with regards to product identification. NMR spectra reported as product characterisation data are additionally captured.
URI: http://www.dspace.cam.ac.uk/handle/1810/238389
Appears in Collections:Visions of a Semantic Molecular Future

Files in This Item:

File Description SizeFormat
Open-patents.pdf1.66 MBAdobe PDFThumbnail
View/Open
mining-open-patents-FINAL.doc535.5 kBMicrosoft WordView/Open
Additional resources for this item
search for alternative versions in eresources@cambridge
retrieve citation metadata in EndNote format

This item has been accessed 269 times.

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.