CINECA IRIS Institutional Research Information System

Identifying technologies is a key element for mapping a domain and its evolution. It allows managers and decision makers to anticipate trends for an accurate forecast and effective foresight. Researchers and practitioners are taking advantage of the rapid growth of the publicly accessible sources to map technological domains. Among these sources, patents are the widest technical open access database used in the literature and in practice. Nowadays, Natural Language Processing (NLP) techniques enable new methods for the analysis of patent texts. Among these techniques, in this paper we explore the use of Named Entity Recognition (NER) with the purpose to identify the technologies mentioned in patents' text. We compare three different NER methods, gazetteer-based, rule-based and deep learning-based (e.g. BERT), measuring their performances in terms of precision, recall and computational time. We test the approaches on 1600 patents from four assorted IPC classes as case studies. Our NER systems collected over 4500 fine-grained technologies, achieving the best results thanks to the combination of the three methodologies. The proposed method overcomes the literature thanks to the ability to filter generic technological terms. Our study delineates a valid technology identification tool that can be integrated in any text analysis pipeline to support academics and companies in investigating a technological domain.

Technology identification from patent texts: A novel named entity recognition method

Puccetti, Giovanni;Giordano, Vito;Spada, Irene;Chiarello, Filippo;Fantoni, Gualtiero

2023-01-01

Abstract

Identifying technologies is a key element for mapping a domain and its evolution. It allows managers and decision makers to anticipate trends for an accurate forecast and effective foresight. Researchers and practitioners are taking advantage of the rapid growth of the publicly accessible sources to map technological domains. Among these sources, patents are the widest technical open access database used in the literature and in practice. Nowadays, Natural Language Processing (NLP) techniques enable new methods for the analysis of patent texts. Among these techniques, in this paper we explore the use of Named Entity Recognition (NER) with the purpose to identify the technologies mentioned in patents' text. We compare three different NER methods, gazetteer-based, rule-based and deep learning-based (e.g. BERT), measuring their performances in terms of precision, recall and computational time. We test the approaches on 1600 patents from four assorted IPC classes as case studies. Our NER systems collected over 4500 fine-grained technologies, achieving the best results thanks to the combination of the three methodologies. The proposed method overcomes the literature thanks to the ability to filter generic technological terms. Our study delineates a valid technology identification tool that can be integrated in any text analysis pipeline to support academics and companies in investigating a technological domain.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2023
			
	Codice DOI
	
				https://dx.doi.org/10.1016/j.techfore.2022.122160
			
	Tutti gli autori
	
						Puccetti, Giovanni; Giordano, Vito; Spada, Irene; Chiarello, Filippo; Fantoni, Gualtiero
					
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
1-s2.0-S0040162522006813-main.pdf non disponibili Tipologia: Versione finale editoriale Licenza: NON PUBBLICO - accesso privato/ristretto Dimensione 2.23 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	2.23 MB	Adobe PDF	Visualizza/Apri Richiedi una copia
techNER__REV.pdf Open Access dal 01/02/2025 Tipologia: Documento in Post-print Licenza: Creative commons Dimensione 668.04 kB Formato Adobe PDF Visualizza/Apri	668.04 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1157021

Citazioni

ND

68

49

social impact