Text Data Based Output Indicators as Base of a New Innovation Metric
Text Data Based Output Indicators as Base of a New Innovation Metric
In the joint project, new output indicators on innovation activities were developed. Computer linguistic methods were used, which were applied to large amounts of text data. The development of the methods and the validation of the generated indicators were carried out by ZEW in Mannheim and the Justus Liebig University in Gießen. At ZEW, the analysis was based on text content from corporate websites, which is collected automatically and regularly via a web scraper. Text data mining (e.g. topic models) is then used to identify information on innovations from these texts and to derive innovation indicators. Access to the web pages is based on the databases available at ZEW. These allow the continuous monitoring of the websites of the current German stock of companies and the consideration of extensive metadata (e.g. industry and location of the company). In addition, the newly generated innovation indicators can be compared with conventional innovation indicators via the ZEW databases.