Artificial Intelligence, Web-Derived Signals, and the Measurement of Firm-Level Innovation: A Multidisciplinary Analytical Framework

Authors

  • Aarav Mehta Department of Economics and Innovation Studies, University of Amsterdam, Netherlands

Keywords:

Artificial intelligence, innovation measurement, web mining, corporate websites

Abstract

The measurement and prediction of firm-level innovation have long posed significant challenges for researchers and policymakers due to the latent, multidimensional, and context-dependent nature of innovation processes. Recent advancements in artificial intelligence (AI), big data analytics, and web mining methodologies have opened new avenues for capturing real-time, scalable, and nuanced indicators of innovation. This study develops a comprehensive, multidisciplinary framework that integrates insights from innovation economics, information retrieval theory, financial economics, and machine learning to examine how corporate websites, textual data, and AI-driven analytical tools can be utilized to measure and predict firm-level innovation. Drawing upon a carefully curated set of academic references, the article synthesizes theoretical and empirical contributions related to web-based innovation indicators, diffusion theory, transformer-based language models, and financial disclosure analytics.

The methodology relies on a conceptual integration of web scraping techniques, natural language processing models such as transformer architectures, and retrieval-augmented generation approaches to extract innovation signals from corporate digital footprints. The results demonstrate that website characteristics-including linguistic complexity, technological signaling, and product-related disclosures-serve as strong proxies for innovation activity, particularly when augmented with AI-driven analysis. Furthermore, the study highlights the role of AI in enhancing both the production and measurement of innovation, showing that firms leveraging AI technologies exhibit higher growth rates and increased product innovation.

The discussion critically evaluates the limitations of web-derived indicators, including issues related to data bias, interpretability, and cross-sector comparability, while proposing future research directions involving multimodal data integration and dynamic innovation tracking. The findings contribute to the broader literature by offering a unified framework that bridges theoretical and methodological gaps, providing actionable insights for academics, investors, and policymakers seeking to understand innovation dynamics in the digital age.

References

1. Axenbeck, J., & Breithaupt, P. (2021). Innovation indicators based on firm websites-Which website characteristics predict firm-level innovation activity? PLoS ONE, 16(4), e0249583. https://doi.org/10.1371/journal.pone.0249583

2. Babina, T., Fedyk, A., He, A., & Hodson, J. (2024). Artificial intelligence, firm growth, and product innovation. Journal of Financial Economics, 151, 103745. https://doi.org/10.1016/j.jfineco.2023.103745

3. Battisti, G., & Stoneman, P. (2003). Inter- and intra-firm effects in the diffusion of new process technology. Research Policy, 32(9), 1641–1655. https://doi.org/10.1016/S0048-7333(03)00055-6

4. Blazquez, D., & Domenech, J. (2018). Big Data sources and methods for social and economic analyses. Technological Forecasting and Social Change, 130, 99–113. https://doi.org/10.1016/j.techfore.2017.07.027

5. Bottai, C., Crosato, L., Domenech, J., Guerzoni, M., & Liberati, C. (2024). Scraping innovativeness from corporate websites: Empirical evidence on Italian manufacturing SMEs. Technological Forecasting and Social Change, 207, 123597. https://doi.org/10.1016/j.techfore.2024.123597

6. Bouschery, S.-G., Blazevic, V., & Piller, F.-T. (2023). Augmenting human innovation teams with artificial intelligence: Exploring transformer-based language models. Journal of Product Innovation Management. https://doi.org/10.1111/jpim.12656

7. Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. https://doi.org/10.48550/arXiv.1810.04805

8. Dewar, R.-D., & Dutton, J.-E. (1986). The adoption of radical and incremental innovations: An empirical analysis. Management Science, 32(11), 1422–1433. https://doi.org/10.1287/mnsc.32.11.1422

9. European Commission. (2003). Commission recommendation of 6 May 2003 concerning the definition of micro, small and medium-sized enterprises. Official Journal of the European Union, L, 124, 36–41.

10. European Commission. (2008). NACE Rev. 2: Statistical classification of economic activities in the European Community. Publications Office.

11. European Commission. (2024). Regions in the European Union: Nomenclature of territorial units for statistics (NUTS). Publications Office. https://doi.org/10.2785/714519

12. Gao, Y., Xiong, Y., Gao, X., Jia, K., Pan, J., Bi, Y., & Wang, H. (2024). Retrieval-augmented generation for large language models: A survey. https://doi.org/10.48550/arXiv.2312.10997

13. Gök, A., Waterworth, A., & Shapira, P. (2015). Use of web mining in studying innovation. Scientometrics, 102, 653–671. https://doi.org/10.1007/s11192-014-1434-0

14. Singh, M., & Biwas, A. (2023). AI stocks rally in latest Wall Street craze sparked by ChatGPT. Reuters.

15. Somefun, K., Perchet, R., Yin, C., & Leote de Carvalho, R. (2023). Allocating to thematic investments. Financial Analysts Journal, 79, 18–36.

16. Sortino, F. A., & Price, L. N. (1994). Performance measurement in a downside risk framework. Journal of Investing, 3, 59–64.

17. Sparck Jones, K. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28, 11–21.

18. Spence, M. (1973). Job market signaling. Quarterly Journal of Economics, 87, 355–374.

19. Stice, E. K. (1991). The market reaction to 10-K and 10-Q filings and to subsequent earnings announcements. Accounting Review, 66, 42–55.

20. Priyank Tailor, & Anjali Kale. (2025). Multimodal Sentiment Analysis of Earnings Calls and SEC Filings: A Deep Learning Approach to Financial Disclosures. Utilitas Mathematica, 122(1), 3163–3168. Retrieved from https://utilitasmathematica.com/index.php/Index/article/view/2676

Downloads

Published

2026-02-28

How to Cite

Aarav Mehta. (2026). Artificial Intelligence, Web-Derived Signals, and the Measurement of Firm-Level Innovation: A Multidisciplinary Analytical Framework. International Journal of Advance Scientific Research, 6(02), 175-183. https://sciencebring.com/index.php/ijasr/article/view/1169

Similar Articles

31-40 of 164

You may also start an advanced similarity search for this article.