Research Papers

Understanding teams and productivity in information retrieval research: Academia, industry, and cross-community collaborations

Expand
  • 1Institute of Education, Tsinghua University, Beijing 100084, China;
    2Department of Information Management, Peking University, Beijing 100871, China;
    3School of Library and Information Studies, The University of Oklahoma, Norman, Oklahoma 73019, U.S.A.
† Yi Bu (Email: buyi@pku.edu.cn); Jiqun Liu (Email: jiqunliu@ou.edu).
* These authors contributed to this paper equally.

Received date: 2025-06-06

  Revised date: 2025-08-07

  Accepted date: 2025-09-19

  Online published: 2025-10-13

Abstract

Purpose: Prior Information Retrieval (IR) research synthesizes progress from individual studies, yet academia-industry collaboration dynamics remain unexplored. This study investigates: (1) productivity patterns and venues, (2) citations-downloads relationships, (3) topic evolution, and (4) collaboration trends.
PDesign/methodology/approach: We perform an analysis of 53,471 ACM IR papers (2000-2018) using bibliometrics and DistilBERT topic modeling.
Findings: We find that industry-involved papers preferred WWW/CIKM venues; collaborations dominated RecSys/CSCW. We see that academia-industry collaborations achieved the highest download-to-citation conversion rates. Academia focused on algorithms; industry on applications; collaborations bridged both with rising human-centered themes.
Research implications: This is a pioneering large-scale bibliometrics revealing collaboration’s impact on IR knowledge evolution and provides a methodological framework for cross-sector analysis.
Practical implications: The paper identifies optimal venues (RecSys/CSCW) for partnerships and guides joint initiatives (shared datasets, grants) to bridge academia-industry divides and enhance research translation.
Originality/value: This is the first large-scale bibliometric analysis of IR academia-industry collaboration. The paper finds many novel insights, including the fact that collaboration boosts citation efficiency, enables complementary specialization, and drives topic convergence.

Cite this article

Jiaqi Lei, Liang Hu, Yi Bu, Jiqun Liu . Understanding teams and productivity in information retrieval research: Academia, industry, and cross-community collaborations[J]. Journal of Data and Information Science, 0 : 20251013 -20251013 . DOI: 10.2478/jdis-2025-0051 CSTR: 32295.14.jdis-2025-0051

References

Ahmed N., Wahed M., & Thompson N. C. (2023). The growing influence of industry in AI research. Science, 379(6635), 884-886. https://doi.org/10.1126/science.ade2420
Castillo, C. (2019). Fairness and Transparency in Ranking. ACM SIGIR Forum, 52(2), 64-71. https://doi.org/10.1145/3308774.3308783
Culpepper J. S., Diaz F., & Smucker M. D. (2018). Research Frontiers in Information Retrieval: Report from the Third Strategic Workshop on Information Retrieval in Lorne (SWIRL 2018). ACM SIGIR Forum, 52(1), 34-90. https://doi.org/10.1145/3274784.3274788
Ekstrand M. D., Burke R., & Diaz F. (2019). Fairness and Discrimination in Retrieval and Recommendation. Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 1403-1404. https://doi.org/10.1145/3331184.3331380
Etzkowitz, H., & Leydesdorff, L. (2000). The dynamics of innovation: From National Systems and “Mode 2” to a Triple Helix of university-industry-government relations. Research Policy, 29(2), 109-123. https://doi.org/10.1016/S0048-7333(99)00055-4
Gao J., Xiong C., & Bennett P. (2020). Recent Advances in Conversational Information Retrieval. Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2421-2424. https://doi.org/10.1145/3397271.3401418
Gao, R., & Shah, C. (2021). Addressing Bias and Fairness in Search Systems. Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2643-2646. https://doi.org/10.1145/3404835.3462807
Garfield, E. (1996). Fortnightly Review: How can impact factors be improved? BMJ, 313(7054), 411-413. https://doi.org/10.1136/bmj.313.7054.411
Hu B., Ding Y., Dong X., Bu Y.,& Ding, Y.(2021). On the relationship between download and citation counts: An introduction of Granger-causality inference. Journal of Informetrics, 152020.101125
Jasny B. R., Wigginton N., McNutt M., Bubela T., Buck S., Cook-Deegan R., Gardner T., Hanson B., Hustad C., Kiermer V., Lazer D., Lupia A., Manrai A., McConnell L., Noonan K., Phimister E., Simon B., Strandburg K., Summers Z., & Watts D. (2017). Fostering reproducibility in industry-academia research. Science, 357(6353), 759-761. https://doi.org/10.1126/science.aan4906
Jazi S. Y., Mirzaeinia A., & Jazi S. Y. (2024). Analyzing Gender Polarity in Short Social Media Texts with BERT: The Role of Emojis and Emoticons. https://doi.org/10.13140/RG.2.2.15772.50568
Keyvan, K., & Huang, J. X. (2023). How to Approach Ambiguous Queries in Conversational Search: A Survey of Techniques, Approaches, Tools, and Challenges. ACM Computing Surveys, 55(6), 1-40. https://doi.org/10.1145/3534965
Kobayashi, M., & Takeda, K. (2000). Information retrieval on the web. ACM Computing Surveys, 32(2), 144-173. https://doi.org/10.1145/358923.358934
Lei J., Bu Y., & Liu J. (2023). Information Retrieval Research in Academia and Industry: A Preliminary Analysis of Productivity, Authorship, Impact, and Topic Distribution. In I. Sserwanga, A. Goulding, H. Moulaison-Sandy, J. T. Du, A. L. Soares, V. Hessami, & R. D. Frank (Eds.), Information for a Better World: Normality, Virtuality, Physicality, Inclusivity (Vol. 13972, pp. 360-370). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-28032-0_29
Li, H., & Lu, Z. (2016). Deep Learning for Information Retrieval. Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, 1203-1206. https://doi.org/10.1145/2911451.2914800
Liu, J. (2021). Deconstructing search tasks in interactive information retrieval: A systematic review of task dimensions and predictors. Information Processing & Management, 58(3), 102522. 10.1016/j.ipm.2021.102522
Marijan D.,& Gotlieb, A.(2021). Industry-Academia research collaboration in software engineering: The Certus model. Information and Software Technology, 132, 106473. https://doi.org/10.1016/j.infsof.2020.106473
Mitchell M., Wu S., Zaldivar A., Barnes P., Vasserman L., Hutchinson B., Spitzer E., Raji, I. & Gebru, T. (2019, January). Model cards for model reporting. In Proceedings of the Conference on Fairness, Accountability, and Transparency (pp. 220-229). 10.1145/3287560.3287596
Noyons E. C. M., Van Raan, A. F. J., Grupp H., & Schmoch U. (1994). Exploring the science and technology interface: Inventor-author relations in laser medicine research. Research Policy, 23(4), 443-457. https://doi.org/10.1016/0048-7333(94)90007-8
Olteanu A., Garcia-Gathright J., De Rijke M., Ekstrand M. D., Roegiest A., Lipani A., Beutel A., Olteanu A., Lucic A., Stoica A.-A., Das A., Biega A., Voorn B., Hauff C., Spina D., Lewis D., Oard D. W., Yilmaz E., Hasibi F., … Kamishima T. (2019). FACTS-IR: Fairness, accountability, confidentiality, transparency, and safety in information retrieval. ACM SIGIR Forum, 53(2), 20-43. https://doi.org/10.1145/3458553.3458556
Owen-Smith, J. (2003). From separate systems to a hybrid order: Accumulative advantage across public and private science at Research One universities. Research Policy, 32(6), 1081-1104. https://doi.org/10.1016/S0048-7333(02)00111-7
Perkmann, M., & Walsh, K. (2009). The two faces of collaboration: Impacts of university-industry relations on public research. Industrial and Corporate Change, 18(6), 1033-1065. https://doi.org/10.1093/icc/dtp015
Rani Y. A., Balaram A., Sirisha M. R., Nabi S. A., Renuka P.,& Kiran, A.(2024). AI Enhanced Customer Service Chatbot. 2024 International Conference on Science Technology Engineering and Management (ICSTEM), 1-5. https://doi.org/10.1109/ICSTEM61137.2024.10561155
Rhoten, D., & Powell, W. W. (2007). The Frontiers of Intellectual Property: Expanded Protection versus New Models of Open Science. Annual Review of Law and Social Science, 3(1), 345-373. https://doi.org/10.1146/annurev.lawsocsci.3.081806.112900
Sanh V., Debut L., Chaumond J., & Wolf T. (20202020 (202020202020). DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter (No. arXiv:1910.01108). arXiv. http://arxiv.org/abs/1910.01108
Schloegl, C., & Gorraiz, J. (2011). Global usage versus global citation metrics: The case of pharmacology journals. Journal of the American Society for Information Science and Technology, 62(1), 161-170. https://doi.org/10.1002/asi.21420
Serajian M., Marini S., Alanko J. N., Noyes N. R., Prosperi M., & Boucher C. (2023). Scalable De Novo Classification of Antibiotic Resistance of Mycobacterium Tuberculosis. Bioinformatics. https://doi.org/10.1101/2023.11.16.567394
Shahin M., Chen F. F., Hosseinzadeh A., Maghanaki M., & Eghbalian A. (2024). A novel approach to voice of customer extraction using GPT-3.5 Turbo: Linking advanced NLP and Lean Six Sigma 4.0. The International Journal of Advanced Manufacturing Technology, 131(7-8), 3615-3630. https://doi.org/10.1007/s00170-024-13167-w
Spicer A. J., Colcomb P.-A., & Kraft A. (2022). Mind the gap: Closing the growing chasm between academia and industry. Nature Biotechnology, 40(11), 1693-1696. https://doi.org/10.1038/s41587-022-01543-4
Thomas P., Czerwinksi M., Mcduff D., & Craswell N. (2021). Theories of Conversation for Conversational IR. ACM Transactions on Information Systems, 39(4), 1-23. https://doi.org/10.1145/3439869
Van Looy,B., Callaert, J., & Debackere, K.(2006). Publication and patent behavior of academic researchers: Conflicting, reinforcing or merely co-existing? Research Policy, 352006.02.003
Wuchty S., Jones B. F., & Uzzi B. (2007). The Increasing Dominance of Teams in Production of Knowledge. Science, 316(5827), 1036-1039. https://doi.org/10.1126/science.1136099
Yates A., Nogueira R., & Lin J. (2021). Pretrained Transformers for Text Ranking: BERT and Beyond. Proceedings of the 14th ACM International Conference on Web Search and Data Mining, 1154-1156. https://doi.org/10.1145/3437963.3441667
Zaharia, N., & Kaburakis, A. (2016). Bridging the Gap: U.S. Sport Managers on Barriers to Industry-Academia Research Collaboration. Journal of Sport Management, 30(3), 248-264. https://doi.org/10.1123/jsm.2015-0010
Zamani H., Dumais S., Craswell N., Bennett P., & Lueck G. (2020). Generating Clarifying Questions for Information Retrieval. Proceedings of The Web Conference 2020, 418-428. https://doi.org/10.1145/3366423.3380126
Zhang C., Bu Y., Ding Y., & Xu J. (2018). Understanding scientific collaboration: Homophily, transitivity, and preferential attachment. Journal of the Association for Information Science and Technology, 69(1), 72-86. https://doi.org/10.1002/asi.23916
Outlines

/

京ICP备05002861号-43

Copyright © 2023 All rights reserved Journal of Data and Information Science

E-mail: jdis@mail.las.ac.cn Add:No.33, Beisihuan Xilu, Haidian District, Beijing 100190, China

Support by Beijing Magtech Co.ltd E-mail: support@magtech.com.cn