1 Introduction
2 Related work
2.1 Academia-industry collaborations
2.2 Emerging topics and collaborations in Information Retrieval research
3 Data preparation
Figure 1. Flow chart of data processing. * indicates the focus of this current paper. |
Figure 2. Distribution of the number of papers in three types over the years. |
4 Methods
5 Analysis and results
5.1 Productivity patterns and preferred venues
Figure 3. The distribution of conference frequency. |
Figure 4. The top-10 published conferences in the three categories. |
5.2 Citations and downloads
Figure 5. Heat map of the correlation coefficient matrix for the three types of articles. |
Figure 6. Conversion rate for the three types of articles. |
5.3 Research topic analysis
Figure 7. Variation of cosine similarity with year for three types of articles. “Academic-industry” indicates the similarity between publications by authors purely from academia and publications by authors purely from industry. “Industry-collaboration” indicates the similarity between publications by authors purely from industry and publications co-authored by scientists from academia and industry. “Collaboration-academic” refers to the similarity between publications co-authored by scientists from academia and industry and publications by authors purely from academia. |
5.4 Scientific collaborations
Figure 8. Distribution of the number of co-authors for each type of publications. Since the Academia-Industry Collaboration is defined as having at least one author from academia and one author from industry, the blue curve representing Academia-Industry Collaboration starts with a number of co-authors of two. |
6 Discussion
Acknowledgements
Funding information
Conflict of interests
Author contributions
Appendix
Table A1. ACM article keywords/phrases (ranked in an alphabetical order by column; all keywords/phrases lowercased). |
| accountability information retrieval | explainability information retrieval | navigation | sentiment analysis |
|---|---|---|---|
| active learning | exploratory search | neural network | similarity |
| adaptation | eye tracking | neural networks | similarity measure |
| annotation | faceted search | novelty | similarity search |
| annotations | fairness information retrieval | online social networks | social media |
| audio | feature extraction | ontologies | social network |
| augmented reality | feature selection | ontology | social network analysis |
| benchmark | federated search | open data | social networks |
| big data | filtering | opinion mining | social search |
| blog | flickr | optimization | social tagging |
| browsing | folksonomy | P2P | spam |
| caching | geographic information retrieval | pagerank | spoken search system |
| CBIR | graph mining | passage retrieval | sponsored search |
| children | group recommendation | peer-to-peer | summarization |
| classification | hashing | performance | supervised learning |
| cloud computing | human-computer interaction | performance evaluation | SVM |
| clustering | image annotation | personal information management | tagging |
| collaboration | image classification | personalization | tags |
| collaborative filtering | image retrieval | personalized search | test collection |
| collaborative tagging | image search | privacy | test collections |
| community detection | implicit feedback | pseudo relevance feedback | text categorization |
| complex event processing | index | pseudo-relevance feedback | text classification |
| content analysis | indexing | query | text mining |
| content-based filtering | information extraction | query classification | time series |
| content-based image retrieval | information filtering | query expansion | topic model |
| content-based retrieval | information retrieval | query formulation | topic modeling |
| context | information seeking | query intent | topic models |
| context-awareness | information visualization | query log analysis | transfer learning |
| conversational information retrieval | interaction | query logs | transparency information retrieval |
| convolutional neural networks | interactive information retrieval | query performance prediction | trust |
| correlation | interoperability | query processing | |
| credibility | inverted index | query reformulation | unsupervised learning |
| cross-language information retrieval | kernel methods | query suggestion | usability |
| cross-modal retrieval | keyword search | question answering | user behavior |
| crowdsourcing | knowledge base | random walk | user interaction |
| data integration | knowledge management | ranking | user interface |
| data mining | language model | RDF | user interfaces |
| database | language modeling | recommendation | user modeling |
| deep learning | language models | recommendation system | user profile |
| digital humanities | learning | recommendation systems | user profiling |
| digital libraries | learning to rank | recommender system | user studies |
| digital library | lifelogging | recommender systems | user study |
| digital preservation | link analysis | relation extraction | video |
| dimensionality reduction | linked data | relevance | video analysis |
| distributed information retrieval | locality sensitive hashing | relevance feedback | video annotation |
| diversification | location-based services | re-ranking | video retrieval |
| diversity | log analysis | responsible information retrieval | video search |
| document clustering | machine learning | retrieval | video summarization |
| document representation | machine translation | retrieval models | visualization |
| document retrieval | MapReduce | sampling | web |
| e-commerce | matrix factorization | scalability | web 2.0 |
| education | measurement | search | web mining |
| efficiency | metadata | search behavior | web search |
| e-government | mobile | search engine | web search engine |
| emotion | mobile computing | search engines | web service |
| enterprise search | mobile devices | semantic relatedness | web services |
| entity linking | multimedia | semantic search | wiki |
| ethnics information retrieval | multimedia retrieval | semantic similarity | wikipedia |
| evaluation | music | semantic web | word embeddings |
| event detection | music information retrieval | semantics | world wide web |
| events | music recommendation | semi-supervised learning | XML |
| experimentation | named entity recognition | sensor networks | XML retrieval |
| expert finding | natural language processing |
Table A2. Research topics over time. |
| Academia | Industry | Academia-Industry Collaboration | |
|---|---|---|---|
| 2000 | intelligent libraries library classes library technologies novel browser internet classrooms | video classroom video watermarking video performance video technical video recording | learning algorithms analysis hashing proliferation internet classifies algorithms learning algorithm |
| 2001 | mouse popular mouse 3d popular multiplayer 3d computing powerful 3d | advanced algorithms computationally feasible researchers improve modeling useful interface powerful | software engineers designer needs designing ontology xml software documentation engineers |
| 2002 | libraries tutorial databases attractive library technology efficient indexing databases tutoring | designing web auctions improving expanding rehearsal bioinformatics emerging search engines | offering algorithms tackling algorithms semantic web algorithms software web query |
| 2003 | optimization cancer audition algorithms partitioning algorithms algorithms comparison algorithms haplotype | music photo2video retrieves songs music concert music database new songs | simplifying web internet experiments popular web online semantic new spam |
| 2004 | algorithm updating novel algorithms algorithms lessons developing algorithms algorithms learning | toolkit debugging browser optimizations search algorithms web verification algorithms methodology | simplifying web internet experiments popular web online semantic new spam |
| 2005 | valuable indexing algorithms improving efficient ontology interesting research important crosscutting | algorithms methodology magic instructional data webgazeanalyzer webgazeanalyzer brings laser pointer | holistic algorithms novel internetworking smart phones algorithms scalability wireless broadband |
| 2006 | valuable indexing algorithms improving efficient ontology interesting research important crosscutting | research challenging executed twice algorithms counter bioinformatics motivation steep learning | challenge traffic traffic engineering algorithms damping algorithms research engineering algorithms |
| 2007 | servers cheating privacy vulnerabilities online personalization leakage internet online anonymity | huge database hashing billions largest commerce amazon highly huge collections | major browsers algorithms large algorithms widely extensive programming fastdash developer |
| 2008 | wikipedia huge browsers popular huge databases favourite websites winning podcasts | study robots survey robots querybuilder query querylogs bundling botnet detection | algorithm recommender study novel research recent tagging podcasts researchers paper |
| 2009 | rfid popular patents important rfid algorithms patents essential tagging expert | browsers actively growth online online advertising online surveys pushing browsers | internet researchers importance researchers benchmarking browsers research researchers data researchers |
| 2010 | new algorithms hashtag innovation evolving wiki new apps discovery bioinformatics | research revolutionize expensive simulators database huge budget challenging consumption skyrocketed | wikiprojects increased pagerank algorithm playlists photoselect brainstorming stylus retagging online |
| 2011 | design tutorial bioinformaticians designing designing sparql clustering bioinformatics tutorials technologies | driver safety automobiles tutorial refocus driving driver infotainment opportunistic driver | attractive websites understanding internet moderating online good websites modeling internet |
| 2012 | comics techcommix cancer increased cyberinfrastructure scientist algorithms physicists scientists seeking | improving tweet threefold allows reducing aliasing algorithm reducing strategies threefold | avatar conferencing rearranging videos video hashing video tutorials media tutorials |
| 2013 | detectives solving algorithms greedy played detectives crime notepad detecting cyberbullying | latest poll proposed algorithm tweeted wedding motivation graphbuilder research innovations | project openstreetmap tutorial overview openstreetmap editors pattern openstreetmap modeling tutorial |
| 2014 | discriminating online misinformation crowdturfing instagram traffic kickstarter interview internet challenging | improving online improvements online google overtaking web analytics wikipedia benefitted | pagerank algorithm web searchers search engines search engine online surveys |
| 2015 | apps revolutionize apps changing simplifying mobile investigating smartphone developing smartwatch | improve genetic experiments expensive expensive experiments novel algorithms quantitative genetic | youtube tutorials movielens netflix netflix datasets netflix dataset youtube flickr |
| 2016 | learning analytics analytics learning agile analytics learning smartwatches learning evolution | google facebook facebook microsoft facebook conducting traffic staggering economics rigorously | chatbots developers needed mobile designing android prototype chatbot apps study |
| 2017 | videos rebooting videoconferencing application video tutorials video study videoconferencing | google facebook facebook microsoft facebook conducting traffic staggering economics rigorously | online bullying facebook misleading bullying twitter online distractions reducingcontroversy homepage |
| 2018 | videos rebooting videoconferencing application video tutorials video study videoconferencing | important research seismic interpreters cryptography needed noisy training research challenges | crowdsourcing analytics reuse networking cache management cache reuse web transformational |


