About

The Computer Science Ontology (CSO) is a large-scale ontology of research areas that was automatically generated using the Klink-2 algorithm [1] on the Rexplore dataset [2], which consists of about 16 million publications, mainly in the field of Computer Science. The Klink-2 algorithm combines semantic technologies, machine learning, and knowledge from external sources to automatically generate a fully populated ontology of research areas. Some relationships were also revised manually by experts during the preparation of two ontology-assisted surveys in the field of Semantic Web and Software Architecture. The main root of CSO is Computer Science, however, the ontology includes also a few secondary roots, such as Linguistics, Geometry, Semantics, and so on.

CSO presents two main advantages over manually crafted categorisations used in Computer Science (e.g., 2012 ACM Classification, Microsoft Academic Search Classification). First, it can characterise higher-level research areas by means of hundreds of sub-topics and related terms, which enables to map very specific terms to higher-level research areas. Secondly, it can be easily updated by running Klink-2 on a set of new publications. A more comprehensive discussion of the advantages of adopting an automatically generated ontology in the scholarly domain can be found in [3].

Data Model

The CSO model is an extension of SKOS. It includes eight semantic relations:

Resource Exploration

Each resource is available at its own URI. For instance, the resource 'semantic web' is browsable at the URI https://cso.kmi.open.ac.uk/topics/semantic_web.

The CSO Portal allows to negotiate the content to serve different representations of the same resource (URI), with the following formats:

Details:

Format Header Resource
HTML - semantic web
RDF/XML application/rdf+xml semantic web.rdf or semantic web.xml
Turtle text/turtle semantic web.ttl
JSON-LD application/json or application/ld+json semantic web.json or semantic web.jsonld
N-Triples application/n-triples semantic web.nt

CSO Uptake

CSO was officially released in 2019 and has been already adopted by several major organizations, including Springer Nature.
In the last two year, CSO supported the creation of many innovative applications and technologies, including ontology-driven topic models (e.g., CoCoNoW (Beck et al., 2020)), recommender systems for articles (e.g., SBR (Thanapalasingam et al., 2018)) and video lessons (Borges & dos Reis, 2019), visualisation frameworks (e.g., ScholarLensViz (Loffler et al., 2020), ConceptScope (Zhang et al., 2021)), temporal knowledge graphs (e.g., TGK (Rossanez et al., 2020)), NLP frameworks for entity extraction (Dessi et al., 2021), tools for identifying domain experts (e.g., VeTo (Vergoulis et al., 2020)), and systems for predicting academic impact (e.g., ArtSim (Chatzopoulos et al., 2020)).
It was also used for several large-scale analyses of the literature (e.g., Cloud Computing (Lula et al., 2021), Software Engineering (Chicaiza & Re ategui, 2020), Ecuadorian publications (Chicaiza & Reategui, 2020)).

References

Applications

Smart Topic Miner. The Smart Topic Miner (STM) [4] is a tool which uses semantic web technologies to classify scholarly publications on the basis of a very large automatically generated ontology of research areas. It was developed to support the Springer Nature Computer Science editorial team in classifying proceedings. A demo of the system is available at http://stm-demo.kmi.open.ac.uk/.

Smart Book Recommender. The Smart Book Recommender (SBR) [5] is a semantic application designed to support the Springer Nature editorial team in promoting their publications at Computer Science venues. It takes as input the proceedings of a conference and suggests books, journals, and other conference proceedings which are likely to be relevant to the attendees of the conference in question. A demo of the system is available at http://rexplore.kmi.open.ac.uk/SBR_demo/.

Rexplore. Rexplore [2] is a system which leverages novel solutions in large-scale data mining, semantic technologies and visual analytics, to provide an innovative environment for exploring and making sense of scholarly data.

EDAM methodology. EDAM [6] is a novel expert-driven automatic methodology for creating Systematic Reviews that keep human experts in the loop, but does not require them to check all papers included in the analysis.

Research Communities Map Builder. Temporal Semantic Topic-Based Clustering (TST) [7, 8] is an approach for detecting research communities by clustering researchers according to their research trajectories, defined as distributions of topics over time.

The CSO Classifier. The CSO Classifier [9] is an unsupervised approach for automatically classifying research papers according to the Computer Science Ontology. The classifier takes as input the metadata of a research paper (usually title, abstract, and keywords) and returns a set of research topics drawn from the ontology. Try it out.

Academic/Industry DynAmics Knowledge Graph. The Academia/Industry DynAmics (AIDA) Knowledge Graph [9] describes 21M publications and 8M patents according to the research topics drawn from the Computer Science Ontology. 5.1M publications and 5.6M patents are further characterized according to the type of the author's affiliations (academy, industry, or collaborative) and 66 industrial sectors (e.g., automotive, financial, energy, electronics) organized in a two-level taxonomy.

People

Steering Committee

Aliaksandr
Birukou

Aliaksandr Birukou

Executive Editor, Springer-Verlag GmbH

Enrico
Motta

Enrico Motta

Professor of Knowledge Technologies

Francesco
Osborne

Francesco Osborne

Research Fellow

Team

Enrico
Motta

Enrico Motta

Professor of Knowledge Technologies

Francesco
Osborne

Francesco Osborne

Research Fellow

Angelo
Salatino

Angelo Salatino

Research Associate

Alumni

Andrea
Mannocci

Andrea Mannocci

Research Associate

Thiviyan
Thanapalasingam

Thiviyan Thanapalasingam

Research Assistant

How to Cite CSO

Please cite the following paper:

Salatino, Angelo A., Thiviyan Thanapalasingam, Andrea Mannocci, Francesco Osborne, and Enrico Motta. "The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas." International Semantic Web Conference 2018, Monterey (CA), USA, 2018. http://oro.open.ac.uk/55484/

Relevant Papers

[1] Osborne, F. and Motta, E. (2015) Klink-2: Integrating Multiple Web Sources to Generate Semantic Topic Networks, International Semantic Web Conference 2015, Bethlehem, Pennsylvania, USA

[2] Osborne, F., Motta, E. and Mulholland, P. (2013) Exploring Scholarly Data with Rexplore, International Semantic Web Conference, Sydney, Australia

[3] Osborne, F. and Motta, E. (2012) Mining Semantic Relations between Research Areas, International Semantic Web Conference, Boston, MA

[4] Osborne, F., Salatino, A., Birukou, A. and Motta, E. (2016) Automatic Classification of Springer Nature Proceedings with Smart Topic Miner. International Semantic Web Conference 2016, Kobe, Japan. – slides

[5] Osborne, F., Birukou, A., Thanapalasingam, T. , and Motta, E. (2017) Smart Book Recommender: A Semantic Recommendation Engine for Editorial Products. International Semantic Web Conference 2017, Poster Track. Vienna, Austria.

[6] Osborne, F., Lago, P., Muccini, H., Motta, E. (2018) Reducing the Effort for Systematic Reviews in Software Engineering.

[7] Osborne, F., Scavo, G. and Motta, E. (2014) A Hybrid Semantic Approach to Building Dynamic Maps of Research Communities, EKAW 2014, Linkoping, Sweden.

[8] Osborne, F., Scavo, G. and Motta, E. (2014) Identifying diachronic topic-based research communities by clustering shared research trajectories, Extended Semantic Web Conference 2014, Crete, Greece.

[9] Salatino, A.; Osborne, F.; Thanapalasingam, T. and Motta, E. (2019) The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly Articles., Theory and Practice of Digital Libraries, Oslo, Norway.

[10] Angioni, S.; Salatino, A.; Osborne, F.; Reforgiato Recupero, D. and Motta, E. (2020) Integrating Knowledge Graphs for Analysing Academia and Industry Dynamics, Workshop on Scientific Knowledge Graphs 2020, Lyon, France.

License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.