I am a Professor of Software Engineering in the Department of Computer Science at the University of York, where I research and teach automated and model-based software engineering. I am also an Eclipse Foundation committer, leading the development of the open-source Epsilon model-based software engineering platform, an associate editor of IET Software, and a member of the program committees of MODELS and ICSE.

Looking for a PhD in Software Engineering?

I am recruiting PhD candidates in the field of software engineering, on topics such as model-based software engineering, engineering of big-data analytics systems, and mining software repositories. If you have strong object-oriented design and development skills and you would like to join a world-class research group in Britain's best place to live, please read on.

Featured Publications

I have co-authored more than 100 peer-reviewed journal, conference and workshop papers, some of which are listed below. (more...)

journal Zolotas, A., Matragkas, N., Devlin, S., Kolovos, D.S. & Paige, R.F. Type inference in flexible model-driven engineering using classification algorithms Software and Systems Modeling, pages 1-22, 2018.

conference Zolotas, A., Rodriguez, H.H., Kolovos, D.S., Paige, R.F. & Hutchesson, S. Bridging Proprietary Modelling and Open-Source Model Management Tools: The Case of PTC Integrity Modeller and EpsilonIn 20th ACM/IEEE International Conference on Model Driven Engineering Languages and Systems, MODELS 2017, Austin, TX, USA, September 17-22 (Best paper award), pages 237-247, 2017

journal Zolotas, A., Clarisó, R., Matragkas, N., Kolovos, D.S. & Paige, R.F. Constraint programming for type inference in flexible model-driven engineering Computer Languages, Systems and Structures, 49:216-230, 2017.

journal Ajit, S., Holmes, C., Johnson, J., Kolovos, D.S. & Paige, R.F. Model-based tool support for Tactical Data Links: an experience report from the defence domain Software and System Modeling, 16(2):559-586, 2017.

journal Kolovos, D.S., García-Domínguez, A., Rose, L.M. & Paige, R.F. Eugenia: towards disciplined and automated development of GMF-based graphical model editors Software and System Modeling, 16(1):229-255, 2017

conference García-Domínguez, A., Barmpis, K., Kolovos, D.S., Silva, M.A.A.d., Abherve, A. & Bagnato, A. Integration of a graph-based model indexer in commercial modelling toolsIn Proceedings of the ACM/IEEE 19th International Conference on Model Driven Engineering Languages and Systems, Saint-Malo, France, October 2-7, 2016, pages 340-350, 2016.

Research Grants

My research has been funded through several national and European Commission grants, some of which appear below. (more...)

Innovate UK KTP (Rolls-Royce): Knowledge Transfer Partnership with Rolls-Royce on domain-specific modelling and model-based engineering of aerospace systems (Principal Investigator, 2018-2021)

TYPHON: EC H2020 project on polyglot and hybrid persistence architectures for Big Data analytics (Principal Investigator, 2018-2020)

CROSSMINER: EC H2020 project on knowledge extraction from open-source software (Principal Investigator, 2017-2019)

Innovate UK KTP (Smith & Nephew): Knowledge Transfer Partnership with Smith & Nephew on applications of MDE technologies in the medical device domain (Principal Investigator, 2017-2019)

Innovate UK KTP (IBM): Knowledge Transfer Partnership with Smith & Nephew on applications of MDE technologies in the data management domain (Co-investigator, 2017-2020)

ESRC National Productivity Investment Fund: Collaborative project (with Keyfort) on delivery of mental health support services through mobile apps. (Principal Investigator, 2017-2018)

MONDO: EC FP7 STREP project on scalable model-driven engineering (Principal Investigator, 2013-2016)

Innovate UK KTP (JC Chapman): Knowledge Transfer Partnership with JC Chapman on applications of MDE technologies in the financial domain (Principal Investigator, 2016-2016)

Research Interests

My research interests are in model-based software and systems engineering, in mining software repositories and communities to extract actionable insights, and in technologies for persisting and analysing large volumes of heterogeneous data.

Model-Based Software Engineering

Model-based software engineering (MBSE) is the practice of raising models to first-class artefacts of the software engineering process, using such models to analyse, simulate and reason about properties of the system under development, and eventually often auto-generate a part of its implementation.

MBSE brings and adapts well-understood and long-established principles and practices of trustworthy systems engineering to software engineering (it is unthinkable to start constructing e.g. a bridge or an aircraft without designing and analysing several models of it first) and is used extensively in organisations that produce business- or safety-critical software (e.g. in the aerospace, automotive and robotics industries), where defects can have catastrophic effects (e.g. loss of life) or can be very expensive to remedy (e.g. large scale product recall). MBSE is also increasingly used for non-critical systems due to the productivity and consistency benefits (largely through automated code generation) it delivers (e.g. JHipster for microservice architectures).

I have authored many highly-cited peer-reviewed papers on topics related to MBSE and I am leading the development of the Epsilon open-source MBSE platform under the Eclipse Foundation, which has a wide user base, including engineers at organisations such as NASA, IBM, BAE Systems and THALES. I am on the Program Committee of the ACM/IEEE International Conference on Model Driven Engineering Languages and Systems, and I have been the Technical Director of a large European Commission project (MONDO, €2.67M) which investigated techniques for scaling up MBSE technologies for very large systems. I am currently involved in knowledge transfer projects with Rolls-Royce, Smith & Nephew and IBM which aim at applying the results of our MBSE research on problems of interest to our industry partners.

Big Data Persistence and Analytics Architectures

The need for levels of availability and scalability beyond those supported by relational databases has led to the emergence of a new generation of purpose-specific databases grouped under the term NoSQL. In general, NoSQL databases are designed with horizontal scalability as a primary concern and deliver increased availability and fault-tolerance at a cost of temporary inconsistency and reduced durability of data. To balance the requirements for data consistency and availability, organisations increasingly migrate towards hybrid data persistence architectures comprising both relational and NoSQL databases. The consensus is that this trend will only become stronger in the future; critical data will continue to be stored in ACID (predominately relational) databases while non-critical data will be progressively migrated to high-availability NoSQL databases.

TYPHON is a European Commission H2020 project (2018-2020, €4.4M), of which I am the Technical Director, which aims to provide a methodology and an integrated technical offering for designing, developing, querying and evolving scalable architectures for persistence, analytics and monitoring of large volumes of hybrid (relational, graph-based, document-based, natural language etc.) data.

TYPHON brings together research partners with a long track record of conducting internationally-leading research on software modelling, domain-specific languages, text mining and data migration, and of delivering research results in the form of robust and widely-used open-source software, industrial partners active in the automotive, earth observation, banking, and motorway operation domains, an industrial advisory board of world-class experts in the fields of databases, business intelligence and analytics, and large-scale data management, and a global consortium including more than 400 organisations from all sectors of IT.

Mining Software Repositories and Communities

Deciding whether an open source software (OSS) product or component meets the required standards for adoption in terms of quality, maturity, activity of development and user support is not a straightforward process. It involves exploring various sources of information including its source code repositories to identify how actively the code is developed, which programming languages are used, how well the code is commented, whether there are unit tests etc., communication channels such as newsgroups, forums and mailing lists to identify whether user questions are answered in a timely and satisfactory manner, to estimate the number of experts and users of the software, its bug tracking system to identify whether the software has many open bugs and at which rate bugs are fixed, and other relevant metadata such as the number of downloads, the license(s) under which it is made available, its release history etc.

Having been involved in open-source software development for more than a decade, I have developed an interest in automatically analysing software repositories and communities to guide and support future software development. I was the Technical Director of OSSMETER (2011-14, €2.7M), a European Commission FP7 project that developed a platform for incremental analysis of source code repositories, bug trackers and communication channels, to support decision makers in the process of discovering, comparing, assessing and monitoring the health, quality, impact and activity of open-source software.

I am currently a principal investigator in a follow-up project (CROSSMINER, 2017-20, €4.4M) which is investigating techniques for mining information from different sources and making it available within the Eclipse IDE. CROSSMINER builds on the results of OSSMETER and is the driving force behind the new Eclipse CROSSMETER project, where I am a committer.