About the machine learning project
The government's ambition is to create a patient-centric health service where patient voices are heard, including strengthening their involvement in decision-making processes, development, and evaluation of healthcare services. This innovation project, funded by the Research Council of Norway (NFR), aims to develop and test tools for sentiment analysis of patient comments in Norwegian.
A key patient-centred tool at the national level is the national system for measuring patient experiences. Free-text comments from these surveys are highly relevant for clinicians and leaders in their quality improvement efforts. However, they are largely unused due to the time and resources required to analyse them.
Natural Language Processing (NLP) represents a branch of computer science and artificial intelligence focused on the automated analysis of human language using machine learning models. Sentiment analysis is one of these models, aiming to identify subjective attitudes in a text: whether the opinion is positive or negative, and to who or what the opinion refers. Aspect-based sentiment analysis goes a step further by linking identified targets to broader topic categories.
Sentiment analysis has been introduced to analyse comments from patients in international health service research, but these tools are both domain- and language-specific and have not been developed for Norwegian text in the field of health. This project will therefore develop and evaluate resources and tools for aspect-based sentiment analysis of free-text comments in Norwegian. We will use patient comments from the Norwegian Institute of Public Health's (NIPH) national patient experience surveys as a data source when developing the model.
The results from this project will be of great value for surveys conducted by the national system for patient experience surveys, in addition to other parts of NIPH and the public sector in general. Automatic analysis of comments has wide application in the public sector, and the project will also lead to efficiency and cost savings.
The innovation project is formally organised within the Division for Health Services at NIPH. The University of Oslo, Research Group for Language Technology, is a partner in the project. Various actors in the health service will be involved to ensure that the results are as relevant as possible for the services.
Work Package 1
The goal of this work package is to adapt a machine learning model that can automate the classification of unstructured patient comments as positive or negative, and the degree of polarity. To achieve this, access to gold-standard data for both training and testing is required. Gold standard data are comments that are manually annotated with concerning polarity at both the sentence and comment level. The first phase will involve manual annotation of comments for polarity, which will then be used to construct training and testing sets. Neural models, including large language models for Norwegian (1), will be the starting point for adapting the model to the task and domain (1).
Work Package 2
This phase involves developing and testing algorithms for the automatic classification of the content of comments into main themes and sub-themes through aspect-based sentiment analysis. Aspect-based sentiment analysis will help to classify patient experiences in different aspects of health services, such as diagnosis and follow-up. The first phase involves manual annotation of free-text comments against the conceptual framework for each questionnaire (2-3), and then developing a domain-specific overview of various aspects and performing a fine-grained annotation. The annotated data will be used for both training and evaluation of the model. Here, we will also test large language models and use them as a starting point for the model (1, 4).
Work Package 3
The model developed for the analysis of polarity and aspects (work packages 1 and 2) will be used when creating the first version of supplier-level reports. These reports will primarily be directed at various health service providers, such as general practitioners. The evaluation will include investigations into how the new results from sentiment analysis are used in practice, and what promotes or inhibits the use of patient experiences for quality improvement. This will involve qualitative interviews with health personnel and electronic surveys.
Work Package 4
The new data sources, methods, and tools developed will be documented in a report. This will lay the foundation for integration with the system for national patient experience surveys and the possibility of adaptation to other public applications. The documentation process will include workshops with the entire research team at NIPH and meetings with all project partners. User participation will be a central factor throughout the process to ensure relevance and utility for the main target group.
The innovation project will be documented in a report, opinion articles in the press, and popular science articles. In addition, we will publish results in scientific articles in international journals.