SIPHER Products

Main panel

SIPHER’s Qualitative Products

SIPHER’s Data Products

SIPHER’s Quantitative Products



Causal Systems Mapping

Snapshot from a causal loop diagram specifying the pathways by which quality-adjusted life expectancy (QALE) is influenced.

Snapshot from a causal loop diagram specifying the pathways by which quality-adjusted life expectancy (QALE) is influenced.

Causal Systems Mapping
Characteristic Details
Status Ready
Purpose Concise visual representations of SIPHER’s policy areas of interest including inclusive economies, public mental health and housing, which captures the causal connections between parts of a system.
Strengths There are a range of causal system mapping approaches, with different strengths and limitations, and the choice of which systems mapping approach to use is determined by the problem. Different systems mapping approaches have been used in SIPHER, including participatory systems mapping and causal loop diagrams. Generally, the strengths of systems mapping are bringing together information from different sources, including documents and stakeholders’ tacit knowledge and presenting it visually, which better reflects the underlying complexity. The maps can bring together a range of perspectives on a topic and be used: to analyse the structure of the system; as tools for thinking and discussion; or developed into quantitative models to test scenarios.
Limitations Complex and comprehensive causal systems maps can be overwhelming and may not be easily useable in policy settings or for computational modelling. In contrast, simplified systems maps may appear more useable but may not capture all relevant variables. Systems maps developed in workshop settings are typically driven by the participants and their understanding of the system, therefore the maps developed reflect participants’ knowledge and experience.
Variables Multiple systems maps have been developed for different policy areas of interest and different policy partners; variables are dependent on the maps.
Examples / Link with Other Models and Data A causal loop diagram connecting the SIPHER Inclusive Economy indicators underlies the Inclusive Economy Dynamical Systems model.
Additional Resources Explore SIPHER’s approach to Systems mapping: https://www.gla.ac.uk/research/az/sipher/systemsmapping/ Clackmannanshire Inclusive Economy system map: https://kumu.io/Sipher-Consortium/clacks-systems-map#clacks-ie-policy-map-final and more about our systems mapping work: https://www.gla.ac.uk/research/az/sipher/systemsmapping/
Contact


Return to SIPHER homepage

Compare All Qualitative Products

Compare All Qualitative Products
Characteristic Employment and Health Evidence and Gap Map Causal Systems Mapping
Status Ready Ready
Purpose A visual and interactive resource to locate published systematic reviews on the topic of health and work. Concise visual representations of SIPHER’s policy areas of interest including inclusive economies, public mental health and housing, which captures the causal connections between parts of a system.
Strengths The primary strength lies in the simplification of complex and diverse research findings. The interactive map contains studies that have explored the relationship between an employment feature and a health and social outcome. The map only contains systematic reviews. There are a range of causal system mapping approaches, with different strengths and limitations, and the choice of which systems mapping approach to use is determined by the problem. Different systems mapping approaches have been used in SIPHER, including participatory systems mapping and causal loop diagrams. Generally, the strengths of systems mapping are bringing together information from different sources, including documents and stakeholders’ tacit knowledge and presenting it visually, which better reflects the underlying complexity. The maps can bring together a range of perspectives on a topic and be used: to analyse the structure of the system; as tools for thinking and discussion; or developed into quantitative models to test scenarios.
Limitations Does not provide any analysis on the studies identified. Users may not have access to all academic papers that are captured in the evidence and gap map as some of the covered material is not open access. Complex and comprehensive causal systems maps can be overwhelming and may not be easily useable in policy settings or for computational modelling. In contrast, simplified systems maps may appear more useable but may not capture all relevant variables. Systems maps developed in workshop settings are typically driven by the participants and their understanding of the system, therefore the maps developed reflect participants’ knowledge and experience.
Variables A range of work related characteristics (including contract conditions, employer attributes and working environment) and health related measures (including physical health outcomes and psychological health). Multiple systems maps have been developed for different policy areas of interest and different policy partners; variables are dependent on the maps.
Examples / Link with Other Models and Data Informs model building and interpretation of quantitative findings other workstreams have obtained. A causal loop diagram connecting the SIPHER Inclusive Economy indicators underlies the Inclusive Economy Dynamical Systems model.
Additional Resources Explore further with link to the interactive tool: https://www.gla.ac.uk/research/az/sipher/products/employmentandhealthegm/ Explore SIPHER’s approach to Systems mapping: https://www.gla.ac.uk/research/az/sipher/systemsmapping/ Clackmannanshire Inclusive Economy system map: https://kumu.io/Sipher-Consortium/clacks-systems-map#clacks-ie-policy-map-final and more about our systems mapping work: https://www.gla.ac.uk/research/az/sipher/systemsmapping/
Contact



Return to SIPHER homepage

Compare All Data Products

Compare All Data Products
Characteristic SIPHER Synthetic Population Health Indicator Dataset Inclusive Economy Indicator Dataset SIPHER-7 Wellbeing Domain Preferences (Survey Dataset) Aversion to Inequality (Survey Dataset) HWMIC (Health and Wellbeing Multi-Instrument Comparison) Dataset
Status Ready Ready Ready In Progress/Ready Soon In Progress/Ready Soon In Progress/Ready Soon
Purpose A quality-controlled, public available data source containing attribute-rich data at the individual level - with the aim to create a digital twin for every adult in the population with a large amount of associated information about each person. A variety of popualtion health indicators for small geographical units (local authorities and LSOAs/MSOAs) for use in statistical analyses and monitoring of area-level health inequalities. This dataset was designed to provide a meaningful operationalisation of the underlying concept of the inclusive economy, and to enable statistical models to further explore the concept. To represent a multi-dimensional measure of wellbeing, consisting of seven indicators, in terms of a single index metric, equivalent income. To elicit public preferences regarding trade-offs between improving wellbeing and reducing inequality. A dataset with a battery of self-reporting health and wellbeing indicators from a large UK sample, oversampling from Scotland.
Context Individual level data enable us to understand an individuals’ situations, what happens to them over time or when affected by changes due to external events or policies. The lack of a comprehensive register-based system in Great Britain has made it challenging to access data on individuals across multiple domains. The SIPHER Synthetic Population helps bridging this gap by providing a representative, attribute-rich dataset reflecting the whole of the adult population in Great Britain. By randomly selecting individuals from a survey and assigning them to small geographical areas based on census statistics, the SIPHER Synthetic Population ensures that the distribution of demographic characteristics for all sampled individuals corresponds exactly to the true demographic structure within each small census output area. This enables researchers to derive area-level profiles which would otherwise not be available. In more complex applications, the dataset can be used to simulate policy interventions and explore their potential impact on individuals and households at a granular resolution, distinguishing small geographical areas such and even population subgroups within these areas. Modelling the impact of public policy on health requires a shared understanding of how we conceptualise and measure health as an outcome. We need a set of health indicators that are meaningful in the context of understanding the effects of policies and interventions of interest to SIPHER, such as those aiming to create an inclusive economy or improve mental health. These indicators can be derived either from synthetic data (e.g., SIPHER Synthetic Popualtion) or from non-synthetic data sources (e.g., ONS/NRS data) SIPHER has adopted a particular understanding which focuses on economic inclusion, rather than inclusive growth. There are multiple approaches and definitions of what constitutes an inclusive economy. To date, there is no single definition of the concept. In response, SIPHER has developed a collection of indicators for researchers and policymakers which describes the extent and nature of economic inclusion across local authorities in Great Britain. The creation of the data set has been informed by an initial review of the underlying theoretical concepts. The selection and estimation of all indicators benefited from co-production between SIPHER researchers and policy partners. SIPHER’s WS6 team has developed a wellbeing indicator set comprising seven indicators - SIPHER-7. While SIPHER-7 describes people’s wellbeing across these seven indicators, when some indicators improve and others worsen, it is difficult to judge whether overall wellbeing is improving or worsening. The purpose of this part of the project is to collapse the multi-dimensional wellbeing indicators into a single index metric for wellbeing, equivalent income. To do this, four surveys using Discrete Choice Experiments (DCE) were conducted with a sample of the UK public. Participants were asked to review a set of ten choice tasks, each involving two imaginary scenarios described in terms of SIPHER-7, and select which scenario they believed was better. In three of the surveys, participants were asked to complete the tasks from a personal perspective (i.e., which scenario they would want for themselves), and in the remaining survey, participants were asked to complete the task from a social perspective (i.e., which scenario they think would be better for policy makers to bring about for others). The econometrically estimated parameters represent the relative values given to the seven wellbeing indicators of SIPHER-7 by samples of the UK general public. Public policies aim to improve wellbeing and reduce wellbeing inequality, but it is not always possible to do both. How do the public balance the trade-off between improving wellbeing and reducing inequality? The relative importance people place on increasing averages and reducing inequalities (or “inequality aversion”) was elicited from a sample of the UK general public (n=53). Respondents participated in one of eleven online discussion groups, where a series of quantitative trade-off exercises were explained and discussed. Each respondent then completed the same exercise individually. The exercises covered aversion to inequality in: (a) an overall measure of wellbeing (equivalent income); (b) lifetime health across otherwise equal individuals; and (c) lifetime health across the rich and poor. Different surveys use different health outcome indicators. Therefore, data might be available for one indicator set when another is required. For example, answers to SF-12 survey items are available but a WEMWBS value is required. This is a large-cross section online survey of the general public (n=12,401) where respondents are asked to self-report their health and wellbeing across a battery of questions. This dataset allows the estimation of a statistical mapping algorithm between the different indicator sets.
Strengths The SIPHER Synthetic Population is representative of the demographic characteristics of the respective area - down to a low geographical resolution. The strength of the SIPHER Synthetic Population is that it provides a wide range of information at the level of individuals. This information can be aggregated into groupings of interest (e.g. sex, income groups) and particular geographical units of interest (LSOA/DZ; MSOA; Local Authorities etc.). The method used to develop the dataset is referred to as spatial microsimulation. We often use the SIPHER Synthetic Population in conjunction with other models we have developed. This enables us to determine whether an intervention has benefitted a population group of interest. Small-area health indicators can be used to monitor area-level health inequalities or as inputs in statistical models. In addition, all health outcome measures can be attached to the Synthetic Population representing area-level health indicators. SIPHER reviewed the available measures and conducted a consensus process with SIPHER colleagues to agree on a final set of indicators. The criteria used were: 1. Interpretability -accessible & meaningful to decision makers, 2. Sensitivity to policy – the indicator can plausibly show the effects of policy. 3. Indicator can show impacts of pandemic on health. 4. Timeliness – refers to the current health state. 5. Availability of timeseries data
6. Changes in mental AND physical health can be separately studied. 7. Regular updates into the future are expected, 8. Comparability – between areas, ideally comparable between England & Scotland, 9. High resolution – available for small areas with LA as a minimum, 10. Disaggregate – available by subgroups (e.g. broken down by age, sex etc).
The data set has been subject to a thorough geographical harmonisation and review process. In addition, the dataset contains a number of supplementary health and demographic indicators for all local authorities. The major strength of this data set is the wide range of potential applications; from descriptive analyses to studies examining the complex relationships between economic inclusion and health and wellbeing. The data set is available as an open access resource via the Open Science Framework: https://osf.io/vnsur/ The DCE data on relative preferences allow the calculation of equivalent income - a quantitative preference-based single metric of wellbeing - for any combination of SIPHER-7 indicators. The samples are large (ranging from 1000 to 3000, totalling just under 11,000) and representative of the UK general public in terms of age and sex. Public policies aim to improve wellbeing and to reduce wellbeing inequality. When there is a conflict between these, policy makers need to make difficult decisions. The quantitative data on inequality aversion is derived from discussion groups, where participants had the opportunity to examine the trade-off exercise in detail. The results help inform policy makers on the trade-offs between the two policy aims that members of the public would support. Different surveys have different health and wellbeing indicators, and this dataset allows the estimation of a statistical mapping algorithm between them. This would allow predicting SIPHER-7 information where the relevant variables are not available.
Limitations The accuracy of the SIPHER Synthetic Population depends on the quality and availability of the underlying data. Some variables may have poor completion rates in the underlying survey, resulting in missing data after linkage. Despite the high number of participants in the Understanding Society survey, explicit spatial constraints cannot be applied when creating the datasaet. This means that an individual who was interviewed as part of the survey and who is actually residing in place X can be assigned to a variety of places A, B, and C, as long as they match the demographic constraints such as age, sex, marital status etc. Although recent updates of the code have led to more constraints on how to perform this selection process, it is important to remember that the creation of the SIPHER Synthetic Population is based on associations and descriptive statistics. It can only ever serve as an approximation of the true population in Scotland, England and Wales - which is likely to be much more heterogenous and diverse than the population captured in the synthetic data source. Therefore, all results obtained from the SIPHER Synthetic Population should always be interpreted carefully as model output, and not as equivalent to a population-based register. The data set cannot resolve situations where no data is available at all or where sampling in surveys is not representative of small geographical units. For a few of the indicators, exact definitions differ between countries. For example, there are different definitions of fuel poverty in use in Scotland and England. In these cases, national deciles were created and comparable alternative indicators were identified. For example, food insecurity was used as an alternative cost-of-living indicator. Currently not available. Currently not available. Currently not available.
Geography Individuals in the SIPHER Synthetic Population have a geography assigned to them (a synthetic DZ/LSOA). This allows all levels of geography upwards from DZ/LSOA Level for Scotland, England and Wales - excluding Northern Ireland - to be analysed and modelled. The exact geographical resolution is indicator-dependent. Typically, the following resolutions are available for Mortality: DZ/LSOA Level for Scotland, England and Wales and LA Level Longitudinal (2017-2021) and geographically harmonised data is available at the level of local authorities in England, Scotland, and Wales. The data set covers all 363 local authorities in Great Britain, reflecting their 2021 boundaries according to ONS definition. The surveys collected data from participants resident in the UK with sampling quotas for age and for sex. UK with sampling quotas for age and for sex. The survey collected data from participants resident in the UK with sampling quotas for age and for sex. Oversamples Scotland.
Variables / Indicators A large variety of variables can be included. This includes all variables included in the Understanding Society survey - the underlying survey data source. It also possible to estimate other derived variables from this data source, for example ‘Equivalent Income’, using the ‘Equivalent Income Calculator’ method. The data set includes measures of mortality, physical, and mental health, and composite measures combining mortality and health. It is open to data updates, and additional health indicators can be estimated and incorporated if required. Details on all indicators are outlined in the Technical Report for the SIPHER Inclusive Economy Indicator Set – See Additional Resources. In addition to the DCE choice data, the surveys include participant self-reported data on: SIPHER-7; household size; age; gender; etc. Surveys (1) and (2) use the original SIPHER-7. Surveys (3) and (4) use the revised version of SIPHER-7. In addition to the inequality aversion task, the survey include participant self-reported data on: SIPHER-7; household size; age; gender; etc. The indicator sets and questions included in the survey: SIPHER-7; ICECAP-A; EQ-5D-5L; SF-12 v2; HUI; WEMWBS; EQ-HWB; ONS-4; Understanding Society items on crime and housing; items from the Labour Force Survey, the Living Wage Foundation questionnaire; education, income, ethnicity, children, informal caregiving; gender, age; etc. Includes sampling weights to correct for age and sex with respect to the mid-year UK population estimate.
Time Period The latest release reflects the years 2019-2021. Results from the UK census 2011 are used as constraints for the spatial microsimulation - the process generating the Synthetic Population. Preliminary updated version for England and Wales are available which are based on the UK census 2021. However, Scotland has not yet published all required input data from its most recent census. DZ/LSOA/MSOA Level: typically, cross-sectional representing the period covered by the synthetic population. Local Authority level: typically, longitudinal for 2004-2020 when based on non-synthetic data. Data will be updated as new data becomes available. Longitudinal data are available for every year between 2017 and 2021. There are four datasets: (1) people’s personal preferences in autumn 2020; (2) people’s personal preferences in autumn 2021; (3) people’s personal preferences in spring 2022; (4) people’s social preferences in spring 2022. Dataset (2) includes returning respondents from (1). Otherwise, the observations are independent. Data collected: summer - autumn 2022. Data collected: late 2022.
Missing Data The level of missing information for a particular variable is determined by the levels of missingness in the underlying Understanding Society survey. Level of missing data determined by data availability. Older data not always comparable across time or form for some indicators. Missing data were imputed using a sophisticated multiple imputation algorithm. In some cases, only cross-sectional measurements were available, which were carried forward or backward. For example, local elections did not take place every single year. Currently not available. Currently not available. Currently not available.
Examples / Link with Other Models and Data The Synthetic Population is used as the underlying data source in several SIPHER models. These include: (1) dynamic systems model, (2) static and dynamic microsimulation and (3) decision support tool. Information covered in the Synthetic Population can be extended by adding additional variables from other data sources. These could be datasets that are not publicly available. In addition, the SIPHER Synthetic Population can be used to derive more complex concepts such as the ‘Equivalent Income’ - a variable which is calculated using the ‘Equivalent Income Calculator’ method. A portfolio of area-level summary indicators on mortality, health, and composite indicators that combine information on mortality and health. These indicators can be attached as area-level indicators to the Synthetic Population. In addition, health measures are used in the Local Authority clustering work, as well as in the Dynamic Systems model. The data set is currently used in a k-means clustering machine learning study. The primary aim of this study is to identify clusters of similar local authorities and to examine the association of each cluster with a number of health outcomes. In another application, we explore the association between Quality-Adjusted Life Expectancy (QALE) and indicators of economic inclusion. The estimated parameters can be used to calculate an equivalent income variable in the Synthetic Population. The estimated inequality aversion parameter is used to identify the optimal trade-off between maximising wellbeing and reducing inequality in the decision support tools. Currently not available.
Software Requirements Requires a software that can handle the size of the data file, such as R or Python. An interactive Rshiny dashboard allows a code-free exploration of an aggregated version: https://sipherdashboard.sphsu.gla.ac.uk/ Requires a software that can handle the size of the data file, such as R or Python Requires a software that loads data, such as Excel, R, or Python. Access SIPHER Inclusive Economy Dataset Interactive Map - https://www.gla.ac.uk/research/az/sipher/products/inclusiveeconomydataset/ieinteractivemap/#d.en.1054750 The main choice data and respondent background variables are saved in Stata and require a software that can read in Stata files. The main trade-off data and respondent background variables are saved in Stata and require a software that can read in Stata files. Currently saved in Stata and requires a software that can read in Stata files.
Data Requirements / Restrictions The SIPHER Synthetic Population is available for full indeopendent use via the UK Data Service’s Curated Data Collection. To set up the SIPHER Synthetic Population, it is required to link the synthetic population file (UK Data Service ID: SN9277) with Understanding Society survey data (UK Data Service ID: SN6614) - as is typically done for area-level linkages of surveys. Both datasets are subject to the General End-User License Agreement terms and conditions, and can be downloaded without any costs directly from the website of UK Data Service. For key indicators such as QALE, Life Expectancy, and Lifespan Variation it is planned that a final version of the dataset and the underlying code will be made publicly available. In order to fully reproduce health measures requiring the Synthetic Population, access to the Synthetic Population is required. The final data set is available as an open access resource. Currently not available. Currently not available. Currently not available.
Data / Code Available Due to the underlying license agreement, the dataset cannot be shared as an open access version. However, the dataset can be downloaded through the UK Data Service website, after acceptance of the General End-User license terms and conditions: https://beta.ukdataservice.ac.uk/datacatalogue/studies/study?id=9277#!/details In addition, we have made a wealth of supplementary material available, documenting creation, validation, linkage, and exploration of the dataset: https://reshare.ukdataservice.ac.uk/856754/ Work in progress, final dataset will be made publicly available. Pipeline of code for estimation of Quality-Adjusted Life Expectancy (QALE) is available. The final dataset and additional documentation are publicly available via the Open Science Foundation: https://osf.io/vnsur/ Currently not available. Currently not available. Currently not available. The dataset will be archived. There is no associated code.
Training We have provided a comprehensiv, open access User Guide for our SIPHER Synthertic Population. The User Guide provides background information and explains how to setup up the data and analyse it swiftly: https://doc.ukdataservice.ac.uk/doc/9277/mrdoc/pdf/9277_user_guide_r4_clean.pdf Online pipeline example via GitHub. The data is accompanied by a comprehensive data dictionary which provides context to all included variables. Currently not available. Currently not available. Currently not available.
Additional Resources SIPHER Synthetic Population for Individuals in Great Britain, 2019-2021 (UK Data Service Curated Collection, SN9277): https://beta.ukdataservice.ac.uk/datacatalogue/studies/study?id=9277#!/details Comprehensive User Guide: https://doc.ukdataservice.ac.uk/doc/9277/mrdoc/pdf/9277_user_guide_r4_clean.pdf Supplementary Resources: https://reshare.ukdataservice.ac.uk/856754/ Paper describing the statistical creation process: https://www.nature.com/articles/s41597-022-01124-9 Understanding Society Survey Blog: https://www.understandingsociety.ac.uk/news/2024/07/10/building-synthetic-population-data/ Introduction Video: https://www.youtube.com/watch?v=CkiORY7GSLc Choosing the SIPHER health Indicators Report: https://www.gla.ac.uk/media/Media_970682_smxx.pdf and QALE exemplar: https://github.com/AndreasxHoehn/QALE_Exemplar Some indicators are available through the SIPHER Synthetic Population Dashboard: https://sipherdashboard.sphsu.gla.ac.uk/ Explore - https://www.gla.ac.uk/research/az/sipher/products/inclusiveeconomydataset/ SIPHER Inclusive Economy Indicator Set: Technical paper [PDF] - https://www.gla.ac.uk/media/Media_970680_smxx.pdf SIPHER Inclusive Economy Indicator Set: Summary [PDF] - https://www.gla.ac.uk/media/Media_1029792_smxx.pdf Estimating quality-adjusted life expectancy (QALE) for local authorities in Great Britain and its association with indicators of the inclusive economy: a cross-sectional study BMJ Open March 2024 - https://bmjopen.bmj.com/content/14/3/e076704 Measuring the Inclusive Economy Blog - https://www.gla.ac.uk/research/az/sipher/sharingourevidence/blog/headline_1049629_en.html Explore - https://www.gla.ac.uk/research/az/sipher/products/sipher-7wellbeingindicators/ Blog: Collasping multi-dimensional wellbeing into equivalent income - March 2022 https://www.gla.ac.uk/research/az/sipher/sharingourevidence/blog/headline_1019908_en.html Currently not available. Currently not available.
Contact


Return to SIPHER homepage

Compare All Quantitative Products

Compare All Quantitative Products
Characteristic Dynamic Systems Model Static Microsimulation Dynamic Microsimulation - MINOS Decision Support Tool K-Means Clustering Small-Area Indicator Estimation
Status In progress/ready soon Ready In progress/ready soon In progress/ready soon Ready Ready
Main Perspective Population Level (Macro) Individual Level (Micro) Individual Level (Micro) From Individual Level (Micro) to Population Level (Macro) Population Level (Macro) Population Level (Macro)
Purpose This state-space dynamic system model provides a simulation of how each variable contained in the systems map will be affected over time, given specific changes to one or more variables. All studied variables (unemployment, poverty, health, etc.) have to be represented by the input data. Model provide results at the local authority level and allow us to compare system-level effects of different (or no) policy interventions over time. This static microsimulation, using a digital twin of the UK population as a data source, provides a granular picture of the impact of policy interventions. This model enables us to examine changes relatively quickly and with a relatively low amount of computational resources. It achieves this by simplifying the relationships and interconnections of an individual’s attributes. This Microsimulation For Interrogation Of Social And Health Systems (MINOS) dynamic microsimulation, using longitudinal survey data such as the SIPHER Synthetic Population, provides a very granular picture of the impact of policy interventions on different population groups. This model uses individual-level data and simulates the transitions of individuals across different states (such as health states) over time, based on a specific set of models describing these transitions. The decision support tool is not a model in itself. Rather, it uses the available SIPHER models to provide decision support to policy analysts. K-means clustering is a data-driven approach that allows users to identify clusters of local authorities based on their performance with respect to the utilised inclusive economy data collection. This enables the identification of more or less inclusive clusters. In addition, the association between these clusters and a number of selected local authority level health outcomes is examined. The estimation of area-level indicators for small geographical units such as Local Authorities, MSOAs, or LSOAs is challenging. For example, fluctuations in the number of deaths can introduce imprecision and fluctuations when estimating life expectancy. Typically, these challenges increase as the size of the geographical unit decreases. Therefore, we employ a suite of specific small-area estimation methods to address these challenges. This suite of methods can then be applied to both non-synthetic and synthetic sources of data, such as the synthetic population, to obtain area-level estimates for the dimensions captured in the Understanding Society main stage survey.
Strengths The model captures an entire system, including feedback loops to allow for the modelling of dynamic behaviour. In addition, the model allows the testing of policy changes ex-ante - rather than retrospectively. The model can capture both, increases and decreases (such as increases or decreases in funding to supplement disposable household income). A particular strength of the model is that it enables the examination of immediate outcomes at level of individuals or households based on a policy change. Aggregating the outcomes allows a user to derive changes on the level of small geographies such as MSOA/LSOA, DZ, and local authorities. Models can provide immediate information on how many people will be affected, where those people live, and what their basic demographic characteristics are. Aggregation allows us to identify potential changes for specific geographical areas of interest. Designated longitudinal approach for the individual-level while outcomes can also be aggregated to reflect changes for population subgroups and geographical areas. Can search over many thousands of different intervention options (e.g. local communities, socio-demographic sub-groups, levels of intervention) to reveal trade-offs between outcomes. A summarising cluster solution clearly reduces complexity and leads to intuitive results. Outcomes have a straightforward meaning. Another strength of this approach lies in its ability to be updated and transferred to other sets of indicators or used over time. The suite of models aims to account for fluctuations and increase reliability of small-area estimates. This enables us to obtain reliable estimations given potentially unreliable data situations. The use of synthetic data can help to navigate situations in which no non-synthetic data would be available at all.
Limitations Any change to be modelled must be quantifiable by the model. This means that changes in variables which are not explicitly covered or for which there is no dependency will not become visible in the model. This implies that results are sensitive to pre-defined pathways which were specified in the systems map. Another limitation is posed by the assumption of known causal pathways between domains. This can be problematic in some cases and requires careful consideration and good justification. Furthermore, assumptions on the time frame for causal relationships needs strong justification and supporting information, which might not always be available. Finally, all modelled policy interventions need to be attributable to the LA level. Limitations of the synthetic population apply. Interventions can be applied to specific variables, and outcomes applied to specific health variables. The decision support tool is dependent on SIPHER models and therefore subject to the limitations of these underlying models. Synthetic Population, Dynamic Systems Model and Dynamic Microsimulation can all be integrated but their limitations will then apply to the resulting decision support tool. It is important to note that the decision support tool is not intended to be used as a decision making tool. Rather the tool will provide a range of possible answers reflecting the trade-offs associated with potential decisions. The tool does not make any decisions - this responsibility rests with the user. In some cases, the achieved reduction in complexity might not be desired. It is a limitation that complete observations are required which often adds another preparatory step to the process (imputation of missing data). As a data-driven algorithm there are only limited options to intervene, for example with respect to the number of optimal clusters. Despite its advantages of dealing with small numbers, these methods cannot resolve situations where no data is available at all. The interpretation of results obtained from synthetic data needs care - for example, when interpreting very specific attributes for a very distinct geographical region.
Geography Local Authority level for Scotland/England/Wales. LSOA/MSOA/DZ, and local authority level for Scotland, England, and Wales. DZ/LSOA Level for Scotland, England, and Wales Adopts the same geographical perspective as the SIPHER models that have been integrated - typically it is matched to the needs of the policy partner (so we have created Sheffield, Greater Manchester, Scotland (and Scottish LA) versions of the tool). The clustering is currently based on all local authorities in Scotland, England, and Wales. A previous application covered the LSOA level for selected English Local Authorities. The most common geographical level reflects the Local Authority level for England, Scotland, and Wales. In addition, estimates can be derived for the MSOA Level in England and Wales. Deriving estimates for the Intermediate Zone Level in Scotland is currently in progress. Due to the use of synthetic data, even smaller geographical resolutions can be achieved for some indicators.
Time Period Based on available and imputed data for previous years (currently 2004-2021). The model provides a dynamic annual forecast for a specified period, for example 5 years, for each variable in the model. Corresponding to the period covered in the underlying Synthetic Population, for example based on Understanding Society wave k (2019-2021) The ‘jump off’ point for the scenarios is the latest period in the underlying Understanding Society input data (currently wave k (2019-2021). The ‘time horizon’ for the scenario is set at 2037. Adopts the same time period as the SIPHER models that have been integrated. Corresponds to the period covered in the underlying Synthetic Population, for example based on Understanding Society wave k (2019-2021) up to 2025/2026. The current approach is cross-sectional, covering the last available year (2020/2021). As data on inclusive economies is available for a much longer period, it is planned to study the stability of clusters over time. Estimates are available for 2004/2014 to 2020/2021 - dependant on indicator and underlying data sources. Data updates and suggestions of new indicators can be incorporated easily.
Adjustments / Extensions Factors which can be modified include: the underlying systems map (representing domains and their interactions), features of each respective intervention (including the amount of uplift or characteristics of recipients). In addition the method can be used to capture different systems (environment, housing etc.). All information describing individuals in all or only particular areas can be seen as potentially modifiable. For example, income, employment status, health etc. These interventions are typically informed by previous research and are often referred to as “the morning after” scenarios - situations., in which an immediate change to one or more individual-level factors has occurred instantaneously. Features of each respective intervention, including the amount of uplift or characteristics of recipients receiving the uplift. Potential adjustments include characteristics of the underlying models as well as features and the geographical granularity of the reported outcomes. Adjustments to the current model include the number of clusters, a designated focus on one or more UK Nations (Scotland, England or Wales) in isolation as well as the respective Inclusive Economy indicators and health outcomes considered. Data updates can be incorporated easily. Ideas for additional indicators are welcome and can be estimated given that suitable data is available in a synthetic ornon-synthetic source.
Data Requirements Aggregate level inputs for units of the studied geographical level (e.g. unemployment rate for the LA). Sufficient longitudinal data is required for all variables to validate the model. Cross-sectional data can supplement the longitudinal data for model determination. Domain-specific definitions need to be similar across all geographical units. Please note that different indicators have been selected for England and Wales and Scotland due to data availability. Synthetic Population (see Product Guide details) Understanding Society (waves a-k). If spatial results are required, the latest version of the Synthetic Population (see data for details). The decision support tool requires results from other SIPHER models. In addition, information on the intervention as well as cost-effectiveness assumptions are required. Aggregate-level information for geographical areas on a selected set of indicators. Indicators can come from various different sources, but each indicator must have been measured consistently across observation units. For k-means to work properly, the level of missing information should be 0%. In case any information is missing, imputation methods can be utilised to achieve this requirement. This is indicator dependent. For some indicators, all required data is free and publicly available via ONS/NRS vital statistics data on population, deaths, and health outcomes. In particular for those indicators combining mortality and health information (e.g., QALE) access to the General and Special License of Understanding Society is required - depending on the level of geography required. If the underlying data source is synthetic data, such as the synthetic population, requirements of this source apply.
Applications Typical applications include a systems behaviour as a result of policy interventions, such as interventions to improve poverty, living wage, participation in employment, skills and qualification. In addition, this set of models can help to answer questions about the potential impact of direct policy responses to the current cost-of-living crisis.It is possible to forecast the impact of an intervention for a specific local authority. Number and characteristics of people affected by a financial uplift policy or labour market intervention as well as total costs of this policy for a particular geographical area Shocks and policy interventions which can be expressed as changes at the individual level. For example: changes to disposable income. Transition models need to be constructed for new problems. Applications include local community interventions on components of wellbeing; spatial targeting of job creation schemes; impact of targeted employment stimuli on health outcomes. The method is currently used to cluster local authorities based on inclusive economy indicators. It can be expanded to other indicator sets and domains as well as other outcome measures (environmental indicators). Estimated measures include measures of mortality such as life expectancy and lifespan variation, measures of health such as SF-12 instrument capturing physical and mental health, and composite measures combining health and mortality. Measures at the household-level related to cost-of-living are also available and can be obtained from synthetic sources.
Modelling Assumptions Models depend on a pre-defined systems map that describes how domains impact each other and which domains can be subject to interventions. These systems maps need to specify causal pathways between domains with pre-defined time lags. Models also depend on data to provide evidence for quantifying relationships. Assumptions of the Synthetic Population apply. The model relies on the assumption that transitions between states over time - representing the characteristics of an individual - can be modelled using a set of specified and measured characteristics of this individual. In addition, the Markov assumption needs to hold meanings that the time spent in a particular state (i.e. unemployed) does not have an impact on the probability of transitioning into other states (i.e. employed). Inherits the assumptions of the SIPHER models that have been integrated. In addition, assumptions on the costs and effectiveness of interventions are required. Clusters are identified based on the similarity observed units with respect to a number of defined domains. The major assumption is that small population sizes require specific methods to account for random fluctuations due to small numbers. A lot of measures, such mortality rates follow a very distinct pattern over age (standard trajectory) which requires knowledge of this approximate standard trajectory. When synthetic data is used, assumptions of the synthetic population apply.
User Options Which variable to change and by how much, corresponding to the policy intervention (or shock/absence of intervention) which is evaluated. All changes can be applied differentially to local authorities. Character, target group, and magnitude of particular interventions. In addition, the user can choose the geography level and select specific geographical reasons of interest. Character, target group, and magnitude of particular interventions. In addition, users can assess the impacts for LSOAs/DZs within a given area. Geographical and temporal focus. Intervention configuration options. The primary option for adjustment is the number of clusters. The most common options are the measure itself, the geographical resolution, and year.
User Type(s) Modellers, decision makers Modellers, decision makers, descriptive overview to inform statistical modelling Modellers, decision makers Modellers, decision makers Provides descriptive overview to inform decision making and modelling Outcomes are used as inputs in other models, for monitoring purposes, and can inform decision making.
Examples / Link with Other Models and Data Models of dynamic systems can inform individual-level approaches and help to validate results which were obtained in individual-level approaches. Works also in opposite direction: changes on individual-level which can be aggregated and expressed on LA level. This model requires SIPHER’s Synthetic Population. This model uses SIPHER’s Synthetic Population. The decision support tool uses the synthetic population, the systems dynamic model, the static and dynamic microsimulations, and the equivalent income utility function. Can inform the interpretation of WS4 models. In turn, can inform WS4 model input. Some of the derived health measures are used as input data in WS4 models, as outcomes for the association of clusters with health outcomes. In addition, some derived health measures can be attached to the synthetic population to represent area-level features as they cannot be derived directly from the synthetic population.
Software Requirement(s) Matlab. R or Python Python Python R R
Options for Extension Building different models for different systems. Modelling and quantifying uncertainty. All results can be combined with cost information where available to conduct cost-benefit analyses. Building different models for different interventions. Factors impacting transitions can be adjusted based on different contexts and assumptions. Alternative policy/intervention configurations. Other domains for which indicator sets exist or can be created (crime, transport, environment etc.). A k-means clustering approach can be applied to individual-level life course trajectories. Extension to a variety of small-area indicators is possible, such as age trajectories of fertility rates, employment rates, emergency admissions etc. In addition, different synthetic data sources can be utilised to create synthetic populations.
Additional Resources Explore: https://www.gla.ac.uk/research/az/sipher/development/dynamicsystemsmodel/ Paper describing applied static microsimulation to create the Synthetic Population: https://www.nature.com/articles/s41597-022-01124-9 Explore: https://www.gla.ac.uk/research/az/sipher/products/minos/ For documentation visit: https://leeds-mrg.github.io/Minos/ and for code and more detailed user instructions visit: https://github.com/Leeds-MRG/Minos Explore: https://www.gla.ac.uk/research/az/sipher/products/decisionsupporttool/ Software: https://ligerdev.shef.ac.uk/sipher-team/sipher_ws7_interventions Database: https://ligerdev.shef.ac.uk/sipher-team/sipher_ws7_database Preliminary results available upon request Explore - https://www.gla.ac.uk/research/az/sipher/products/inclusiveeconomydataset/ An exemplary pipeline, estimating a range of health measures: https://github.com/AndreasxHoehn/QALE_Exemplar Estimating quality-adjusted life expectancy (QALE) for local authorities in Great Britain and its association with indicators of the inclusive economy: a cross-sectional study BMJ Open March 2024 - https://bmjopen.bmj.com/content/14/3/e076704 Some indicators are available through the SIPHER Synthetic Population Dashboard: https://sipherdashboard.sphsu.gla.ac.uk/
Contact


Return to SIPHER homepage