SIPHER Products

SIPHER Product Guide

Category headings offer a dropdown menu to allow navigation and selection of products, including the option to compare all products within a category.

Each product page provides details of its characteristics including strengths and limitations plus current status. There are two status types;

Ready: Complete and available.

In progress/ready soon: Nearing completion available shortly.

Please direct all enquiries and feedback to the SIPHER Consortium via email: sipher@glasgow.ac.uk.

This dashboard was built using the flexdashboard package for R. We have shared the underlying code of the first release via GitHub.

Please use links provided to return to SIPHER homepage.

Last updated: 06-December-2024

Main panel

Employment and Health Evidence and Gap Map

The Employment and Health Evidence and Gap Map is a systematic visualisation of research findings surrounding the impact employment has on health.

Employment and Health Evidence and Gap Map
Characteristic	Details
Status	Ready
Purpose	A visual and interactive resource to locate published systematic reviews on the topic of health and work.
Strengths	The primary strength lies in the simplification of complex and diverse research findings. The interactive map contains studies that have explored the relationship between an employment feature and a health and social outcome. The map only contains systematic reviews.
Limitations	Does not provide any analysis on the studies identified. Users may not have access to all academic papers that are captured in the evidence and gap map as some of the covered material is not open access.
Variables	A range of work related characteristics (including contract conditions, employer attributes and working environment) and health related measures (including physical health outcomes and psychological health).
Examples / Link with Other Models and Data	Informs model building and interpretation of quantitative findings other workstreams have obtained.
Additional Resources	Explore further with link to the interactive tool: https://www.gla.ac.uk/research/az/sipher/products/employmentandhealthegm/
Contact	sipher@glasgow.ac.uk

Return to SIPHER homepage

Causal Systems Mapping

Snapshot from a causal loop diagram specifying the pathways by which quality-adjusted life expectancy (QALE) is influenced.

Causal Systems Mapping
Characteristic	Details
Status	Ready
Purpose	Concise visual representations of SIPHER’s policy areas of interest including inclusive economies, public mental health and housing, which captures the causal connections between parts of a system.
Strengths	There are a range of causal system mapping approaches, with different strengths and limitations, and the choice of which systems mapping approach to use is determined by the problem. Different systems mapping approaches have been used in SIPHER, including participatory systems mapping and causal loop diagrams. Generally, the strengths of systems mapping are bringing together information from different sources, including documents and stakeholders’ tacit knowledge and presenting it visually, which better reflects the underlying complexity. The maps can bring together a range of perspectives on a topic and be used: to analyse the structure of the system; as tools for thinking and discussion; or developed into quantitative models to test scenarios.
Limitations	Complex and comprehensive causal systems maps can be overwhelming and may not be easily useable in policy settings or for computational modelling. In contrast, simplified systems maps may appear more useable but may not capture all relevant variables. Systems maps developed in workshop settings are typically driven by the participants and their understanding of the system, therefore the maps developed reflect participants’ knowledge and experience.
Variables	Multiple systems maps have been developed for different policy areas of interest and different policy partners; variables are dependent on the maps.
Examples / Link with Other Models and Data	A causal loop diagram connecting the SIPHER Inclusive Economy indicators underlies the Inclusive Economy Dynamical Systems model.
Additional Resources	Explore SIPHER’s approach to Systems mapping: https://www.gla.ac.uk/research/az/sipher/systemsmapping/ Clackmannanshire Inclusive Economy system map: https://kumu.io/Sipher-Consortium/clacks-systems-map#clacks-ie-policy-map-final and more about our systems mapping work: https://www.gla.ac.uk/research/az/sipher/systemsmapping/
Contact	sipher@glasgow.ac.uk

Return to SIPHER homepage

Compare All Qualitative Products

Compare All Qualitative Products
Characteristic	Employment and Health Evidence and Gap Map	Causal Systems Mapping
Status	Ready	Ready
Purpose	A visual and interactive resource to locate published systematic reviews on the topic of health and work.	Concise visual representations of SIPHER’s policy areas of interest including inclusive economies, public mental health and housing, which captures the causal connections between parts of a system.
Strengths	The primary strength lies in the simplification of complex and diverse research findings. The interactive map contains studies that have explored the relationship between an employment feature and a health and social outcome. The map only contains systematic reviews.	There are a range of causal system mapping approaches, with different strengths and limitations, and the choice of which systems mapping approach to use is determined by the problem. Different systems mapping approaches have been used in SIPHER, including participatory systems mapping and causal loop diagrams. Generally, the strengths of systems mapping are bringing together information from different sources, including documents and stakeholders’ tacit knowledge and presenting it visually, which better reflects the underlying complexity. The maps can bring together a range of perspectives on a topic and be used: to analyse the structure of the system; as tools for thinking and discussion; or developed into quantitative models to test scenarios.
Limitations	Does not provide any analysis on the studies identified. Users may not have access to all academic papers that are captured in the evidence and gap map as some of the covered material is not open access.	Complex and comprehensive causal systems maps can be overwhelming and may not be easily useable in policy settings or for computational modelling. In contrast, simplified systems maps may appear more useable but may not capture all relevant variables. Systems maps developed in workshop settings are typically driven by the participants and their understanding of the system, therefore the maps developed reflect participants’ knowledge and experience.
Variables	A range of work related characteristics (including contract conditions, employer attributes and working environment) and health related measures (including physical health outcomes and psychological health).	Multiple systems maps have been developed for different policy areas of interest and different policy partners; variables are dependent on the maps.
Examples / Link with Other Models and Data	Informs model building and interpretation of quantitative findings other workstreams have obtained.	A causal loop diagram connecting the SIPHER Inclusive Economy indicators underlies the Inclusive Economy Dynamical Systems model.
Additional Resources	Explore further with link to the interactive tool: https://www.gla.ac.uk/research/az/sipher/products/employmentandhealthegm/	Explore SIPHER’s approach to Systems mapping: https://www.gla.ac.uk/research/az/sipher/systemsmapping/ Clackmannanshire Inclusive Economy system map: https://kumu.io/Sipher-Consortium/clacks-systems-map#clacks-ie-policy-map-final and more about our systems mapping work: https://www.gla.ac.uk/research/az/sipher/systemsmapping/
Contact	sipher@glasgow.ac.uk	sipher@glasgow.ac.uk

Return to SIPHER homepage

SIPHER Synthetic Population

This table provides an example of the SIPHER Synthetic Population data structure and the result after a linkage with Understanding Society survey data.

SIPHER Synthetic Population
Characteristic	Details
Status	Ready
Purpose	A quality-controlled, public available data source containing attribute-rich data at the individual level - with the aim to create a digital twin for every adult in the population with a large amount of associated information about each person.
Context	Individual level data enable us to understand an individuals’ situations, what happens to them over time or when affected by changes due to external events or policies. The lack of a comprehensive register-based system in Great Britain has made it challenging to access data on individuals across multiple domains. The SIPHER Synthetic Population helps bridging this gap by providing a representative, attribute-rich dataset reflecting the whole of the adult population in Great Britain. By randomly selecting individuals from a survey and assigning them to small geographical areas based on census statistics, the SIPHER Synthetic Population ensures that the distribution of demographic characteristics for all sampled individuals corresponds exactly to the true demographic structure within each small census output area. This enables researchers to derive area-level profiles which would otherwise not be available. In more complex applications, the dataset can be used to simulate policy interventions and explore their potential impact on individuals and households at a granular resolution, distinguishing small geographical areas such and even population subgroups within these areas.
Strengths	The SIPHER Synthetic Population is representative of the demographic characteristics of the respective area - down to a low geographical resolution. The strength of the SIPHER Synthetic Population is that it provides a wide range of information at the level of individuals. This information can be aggregated into groupings of interest (e.g. sex, income groups) and particular geographical units of interest (LSOA/DZ; MSOA; Local Authorities etc.). The method used to develop the dataset is referred to as spatial microsimulation. We often use the SIPHER Synthetic Population in conjunction with other models we have developed. This enables us to determine whether an intervention has benefitted a population group of interest.
Limitations	The accuracy of the SIPHER Synthetic Population depends on the quality and availability of the underlying data. Some variables may have poor completion rates in the underlying survey, resulting in missing data after linkage. Despite the high number of participants in the Understanding Society survey, explicit spatial constraints cannot be applied when creating the datasaet. This means that an individual who was interviewed as part of the survey and who is actually residing in place X can be assigned to a variety of places A, B, and C, as long as they match the demographic constraints such as age, sex, marital status etc. Although recent updates of the code have led to more constraints on how to perform this selection process, it is important to remember that the creation of the SIPHER Synthetic Population is based on associations and descriptive statistics. It can only ever serve as an approximation of the true population in Scotland, England and Wales - which is likely to be much more heterogenous and diverse than the population captured in the synthetic data source. Therefore, all results obtained from the SIPHER Synthetic Population should always be interpreted carefully as model output, and not as equivalent to a population-based register.
Geography	Individuals in the SIPHER Synthetic Population have a geography assigned to them (a synthetic DZ/LSOA). This allows all levels of geography upwards from DZ/LSOA Level for Scotland, England and Wales - excluding Northern Ireland - to be analysed and modelled.
Variables / Indicators	A large variety of variables can be included. This includes all variables included in the Understanding Society survey - the underlying survey data source. It also possible to estimate other derived variables from this data source, for example ‘Equivalent Income’, using the ‘Equivalent Income Calculator’ method.
Time Period	The latest release reflects the years 2019-2021. Results from the UK census 2011 are used as constraints for the spatial microsimulation - the process generating the Synthetic Population. Preliminary updated version for England and Wales are available which are based on the UK census 2021. However, Scotland has not yet published all required input data from its most recent census.
Missing Data	The level of missing information for a particular variable is determined by the levels of missingness in the underlying Understanding Society survey.
Examples / Link with Other Models and Data	The Synthetic Population is used as the underlying data source in several SIPHER models. These include: (1) dynamic systems model, (2) static and dynamic microsimulation and (3) decision support tool. Information covered in the Synthetic Population can be extended by adding additional variables from other data sources. These could be datasets that are not publicly available. In addition, the SIPHER Synthetic Population can be used to derive more complex concepts such as the ‘Equivalent Income’ - a variable which is calculated using the ‘Equivalent Income Calculator’ method.
Software Requirements	Requires a software that can handle the size of the data file, such as R or Python. An interactive Rshiny dashboard allows a code-free exploration of an aggregated version: https://sipherdashboard.sphsu.gla.ac.uk/
Data Requirements / Restrictions	The SIPHER Synthetic Population is available for full indeopendent use via the UK Data Service’s Curated Data Collection. To set up the SIPHER Synthetic Population, it is required to link the synthetic population file (UK Data Service ID: SN9277) with Understanding Society survey data (UK Data Service ID: SN6614) - as is typically done for area-level linkages of surveys. Both datasets are subject to the General End-User License Agreement terms and conditions, and can be downloaded without any costs directly from the website of UK Data Service.
Data / Code Available	Due to the underlying license agreement, the dataset cannot be shared as an open access version. However, the dataset can be downloaded through the UK Data Service website, after acceptance of the General End-User license terms and conditions: https://beta.ukdataservice.ac.uk/datacatalogue/studies/study?id=9277#!/details In addition, we have made a wealth of supplementary material available, documenting creation, validation, linkage, and exploration of the dataset: https://reshare.ukdataservice.ac.uk/856754/
Training	We have provided a comprehensiv, open access User Guide for our SIPHER Synthertic Population. The User Guide provides background information and explains how to setup up the data and analyse it swiftly: https://doc.ukdataservice.ac.uk/doc/9277/mrdoc/pdf/9277_user_guide_r4_clean.pdf
Additional Resources	SIPHER Synthetic Population for Individuals in Great Britain, 2019-2021 (UK Data Service Curated Collection, SN9277): https://beta.ukdataservice.ac.uk/datacatalogue/studies/study?id=9277#!/details Comprehensive User Guide: https://doc.ukdataservice.ac.uk/doc/9277/mrdoc/pdf/9277_user_guide_r4_clean.pdf Supplementary Resources: https://reshare.ukdataservice.ac.uk/856754/ Paper describing the statistical creation process: https://www.nature.com/articles/s41597-022-01124-9 Understanding Society Survey Blog: https://www.understandingsociety.ac.uk/news/2024/07/10/building-synthetic-population-data/ Introduction Video: https://www.youtube.com/watch?v=CkiORY7GSLc
Contact	sipher@glasgow.ac.uk

Return to SIPHER homepage

Health Indicator Dataset

An illustration of the spatial distribution of quality-adjusted life expectancy for local authorities across Great Britain in 2018-2020.

Health Indicator Dataset
Characteristic	Details
Status	Ready
Purpose	A variety of popualtion health indicators for small geographical units (local authorities and LSOAs/MSOAs) for use in statistical analyses and monitoring of area-level health inequalities.
Context	Modelling the impact of public policy on health requires a shared understanding of how we conceptualise and measure health as an outcome. We need a set of health indicators that are meaningful in the context of understanding the effects of policies and interventions of interest to SIPHER, such as those aiming to create an inclusive economy or improve mental health. These indicators can be derived either from synthetic data (e.g., SIPHER Synthetic Popualtion) or from non-synthetic data sources (e.g., ONS/NRS data)
Strengths	Small-area health indicators can be used to monitor area-level health inequalities or as inputs in statistical models. In addition, all health outcome measures can be attached to the Synthetic Population representing area-level health indicators. SIPHER reviewed the available measures and conducted a consensus process with SIPHER colleagues to agree on a final set of indicators. The criteria used were: 1. Interpretability -accessible & meaningful to decision makers, 2. Sensitivity to policy – the indicator can plausibly show the effects of policy. 3. Indicator can show impacts of pandemic on health. 4. Timeliness – refers to the current health state. 5. Availability of timeseries data 6. Changes in mental AND physical health can be separately studied. 7. Regular updates into the future are expected, 8. Comparability – between areas, ideally comparable between England & Scotland, 9. High resolution – available for small areas with LA as a minimum, 10. Disaggregate – available by subgroups (e.g. broken dow
Limitations	The dataset cannot resolve situations where no data is available at all or where sampling in surveys is not representative of small geographical units.
Geography	The exact geographical resolution is indicator-dependent. Typically, the following resolutions are available for Mortality: DZ/LSOA Level for Scotland, England and Wales and LA Level
Variables / Indicators	The dataset includes measures of mortality, physical, and mental health, and composite measures combining mortality and health. It is open to data updates, and additional health indicators can be estimated and incorporated if required.
Time Period	DZ/LSOA/MSOA Level: typically, cross-sectional representing the period covered by the synthetic population. Local Authority level: typically, longitudinal for 2004-2020 when based on non-synthetic data. Data will be updated as new data becomes available.
Missing Data	Level of missing data determined by data availability. Older data not always comparable across time or form for some indicators.
Examples / Link with Other Models and Data	A portfolio of area-level summary indicators on mortality, health, and composite indicators that combine information on mortality and health. These indicators can be attached as area-level indicators to the SIPHER Synthetic Population. In addition, health measures are used in the Local Authority clustering work, as well as in the Dynamic Systems model.
Software Requirements	Requires a software that can handle the size of the data file, such as R or Python
Data Requirements / Restrictions	For key indicators such as QALE, Life Expectancy, and Lifespan Variation it is planned that a final version of the dataset and the underlying code will be made publicly available. In order to fully reproduce health measures requiring the Synthetic Population, access to the Synthetic Population is required.
Data / Code Available	Work in progress, final dataset will be made publicly available. Pipeline of code for estimation of Quality-Adjusted Life Expectancy (QALE) is available.
Training	Online pipeline example via GitHub.
Additional Resources	Choosing the SIPHER health Indicators Report: https://www.gla.ac.uk/media/Media_970682_smxx.pdf and QALE exemplar: https://github.com/AndreasxHoehn/QALE_Exemplar Some indicators are available through the SIPHER Synthetic Population Dashboard: https://sipherdashboard.sphsu.gla.ac.uk/
Contact	sipher@glasgow.ac.uk

List of Available Health Indicators
Health Indicator	Definition	Geographical Resolution	Source
Life Expectancy (LE)	We often use mortality to measure population health. LE is a measure of mortality. It quantifies the average length of life, in years, for a newborn individual in a lifetable cohort. A lifetable cohort represents a hypothetical birth cohort of individuals. We assume that this cohort is exposed to the mortality rates observed in a period throughout their entire lifes. LE is an implicitly age-standardised measure which allows direct comparisons over time and between areas.	GB, National, LA, MSOA	own calculations based on ONS/NRS data
E-Dagger	E-Dagger is an absolute measure of inequalities in mortality and typically presented in years. It quantifies the average number of years an individual of the lifetable cohort dies too early - in comparison to all other individuals in this population. In a perfectly equal population, everbody dies at the same age and E-Dagger would be 0.	GB, National, LA, MSOA	own calculations based on ONS/NRS data
Keyfizt Entropy	Keyfity Entropy is a relative measure of inequalities in mortality. It is estimated by standardising the measure E-Dagger by the level of life expectancy. This means that it is dimensionless and which allows a better comparison between populations with very contrasting levels of mortality.	GB, National, LA, MSOA	own calculations based on ONS/NRS data
Healthy Life Expectancy (HLE)	Life expectancy does not distinguish whether years of life are spend in good or poor health. Health expectancy indicators such as HLE aim to overcome this limitation by combining information on mortality and health. HLE is a measure of average and can be understood as an adjustedment of life expectancy by the health status - with health being simply distinguished into a healthy and a unhealthy state.	GB, National, LA	directly from PHE and PHS websites
Quality-Adjusted Life Expectancy (QALE)	Life expectancy does not distinguish whether years of life are spend in good or poor health. Health expectancy indicators such as QALE aim to overcome this limitation by combining information on mortality and health. QALE is a measure of average and can be understood as an adjustment of life expectancy by quality of life. In contrast to the binary approach to health (health/unhealthy) reflected in HLE, QALE captures health on a much more granular scale and incorporates mental and physical elements of health.	GB, National, LA, MSOA	own calculations based on ONS/NRS data for the mortality components and Understanding Society data for the health components
Age Standardised SF-12 Mental Health (SF-12 MCS)	SF-12 MCS provides a summary score for the mental health status of individuals and populations. The score is obtained from a standardised set of questions, the SF-12. As these measures are very sensitive to the age structure, we use age-standardisation to allow for a direct comparison between areas and across time.	GB, National, LA	own calculations based on Understanding Society data for the health components and ONS/NRS data for age standardisation
Age Standardised SF-12 Physical Health (SF-12 PCS)	SF-12 PCS provides a summary score for the mental health status of individuals and populations. The score is obtained from a standardised set of questions, the SF-12. As these measures are very sensitive to the age structure, we use age-standardisation to allow for a direct comparison between areas and across time.	GB, National, LA	own calculations based on Understanding Society data for the health components and ONS/NRS data for age standardisation

Return to SIPHER homepage

Inclusive Economy Indicator Inclusive Economy (Local Authority Level) Dataset

Inclusive Economy indicator cluster solution.

Inclusive Economy (Local Authority Level) Dataset
Characteristic	Details
Status	Ready
Purpose	This dataset was designed to provide a meaningful operationalisation of the underlying concept of the inclusive economy at local authority level, and to enable statistical models to further explore the concept.
Context	SIPHER has adopted a particular understanding which focuses on economic inclusion, rather than inclusive growth. There are multiple approaches and definitions of what constitutes an inclusive economy. To date, there is no single definition of the concept. In response, SIPHER has developed a collection of indicators for researchers and policymakers which describes the extent and nature of economic inclusion across local authorities in Great Britain. The creation of the dataset has been informed by an initial review of the underlying theoretical concepts. The selection and estimation of all indicators benefited from co-production between SIPHER researchers and policy partners.
Strengths	The dataset has been subject to a thorough geographical harmonisation and review process. In addition, the dataset contains a number of supplementary health and demographic indicators for all local authorities. The major strength of this dataset is the wide range of potential applications; from descriptive analyses to studies examining the complex relationships between economic inclusion and health and wellbeing. The dataset is available as an open access resource via the Open Science Framework: https://osf.io/vnsur/
Limitations	For a few of the indicators, exact definitions differ between countries. For example, there are different definitions of fuel poverty in use in Scotland and England. In these cases, national deciles were created and comparable alternative indicators were identified. For example, food insecurity was used as an alternative cost-of-living indicator.
Geography	Longitudinal (2017-2021) and geographically harmonised data is available at the level of local authorities in England, Scotland, and Wales. The dataset covers all 363 local authorities in Great Britain, reflecting their 2021 boundaries according to ONS definition.
Variables / Indicators	Details on all indicators are outlined in the Technical Report for the SIPHER Inclusive Economy Indicator Set – See Additional Resources.
Time Period	Longitudinal data are available for every year between 2017 and 2021.
Missing Data	Missing data were imputed using a sophisticated multiple imputation algorithm. In some cases, only cross-sectional measurements were available, which were carried forward or backward. For example, local elections (Indicator 6B) did not take place every single year.
Examples / Link with Other Models and Data	The dataset is currently used in a k-means clustering machine learning study. The primary aim of this study is to identify clusters of similar local authorities and to examine the association of each cluster with a number of health outcomes. In another application, we explore the association between Quality-Adjusted Life Expectancy (QALE) and indicators of economic inclusion.
Software Requirements	Requires a software that loads data, such as Excel, R, or Python. Access SIPHER Inclusive Economy Dataset Interactive Map - https://www.gla.ac.uk/research/az/sipher/products/inclusiveeconomydataset/ieinteractivemap/#d.en.1054750.
Data Requirements / Restrictions	The final dataset is available as an open access resource.
Data / Code Available	The final dataset and additional documentation are publicly available via the Open Science Framework: https://osf.io/vnsur/.
Training	The data is accompanied by a comprehensive data dictionary which provides context relating to all variables included.
Additional Resources	Explore - https://www.gla.ac.uk/research/az/sipher/products/inclusiveeconomydataset/ SIPHER Inclusive Economy Indicator Set: Technical paper [PDF] - https://www.gla.ac.uk/media/Media_970680_smxx.pdf SIPHER Inclusive Economy Indicator Set: Summary [PDF] - https://www.gla.ac.uk/media/Media_1029792_smxx.pdf Estimating quality-adjusted life expectancy (QALE) for local authorities in Great Britain and its association with indicators of the inclusive economy: a cross-sectional study BMJ Open March 2024 - https://bmjopen.bmj.com/content/14/3/e076704 Measuring the Inclusive Economy Blog - https://www.gla.ac.uk/research/az/sipher/sharingourevidence/blog/headline_1049629_en.html
Contact	sipher@glasgow.ac.uk

List of Inclusive Economy Indicators (Local Authority Level)
Type	Indicator	Explanation
Economic outcomes	Participation in paid employment	Percentage of working-age people (aged 16-64) who are employed
Economic outcomes	Involuntary exclusion from the labour market	Percentage of working-age people (aged 16-64) who are inactive due to ill health or disability
Economic outcomes	Wealth inequality	Ratio of median house prices in most expensive wards to median in least expensive
Economic outcomes	Earnings inequality	Ratio of weekly earnings (residents in full-time work) between 80th and 20th percentiles
Economic outcomes	Poverty	Percentage of children living in low income households (based on national relative threshold, after adjustment for housing costs)
Economic outcomes	Decent pay	Percentage of employee jobs that are paid below the Living Wage (as defined by the Living Wage Foundation)
Economic outcomes	Job security/precarity	Percentage of employees on a permanent contract
Wider outcomes / enablers	Skills and qualifications	Percentage of adults aged 20-49 with a Level 2 or higher National Vocational Qualification (NVQ)
Wider outcomes / enablers	Digital exclusion	Percentage of individuals who are classified as a) e-withdrawn, b) passive and uncommitted internet users, or c) settled offline communities; based on the Internet User Classification (IUC)
Wider outcomes / enablers	Physical connectivity	Public transport accessibility measure, percentage of Lower Super Output Area (LSOAs)/Data Zones (DZs) within the local authority area that are among the 50% least accessible LSOAs/DZs for each devolved nation
Wider outcomes / enablers	Housing affordability	Ratio of median house prices to median gross annual earnings
Wider outcomes / enablers	Costs of living	Percentage of households experiencing food insecurity
Wider outcomes / enablers	Inclusion in decision-making	Percentage of eligible voters participating in local elections

Return to SIPHER homepage

Inclusive Economy Indicator Inclusive Economy (Ward Level) Dataset

Participation in local elections for electoral wards in Glasgow, obtained from the CDRC Interactive Map.

Inclusive Economy (Ward Level) Dataset
Characteristic	Details
Status	Ready
Purpose	This dataset was designed to provide a meaningful operationalisation of the underlying concept of the inclusive economy at electoral ward level.
Context	SIPHER has adopted a particular understanding which focuses on economic inclusion, rather than inclusive growth. There are multiple approaches and definitions of what constitutes an inclusive economy. To date, there is no single definition of the concept. In response, SIPHER has developed a collection of indicators for researchers and policymakers which describes the extent and nature of economic inclusion across electoral wards in Great Britain. The creation of the dataset has been informed by an initial review of the underlying theoretical concepts. The selection and estimation of all indicators benefited from co-production between SIPHER researchers and policy partners.
Strengths	The dataset has been subject to a thorough geographical harmonisation and review process. In addition, the dataset contains a number of supplementary wellbeing and demographic indicators for all local authorities. The major strength of this dataset is the wide range of potential applications; from descriptive analyses to studies examining the complex relationships between economic inclusion and health and wellbeing. The dataset is available as an open access resource via the Open Science Framework: https://osf.io/s24ye/
Limitations	It should be noted that the metrics for two indicators differ from those in the SIPHER Inclusive Economy (Local Authority) Level Dataset: (1) Indicator 5A (poverty), low income before housing costs (BHC) was used, rather than after housing costs (AHC); (2) Indicator 5B (cost of living), fuel poverty was used, rather than food poverty.
Geography	Longitudinal (2019-2021) and geographically harmonised data is available at the level of electoral wards in England, Scotland, and Wales. The dataset covers 7,973 of 8,020 wards in Great Britain, reflecting their 2022 boundaries according to ONS definition.
Variables / Indicators	Details on all indicators are outlined in the Technical Report for the SIPHER Inclusive Economy Indicator Set – See Additional Resources.
Time Period	Longitudinal data are available for every year between 2019 and 2021.
Missing Data	Missing data were imputed using a sophisticated multiple imputation algorithm. In some cases, only cross-sectional measurements were available, which were carried forward or backward. For example, local elections (Indicator 6B) did not take place every single year.
Examples / Link with Other Models and Data	The dataset relies on the SIPHER Synthetic Population for 8/13 of the inclusive economy indicators. It also includes several demographic and wellbeing indicators in the form of the Shortform-12 (SF-12) measures, physical and mental components scores (PCS and MCS).
Software Requirements	Requires a software that loads data, such as Excel, R, or Python. Access SIPHER Inclusive Economy Dataset Interactive Map - https://www.gla.ac.uk/research/az/sipher/products/inclusiveeconomydataset/ieinteractivemap/#d.en.1054750
Data Requirements / Restrictions	The final dataset is available as an open access resource.
Data / Code Available	The final dataset and additional documentation are publicly available via the Open Science Framework: https://osf.io/s24ye/.
Training	The data is accompanied by a comprehensive data dictionary which provides context relating to all variables included.
Additional Resources	Explore - https://www.gla.ac.uk/research/az/sipher/products/inclusiveeconomydataset/ SIPHER Inclusive Economy Indicator Set: Technical paper [PDF] - https://www.gla.ac.uk/media/Media_970680_smxx.pdf SIPHER Inclusive Economy Indicator Set: Summary [PDF] - https://www.gla.ac.uk/media/Media_1029792_smxx.pdf Inclusive Economy Indicators for Electoral Wards Blog - https://www.gla.ac.uk/research/az/sipher/sharingourevidence/blog/headline_1132578_en.html
Contact	sipher@glasgow.ac.uk

List of Inclusive Economy Indicators (Ward Level)
Type	Indicator	Explanation
Economic outcomes	Participation in paid employment	Percentage of working-age people (aged 16-64) who are employed
Economic outcomes	Involuntary exclusion from the labour market	Percentageof working-age people who are long-term unemployed or inactive due to ill health or disability
Economic outcomes	Wealth inequality	Ratio of median house prices in most expensive neighbourhood to median in least expensive
Economic outcomes	Earnings inequality	Ratio of weekly earnings (residents in full-time work) between 80th and 20th percentiles
Economic outcomes	Poverty	Percentage of children living in low income households (based on national relative threshold, before adjustment for housing costs)
Economic outcomes	Decent pay	Percentage of employee jobs that are paid at or above Real Living Wage (as defined by the Living Wage Foundation)
Economic outcomes	Job security	Percentage of employees in permanent work
Wider outcomes / enablers	Skills and qualifications	Percentage of adults aged 20-49 with a Level 2 or higher National Vocational Qualification (NVQ)
Wider outcomes / enablers	Digital exclusion	Percentage of individuals who are classified as a) e-withdrawn, b) passive and uncommitted internet users, or c) settled offline communities; based on the Internet User Classification (IUC)
Wider outcomes / enablers	Physical connectivity	Public transport accessibility measure, percentage of Lower Super Output Area (LSOAs)/Data Zones (DZs) within the local authority area that are among the 50% least accessible LSOAs/DZs for each devolved nation
Wider outcomes / enablers	Housing affordability	Ratio of median house prices to median gross annual earnings
Wider outcomes / enablers	Costs of living	Percentage of households experiencing fuel poverty
Wider outcomes / enablers	Inclusion in decision-making	Percentage of eligible voters participating in local elections

Return to SIPHER homepage

SIPHER-7 Wellbeing Domain Preferences (Survey Dataset)

SIPHER-7 Wellbeing Domain Preferences (Survey Dataset)
Characteristic	Details
Status	In Progress/Ready Soon
Purpose	To represent a multi-dimensional measure of wellbeing, consisting of seven indicators, in terms of a single index metric, equivalent income.
Context	SIPHER’s WS6 team has developed a wellbeing indicator set comprising seven indicators - SIPHER-7. While SIPHER-7 describes people’s wellbeing across these seven indicators, when some indicators improve and others worsen, it is difficult to judge whether overall wellbeing is improving or worsening. The purpose of this part of the project is to collapse the multi-dimensional wellbeing indicators into a single index metric for wellbeing, equivalent income. To do this, four surveys using Discrete Choice Experiments (DCE) were conducted with a sample of the UK public. Participants were asked to review a set of ten choice tasks, each involving two imaginary scenarios described in terms of SIPHER-7, and select which scenario they believed was better. In three of the surveys, participants were asked to complete the tasks from a personal perspective (i.e., which scenario they would want for themselves), and in the remaining survey, participants were asked to complete the task from a social perspective (i.e., which scenario they think would be better for policy makers to bring about for others). The econometrically estimated parameters represent the relative values given to the seven wellbeing indicators of SIPHER-7 by samples of the UK general public.
Strengths	The DCE data on relative preferences allow the calculation of equivalent income - a quantitative preference-based single metric of wellbeing - for any combination of SIPHER-7 indicators. The samples are large (ranging from 1000 to 3000, totalling just under 11,000) and representative of the UK general public in terms of age and sex.
Limitations	Currently not available.
Geography	The surveys collected data from participants resident in the UK with sampling quotas for age and for sex.
Variables / Indicators	In addition to the DCE choice data, the surveys include participant self-reported data on: SIPHER-7; household size; age; gender; etc. Surveys (1) and (2) use the original SIPHER-7. Surveys (3) and (4) use the revised version of SIPHER-7.
Time Period	There are four datasets: (1) people’s personal preferences in autumn 2020; (2) people’s personal preferences in autumn 2021; (3) people’s personal preferences in spring 2022; (4) people’s social preferences in spring 2022. Dataset (2) includes returning respondents from (1). Otherwise, the observations are independent.
Missing Data	Currently not available.
Examples / Link with Other Models and Data	The estimated parameters can be used to calculate an equivalent income variable in the Synthetic Population.
Software Requirements	The main choice data and respondent background variables are saved in Stata and require a software that can read in Stata files.
Data Requirements / Restrictions	Currently not available.
Data / Code Available	Currently not available.
Training	Currently not available.
Additional Resources	Explore - https://www.gla.ac.uk/research/az/sipher/products/sipher-7wellbeingindicators/ Blog: Collasping multi-dimensional wellbeing into equivalent income - March 2022 https://www.gla.ac.uk/research/az/sipher/sharingourevidence/blog/headline_1019908_en.html
Contact	sipher@glasgow.ac.uk

Return to SIPHER homepage

Aversion to Inequality (Survey Dataset)

An exemplary question from the underlying survey: Which inequality setting would you consider more favorable in a country?

Aversion to Inequality (Survey Dataset)
Characteristic	Details
Status	In Progress/Ready Soon
Purpose	To elicit public preferences regarding trade-offs between improving wellbeing and reducing inequality.
Context	Public policies aim to improve wellbeing and reduce wellbeing inequality, but it is not always possible to do both. How do the public balance the trade-off between improving wellbeing and reducing inequality? The relative importance people place on increasing averages and reducing inequalities (or “inequality aversion”) was elicited from a sample of the UK general public (n=53). Respondents participated in one of eleven online discussion groups, where a series of quantitative trade-off exercises were explained and discussed. Each respondent then completed the same exercise individually. The exercises covered aversion to inequality in: (a) an overall measure of wellbeing (equivalent income); (b) lifetime health across otherwise equal individuals; and (c) lifetime health across the rich and poor.
Strengths	Public policies aim to improve wellbeing and to reduce wellbeing inequality. When there is a conflict between these, policy makers need to make difficult decisions. The quantitative data on inequality aversion is derived from discussion groups, where participants had the opportunity to examine the trade-off exercise in detail. The results help inform policy makers on the trade-offs between the two policy aims that members of the public would support.
Limitations	Currently not available.
Geography	UK with sampling quotas for age and for sex.
Variables / Indicators	In addition to the inequality aversion task, the survey include participant self-reported data on: SIPHER-7; household size; age; gender; etc.
Time Period	Data collected: summer - autumn 2022.
Missing Data	Currently not available.
Examples / Link with Other Models and Data	The estimated inequality aversion parameter is used to identify the optimal trade-off between maximising wellbeing and reducing inequality in the decision support tools.
Software Requirements	The main trade-off data and respondent background variables are saved in Stata and require a software that can read in Stata files.
Data Requirements / Restrictions	Currently not available.
Data / Code Available	Currently not available.
Training	Currently not available.
Additional Resources	Currently not available.
Contact	sipher@glasgow.ac.uk

Return to SIPHER homepage

HWMIC (Health and Wellbeing Multi-Instrument Comparison) Dataset

HWMIC (Health and Wellbeing Multi-Instrument Comparison) Dataset
Characteristic	Details
Status	In Progress/Ready Soon
Purpose	A dataset with a battery of self-reporting health and wellbeing indicators from a large UK sample, oversampling from Scotland.
Context	Different surveys use different health outcome indicators. Therefore, data might be available for one indicator set when another is required. For example, answers to SF-12 survey items are available but a WEMWBS value is required. This is a large-cross section online survey of the general public (n=12,401) where respondents are asked to self-report their health and wellbeing across a battery of questions. This dataset allows the estimation of a statistical mapping algorithm between the different indicator sets.
Strengths	Different surveys have different health and wellbeing indicators, and this dataset allows the estimation of a statistical mapping algorithm between them. This would allow predicting SIPHER-7 information where the relevant variables are not available.
Limitations	Currently not available.
Geography	The survey collected data from participants resident in the UK with sampling quotas for age and for sex. Oversamples Scotland.
Variables / Indicators	The indicator sets and questions included in the survey: SIPHER-7; ICECAP-A; EQ-5D-5L; SF-12 v2; HUI; WEMWBS; EQ-HWB; ONS-4; Understanding Society items on crime and housing; items from the Labour Force Survey, the Living Wage Foundation questionnaire; education, income, ethnicity, children, informal caregiving; gender, age; etc. Includes sampling weights to correct for age and sex with respect to the mid-year UK population estimate.
Time Period	Data collected: late 2022.
Missing Data	Currently not available.
Examples / Link with Other Models and Data	Currently not available.
Software Requirements	Currently saved in Stata and requires a software that can read in Stata files.
Data Requirements / Restrictions	Currently not available.
Data / Code Available	Currently not available. The dataset will be archived. There is no associated code.
Training	Currently not available.
Additional Resources	Currently not available.
Contact	sipher@glasgow.ac.uk

Return to SIPHER homepage

Compare All Data Products

Compare All Data Products
Characteristic	SIPHER Synthetic Population	Health Indicator Dataset	Inclusive Economy (Local Authority Level) Dataset	Inclusive Economy (Ward Level) Dataset	SIPHER-7 Wellbeing Domain Preferences (Survey Dataset)	Aversion to Inequality (Survey Dataset)	HWMIC (Health and Wellbeing Multi-Instrument Comparison) Dataset
Status	Ready	Ready	Ready	Ready	In Progress/Ready Soon	In Progress/Ready Soon	In Progress/Ready Soon
Purpose	A quality-controlled, public available data source containing attribute-rich data at the individual level - with the aim to create a digital twin for every adult in the population with a large amount of associated information about each person.	A variety of popualtion health indicators for small geographical units (local authorities and LSOAs/MSOAs) for use in statistical analyses and monitoring of area-level health inequalities.	This dataset was designed to provide a meaningful operationalisation of the underlying concept of the inclusive economy at local authority level, and to enable statistical models to further explore the concept.	This dataset was designed to provide a meaningful operationalisation of the underlying concept of the inclusive economy at electoral ward level.	To represent a multi-dimensional measure of wellbeing, consisting of seven indicators, in terms of a single index metric, equivalent income.	To elicit public preferences regarding trade-offs between improving wellbeing and reducing inequality.	A dataset with a battery of self-reporting health and wellbeing indicators from a large UK sample, oversampling from Scotland.
Context	Individual level data enable us to understand an individuals’ situations, what happens to them over time or when affected by changes due to external events or policies. The lack of a comprehensive register-based system in Great Britain has made it challenging to access data on individuals across multiple domains. The SIPHER Synthetic Population helps bridging this gap by providing a representative, attribute-rich dataset reflecting the whole of the adult population in Great Britain. By randomly selecting individuals from a survey and assigning them to small geographical areas based on census statistics, the SIPHER Synthetic Population ensures that the distribution of demographic characteristics for all sampled individuals corresponds exactly to the true demographic structure within each small census output area. This enables researchers to derive area-level profiles which would otherwise not be available. In more complex applications, the dataset can be used to simulate policy interventions and explore their potential impact on individuals and households at a granular resolution, distinguishing small geographical areas such and even population subgroups within these areas.	Modelling the impact of public policy on health requires a shared understanding of how we conceptualise and measure health as an outcome. We need a set of health indicators that are meaningful in the context of understanding the effects of policies and interventions of interest to SIPHER, such as those aiming to create an inclusive economy or improve mental health. These indicators can be derived either from synthetic data (e.g., SIPHER Synthetic Popualtion) or from non-synthetic data sources (e.g., ONS/NRS data)	SIPHER has adopted a particular understanding which focuses on economic inclusion, rather than inclusive growth. There are multiple approaches and definitions of what constitutes an inclusive economy. To date, there is no single definition of the concept. In response, SIPHER has developed a collection of indicators for researchers and policymakers which describes the extent and nature of economic inclusion across local authorities in Great Britain. The creation of the dataset has been informed by an initial review of the underlying theoretical concepts. The selection and estimation of all indicators benefited from co-production between SIPHER researchers and policy partners.	SIPHER has adopted a particular understanding which focuses on economic inclusion, rather than inclusive growth. There are multiple approaches and definitions of what constitutes an inclusive economy. To date, there is no single definition of the concept. In response, SIPHER has developed a collection of indicators for researchers and policymakers which describes the extent and nature of economic inclusion across electoral wards in Great Britain. The creation of the dataset has been informed by an initial review of the underlying theoretical concepts. The selection and estimation of all indicators benefited from co-production between SIPHER researchers and policy partners.	SIPHER’s WS6 team has developed a wellbeing indicator set comprising seven indicators - SIPHER-7. While SIPHER-7 describes people’s wellbeing across these seven indicators, when some indicators improve and others worsen, it is difficult to judge whether overall wellbeing is improving or worsening. The purpose of this part of the project is to collapse the multi-dimensional wellbeing indicators into a single index metric for wellbeing, equivalent income. To do this, four surveys using Discrete Choice Experiments (DCE) were conducted with a sample of the UK public. Participants were asked to review a set of ten choice tasks, each involving two imaginary scenarios described in terms of SIPHER-7, and select which scenario they believed was better. In three of the surveys, participants were asked to complete the tasks from a personal perspective (i.e., which scenario they would want for themselves), and in the remaining survey, participants were asked to complete the task from a social perspective (i.e., which scenario they think would be better for policy makers to bring about for others). The econometrically estimated parameters represent the relative values given to the seven wellbeing indicators of SIPHER-7 by samples of the UK general public.	Public policies aim to improve wellbeing and reduce wellbeing inequality, but it is not always possible to do both. How do the public balance the trade-off between improving wellbeing and reducing inequality? The relative importance people place on increasing averages and reducing inequalities (or “inequality aversion”) was elicited from a sample of the UK general public (n=53). Respondents participated in one of eleven online discussion groups, where a series of quantitative trade-off exercises were explained and discussed. Each respondent then completed the same exercise individually. The exercises covered aversion to inequality in: (a) an overall measure of wellbeing (equivalent income); (b) lifetime health across otherwise equal individuals; and (c) lifetime health across the rich and poor.	Different surveys use different health outcome indicators. Therefore, data might be available for one indicator set when another is required. For example, answers to SF-12 survey items are available but a WEMWBS value is required. This is a large-cross section online survey of the general public (n=12,401) where respondents are asked to self-report their health and wellbeing across a battery of questions. This dataset allows the estimation of a statistical mapping algorithm between the different indicator sets.
Strengths	The SIPHER Synthetic Population is representative of the demographic characteristics of the respective area - down to a low geographical resolution. The strength of the SIPHER Synthetic Population is that it provides a wide range of information at the level of individuals. This information can be aggregated into groupings of interest (e.g. sex, income groups) and particular geographical units of interest (LSOA/DZ; MSOA; Local Authorities etc.). The method used to develop the dataset is referred to as spatial microsimulation. We often use the SIPHER Synthetic Population in conjunction with other models we have developed. This enables us to determine whether an intervention has benefitted a population group of interest.	Small-area health indicators can be used to monitor area-level health inequalities or as inputs in statistical models. In addition, all health outcome measures can be attached to the Synthetic Population representing area-level health indicators. SIPHER reviewed the available measures and conducted a consensus process with SIPHER colleagues to agree on a final set of indicators. The criteria used were: 1. Interpretability -accessible & meaningful to decision makers, 2. Sensitivity to policy – the indicator can plausibly show the effects of policy. 3. Indicator can show impacts of pandemic on health. 4. Timeliness – refers to the current health state. 5. Availability of timeseries data 6. Changes in mental AND physical health can be separately studied. 7. Regular updates into the future are expected, 8. Comparability – between areas, ideally comparable between England & Scotland, 9. High resolution – available for small areas with LA as a minimum, 10. Disaggregate – available by subgroups (e.g. broken down by age, sex etc).	The dataset has been subject to a thorough geographical harmonisation and review process. In addition, the dataset contains a number of supplementary health and demographic indicators for all local authorities. The major strength of this dataset is the wide range of potential applications; from descriptive analyses to studies examining the complex relationships between economic inclusion and health and wellbeing. The dataset is available as an open access resource via the Open Science Framework: https://osf.io/vnsur/	The dataset has been subject to a thorough geographical harmonisation and review process. In addition, the dataset contains a number of supplementary wellbeing and demographic indicators for all local authorities. The major strength of this dataset is the wide range of potential applications; from descriptive analyses to studies examining the complex relationships between economic inclusion and health and wellbeing. The dataset is available as an open access resource via the Open Science Framework: https://osf.io/s24ye/	The DCE data on relative preferences allow the calculation of equivalent income - a quantitative preference-based single metric of wellbeing - for any combination of SIPHER-7 indicators. The samples are large (ranging from 1000 to 3000, totalling just under 11,000) and representative of the UK general public in terms of age and sex.	Public policies aim to improve wellbeing and to reduce wellbeing inequality. When there is a conflict between these, policy makers need to make difficult decisions. The quantitative data on inequality aversion is derived from discussion groups, where participants had the opportunity to examine the trade-off exercise in detail. The results help inform policy makers on the trade-offs between the two policy aims that members of the public would support.	Different surveys have different health and wellbeing indicators, and this dataset allows the estimation of a statistical mapping algorithm between them. This would allow predicting SIPHER-7 information where the relevant variables are not available.
Limitations	The accuracy of the SIPHER Synthetic Population depends on the quality and availability of the underlying data. Some variables may have poor completion rates in the underlying survey, resulting in missing data after linkage. Despite the high number of participants in the Understanding Society survey, explicit spatial constraints cannot be applied when creating the datasaet. This means that an individual who was interviewed as part of the survey and who is actually residing in place X can be assigned to a variety of places A, B, and C, as long as they match the demographic constraints such as age, sex, marital status etc. Although recent updates of the code have led to more constraints on how to perform this selection process, it is important to remember that the creation of the SIPHER Synthetic Population is based on associations and descriptive statistics. It can only ever serve as an approximation of the true population in Scotland, England and Wales - which is likely to be much more heterogenous and diverse than the population captured in the synthetic data source. Therefore, all results obtained from the SIPHER Synthetic Population should always be interpreted carefully as model output, and not as equivalent to a population-based register.	The dataset cannot resolve situations where no data is available at all or where sampling in surveys is not representative of small geographical units.	For a few of the indicators, exact definitions differ between countries. For example, there are different definitions of fuel poverty in use in Scotland and England. In these cases, national deciles were created and comparable alternative indicators were identified. For example, food insecurity was used as an alternative cost-of-living indicator.	It should be noted that the metrics for two indicators differ from those in the SIPHER Inclusive Economy (Local Authority) Level Dataset: (1) Indicator 5A (poverty), low income before housing costs (BHC) was used, rather than after housing costs (AHC); (2) Indicator 5B (cost of living), fuel poverty was used, rather than food poverty.	Currently not available.	Currently not available.	Currently not available.
Geography	Individuals in the SIPHER Synthetic Population have a geography assigned to them (a synthetic DZ/LSOA). This allows all levels of geography upwards from DZ/LSOA Level for Scotland, England and Wales - excluding Northern Ireland - to be analysed and modelled.	The exact geographical resolution is indicator-dependent. Typically, the following resolutions are available for Mortality: DZ/LSOA Level for Scotland, England and Wales and LA Level	Longitudinal (2017-2021) and geographically harmonised data is available at the level of local authorities in England, Scotland, and Wales. The dataset covers all 363 local authorities in Great Britain, reflecting their 2021 boundaries according to ONS definition.	Longitudinal (2019-2021) and geographically harmonised data is available at the level of electoral wards in England, Scotland, and Wales. The dataset covers 7,973 of 8,020 wards in Great Britain, reflecting their 2022 boundaries according to ONS definition.	The surveys collected data from participants resident in the UK with sampling quotas for age and for sex.	UK with sampling quotas for age and for sex.	The survey collected data from participants resident in the UK with sampling quotas for age and for sex. Oversamples Scotland.
Variables / Indicators	A large variety of variables can be included. This includes all variables included in the Understanding Society survey - the underlying survey data source. It also possible to estimate other derived variables from this data source, for example ‘Equivalent Income’, using the ‘Equivalent Income Calculator’ method.	The dataset includes measures of mortality, physical, and mental health, and composite measures combining mortality and health. It is open to data updates, and additional health indicators can be estimated and incorporated if required.	Details on all indicators are outlined in the Technical Report for the SIPHER Inclusive Economy Indicator Set – See Additional Resources.	Details on all indicators are outlined in the Technical Report for the SIPHER Inclusive Economy Indicator Set – See Additional Resources.	In addition to the DCE choice data, the surveys include participant self-reported data on: SIPHER-7; household size; age; gender; etc. Surveys (1) and (2) use the original SIPHER-7. Surveys (3) and (4) use the revised version of SIPHER-7.	In addition to the inequality aversion task, the survey include participant self-reported data on: SIPHER-7; household size; age; gender; etc.	The indicator sets and questions included in the survey: SIPHER-7; ICECAP-A; EQ-5D-5L; SF-12 v2; HUI; WEMWBS; EQ-HWB; ONS-4; Understanding Society items on crime and housing; items from the Labour Force Survey, the Living Wage Foundation questionnaire; education, income, ethnicity, children, informal caregiving; gender, age; etc. Includes sampling weights to correct for age and sex with respect to the mid-year UK population estimate.
Time Period	The latest release reflects the years 2019-2021. Results from the UK census 2011 are used as constraints for the spatial microsimulation - the process generating the Synthetic Population. Preliminary updated version for England and Wales are available which are based on the UK census 2021. However, Scotland has not yet published all required input data from its most recent census.	DZ/LSOA/MSOA Level: typically, cross-sectional representing the period covered by the synthetic population. Local Authority level: typically, longitudinal for 2004-2020 when based on non-synthetic data. Data will be updated as new data becomes available.	Longitudinal data are available for every year between 2017 and 2021.	Longitudinal data are available for every year between 2019 and 2021.	There are four datasets: (1) people’s personal preferences in autumn 2020; (2) people’s personal preferences in autumn 2021; (3) people’s personal preferences in spring 2022; (4) people’s social preferences in spring 2022. Dataset (2) includes returning respondents from (1). Otherwise, the observations are independent.	Data collected: summer - autumn 2022.	Data collected: late 2022.
Missing Data	The level of missing information for a particular variable is determined by the levels of missingness in the underlying Understanding Society survey.	Level of missing data determined by data availability. Older data not always comparable across time or form for some indicators.	Missing data were imputed using a sophisticated multiple imputation algorithm. In some cases, only cross-sectional measurements were available, which were carried forward or backward. For example, local elections (Indicator 6B) did not take place every single year.	Missing data were imputed using a sophisticated multiple imputation algorithm. In some cases, only cross-sectional measurements were available, which were carried forward or backward. For example, local elections (Indicator 6B) did not take place every single year.	Currently not available.	Currently not available.	Currently not available.
Examples / Link with Other Models and Data	The Synthetic Population is used as the underlying data source in several SIPHER models. These include: (1) dynamic systems model, (2) static and dynamic microsimulation and (3) decision support tool. Information covered in the Synthetic Population can be extended by adding additional variables from other data sources. These could be datasets that are not publicly available. In addition, the SIPHER Synthetic Population can be used to derive more complex concepts such as the ‘Equivalent Income’ - a variable which is calculated using the ‘Equivalent Income Calculator’ method.	A portfolio of area-level summary indicators on mortality, health, and composite indicators that combine information on mortality and health. These indicators can be attached as area-level indicators to the SIPHER Synthetic Population. In addition, health measures are used in the Local Authority clustering work, as well as in the Dynamic Systems model.	The dataset is currently used in a k-means clustering machine learning study. The primary aim of this study is to identify clusters of similar local authorities and to examine the association of each cluster with a number of health outcomes. In another application, we explore the association between Quality-Adjusted Life Expectancy (QALE) and indicators of economic inclusion.	The dataset relies on the SIPHER Synthetic Population for 8/13 of the inclusive economy indicators. It also includes several demographic and wellbeing indicators in the form of the Shortform-12 (SF-12) measures, physical and mental components scores (PCS and MCS).	The estimated parameters can be used to calculate an equivalent income variable in the Synthetic Population.	The estimated inequality aversion parameter is used to identify the optimal trade-off between maximising wellbeing and reducing inequality in the decision support tools.	Currently not available.
Software Requirements	Requires a software that can handle the size of the data file, such as R or Python. An interactive Rshiny dashboard allows a code-free exploration of an aggregated version: https://sipherdashboard.sphsu.gla.ac.uk/	Requires a software that can handle the size of the data file, such as R or Python	Requires a software that loads data, such as Excel, R, or Python. Access SIPHER Inclusive Economy Dataset Interactive Map - https://www.gla.ac.uk/research/az/sipher/products/inclusiveeconomydataset/ieinteractivemap/#d.en.1054750.	Requires a software that loads data, such as Excel, R, or Python. Access SIPHER Inclusive Economy Dataset Interactive Map - https://www.gla.ac.uk/research/az/sipher/products/inclusiveeconomydataset/ieinteractivemap/#d.en.1054750	The main choice data and respondent background variables are saved in Stata and require a software that can read in Stata files.	The main trade-off data and respondent background variables are saved in Stata and require a software that can read in Stata files.	Currently saved in Stata and requires a software that can read in Stata files.
Data Requirements / Restrictions	The SIPHER Synthetic Population is available for full indeopendent use via the UK Data Service’s Curated Data Collection. To set up the SIPHER Synthetic Population, it is required to link the synthetic population file (UK Data Service ID: SN9277) with Understanding Society survey data (UK Data Service ID: SN6614) - as is typically done for area-level linkages of surveys. Both datasets are subject to the General End-User License Agreement terms and conditions, and can be downloaded without any costs directly from the website of UK Data Service.	For key indicators such as QALE, Life Expectancy, and Lifespan Variation it is planned that a final version of the dataset and the underlying code will be made publicly available. In order to fully reproduce health measures requiring the Synthetic Population, access to the Synthetic Population is required.	The final dataset is available as an open access resource.	The final dataset is available as an open access resource.	Currently not available.	Currently not available.	Currently not available.
Data / Code Available	Due to the underlying license agreement, the dataset cannot be shared as an open access version. However, the dataset can be downloaded through the UK Data Service website, after acceptance of the General End-User license terms and conditions: https://beta.ukdataservice.ac.uk/datacatalogue/studies/study?id=9277#!/details In addition, we have made a wealth of supplementary material available, documenting creation, validation, linkage, and exploration of the dataset: https://reshare.ukdataservice.ac.uk/856754/	Work in progress, final dataset will be made publicly available. Pipeline of code for estimation of Quality-Adjusted Life Expectancy (QALE) is available.	The final dataset and additional documentation are publicly available via the Open Science Framework: https://osf.io/vnsur/.	The final dataset and additional documentation are publicly available via the Open Science Framework: https://osf.io/s24ye/.	Currently not available.	Currently not available.	Currently not available. The dataset will be archived. There is no associated code.
Training	We have provided a comprehensiv, open access User Guide for our SIPHER Synthertic Population. The User Guide provides background information and explains how to setup up the data and analyse it swiftly: https://doc.ukdataservice.ac.uk/doc/9277/mrdoc/pdf/9277_user_guide_r4_clean.pdf	Online pipeline example via GitHub.	The data is accompanied by a comprehensive data dictionary which provides context relating to all variables included.	The data is accompanied by a comprehensive data dictionary which provides context relating to all variables included.	Currently not available.	Currently not available.	Currently not available.
Additional Resources	SIPHER Synthetic Population for Individuals in Great Britain, 2019-2021 (UK Data Service Curated Collection, SN9277): https://beta.ukdataservice.ac.uk/datacatalogue/studies/study?id=9277#!/details Comprehensive User Guide: https://doc.ukdataservice.ac.uk/doc/9277/mrdoc/pdf/9277_user_guide_r4_clean.pdf Supplementary Resources: https://reshare.ukdataservice.ac.uk/856754/ Paper describing the statistical creation process: https://www.nature.com/articles/s41597-022-01124-9 Understanding Society Survey Blog: https://www.understandingsociety.ac.uk/news/2024/07/10/building-synthetic-population-data/ Introduction Video: https://www.youtube.com/watch?v=CkiORY7GSLc	Choosing the SIPHER health Indicators Report: https://www.gla.ac.uk/media/Media_970682_smxx.pdf and QALE exemplar: https://github.com/AndreasxHoehn/QALE_Exemplar Some indicators are available through the SIPHER Synthetic Population Dashboard: https://sipherdashboard.sphsu.gla.ac.uk/	Explore - https://www.gla.ac.uk/research/az/sipher/products/inclusiveeconomydataset/ SIPHER Inclusive Economy Indicator Set: Technical paper [PDF] - https://www.gla.ac.uk/media/Media_970680_smxx.pdf SIPHER Inclusive Economy Indicator Set: Summary [PDF] - https://www.gla.ac.uk/media/Media_1029792_smxx.pdf Estimating quality-adjusted life expectancy (QALE) for local authorities in Great Britain and its association with indicators of the inclusive economy: a cross-sectional study BMJ Open March 2024 - https://bmjopen.bmj.com/content/14/3/e076704 Measuring the Inclusive Economy Blog - https://www.gla.ac.uk/research/az/sipher/sharingourevidence/blog/headline_1049629_en.html	Explore - https://www.gla.ac.uk/research/az/sipher/products/inclusiveeconomydataset/ SIPHER Inclusive Economy Indicator Set: Technical paper [PDF] - https://www.gla.ac.uk/media/Media_970680_smxx.pdf SIPHER Inclusive Economy Indicator Set: Summary [PDF] - https://www.gla.ac.uk/media/Media_1029792_smxx.pdf Inclusive Economy Indicators for Electoral Wards Blog - https://www.gla.ac.uk/research/az/sipher/sharingourevidence/blog/headline_1132578_en.html	Explore - https://www.gla.ac.uk/research/az/sipher/products/sipher-7wellbeingindicators/ Blog: Collasping multi-dimensional wellbeing into equivalent income - March 2022 https://www.gla.ac.uk/research/az/sipher/sharingourevidence/blog/headline_1019908_en.html	Currently not available.	Currently not available.
Contact	sipher@glasgow.ac.uk	sipher@glasgow.ac.uk	sipher@glasgow.ac.uk	sipher@glasgow.ac.uk	sipher@glasgow.ac.uk	sipher@glasgow.ac.uk	sipher@glasgow.ac.uk

Return to SIPHER homepage

Dynamic Systems Model

This systems map underlying the WS4 model provides a visualisation of all captured nodes and their interactions (Scotland Model).

Dynamic Systems Model
Characteristic	Details
Status	In progress/ready soon
Main Perspective	Population Level (Macro)
Purpose	This state-space dynamic system model provides a simulation of how each variable contained in the systems map will be affected over time, given specific changes to one or more variables. All studied variables (unemployment, poverty, health, etc.) have to be represented by the input data. Model provide results at the local authority level and allow us to compare system-level effects of different (or no) policy interventions over time.
Strengths	The model captures an entire system, including feedback loops to allow for the modelling of dynamic behaviour. In addition, the model allows the testing of policy changes ex-ante - rather than retrospectively. The model can capture both, increases and decreases (such as increases or decreases in funding to supplement disposable household income).
Limitations	Any change to be modelled must be quantifiable by the model. This means that changes in variables which are not explicitly covered or for which there is no dependency will not become visible in the model. This implies that results are sensitive to pre-defined pathways which were specified in the systems map. Another limitation is posed by the assumption of known causal pathways between domains. This can be problematic in some cases and requires careful consideration and good justification. Furthermore, assumptions on the time frame for causal relationships needs strong justification and supporting information, which might not always be available. Finally, all modelled policy interventions need to be attributable to the LA level.
Geography	Local Authority level for Scotland/England/Wales.
Time Period	Based on available and imputed data for previous years (currently 2004-2021). The model provides a dynamic annual forecast for a specified period, for example 5 years, for each variable in the model.
Adjustments / Extensions	Factors which can be modified include: the underlying systems map (representing domains and their interactions), features of each respective intervention (including the amount of uplift or characteristics of recipients). In addition the method can be used to capture different systems (environment, housing etc.).
Data Requirements	Aggregate level inputs for units of the studied geographical level (e.g. unemployment rate for the LA). Sufficient longitudinal data is required for all variables to validate the model. Cross-sectional data can supplement the longitudinal data for model determination. Domain-specific definitions need to be similar across all geographical units. Please note that different indicators have been selected for England and Wales and Scotland due to data availability.
Applications	Typical applications include a systems behaviour as a result of policy interventions, such as interventions to improve poverty, living wage, participation in employment, skills and qualification. In addition, this set of models can help to answer questions about the potential impact of direct policy responses to the current cost-of-living crisis.It is possible to forecast the impact of an intervention for a specific local authority.
Modelling Assumptions	Models depend on a pre-defined systems map that describes how domains impact each other and which domains can be subject to interventions. These systems maps need to specify causal pathways between domains with pre-defined time lags. Models also depend on data to provide evidence for quantifying relationships.
User Options	Which variable to change and by how much, corresponding to the policy intervention (or shock/absence of intervention) which is evaluated. All changes can be applied differentially to local authorities.
User Type(s)	Modellers, decision makers
Examples / Link with Other Models and Data	Models of dynamic systems can inform individual-level approaches and help to validate results which were obtained in individual-level approaches. Works also in opposite direction: changes on individual-level which can be aggregated and expressed on LA level.
Software Requirement(s)	Matlab.
Options for Extension	Building different models for different systems. Modelling and quantifying uncertainty.
Additional Resources	Explore: https://www.gla.ac.uk/research/az/sipher/development/dynamicsystemsmodel/
Contact	sipher@glasgow.ac.uk

Inclusive Economy Dynamic Systems Model Variable Definitions
Model Variable	Scotland Model Indicator	England and Wales Model Indicator
Employment Rate	Employment rate – aged 16-64	Employment rate – aged 16-64
Job Security / Precarity	Percentage of 16+ in non-permanent employment amongst employed	Percentage of 16+ in non-permanent employment amongst employed
Skills and Qualifications	Percentage of adults aged 16-64 with a NVQ 2+ qualification	Percentage of adults aged 16-64 with a NVQ 2+ qualification
Labour Remuneration	Percentage of employee jobs paid below “living wage”	Percentage of employee jobs paid below “living wage”
Involuntary Exclusion (long term sick)	Proportion of economically inactive due to long-term ill health over working age population	Proportion of economically inactive due to long-term ill health over working age population
Disposable Income	Gross disposable household income per head	Gross disposable household income per head
Earnings Inequality	Ratio of weekly earnings between 80th and 20th percentiles	Ratio of weekly earnings between 80th and 20th percentiles
(Child) Poverty	Percentage of children living in low income households	Percentage of children living in low income households
Cost of living	Percentage of household in a LA with finance problem	% of fuel poor households in a LA
Health Outcome	Percentage of people in a LA who reported mental health problem	SF-12 mental health values from Survey data (Understanding Society)
Health Outcome	Directly age-standardized mortality rate per 100,000 (age under 75)	Directly age-standardized mortality rate per 100,000 (age under 75)
Health Outcome	Life expectancy at birth (male)	Healthy life expectancy at birth

Return to SIPHER homepage

Static Microsimulation

Static microsimulation allows us to grasp baseline scenarios or the immediate impact of changes.
The example above illustrates the percentage of households across local authorities in England and Wales
which receive support by foodbanks.

Static Microsimulation
Characteristic	Details
Status	Ready
Main Perspective	Individual Level (Micro)
Purpose	This static microsimulation, using a digital twin of the UK population as a data source, provides a granular picture of the impact of policy interventions. This model enables us to examine changes relatively quickly and with a relatively low amount of computational resources. It achieves this by simplifying the relationships and interconnections of an individual’s attributes.
Strengths	A particular strength of the model is that it enables the examination of immediate outcomes at level of individuals or households based on a policy change. Aggregating the outcomes allows a user to derive changes on the level of small geographies such as MSOA/LSOA, DZ, and local authorities. Models can provide immediate information on how many people will be affected, where those people live, and what their basic demographic characteristics are. Aggregation allows us to identify potential changes for specific geographical areas of interest.
Limitations	Limitations of the synthetic population apply.
Geography	LSOA/MSOA/DZ, and local authority level for Scotland, England, and Wales.
Time Period	Corresponding to the period covered in the underlying Synthetic Population, for example based on Understanding Society wave k (2019-2021)
Adjustments / Extensions	All information describing individuals in all or only particular areas can be seen as potentially modifiable. For example, income, employment status, health etc. These interventions are typically informed by previous research and are often referred to as “the morning after” scenarios - situations., in which an immediate change to one or more individual-level factors has occurred instantaneously.
Data Requirements	Synthetic Population (see Product Guide details)
Applications	Number and characteristics of people affected by a financial uplift policy or labour market intervention as well as total costs of this policy for a particular geographical area
Modelling Assumptions	Assumptions of the Synthetic Population apply.
User Options	Character, target group, and magnitude of particular interventions. In addition, the user can choose the geography level and select specific geographical reasons of interest.
User Type(s)	Modellers, decision makers, descriptive overview to inform statistical modelling
Examples / Link with Other Models and Data	This model requires SIPHER’s Synthetic Population.
Software Requirement(s)	R or Python
Options for Extension	All results can be combined with cost information where available to conduct cost-benefit analyses.
Additional Resources	Paper describing applied static microsimulation to create the Synthetic Population: https://www.nature.com/articles/s41597-022-01124-9
Contact	sipher@glasgow.ac.uk

Return to SIPHER homepage

Dynamic Microsimulation - MINOS

A visualisation of the pathways used in the dynamic microsimulation model of disposable income to mental health.

Dynamic Microsimulation - MINOS
Characteristic	Details
Status	In progress/ready soon
Main Perspective	Individual Level (Micro)
Purpose	This Microsimulation For Interrogation Of Social And Health Systems (MINOS) dynamic microsimulation, using longitudinal survey data such as the SIPHER Synthetic Population, provides a very granular picture of the impact of policy interventions on different population groups. This model uses individual-level data and simulates the transitions of individuals across different states (such as health states) over time, based on a specific set of models describing these transitions.
Strengths	Designated longitudinal approach for the individual-level while outcomes can also be aggregated to reflect changes for population subgroups and geographical areas.
Limitations	Interventions can be applied to specific variables, and outcomes applied to specific health variables.
Geography	DZ/LSOA Level for Scotland, England, and Wales
Time Period	The ‘jump off’ point for the scenarios is the latest period in the underlying Understanding Society input data (currently wave k (2019-2021). The ‘time horizon’ for the scenario is set at 2037.
Adjustments / Extensions	Features of each respective intervention, including the amount of uplift or characteristics of recipients receiving the uplift.
Data Requirements	Understanding Society (waves a-k). If spatial results are required, the latest version of the Synthetic Population (see data for details).
Applications	Shocks and policy interventions which can be expressed as changes at the individual level. For example: changes to disposable income. Transition models need to be constructed for new problems.
Modelling Assumptions	The model relies on the assumption that transitions between states over time - representing the characteristics of an individual - can be modelled using a set of specified and measured characteristics of this individual. In addition, the Markov assumption needs to hold meanings that the time spent in a particular state (i.e. unemployed) does not have an impact on the probability of transitioning into other states (i.e. employed).
User Options	Character, target group, and magnitude of particular interventions. In addition, users can assess the impacts for LSOAs/DZs within a given area.
User Type(s)	Modellers, decision makers
Examples / Link with Other Models and Data	This model uses SIPHER’s Synthetic Population.
Software Requirement(s)	Python
Options for Extension	Building different models for different interventions. Factors impacting transitions can be adjusted based on different contexts and assumptions.
Additional Resources	Explore: https://www.gla.ac.uk/research/az/sipher/products/minos/ For documentation visit: https://leeds-mrg.github.io/Minos/ and for code and more detailed user instructions visit: https://github.com/Leeds-MRG/Minos
Contact	sipher@glasgow.ac.uk

Dynamic Microsimulation Pathways and Definitions
Pathway	Measurement
Housing Quality	Measured by the level of access to a range of facilities within the home.
Loneliness	Measured by the frequency and quality of social interaction
Neighbourhood Safety	Measured by an individual’s perception of how safe they feel in their neighbourhood
Nutrition Quality	Measured by the quality of an individual’s diet
Smoking Intensity	Measured by the number of cigarettes smoked

Return to SIPHER homepage

Decision Support Tool

Diagrammatic representation of a trade-off between three different objectives.
The Decision Support Tool identifies different compromise options to balance this trade-off.

Decision Support Tool
Characteristic	Details
Status	In progress/ready soon
Main Perspective	From Individual Level (Micro) to Population Level (Macro)
Purpose	The decision support tool is not a model in itself. Rather, it uses the available SIPHER models to provide decision support to policy analysts.
Strengths	Can search over many thousands of different intervention options (e.g. local communities, socio-demographic sub-groups, levels of intervention) to reveal trade-offs between outcomes.
Limitations	The decision support tool is dependent on SIPHER models and therefore subject to the limitations of these underlying models. Synthetic Population, Dynamic Systems Model and Dynamic Microsimulation can all be integrated but their limitations will then apply to the resulting decision support tool. It is important to note that the decision support tool is not intended to be used as a decision making tool. Rather the tool will provide a range of possible answers reflecting the trade-offs associated with potential decisions. The tool does not make any decisions - this responsibility rests with the user.
Geography	Adopts the same geographical perspective as the SIPHER models that have been integrated - typically it is matched to the needs of the policy partner (so we have created Sheffield, Greater Manchester, Scotland (and Scottish LA) versions of the tool).
Time Period	Adopts the same time period as the SIPHER models that have been integrated. Corresponds to the period covered in the underlying Synthetic Population, for example based on Understanding Society wave k (2019-2021) up to 2025/2026.
Adjustments / Extensions	Potential adjustments include characteristics of the underlying models as well as features and the geographical granularity of the reported outcomes.
Data Requirements	The decision support tool requires results from other SIPHER models. In addition, information on the intervention as well as cost-effectiveness assumptions are required.
Applications	Applications include local community interventions on components of wellbeing; spatial targeting of job creation schemes; impact of targeted employment stimuli on health outcomes.
Modelling Assumptions	Inherits the assumptions of the SIPHER models that have been integrated. In addition, assumptions on the costs and effectiveness of interventions are required.
User Options	Geographical and temporal focus. Intervention configuration options.
User Type(s)	Modellers, decision makers
Examples / Link with Other Models and Data	The decision support tool uses the synthetic population, the systems dynamic model, the static and dynamic microsimulations, and the equivalent income utility function.
Software Requirement(s)	Python
Options for Extension	Alternative policy/intervention configurations.
Additional Resources	Explore: https://www.gla.ac.uk/research/az/sipher/products/decisionsupporttool/ Software: https://ligerdev.shef.ac.uk/sipher-team/sipher_ws7_interventions Database: https://ligerdev.shef.ac.uk/sipher-team/sipher_ws7_database
Contact	sipher@glasgow.ac.uk

Return to SIPHER homepage

K-Means Clustering

K-means clustering methods allow us to identify areas which perform similar with respect to a number of indicators
as shown in this example for electoral Wards in central Scotland

K-Means Clustering
Characteristic	Details
Status	Ready
Main Perspective	Population Level (Macro)
Purpose	K-means clustering is a data-driven approach that allows users to identify clusters of local authorities based on their performance with respect to the utilised inclusive economy data collection. This enables the identification of more or less inclusive clusters. In addition, the association between these clusters and a number of selected local authority level health outcomes is examined.
Strengths	A summarising cluster solution clearly reduces complexity and leads to intuitive results. Outcomes have a straightforward meaning. Another strength of this approach lies in its ability to be updated and transferred to other sets of indicators or used over time.
Limitations	In some cases, the achieved reduction in complexity might not be desired. It is a limitation that complete observations are required which often adds another preparatory step to the process (imputation of missing data). As a data-driven algorithm there are only limited options to intervene, for example with respect to the number of optimal clusters.
Geography	The clustering is currently based on all local authorities in Scotland, England, and Wales. A previous application covered the LSOA level for selected English Local Authorities.
Time Period	The current approach is cross-sectional, covering the last available year (2020/2021). As data on inclusive economies is available for a much longer period, it is planned to study the stability of clusters over time.
Adjustments / Extensions	Adjustments to the current model include the number of clusters, a designated focus on one or more UK Nations (Scotland, England or Wales) in isolation as well as the respective Inclusive Economy indicators and health outcomes considered.
Data Requirements	Aggregate-level information for geographical areas on a selected set of indicators. Indicators can come from various different sources, but each indicator must have been measured consistently across observation units. For k-means to work properly, the level of missing information should be 0%. In case any information is missing, imputation methods can be utilised to achieve this requirement.
Applications	The method is currently used to cluster local authorities based on inclusive economy indicators. It can be expanded to other indicator sets and domains as well as other outcome measures (environmental indicators).
Modelling Assumptions	Clusters are identified based on the similarity observed units with respect to a number of defined domains.
User Options	The primary option for adjustment is the number of clusters.
User Type(s)	Provides descriptive overview to inform decision making and modelling
Examples / Link with Other Models and Data	Can inform the interpretation of WS4 models. In turn, can inform WS4 model input.
Software Requirement(s)	R
Options for Extension	Other domains for which indicator sets exist or can be created (crime, transport, environment etc.). A k-means clustering approach can be applied to individual-level life course trajectories.
Additional Resources	Preliminary results available upon request
Contact	sipher@glasgow.ac.uk

Return to SIPHER homepage

Small-Area Indicator Estimation

A designated small-area approach enables us to obtain estimates when data is sparse or incomplete.<br>The example shows spatial patterns for the MSOA-level in Wales, obtained from the SIPHER Synthetic Population.

A designated small-area approach enables us to obtain estimates when data is sparse or incomplete.
The example shows spatial patterns for the MSOA-level in Wales, obtained from the SIPHER Synthetic Population.

Small-Area Indicator Estimation
Characteristic	Details
Status	Ready
Main Perspective	Population Level (Macro)
Purpose	The estimation of area-level indicators for small geographical units such as Local Authorities, MSOAs, or LSOAs is challenging. For example, fluctuations in the number of deaths can introduce imprecision and fluctuations when estimating life expectancy. Typically, these challenges increase as the size of the geographical unit decreases. Therefore, we employ a suite of specific small-area estimation methods to address these challenges. This suite of methods can then be applied to both non-synthetic and synthetic sources of data, such as the synthetic population, to obtain area-level estimates for the dimensions captured in the Understanding Society main stage survey.
Strengths	The suite of models aims to account for fluctuations and increase reliability of small-area estimates. This enables us to obtain reliable estimations given potentially unreliable data situations. The use of synthetic data can help to navigate situations in which no non-synthetic data would be available at all.
Limitations	Despite its advantages of dealing with small numbers, these methods cannot resolve situations where no data is available at all. The interpretation of results obtained from synthetic data needs care - for example, when interpreting very specific attributes for a very distinct geographical region.
Geography	The most common geographical level reflects the Local Authority level for England, Scotland, and Wales. In addition, estimates can be derived for the MSOA Level in England and Wales. Deriving estimates for the Intermediate Zone Level in Scotland is currently in progress. Due to the use of synthetic data, even smaller geographical resolutions can be achieved for some indicators.
Time Period	Estimates are available for 2004/2014 to 2020/2021 - dependant on indicator and underlying data sources. Data updates and suggestions of new indicators can be incorporated easily.
Adjustments / Extensions	Data updates can be incorporated easily. Ideas for additional indicators are welcome and can be estimated given that suitable data is available in a synthetic ornon-synthetic source.
Data Requirements	This is indicator dependent. For some indicators, all required data is free and publicly available via ONS/NRS vital statistics data on population, deaths, and health outcomes. In particular for those indicators combining mortality and health information (e.g., QALE) access to the General and Special License of Understanding Society is required - depending on the level of geography required. If the underlying data source is synthetic data, such as the synthetic population, requirements of this source apply.
Applications	Estimated measures include measures of mortality such as life expectancy and lifespan variation, measures of health such as SF-12 instrument capturing physical and mental health, and composite measures combining health and mortality. Measures at the household-level related to cost-of-living are also available and can be obtained from synthetic sources.
Modelling Assumptions	The major assumption is that small population sizes require specific methods to account for random fluctuations due to small numbers. A lot of measures, such mortality rates follow a very distinct pattern over age (standard trajectory) which requires knowledge of this approximate standard trajectory. When synthetic data is used, assumptions of the synthetic population apply.
User Options	The most common options are the measure itself, the geographical resolution, and year.
User Type(s)	Outcomes are used as inputs in other models, for monitoring purposes, and can inform decision making.
Examples / Link with Other Models and Data	Some of the derived health measures are used as input data in WS4 models, as outcomes for the association of clusters with health outcomes. In addition, some derived health measures can be attached to the synthetic population to represent area-level features as they cannot be derived directly from the synthetic population.
Software Requirement(s)	R
Options for Extension	Extension to a variety of small-area indicators is possible, such as age trajectories of fertility rates, employment rates, emergency admissions etc. In addition, different synthetic data sources can be utilised to create synthetic populations.
Additional Resources	Explore - https://www.gla.ac.uk/research/az/sipher/products/inclusiveeconomydataset/ An exemplary pipeline, estimating a range of health measures: https://github.com/AndreasxHoehn/QALE_Exemplar Estimating quality-adjusted life expectancy (QALE) for local authorities in Great Britain and its association with indicators of the inclusive economy: a cross-sectional study BMJ Open March 2024 - https://bmjopen.bmj.com/content/14/3/e076704 Some indicators are available through the SIPHER Synthetic Population Dashboard: https://sipherdashboard.sphsu.gla.ac.uk/
Contact	sipher@glasgow.ac.uk

Return to SIPHER homepage

Compare All Quantitative Products

Compare All Quantitative Products
Characteristic	Dynamic Systems Model	Static Microsimulation	Dynamic Microsimulation - MINOS	Decision Support Tool	K-Means Clustering	Small-Area Indicator Estimation
Status	In progress/ready soon	Ready	In progress/ready soon	In progress/ready soon	Ready	Ready
Main Perspective	Population Level (Macro)	Individual Level (Micro)	Individual Level (Micro)	From Individual Level (Micro) to Population Level (Macro)	Population Level (Macro)	Population Level (Macro)
Purpose	This state-space dynamic system model provides a simulation of how each variable contained in the systems map will be affected over time, given specific changes to one or more variables. All studied variables (unemployment, poverty, health, etc.) have to be represented by the input data. Model provide results at the local authority level and allow us to compare system-level effects of different (or no) policy interventions over time.	This static microsimulation, using a digital twin of the UK population as a data source, provides a granular picture of the impact of policy interventions. This model enables us to examine changes relatively quickly and with a relatively low amount of computational resources. It achieves this by simplifying the relationships and interconnections of an individual’s attributes.	This Microsimulation For Interrogation Of Social And Health Systems (MINOS) dynamic microsimulation, using longitudinal survey data such as the SIPHER Synthetic Population, provides a very granular picture of the impact of policy interventions on different population groups. This model uses individual-level data and simulates the transitions of individuals across different states (such as health states) over time, based on a specific set of models describing these transitions.	The decision support tool is not a model in itself. Rather, it uses the available SIPHER models to provide decision support to policy analysts.	K-means clustering is a data-driven approach that allows users to identify clusters of local authorities based on their performance with respect to the utilised inclusive economy data collection. This enables the identification of more or less inclusive clusters. In addition, the association between these clusters and a number of selected local authority level health outcomes is examined.	The estimation of area-level indicators for small geographical units such as Local Authorities, MSOAs, or LSOAs is challenging. For example, fluctuations in the number of deaths can introduce imprecision and fluctuations when estimating life expectancy. Typically, these challenges increase as the size of the geographical unit decreases. Therefore, we employ a suite of specific small-area estimation methods to address these challenges. This suite of methods can then be applied to both non-synthetic and synthetic sources of data, such as the synthetic population, to obtain area-level estimates for the dimensions captured in the Understanding Society main stage survey.
Strengths	The model captures an entire system, including feedback loops to allow for the modelling of dynamic behaviour. In addition, the model allows the testing of policy changes ex-ante - rather than retrospectively. The model can capture both, increases and decreases (such as increases or decreases in funding to supplement disposable household income).	A particular strength of the model is that it enables the examination of immediate outcomes at level of individuals or households based on a policy change. Aggregating the outcomes allows a user to derive changes on the level of small geographies such as MSOA/LSOA, DZ, and local authorities. Models can provide immediate information on how many people will be affected, where those people live, and what their basic demographic characteristics are. Aggregation allows us to identify potential changes for specific geographical areas of interest.	Designated longitudinal approach for the individual-level while outcomes can also be aggregated to reflect changes for population subgroups and geographical areas.	Can search over many thousands of different intervention options (e.g. local communities, socio-demographic sub-groups, levels of intervention) to reveal trade-offs between outcomes.	A summarising cluster solution clearly reduces complexity and leads to intuitive results. Outcomes have a straightforward meaning. Another strength of this approach lies in its ability to be updated and transferred to other sets of indicators or used over time.	The suite of models aims to account for fluctuations and increase reliability of small-area estimates. This enables us to obtain reliable estimations given potentially unreliable data situations. The use of synthetic data can help to navigate situations in which no non-synthetic data would be available at all.
Limitations	Any change to be modelled must be quantifiable by the model. This means that changes in variables which are not explicitly covered or for which there is no dependency will not become visible in the model. This implies that results are sensitive to pre-defined pathways which were specified in the systems map. Another limitation is posed by the assumption of known causal pathways between domains. This can be problematic in some cases and requires careful consideration and good justification. Furthermore, assumptions on the time frame for causal relationships needs strong justification and supporting information, which might not always be available. Finally, all modelled policy interventions need to be attributable to the LA level.	Limitations of the synthetic population apply.	Interventions can be applied to specific variables, and outcomes applied to specific health variables.	The decision support tool is dependent on SIPHER models and therefore subject to the limitations of these underlying models. Synthetic Population, Dynamic Systems Model and Dynamic Microsimulation can all be integrated but their limitations will then apply to the resulting decision support tool. It is important to note that the decision support tool is not intended to be used as a decision making tool. Rather the tool will provide a range of possible answers reflecting the trade-offs associated with potential decisions. The tool does not make any decisions - this responsibility rests with the user.	In some cases, the achieved reduction in complexity might not be desired. It is a limitation that complete observations are required which often adds another preparatory step to the process (imputation of missing data). As a data-driven algorithm there are only limited options to intervene, for example with respect to the number of optimal clusters.	Despite its advantages of dealing with small numbers, these methods cannot resolve situations where no data is available at all. The interpretation of results obtained from synthetic data needs care - for example, when interpreting very specific attributes for a very distinct geographical region.
Geography	Local Authority level for Scotland/England/Wales.	LSOA/MSOA/DZ, and local authority level for Scotland, England, and Wales.	DZ/LSOA Level for Scotland, England, and Wales	Adopts the same geographical perspective as the SIPHER models that have been integrated - typically it is matched to the needs of the policy partner (so we have created Sheffield, Greater Manchester, Scotland (and Scottish LA) versions of the tool).	The clustering is currently based on all local authorities in Scotland, England, and Wales. A previous application covered the LSOA level for selected English Local Authorities.	The most common geographical level reflects the Local Authority level for England, Scotland, and Wales. In addition, estimates can be derived for the MSOA Level in England and Wales. Deriving estimates for the Intermediate Zone Level in Scotland is currently in progress. Due to the use of synthetic data, even smaller geographical resolutions can be achieved for some indicators.
Time Period	Based on available and imputed data for previous years (currently 2004-2021). The model provides a dynamic annual forecast for a specified period, for example 5 years, for each variable in the model.	Corresponding to the period covered in the underlying Synthetic Population, for example based on Understanding Society wave k (2019-2021)	The ‘jump off’ point for the scenarios is the latest period in the underlying Understanding Society input data (currently wave k (2019-2021). The ‘time horizon’ for the scenario is set at 2037.	Adopts the same time period as the SIPHER models that have been integrated. Corresponds to the period covered in the underlying Synthetic Population, for example based on Understanding Society wave k (2019-2021) up to 2025/2026.	The current approach is cross-sectional, covering the last available year (2020/2021). As data on inclusive economies is available for a much longer period, it is planned to study the stability of clusters over time.	Estimates are available for 2004/2014 to 2020/2021 - dependant on indicator and underlying data sources. Data updates and suggestions of new indicators can be incorporated easily.
Adjustments / Extensions	Factors which can be modified include: the underlying systems map (representing domains and their interactions), features of each respective intervention (including the amount of uplift or characteristics of recipients). In addition the method can be used to capture different systems (environment, housing etc.).	All information describing individuals in all or only particular areas can be seen as potentially modifiable. For example, income, employment status, health etc. These interventions are typically informed by previous research and are often referred to as “the morning after” scenarios - situations., in which an immediate change to one or more individual-level factors has occurred instantaneously.	Features of each respective intervention, including the amount of uplift or characteristics of recipients receiving the uplift.	Potential adjustments include characteristics of the underlying models as well as features and the geographical granularity of the reported outcomes.	Adjustments to the current model include the number of clusters, a designated focus on one or more UK Nations (Scotland, England or Wales) in isolation as well as the respective Inclusive Economy indicators and health outcomes considered.	Data updates can be incorporated easily. Ideas for additional indicators are welcome and can be estimated given that suitable data is available in a synthetic ornon-synthetic source.
Data Requirements	Aggregate level inputs for units of the studied geographical level (e.g. unemployment rate for the LA). Sufficient longitudinal data is required for all variables to validate the model. Cross-sectional data can supplement the longitudinal data for model determination. Domain-specific definitions need to be similar across all geographical units. Please note that different indicators have been selected for England and Wales and Scotland due to data availability.	Synthetic Population (see Product Guide details)	Understanding Society (waves a-k). If spatial results are required, the latest version of the Synthetic Population (see data for details).	The decision support tool requires results from other SIPHER models. In addition, information on the intervention as well as cost-effectiveness assumptions are required.	Aggregate-level information for geographical areas on a selected set of indicators. Indicators can come from various different sources, but each indicator must have been measured consistently across observation units. For k-means to work properly, the level of missing information should be 0%. In case any information is missing, imputation methods can be utilised to achieve this requirement.	This is indicator dependent. For some indicators, all required data is free and publicly available via ONS/NRS vital statistics data on population, deaths, and health outcomes. In particular for those indicators combining mortality and health information (e.g., QALE) access to the General and Special License of Understanding Society is required - depending on the level of geography required. If the underlying data source is synthetic data, such as the synthetic population, requirements of this source apply.
Applications	Typical applications include a systems behaviour as a result of policy interventions, such as interventions to improve poverty, living wage, participation in employment, skills and qualification. In addition, this set of models can help to answer questions about the potential impact of direct policy responses to the current cost-of-living crisis.It is possible to forecast the impact of an intervention for a specific local authority.	Number and characteristics of people affected by a financial uplift policy or labour market intervention as well as total costs of this policy for a particular geographical area	Shocks and policy interventions which can be expressed as changes at the individual level. For example: changes to disposable income. Transition models need to be constructed for new problems.	Applications include local community interventions on components of wellbeing; spatial targeting of job creation schemes; impact of targeted employment stimuli on health outcomes.	The method is currently used to cluster local authorities based on inclusive economy indicators. It can be expanded to other indicator sets and domains as well as other outcome measures (environmental indicators).	Estimated measures include measures of mortality such as life expectancy and lifespan variation, measures of health such as SF-12 instrument capturing physical and mental health, and composite measures combining health and mortality. Measures at the household-level related to cost-of-living are also available and can be obtained from synthetic sources.
Modelling Assumptions	Models depend on a pre-defined systems map that describes how domains impact each other and which domains can be subject to interventions. These systems maps need to specify causal pathways between domains with pre-defined time lags. Models also depend on data to provide evidence for quantifying relationships.	Assumptions of the Synthetic Population apply.	The model relies on the assumption that transitions between states over time - representing the characteristics of an individual - can be modelled using a set of specified and measured characteristics of this individual. In addition, the Markov assumption needs to hold meanings that the time spent in a particular state (i.e. unemployed) does not have an impact on the probability of transitioning into other states (i.e. employed).	Inherits the assumptions of the SIPHER models that have been integrated. In addition, assumptions on the costs and effectiveness of interventions are required.	Clusters are identified based on the similarity observed units with respect to a number of defined domains.	The major assumption is that small population sizes require specific methods to account for random fluctuations due to small numbers. A lot of measures, such mortality rates follow a very distinct pattern over age (standard trajectory) which requires knowledge of this approximate standard trajectory. When synthetic data is used, assumptions of the synthetic population apply.
User Options	Which variable to change and by how much, corresponding to the policy intervention (or shock/absence of intervention) which is evaluated. All changes can be applied differentially to local authorities.	Character, target group, and magnitude of particular interventions. In addition, the user can choose the geography level and select specific geographical reasons of interest.	Character, target group, and magnitude of particular interventions. In addition, users can assess the impacts for LSOAs/DZs within a given area.	Geographical and temporal focus. Intervention configuration options.	The primary option for adjustment is the number of clusters.	The most common options are the measure itself, the geographical resolution, and year.
User Type(s)	Modellers, decision makers	Modellers, decision makers, descriptive overview to inform statistical modelling	Modellers, decision makers	Modellers, decision makers	Provides descriptive overview to inform decision making and modelling	Outcomes are used as inputs in other models, for monitoring purposes, and can inform decision making.
Examples / Link with Other Models and Data	Models of dynamic systems can inform individual-level approaches and help to validate results which were obtained in individual-level approaches. Works also in opposite direction: changes on individual-level which can be aggregated and expressed on LA level.	This model requires SIPHER’s Synthetic Population.	This model uses SIPHER’s Synthetic Population.	The decision support tool uses the synthetic population, the systems dynamic model, the static and dynamic microsimulations, and the equivalent income utility function.	Can inform the interpretation of WS4 models. In turn, can inform WS4 model input.	Some of the derived health measures are used as input data in WS4 models, as outcomes for the association of clusters with health outcomes. In addition, some derived health measures can be attached to the synthetic population to represent area-level features as they cannot be derived directly from the synthetic population.
Software Requirement(s)	Matlab.	R or Python	Python	Python	R	R
Options for Extension	Building different models for different systems. Modelling and quantifying uncertainty.	All results can be combined with cost information where available to conduct cost-benefit analyses.	Building different models for different interventions. Factors impacting transitions can be adjusted based on different contexts and assumptions.	Alternative policy/intervention configurations.	Other domains for which indicator sets exist or can be created (crime, transport, environment etc.). A k-means clustering approach can be applied to individual-level life course trajectories.	Extension to a variety of small-area indicators is possible, such as age trajectories of fertility rates, employment rates, emergency admissions etc. In addition, different synthetic data sources can be utilised to create synthetic populations.
Additional Resources	Explore: https://www.gla.ac.uk/research/az/sipher/development/dynamicsystemsmodel/	Paper describing applied static microsimulation to create the Synthetic Population: https://www.nature.com/articles/s41597-022-01124-9	Explore: https://www.gla.ac.uk/research/az/sipher/products/minos/ For documentation visit: https://leeds-mrg.github.io/Minos/ and for code and more detailed user instructions visit: https://github.com/Leeds-MRG/Minos	Explore: https://www.gla.ac.uk/research/az/sipher/products/decisionsupporttool/ Software: https://ligerdev.shef.ac.uk/sipher-team/sipher_ws7_interventions Database: https://ligerdev.shef.ac.uk/sipher-team/sipher_ws7_database	Preliminary results available upon request	Explore - https://www.gla.ac.uk/research/az/sipher/products/inclusiveeconomydataset/ An exemplary pipeline, estimating a range of health measures: https://github.com/AndreasxHoehn/QALE_Exemplar Estimating quality-adjusted life expectancy (QALE) for local authorities in Great Britain and its association with indicators of the inclusive economy: a cross-sectional study BMJ Open March 2024 - https://bmjopen.bmj.com/content/14/3/e076704 Some indicators are available through the SIPHER Synthetic Population Dashboard: https://sipherdashboard.sphsu.gla.ac.uk/
Contact	sipher@glasgow.ac.uk	sipher@glasgow.ac.uk	sipher@glasgow.ac.uk	sipher@glasgow.ac.uk	sipher@glasgow.ac.uk	sipher@glasgow.ac.uk

Return to SIPHER homepage

Glossary

Glossary of Terms
Term	Explanation
Attribute-rich	Attribute-rich means that for whatever unit of observation (e.g. an individual) there are many variables relating to this unit. For an individual in an attribute-rich dataset there would many variables capturing different pieces of information about this individual. This information can, for example, relate to income, education, employment, health etc.
Causal Loop Diagram	A causal loop diagram is a diagram which consists of variables (also known as nodes) and links (also known as arcs). Variables are connected by links, shown as arrows, which capture the direction of the relationship. Next to the arrow is either a ‘+’ or ‘-’ sign which symbolises the type of relationship. Two variables connected by an arrow is taken to mean a change in the variable at the tail of the arrow leads to a change in the variable are the head of the arrow. If an arrow has a ‘+’ sign next to it, it means an increase (decrease) in the first variable leads to an increase (decrease) in the second variable. If an arrow has a ‘-’ sign, it means an increase (decrease) in the first variables leads to a decrease (increase) in the second variable.
Cross-sectional	A cross-sectional measurement or analysis refers to a perspective in which there is only one single measurement. This measurement can represent one particular point in time, such as one day, one week, one year or one 3-year period. Analysing two or more cross-sectional measurement results in a longitudinal perspective.
Data Zone (DZ)	Data zones (DZ) are census-based, geographical areas in Scotland. Data zones contain approximately the same number of people who live in the area (around 700), but the geographical size can vary. The geographical size of DZs is small in densely populated areas (like a city) and have a larger geographical size in less densely populated areas (like rural areas).
Discrete Choice Experiment	Discrete choice experiments are a preference elicitation technique. They seek to understand and allow for quantification of how individuals trade-off alternatives. Typically, participants are asked to choose between a discrete number of alternatives. Alternatives can contain a set of associated characteristics and the choice between which alternative is better than the other depends on the underlying preferences of the participants (the weights the individual places on the individual characteristics).
Dynamical Systems Model	This is a mathematical framework used to describe and analyse the behaviour of complex systems over time. It represents how variables in the system change and interact with each other, capturing the dynamics and dependencies within the system. This type of model is widely applied in various fields, including physics, biology, economics, and engineering, to understand and predict the behaviour of real-world systems.
Electoral Ward	Refers to a level of geographical resolution as well as an electoral district in the UK. In total Scotland, England, and Wales are made up of 8,082 wards as of 2022. Information on boundaries, and boundary changes over time are provided as well as geographical codes for maps are provided here: https://geoportal.statistics.gov.uk/.
Equivalent Income	Equivalent income: given a current situation captured in terms of income (y) and non-income aspects (X), equivalent income (y’) is the amount of income that, if combined with the best levels of non-income aspects (X*), is as good as the current situation (y, X). It is a preference-based single index of wellbeing in monetary terms that penalises the level of income to reflect any deficiencies in the non-income aspects, based on how important each deficit is.
General License / Special License	Data provided by UK data service, such as the Understanding Society Survey, often comes with different data access agreements. This is due to the fact that data may differ with respect to amount and character of potentially confidential information - for example with respect to details surrounding an individual’s place of residence. Most of the time, the more confidential a data set is - the higher the thresholds for accessing it. A general license agreement represents a situation in which data can often be accessed without further restrictions, after agreement to the respective terms and conditions. In contrast, a special license agreement refers to a situation in which data can only ever be accessed after prior approval by the data provider, which often involves sharing a research proposal including a clear justification for why certain safeguarded information is required. Further information to different license types can be found here: https://ukdataservice.ac.uk/help/access-policy/what-data-can-i-get/
Health Inequalities	Health inequalities are the differences in health of different groups within a population. These inequalities are considered unjust, unfair, avoidable and systematic.
Imputed Data	Data often contains missing values, which can be a challenge for some types of statistical modelling. When a missing data point is replaced with a particular value, for example the overall mean or median, this new value represents an imputed data point. There are various methods to decide the best replacement value, all of which have their strengths and limitations. There are different approaches to imputing missing data. A summary of one of the most common and robust approaches is provided here: https://gking.harvard.edu/amelia
Inclusive Economy (IE)	Improvements across a large number of domains, for example with respect to population health, have historically been attributed to factors surrounding economic growth. Within the past decade, limits to economic growth, for example given by environmental and planetary boundaries, have become increasingly obvious. The idea of an inclusive economy is particularly concerned with economic inclusion, rather than measures of size or growth of an economy. While there are different definitions of the inclusive economy, a number of common attributes have previously been pointed out. Through an iterative, consultative process, SIPHER has selected 13 relevant dimensions and compiled two indicator datasets reflecting these dimensions for local authorities and electoral wards in Scotland, England, and Wales. Further information about SIPHER’s approach to defining and developing indicators of an inclusive economy can be found in the Technical Report, available here: https://sipher.ac.uk/wp-content/uploads/2022/10/SIPHER-Inclusive-Economy-Indicator-set.pdf. A review outlining different approaches to defining inclusive economies can be found here: https://jech.bmj.com/content/75/11/1129.abstract
Inclusive Growth	This refers to the broad idea that growth and prosperity should create opportunities for the wider population and that the benefits of growth should be distributed fairly and reduce inequalities. The RSA Commission, for example, defined inclusive growth as ‘broad-based growth that enables the widest range of people and places to contribute to economic success, and to benefit from it too’ (see page 7, RSA (2016) https://www.thersa.org/reports/emerging-findings-of-the-inclusive-growth-commission, September 2016). However, inclusive growth remains a fuzzy concept, with some versions emphasising the need for inclusion within existing growth models (a ‘growth plus’ version), while others emphasise the need to change the economy so that poverty and inequalities are reduced by design (see the discussion starting page 6 from the report Achieving Inclusive Growth in Greater Manchester: what can be done? http://hummedia.manchester.ac.uk/institutes/mui/igau/IGAU-Consultation-Report.pdf)
Inequality Aversion	The extent to which an equal distribution of an outcome (such as income or health) is preferred over an unequal distribution. Where people are averse to inequality, a less-but-equal distribution is preferred over a more-but-unequal distribution.
Local Authority (LA)	Refers to a level of geographical resolution as well as a level of local government funding in the UK. In total Scotland, England, and Wales are made up of 363 local authorities (district level / lower tier). Information on boundaries, and boundary changes over time are provided as well as geographical codes for maps are provided here: https://geoportal.statistics.gov.uk/.
Longitudinal	Longitudinal data are data that were collected repeatedly over time from the same subject. This is in contrast to cross-sectional data which collects data once. For example, the Understanding Society survey data are longitudinal data as the same households participate in the survey and answer the same questions annually.
Lower Super Output Area (LSOA)	Lower Super Output Areas (LSOA) are census-based, geographical areas in England and Wales. LSOAs contain approximately the same number of people who live in the area but the geographical size can vary - their geographical size is small in densely populated areas (like a city) and have a larger geographical size in less densely populated areas (like rural areas).
Macro-level modelling	Macro-level modelling produces aggregate outputs and provides information about larger population groups, which are often geographical areas or countries.
Matlab	Matlab is a commercial statistical and mathematical platform which comes with its own syntax. More information on Matlab can be found on the providers website: https://www.mathworks.com/products/matlab.html
Microsimulation	A modelling technique used to simulate the behaviour and interactions of individuals offering an insight into the impact of policy interventions. It requires the creation of a synthetic population data source providing a digital twin of the individuals to be examined with their specific attributes. By simulating their decisions, actions, and interactions over time researchers can explore the effects of different policies, interventions, or scenarios at the individual level and aggregate this to understand community impacts.
Micro-level modelling	Micro-level means the model seeks to tell us something about individuals, groups of individuals or very small areas.
Model output	Model output reflects a situation in which a result was obtained through a statistical process, contrasing situations in which we deal with raw input data. Even descriptive statistics can, in some situations, represent model output. For example, when obtained from synthetic data sources such as the SIPHER Synthetic Population. Model output should always be interpreted and discussed within the context of its creation, acknowledging the strenghts and limitations of the creation process.
Non-synthetic data	Non-synthetic data represents data that were gathered through various data collection methods - for example as part of a register-based population system or a survey. Non-synthetic data are reflecting observations of true real-world units, such as real individuals or geographical areas.
Python	Python is a free and publicly available, general-purpose programming language. It is particularly useful when processing large amounts of data. More information on Python can be found here: https://www.python.org/
R	R is a free and publicly available programming language. R has a large, interdisciplinary user base and is particularly well-suited for statistical analyses and data visualisation. The R-project: https://www.r-project.org/
Register-based System	A register-based system refers to a situation in which each individual of a country can be traced exactly - for example due to a unique personal identification number - across multiple sources of routinely collected administrative data. These sources could cover for example: education, tax, prescriptions, hospital admissions, fertility, housing, cause of death etc. This system is common in the Nordic countries, were a register-based systems has been in place since the 1960s. Within the past years, a strong move towards data linkage has taken place in Scotland and the UK and with respect to NHS data. Historically, register-based data were collected routinely for administrative purposes, rather than with a particular research purpose in mind. For Scotland, eDRIS provides the infrastructure to many registers - often referred to as national data safe havens: https://www.isdscotland.org/products-and-services/edris/ The system is more decentralised in England and Wales
Shortform 12 (SF-12)	Shortform 12 (SF-12) is a standardised questionnaire consisting of 12 questions related to the responding individual’s health. From these responses, two summary scores known as the physical summary component (PCS) and mental component summary (MCS) can be calculated for the individual. This differentiation into physical and mental health enables a holistic perspective on an individual’s health. This holistic perspective presents an advantage over more traditional health measures which are often only indicative of an individual’s risk of dying or the reflection of a broadly captured self-rated health status. There are different version of the SF-12 available. For example, Understanding Society uses the SF12v2. The SF-12 instrument has been developed by the company quality metric: https://www.qualitymetric.com/sf-12v2-pro-health-survey-lp/
Simulation	A simulation refers to a study design in which hypothetical changes of the observed units are modelled. Units under study can be individuals, households, administrative units such as UK Local Authority Districts, or entire countries. Simulations come with the advantage that shocks or interventions can be testes and examined before taking place. In turn, this requires that we have a certain amount of information about the particular event or the policy intervention. Simulations have a particular advantage over traditional study designs as shocks can be studied before they have occurred and interventions can be tested before any money is spent. A major drawback of simulations is the fact that despite having as many insights into potential behaviour as possible, in reality, our units of observation might still not behave the way we have predicted it.
Stata	Stata is a commercial statistical software which comes with its own syntax. More information on stata can be found on the providers website: https://www.stata.com/
Synthetic Data	Synthetic data are artificially generated data, which may or may not draw upon existing data. Synthetic data are created in SIPHER because there is a need for it. In the SIPHER project, synthetic data are generated and processed by WS3 and used in WS4, WS5 and WS7 models.
Systematic Review	A scientific method to identify, structure, condense and summarise previous research findings and other sources of evidence with respect to a particular topic. This approach helps to eliminate subjectivity and selectivity with respect to a studied topic. Results of systematic reviews help us to understand the current state of research as well as future directions.
UK Data Service	The UK Data Service is the UK’s largest institution collecting and providing economic, population and social research data. The data provided via UK data service is typically used for research, teaching, learning and public benefit. Access to many data collections, including the Understanding Society survey data, is provided via the website of UK data service. While some data is easily available, other data - in particular with respect to sensitive information - might be subject to special license agreements and other forms of safeguarding. More about the UK Data Service can be found here: https://ukdataservice.ac.uk/ An overview of domains covered in UK data service data sets is provided here: https://ukdataservice.ac.uk/help/access-policy/what-data-can-i-get/
Understanding Society (US)	Understanding Society, also known as the UK Household Longitudinal Study, is a longitudinal study which follows participants over a time. Households are selected to join the study and all members of the household participate in the study. The survey is conducted annually and collects data from participants on their lives on a set of wide-ranging questions. The survey began in 2009 and substituted the British Household Panel Survey (BSPS) - which ran from 1991 to 2009. Harmonised BSPS waves are available and enable an even longer time frame when combined with Understanding Society. More about the study can be found here: https://www.understandingsociety.ac.uk/about/about-the-study Access to the study is provided via UK data Service - a link to the General License Version: https://beta.ukdataservice.ac.uk/datacatalogue/studies/study?id=6614#!/details
Variable	The terminology variable can imply different concepts. Typically as well as in this context, the terminology refers to an object which captures any type of information (e.g., name or spending) with respect to an observational unit of interest (e.g., a person, a household, a Local Authority District, the Scottish Government), and with respect to a particular point in time (e.g., yesterday, today, last year).
Ward	See Electoral Ward.
Workstrand (WS)	The SIPHER project is divided into eight workstrands. Each workstrand has a particular focus on either qualitative and quantitative questions. This diversity represents a particular strength of SIPHER as workstrands contribute to each other in a supportive and iterative manner, for example: theory is informing statistical modelling and results of statistical modelling can change perspective on existing theories. A summary of SIPHER workstrands and their synergistic interaction is described here in more detail: https://wellcomeopenresearch.org/articles/4-174/v1 and available https://www.gla.ac.uk/research/az/sipher/sipherwheel-workstrands/

Return to SIPHER homepage

Acronyms

List of Acronyms
Acronyms	Explanation
DST	Decision Support Tool
DZ	Data Zone
EQ-5D-5L	EuroQual 5 Dimensions (Health and Wellbeing Questionnaire)
EQ-HWB	EuroQual Health and Wellbeing instrument (Health and Wellbeing Questionnaire)
HUI	Health Utility Index
ICECAP-A	Icepop Capability measure for Adults (Health and Wellbeing Questionnaire)
IE	Inclusive Economy
IG	Inclusive Growth
LA	Local Authority
LSOA	Lower Super Output Area
MSOA	Middle Layer Super Output Area
NRS	National Records of Scotland
ONS	Office for National Statistics
ONS-4	Office for National Statistics - 4 (Health and Wellbeing Questionnaire)
ORDA	Online Research Data
QALE	Quality-adjusted life expectancy
SF-12	Short Form 12 Health Survey and its mental and physical health scores MCS and PCS
US	Understanding Society
WEMWBS	Warwick-Edinburgh Mental Wellbeing Scale
WS	Workstrand

Return to SIPHER homepage

SIPHER Products

SIPHER Product Guide

Main panel

SIPHER’s Qualitative Products

SIPHER’s Data Products

SIPHER’s Quantitative Products

Employment and Health Evidence and Gap Map

Causal Systems Mapping

Compare All Qualitative Products

SIPHER Synthetic Population

Health Indicator Dataset

Inclusive Economy Indicator Inclusive Economy (Local Authority Level) Dataset

Inclusive Economy Indicator Inclusive Economy (Ward Level) Dataset

SIPHER-7 Wellbeing Domain Preferences (Survey Dataset)

Aversion to Inequality (Survey Dataset)

HWMIC (Health and Wellbeing Multi-Instrument Comparison) Dataset

Compare All Data Products

Dynamic Systems Model

Static Microsimulation

Dynamic Microsimulation - MINOS

Decision Support Tool

K-Means Clustering

Small-Area Indicator Estimation

Compare All Quantitative Products

Glossary

Acronyms