All state-funded mainstream schools that meet all of the following criteria are included in Schools Like Yours:


All state-funded mainstream schools that meet all of the following criteria are included in Schools Like Yours:


All state-funded special schools that meet all of the following criteria are included in Schools Like Yours:

Back to the top

Linking schools across datasets

Schools are generally identified by a three-digit LA identifier and a four-digit school identifier. The three-digit LA codes change when local authority boundaries are changed and the four-digit estab codes when the nature of a school changes (e.g. when an academy sponsor changes).

We use data on school links (successors and predecessors) available through Get Information About Schools to consistently identify a school over time, and do not include data on schools which amalgamate.

Back to the top

Data sources

The site makes use of the following publicly available school-level datasets:


Capacity figures are calculated in percentage terms, based on schools' recorded capacity and the number of pupils on roll. Figures below 100% mean a school has some spare capacity; figures above 100% mean a school is operating above capacity.

Ofsted data

Ofsted publishes monthly management information containing the results of its inspections.

We have accumulated these datasets over a number of years and use them to identify the inspection judgment for overall effectiveness for each school (including any linked predecessor) as at 31 March 2023.

Three-year averages (KS2 data only)

A number of three-year average figures are included for KS2 measures:

Income per pupil

Data on income and expenditure for maintained schools and academies are reported separately by the Department for Education.

Each data source contains a measure of total income and the number of full-time equivalent (FTE) pupils. Income is divided by FTE numbers to get an income per pupil number. To exclude values likely to be erroneous, lower limits of £2,000 (KS2 and KS4 schools) and £10,000 (special schools), and upper limits of £19,000 (KS2 and KS4 schools) and £100,000 (special schools), have been applied.

Income figures used are net of catering income and supply teacher insurance claims. Direct revenue financing has not been deducted from total income in any reporting year.

Where income figures cover a period less than 12 months, a 12-month figure is calculated on a pro-rataed basis.

For schools within federations, figures reported are federation-level figures, which are published under the lead school of each federation.

For schools within multi-academy trusts, central services income is allocated in proportion to FTE pupil numbers and the number of months that schools have been with the trust within the relevant calendar year. Where FTE pupil numbers are not available for a given school, no central services income is allocated to the school.

Back to the top

What's the difference?

Schools Like Yours uses a calculated "difference" to determine similarity between pairs of schools. The exact values, which can be displayed in the difference column, have no meaning on their own, but can be used for comparison. Zero means "the same", according to the calculation; the greater the difference beyond that, the greater the dissimilarity from the selected school.


Let's imagine there are two schools, Alpha School and Beta School, and we want to measure how similar they are in terms of prior attainment, the percentage of pupils who are disadvantaged, and the percentage of pupils with a first language other than English.

Prior attainment24.329.3-5
Disadvantaged pupils (%)3024+6
First language other than English (%)910-1

For the purpose of this example, we'll use the values without scaling. In this example, both schools have data for each of the three measures but in Schools Like Yours, we use an average value as an approximation (or proxy) when values are missing.

As described below, difference is calculated as the square root of the (weighted) sum of squared differences in each included measure, so the difference the schools' scores is calculated as difference = ((-5)2 + (+6)2 + (-1)2) = (25 + 36 + 1) = 62 = 7.874...

We might want to give greater weight to the percentage of disadvantaged pupils when calculating similarity and so could increase its weight to 1.5. The difference then would be difference = ((-5)2 + (1.5 × (+6))2 + (-1)2) = (25 + 81 + 1) = 107 = 10.344...

By repeating this calculation for every school with sufficient data, we can then find the schools in England that are most similar to Alpha School.

Calculation of similarity

In mathematical language the difference might be described as an n-dimensional weighted Euclidean distance, calculated as the square root of the (weighted) sum of squared differences in each included measure. This is an abstract and extended form of the familiar trigonometry calculation for the length of the hypotenuse of a right-angled triangle: c = (a2 + b2)

Once the difference is calculated, the records returned are simply those with the smallest values.


If all the measures are the same scale, this sort of calculation is straightforward, but a measure with a greater range of values would tend to dominate the calculation since the possible difference in that measure is greater. To combat this issue, the values are first rescaled so that the population standard deviation for each included measure is 1.

This applies only to the values used in the difference calculation, not the measure values displayed on the table.


In order to allow for prioritising certain measures in the calculation, we include a weight for each measure included in the calculation.

By default, all data items included in the similarity criteria are given a weight of 1, in other words they are considered equally important. However, these values can be changed by clicking on the pill within the similarity criteria box and entering a user-defined value in the weight box.

A value of 0.5 means that the selected measure only has half the importance of other measures, 3 means it is three times as important as normal. Notice that a result of this is that a weight of zero means that the measure is effectively ignored from the calculation since it has no importance at all.

Back to the top