Housing affordability

We want to explore which local authorities are the most/least expensive to buy homes in, relative to wages. For this, we wll use the median house price to residence based earning ratio.

from pathlib import Path
import pandas as pd
ROOT = Path("../").resolve()
# Load the house price to wage ratio data
d = pd.read_parquet(ROOT/"data/house-prices/site/hp_to_wage_ratio.parquet")

# Load the geocodes
codes = pd.read_csv(ROOT / "metadata/lookups/Local_Authority_Districts_(April_2023)_Names_and_Codes_in_the_United_Kingdom.csv", usecols=['LAD23NM', 'LAD23CD'])
codes.rename(columns={'LAD23CD': 'geography_code'}, inplace=True)

# Load the house prices data
hp = pd.read_parquet(ROOT / "data/house-prices/site/median_house_prices.parquet")
hp.reset_index(inplace=True)
# Filter so we only have one date per year. We'll choose the latest which is march.
hp = hp[hp['date'].str.contains('Mar')]
# Get tht year only
hp['date'] = hp['date'].apply(lambda s: s[-4::])

# Merge the dataframes
d = d.merge(codes, on='geography_code').merge(hp, on=['geography_code', 'date']).set_index('geography_code')
d

date value LAD23NM geography_name Median
geography_code
E06000001 2002 2.86 Hartlepool Hartlepool 48997.5
E06000002 2002 2.83 Middlesbrough Middlesbrough 46000.0
E06000003 2002 2.83 Redcar and Cleveland Redcar and Cleveland 53150.0
E06000004 2002 3.33 Stockton-on-Tees Stockton-on-Tees 60000.0
E06000005 2002 3.42 Darlington Darlington 56500.0
... ... ... ... ... ...
W06000020 2023 5.84 Torfaen Torfaen 180000.0
W06000021 2023 8.99 Monmouthshire Monmouthshire 320000.0
W06000022 2023 6.64 Newport Newport 215000.0
W06000023 2023 7.17 Powys Powys 242997.5
W06000024 2023 4.55 Merthyr Tydfil Merthyr Tydfil 135000.0

6908 rows × 5 columns

Let's look at the most recent year's data, which is for 2023. We'll also drop any places that don't have data for 2023.

d = d[d.date == '2023']
d = d.dropna()
least_affordable = d.sort_values(by='value', ascending=False).head(5)
least_affordable

date value LAD23NM geography_name Median
geography_code
E09000020 2023 25.36 Kensington and Chelsea Kensington and Chelsea 1357500.0
E09000033 2023 19.11 Westminster Westminster 965000.0
E09000007 2023 17.12 Camden Camden 770000.0
E09000013 2023 17.03 Hammersmith and Fulham Hammersmith and Fulham 770000.0
E09000005 2023 15.30 Brent Brent 565000.0

Unsurprisingly, they are all Boroughs of London. We'll do the same thing but ignore London Boroughs (E09 codes).

d = d[~d.index.str.startswith('E09')]
least_affordable_not_london = d.sort_values(by='value', ascending=False).head(5)
least_affordable_not_london

date value LAD23NM geography_name Median
geography_code
E07000210 2023 14.70 Mole Valley Mole Valley 580000.0
E07000098 2023 14.57 Hertsmere Hertsmere 565000.0
E07000102 2023 14.21 Three Rivers Three Rivers 580000.0
E07000240 2023 13.43 St Albans St Albans 633500.0
E07000207 2023 13.36 Elmbridge Elmbridge 665000.0

Finally, where are the 5 most affordable (lowest house price to earnings ratio) places?

most_affordable = d.sort_values(by='value', ascending=False).tail(5)
most_affordable

date value LAD23NM geography_name Median
geography_code
E06000008 2023 4.28 Blackburn with Darwen Blackburn with Darwen 137000.0
E06000001 2023 4.16 Hartlepool Hartlepool 130000.0
E06000047 2023 4.00 County Durham County Durham 125000.0
E07000120 2023 3.92 Hyndburn Hyndburn 127000.0
E07000117 2023 3.39 Burnley Burnley 116000.0