Housing affordability

We want to explore which local authorities are the most/least expensive to buy homes in, relative to wages. For this, we wll use the median house price to residence based earning ratio.

from pathlib import Path
import pandas as pd
ROOT = Path("../").resolve()

# Load the house price to wage ratio data
d = pd.read_parquet(ROOT/"data/house-prices/site/hp_to_wage_ratio.parquet")

# Load the geocodes
codes = pd.read_csv(ROOT / "metadata/lookups/Local_Authority_Districts_(April_2023)_Names_and_Codes_in_the_United_Kingdom.csv", usecols=['LAD23NM', 'LAD23CD'])
codes.rename(columns={'LAD23CD': 'geography_code'}, inplace=True)

# Load the house prices data
hp = pd.read_parquet(ROOT / "data/house-prices/site/median_house_prices.parquet")
hp.reset_index(inplace=True)
# Filter so we only have one date per year. We'll choose the latest which is march.
hp = hp[hp['date'].str.contains('Mar')]
# Get tht year only
hp['date'] = hp['date'].apply(lambda s: s[-4::])

# Merge the dataframes
d = d.merge(codes, on='geography_code').merge(hp, on=['geography_code', 'date']).set_index('geography_code')
d

	date	value	LAD23NM	geography_name	Median
geography_code
E06000001	2002	2.86	Hartlepool	Hartlepool	48997.5
E06000002	2002	2.83	Middlesbrough	Middlesbrough	46000.0
E06000003	2002	2.83	Redcar and Cleveland	Redcar and Cleveland	53150.0
E06000004	2002	3.33	Stockton-on-Tees	Stockton-on-Tees	60000.0
E06000005	2002	3.42	Darlington	Darlington	56500.0
...	...	...	...	...	...
W06000020	2023	5.84	Torfaen	Torfaen	180000.0
W06000021	2023	8.99	Monmouthshire	Monmouthshire	320000.0
W06000022	2023	6.64	Newport	Newport	215000.0
W06000023	2023	7.17	Powys	Powys	242997.5
W06000024	2023	4.55	Merthyr Tydfil	Merthyr Tydfil	135000.0

6908 rows × 5 columns

Let's look at the most recent year's data, which is for 2023. We'll also drop any places that don't have data for 2023.

d = d[d.date == '2023']
d = d.dropna()
least_affordable = d.sort_values(by='value', ascending=False).head(5)
least_affordable

	date	value	LAD23NM	geography_name	Median
geography_code
E09000020	2023	25.36	Kensington and Chelsea	Kensington and Chelsea	1357500.0
E09000033	2023	19.11	Westminster	Westminster	965000.0
E09000007	2023	17.12	Camden	Camden	770000.0
E09000013	2023	17.03	Hammersmith and Fulham	Hammersmith and Fulham	770000.0
E09000005	2023	15.30	Brent	Brent	565000.0

Unsurprisingly, they are all Boroughs of London. We'll do the same thing but ignore London Boroughs (E09 codes).

d = d[~d.index.str.startswith('E09')]
least_affordable_not_london = d.sort_values(by='value', ascending=False).head(5)
least_affordable_not_london

	date	value	LAD23NM	geography_name	Median
geography_code
E07000210	2023	14.70	Mole Valley	Mole Valley	580000.0
E07000098	2023	14.57	Hertsmere	Hertsmere	565000.0
E07000102	2023	14.21	Three Rivers	Three Rivers	580000.0
E07000240	2023	13.43	St Albans	St Albans	633500.0
E07000207	2023	13.36	Elmbridge	Elmbridge	665000.0

Finally, where are the 5 most affordable (lowest house price to earnings ratio) places?

most_affordable = d.sort_values(by='value', ascending=False).tail(5)
most_affordable

	date	value	LAD23NM	geography_name	Median
geography_code
E06000008	2023	4.28	Blackburn with Darwen	Blackburn with Darwen	137000.0
E06000001	2023	4.16	Hartlepool	Hartlepool	130000.0
E06000047	2023	4.00	County Durham	County Durham	125000.0
E07000120	2023	3.92	Hyndburn	Hyndburn	127000.0
E07000117	2023	3.39	Burnley	Burnley	116000.0