Housing Market Trends: Data-Driven Insights and Analysis
- Oluwakemi Oyefeso
- Feb 20
- 10 min read
Updated: Aug 28

Problem Statement:
Various factors, including property features, location, and economic conditions influence the housing market. Buyers, sellers, and real estate professionals need data-driven insights into pricing trends, outliers, and key property characteristics to make informed decisions. This analysis aims to uncover market trends, identify pricing patterns, and highlight factors affecting property values.
Goals:
Understand Current Trends: Gain insights into property price distributions, demand patterns, and market segmentation.
Identify Growth Opportunities: Detect underpriced and overpriced properties, recognize high-demand segments, and highlight potential investment areas.
Develop Data-Driven Strategies: Provide insights for pricing strategies, buyer recommendations, and seller expectations.
Key Stakeholders:
Real Estate Investors: To identify profitable investment opportunities and evaluate risk factors.
Property Sellers: To optimize pricing strategies based on market trends and demand.
Home Buyers: To make informed purchasing decisions based on pricing trends and affordability.
Real Estate Agents & Analysts: To guide clients using data-driven insights on pricing and market demand.
Data Collection
Source: Kaggle Housing Dataset containing details such as price, number of bedrooms, bathrooms, parking spaces, and area size.✔
Analysis Tools: Power BI.
Limitations:
Data Availability & Accuracy: The dataset does not include all housing transactions, and inaccuracies in reported prices or property features can affect insights.
Market Fluctuations: Housing prices are influenced by external factors such as economic conditions, interest rates, and government policies, which may not be fully captured in the dataset.
Geographical Constraints: Findings may not generalize to all housing markets, as trends vary by location, demand, and local regulations.
Unaccounted Factors: Key drivers of property prices, such as neighborhood desirability, school districts, and future urban development, may not be included in the dataset.
Temporal Limitations: The dataset represents a specific period, making it less effective in predicting long-term trends or future market shifts.
Feature Interactions: Some property attributes may have complex, non-linear relationships with price, requiring advanced modeling beyond basic exploratory analysis.
Scope:
Price Distribution Analysis
Scope:
a. Construct a histogram to visualize property price distribution.
b. Calculate key statistics: mean, median, and standard deviation.
c. Detect outliers using a box plot to identify properties significantly above or below
the norm.
Objective:
Understand the distribution of property prices to identify trends, common price ranges, and outliers.
a. Histogram to visualize property price distribution.

Insights:
Skewness of Property Prices
The histogram exhibits a right-skewed distribution, indicating a longer tail on the right.
Most properties are priced between 2M and 6M, with the highest concentration around 4M.
Very few properties exceed 10M, confirming that luxury properties are scarce.
Price Concentration
The peak range (2M–6M) suggests strong demand and affordability within this bracket.
The highest number of properties (43 units) falls within the 4M–4.9M range.
Outliers & High-End Properties
A small cluster of properties priced above 10M signals the presence of luxury estates, penthouses, or premium real estate.
These high-priced properties form a distinct, low-frequency segment of the market.
Recommendations:
For Buyers: The optimal price range for purchasing falls between 2M and 6M, aligning with market demand.
For Sellers: Properties priced above 10M may experience longer selling periods due to limited demand in the luxury segment.
b. Calculate key statistics: mean, median, standard deviation, and Price Per square foot.
KPIs:

Utilizing DAX measures to derive summary statistics:
Mean Price: 4.77M – The average property price.
Since the mean is slightly higher than the median, it suggests that a few high-value properties are pulling the average up.
MeanPrice = AVERAGE(Housing[Price])Median Price: 4.34M – The middle value in the price distribution.
This indicates that half of the properties are priced below 4.34M, while the other half are above it.
MedianPrice = MEDIAN(Housing[Price])Standard Deviation: 1.87M – Reflects the degree of variation in property prices.
A high standard deviation suggests a wide price dispersion, likely due to a mix of budget, mid-range, and luxury properties.
StdDevPrice = STDEV.P(Housing[Price])Average Price Per Square Foot: 925.48
Useful for comparing affordability across different property sizes.
PricePerSqFt = SUM(Housing[Price]) / SUM(Housing[Area])The right-skewed histogram and higher mean confirm that a few expensive properties are influencing the market.
c. Detect outliers using a box plot to identify properties significantly above or below the norm.
To identify outliers (properties priced significantly above or below the average), I created a box plot analyzing the average price, bedrooms, and area.

Key Observations:
Wide Price Distribution
The whiskers extend from about 2M to 14M, showing a broad range.
This suggests significant variations in property prices across different bathrooms and areas.
Median and Interquartile Range (IQR) Are More Centered
The IQR spans approximately 3.5M to 6M, capturing most properties in this range.
The median price is 4,447,333.33, meaning that half of the properties are below this price and half are above.
Presence of Outliers
There are several high-value outliers above 10M and up to 14M, indicating luxury properties or prime real estate locations.
The lower-end properties are closer together, with fewer extreme low values.
To categorize prices as "Normal" or "Outlier," I created a Price_IQR column using the following DAX formula:
Price_IQR_Column =
VAR Q1 = CALCULATE(PERCENTILE.INC('Table'[Price], 0.25), ALL('Table'))
VAR Q3 = CALCULATE(PERCENTILE.INC('Table'[Price], 0.75), ALL('Table'))
VAR IQR = Q3 - Q1
VAR Lower_Bound = Q1 - 1.5 * IQR
VAR Upper_Bound = Q3 + 1.5 * IQR
RETURN
IF(
'Table'[Price] < Lower_Bound || 'Table'[Price] > Upper_Bound,
"Outlier",
"Normal"
)This column allowed me to create a filter that filtered the dataset into normal price ranges and outliers, to facilitate a clearer analysis of pricing trends.

Normal Price ranges:
Here, the overall price range is narrower, with the maximum price at 8,960,000.
The median remains at 4,375,000.
The IQR is similar to the image above, with most properties clustered in this middle range.
The outliers are less extreme.

Outliers:
Here, we see a significantly higher median price of 10,150,000.
The distribution is skewed toward higher prices, with fewer lower-priced properties.
The upper whisker extends to 13,300,000, with a concentration of data points around 10M - 12M.
The narrower IQR suggests that prices are more tightly grouped in the upper range.
Key Takeaways
Filtering out higher prices (Normal Price ranges) reduces variability but retains the core pricing pattern.
Filtering to only high-priced properties (Outliers) reveals that premium homes are clustered within 9M–13M, suggesting a distinct market segment.
Business Insight: If targeting the mid-range market, focus on the 4M–6M price range (seen in Images 1 & 2). If targeting luxury buyers, pricing strategies should align with the 9M–13M segment (Image 3).
Examining the Relationship Between Price and Property Features
Scope:
a. Price vs. Area (Scatter Plot Analysis)
b. Average Price by Number of Bathrooms
c. Average Price by Number of Bedrooms
d. Parking Spaces vs. Property Price (Scatter Plot Analysis)
e. Average Price by Number of Stories
a. Price vs. Area (Scatter Plot Analysis)
Key Steps:
Constructed a scatter plot to visualize the relationship between Area and Price.
Calculated the correlation coefficient to measure the strength of this relationship.

Correlation Coefficient of 0.54:
Correlation_Coefficient =
VAR MeanX = AVERAGE('Housing'[area])
VAR MeanY = AVERAGE('Housing'[price])
VAR SumXY = SUMX('Housing', ('Housing'[area] - MeanX) * ('Housing'[price] - MeanY))
VAR SumX2 = SUMX('Housing', ('Housing'[area] - MeanX) ^ 2)
VAR SumY2 = SUMX('Housing', ('Housing'[price] - MeanY) ^ 2)
RETURN
DIVIDE(SumXY, SQRT(SumX2 * SumY2))
Insights:
A correlation coefficient of 0.54 indicates a moderate positive relationship between Area and Price. A positive correlation (0.54) means price increases with area, but not perfectly.
The trendline confirms that larger areas generally lead to higher prices.
Data points are moderately scattered around the trendline, aligning with the correlation value.
Some outliers exist - properties with small areas but high prices.
Other factors like location, amenities, and property type may also influence prices.
b. Average Price by Number of Bathrooms

Insights:
Properties with 4 bathrooms have the highest average price (~12.3M).
As the number of bathrooms decreases, the average price drops:
3 bathrooms → 7.3M
2 bathrooms → 6.2M
1 bathroom → 4.2M
A strong positive correlation exists between the number of bathrooms and property price.
Buyers likely value additional bathrooms, making properties with more bathrooms more expensive.
c. Average Price by Number of Bedrooms

Insights:
Properties with 5 bedrooms have the highest price (~5.8M).
Prices decrease as the number of bedrooms reduces:
4 bedrooms → 5.7M
3 bedrooms → 5.0M
2 bedrooms → 3.6M
1 bedroom → 2.7M
Larger properties (more bedrooms) tend to be more expensive.
The price difference between 5-bedroom and 1-bedroom properties is significant.
d. Parking Spaces vs. Property Price (Scatter Plot Analysis)
Steps:
Constructed a scatter plot to analyze parking impact on price.
Calculated the Pearson correlation coefficient (r).

Correlation coefficient of 0.38:
Parking_Price_Correlation =
VAR xMean = AVERAGE(HouseData[Parking])
VAR yMean = AVERAGE(HouseData[Price])
VAR numerator =
SUMX(
HouseData,
(HouseData[Parking] - xMean) * (HouseData[Price] - yMean)
)
VAR denominator =
SQRT(
SUMX(HouseData, (HouseData[Parking] - xMean) ^ 2) *
SUMX(HouseData, (HouseData[Price] - yMean) ^ 2)
)
RETURN
IF(denominator = 0, BLANK(), numerator / denominator)

Insights:
Correlation coefficient (r) = 0.38, indicates a moderate positive correlation between parking spaces and price.
Observations from the scatter plot:
Properties with more parking spaces tend to have higher prices.
The highest price (~6M) is associated with 2–3 parking spaces.
Properties with 0 parking spaces have the lowest price (~4M).
A plateau effect occurs—adding more than 2 parking spaces does not significantly increase the property price.
Parking influences price, but its effect is weaker than bathrooms or bedrooms.
The correlation value (0.38) suggests that while parking is important, other factors may play a bigger role.
e. Average Price by Number of Stories

Insights:
Multi-story properties (~7.2M) are significantly more expensive than single-story properties (~4.2M).
Multi-story homes tend to be larger, offering more living space, which justifies higher prices.
Buyers likely associate additional floors with better design, privacy, and more usable space.
Key Takeaways:
Bathrooms have the strongest impact on property prices - higher numbers significantly increase prices.
Bedrooms also affect price, but the increase is more gradual.
Parking spaces moderately influence price, but their impact diminishes beyond 2 spaces.
Multi-story properties are priced higher due to increased living space.
Assessing the Impact of Amenities on Property Prices
Scope:
a. Impact of Furnishing Status on Prices
b. Price Increase Due to Amenities
c. Furnishing Status & Bedroom Count
d. Furnishing Status & Bathroom Count
To analyze how different amenities and furnishing statuses affect property prices, I unpivoted these features in Power Query, creating two new columns: Amenities and Has Amenities.
KPIs:

a. Impact of Furnishing Status on Prices

Furnishing Status | Avg. Price (M) |
Furnished | 5.5M |
Semi-Furnished | 4.9M |
Unfurnished | 4.0M |
Key Insights:
Furnished homes command the highest prices (+37.5% vs. unfurnished).
Semi-furnished homes also sell at a premium (+22.5% vs. unfurnished).
Investing in furnishing can significantly boost property value.
b. Price Increase Due to Amenities

Amenity | Price with Amenity (M) | Price without Amenity (M) | % Increase |
Air Conditioning | 6.0M | 4.2M | +42.9% |
Basement | 5.2M | 4.5M | +15.6% |
Guest Room | 5.8M | 4.5M | +28.9% |
Hot Water Heating | 5.6M | 4.7M | +19.1% |
Main Road Proximity | 5.0M | 3.4M | +47.1% |
Preferred Area | 5.9M | 4.4M | +34.1% |
Key Insights:
Main road proximity has the highest price boost (+47.1%) due to accessibility.
Air conditioning significantly increases value (+42.9%), highlighting the demand for comfort.
Guest rooms and preferred locations drive property value up (+28.9% and +34.1%, respectively).
Basements have the lowest effect (+15.6%), likely influenced by location-specific factors.
Key Takeaways
Top Amenities for Price Growth:
Main Road Proximity: +47.1% increase
Air Conditioning: +42.9% increase
Preferred Area: +34.1% increase
Guest Room: +28.9% increase
Impact of Furnishing:
Furnished: 5.5M (+37.5%)
Semi-Furnished: 4.9M (+22.5%)
Unfurnished: 4.0M
Availability of High-Value Amenities
Only 5.3% of homes have air conditioning, despite its high price premium.
14.3% of homes have main road access, making it a strong price driver.
3.9% of homes are in preferred areas, suggesting limited supply increases value.
Recommendations
For Sellers:
Highlight air conditioning, main road access, and premium locations in listings.
Consider semi-furnishing to increase value without full renovation costs.
For Investors:
Focus on preferred areas & main road properties for long-term appreciation.
Upgrade properties with guest rooms & air conditioning to maximize returns.
c. Furnishing Status & Bedroom Count
Objective:
Examine how furnishing affects property prices based on the number of bedrooms.

Key Observations:
Furnished Properties
Show the highest average prices overall.
The 6-bedroom category has the highest price (~7M).
Prices increase significantly as the bedroom count rises.
Semi-Furnished Properties
Prices are slightly lower than furnished properties.
The 5-bedroom category has the highest price (~5.5M).
The 4-bedroom and above categories show a strong price jump compared to smaller homes.
Unfurnished Properties
Have the lowest prices overall.
The 6-bedroom category has an outlier, showing a sharp price increase (~6.5M).
Smaller homes (1-3 bedrooms) have a much lower price range (~2M-4M).
The price gap increases with more bedrooms, meaning furnishing has a greater impact on larger homes.
Key Takeaways
Furnishing boosts property value, with fully furnished homes having the highest prices.
Larger homes (4+ bedrooms) gain the most value from furnishing, with sharp price increases.
Unfurnished properties generally have lower prices, except for the unexpected spike in 6-bedroom homes.
d. Furnishing Status & Bathroom Count
Objective:
To analyze the effect of furnishing status and number of bathrooms on property price based on 1 to 4 bathrooms.

Key Observations:
Furnishing Status Impact on Price:
Furnished properties generally have higher average prices, particularly for 4-bathroom properties, which spike above 12M.
Semi-furnished and unfurnished properties show a similar pricing range, with 4-bathroom properties still being the most expensive but not as extreme as in the furnished category.
Effect of Number of Bathrooms on Price:
Across all furnishing statuses, properties with more bathrooms tend to have higher prices.
4 bathroom properties are the most expensive, especially in the furnished category (12M+).
1-bathroom properties are consistently the cheapest, with a notable price gap compared to properties with more bathrooms.
Price Distribution Across Furnishing Statuses:
Furnished homes show the most price variability, with 4-bathroom properties creating a large gap.
Semi-furnished and unfurnished properties have a more stable price range, with less extreme differences across bathroom counts.
Insights & Recommendations:
Luxury buyers tend to prefer fully furnished homes, and their prices can go significantly higher for 4-bathroom properties.
If targeting mid-range buyers, semi-furnished and unfurnished properties show a stable price trend, making them attractive for affordability.
Developers and sellers should prioritize furnishing upgrades for high-end properties to maximize pricing potential, especially for larger homes.
Price Segmentation (Binning)
Objective:
Categorize properties into distinct price ranges to analyze market distribution.
Approach:
Define price bins (e.g., Low, Mid, High).
Use visualizations like histograms or pie charts to illustrate the distribution.
A custom Price Category column was created using M language after defining the bins:
= if [Price] = null then "Unknown"
else if [Price] < 2000000 then "Low"
else if [Price] >= 2000000 and [Price] < 5000000 then "Mid"
else "High"
Classification:

Mid-Priced Houses (2M - 5M):
2046 houses (~62.57%) fall within this category, making it the dominant segment.
This suggests strong demand or a preference for mid-range housing.
High-Priced Houses (5M+):
1170 houses (~35.78%) belong to this category.
While significant, premium properties are less common than mid-priced ones.
Low-Priced Houses (<2M):
Only 54 houses (~1.65%) fall into this category.
This indicates a shortage of budget-friendly homes, making affordable housing a niche but underserved market.

Insights:
Number of Houses by Price Bin (Bar Chart)
The majority of houses fall within the 2,000,000 - 6,000,000 range.
Properties priced above 8,000,000 are rare, indicating limited high-end inventory.
Budget-friendly properties (0 - 2,000,000) are scarce.
Business Implications
Market Trends:
The mid-range price segment (2M-6M) is the most competitive and in demand.
Luxury properties (8M+) are uncommon, possibly due to lower demand or limited supply.
Investment Strategies:
Affordable Housing: Developers could focus on increasing inventory in the low-price segment (0-2M).
Luxury Market: Pricing strategies should be optimized to enhance the competitiveness of high-end homes.



Comments