Forecasting the Value of Real Estate Using Machine Learning Tools

2024;
: pp.140 - 158
1
Lviv Polytechnic National University, Information Systems and Network
2
Lviv Polytechnic National University, Information Systems and Networks Department

Correct valuation of real estate plays a crucial role in the process of buying and selling. We have carefully studied the existing applications with which we carry out real estate transactions, described their features, advantages and disadvantages. The developed model will help sellers get an estimate of their property according to the parameters entered, which can serve as a starting point for establishing the final value. The computation of real estate values has historically been based primarily on the method of analyzing data manually and subjective estimates, often resulting in errors and delays. The use of machine learning algorithms in solving this problem turned out to be effective, since it has a number of advantages over the manual estimation method, namely: a high level of accuracy, elimination of subjectivity and bias in estimates, time efficiency, cost reduction, use of geospatial data and substantiation of results. The process of creating a machine learning model is conditionally decomposed into four stages, which include collecting data, filtering, processing, supplementing, dividing into different samples and training the model based on this data. We considered the most popular regression algorithms, briefly described the principle of their work, as well as metrics with which you can evaluate the quality of the predicted values of the models. Standard parameters were used to test linear regression algorithms, decision tree, nearest neighbor method, support vector method, and random forest. The determination coefficient R-square is chosen as the main metric. Comparing the coefficient of determination of the results, it became clear that the algorithm “random forest” showed the best result. Having manually selected hyper parameters for this algorithm, the average value of the absolute error of the predicted value is 8.49 %, and the median is 1.9 %. The constructed model meets the established quality requirements and is ready for implementation in the information system of forecasting the value of real estate. For buyers, this service will be relevant, since they will be able to search for real estate according to the parameters entered by them, which have a favorable price for the purchase.

  1. (n. d.). The Importance of Accurate Property Valuation in Real Estate. Sugermint.com. https://sugermint.com/the-importance-of-accurate-property-valuation-in-real-estate/
  2. (n. d.). AI in real estate property valuation: Is it really a game-changer? Mdevelopers.com. https://mdevelopers.com/blog/ai-real-estate-property-valuation
  3. Kolesnikova, I. (2023, March 31). Using Artificial Intelligence for Real Estate: A Comprehensive Guide.
  4. Mindtitan.com.       https://mindtitan.com/resources/industry-use-cases/artificial-intelligence-in-real-estate/
  5. (n. d.) Real-time property valuations: how ai algorithms are making it possible. Realspace3d.com. https://www.realspace3d.com/blog/real-time-property-valuations-how-ai-algorithms-are-making-it-possible/
  6. Veres, O., Ilchuk, P., & Kots, O. (2021). Data Science Methods in Project Financing Involvement, In 2021 IEEE 16th International Conference on Computer Sciences and Information Technologies (CSIT) (Vol. 2, pp. 411– 414). DOI: 10.1109/CSIT52700.2021.9648679
  7. Veres, O., Ilchuk, P., & Kots, O. (2023). Data Analytics on Debt Financing Research Based on Scopus and WoS Metrics, In 2023 IEEE 18th International Conference on Computer Science and Information Technologies (CSIT). DOI: 10.1109/CSIT61576.2023.10324179
  8. (n. d.). DIM.RIA – all real estate of Ukraine. Sale and rent of any real estate. Dom.ria.com. https://dom.ria.com/uk/
  9. (n. d.). Ukrainian classifieds service. Olx.ua. https://www.olx.ua/uk/nedvizhimost/
  10. (n. d.). Agents. Tours. Loans. Homes. Zillow. https://www.zillow.com/
  11. (n. d.). for Sale, Real Estate & Property Listing. Realtor.com. https://www.realtor.com/
  12. (n. d.). Real Estate, Homes for Sale, MLS Listings, Agents. Redfin.com. https://www.redfin.com/
  13. (n. d.). Most Trusted Provider of Real Estate Information. Propstream.com. https://www.propstream.com/
  14. Berezhna, N. (2021). Buying a home: do you need a realtor and how much do his services cost in Ukraine. URL:        https://realestate.24tv.ua/kupivlya-zhitla-potriben-rieltor-skilki-koshtuyut-ostanni-novini_n1525065
  15. (n. d.). How much is my home worth? Zillow.com. https://www.zillow.com/how-much-is-my-home-worth/
  16. (n. d.) Machine Learning Regression Explained. Seldon.io. https://www.seldon.io/machine-learning- regression-explained
  17. Gelman, A., & Hill, J. (2006). Data analysis using regression and multilevel/hierarchical models. Cambridge University Press.
  18. Finch, W. H., Bolin, J. E., & Kelley, K. (2019). Multilevel modeling using R. Crc Press
  19. Evans, C., Leckie, G., & Merlo, J. (2020). Multilevel versus SingleLevel Regression for the Analysis of Multilevel Information: The Case of Quantitative Intersectional Analysis. Social Science and Medicine, 245, 112499. Article 112499. https://doi.org/10.1016/j.socscimed.2019.112499
  20. (n. d.). What is Simple Linear Regression in Machine Learning? Simplilearn.com. https://www.simplilearn.com/what-is-simple-linear-regression-in-machine-learning-article
  21. Maulud, D., & Abdulazeez, A. M. (2020). A review on linear regression comprehensive in machine learning. Journal of Applied Science and Technology Trends, 1(4), 140-147.
  22. Polzer, D. (2021, June 21). 7 of the Most Used Regression Algorithms and How to Choose the Right One. Linear and Polynomial Regression, RANSAC, Decision Tree, Random Forest, Gaussian Process and Support Vector Regression. Towardsdatascience.com. https://towardsdatascience.com/7-of-the-most-commonly-used-regression- algorithms-and-how-to-choose-the-right-one-fc3c8890f9e3
  23. Montgomery, D. C., Peck, E. A., & Vining, G. G. (2021). Introduction to linear regression analysis. John Wiley & Sons.
  24. Dawson,      C.     (2021,   January   23).    Understanding        Multiple       Linear      Regression.        Medium.com. https://medium.com/swlh/understanding-multiple-linear-regression-e0a93327e960
  25. Mahaboob, B., Praveen, J. P., Rao, B. A., Haranadh, Y., Narayana, C., & Prakash, G. B. (2020). A study on multiple linear regression using matrix calculus. Advancecs in Mathematics Scientifc journal, 9(7), 1–10.
  26. Bouzebda, S., Souddi, Y., & Madani, F. (2024). Weak Convergence of the Conditional Set-Indexed Empirical Process for Missing at Random Functional Ergodic Data. Mathematics, 12(3), 448.
  27. Zhou, Y., & He, D. (2024). Multi-Target Feature Selection with Adaptive Graph Learning and Target Correlations. Mathematics, 12(3), 372.
  28. Li, T., Frank, K. A., & Chen, M. (2024). A Conceptual Framework for Quantifying the Robustness of a Regression-Based Causal Inference in Observational Study. Mathematics, 12(3), 388.
  29. Leyland, A. H., & Groenewegen, P. P. (2020). Multilevel Modelling for Public Health and Health Services Research: Health in Context (p. 286). Springer Nature. https://doi.org/10.1007/978-3-030-34801-4.
  30. (n.  d.)  Measure  of  impurity.  Medium.com.  https://medium.com/@viswatejaster/measure-of-impurity- 62bda86d8760
  31. Bentéjac, C., Csörgő, A., & Martínez-Muñoz, G. (2021). A comparative analysis of gradient boosting algorithms. Artificial Intelligence Review, 54, 1937–1967.
  32. Rishit, P. (2023). Understanding K-Nearest Neighbors: A Simple Approach to Classification and Regression. Demystifying   K-Nearest   Neighbors:   Unveiling   the   Power   of   Proximity-based   Algorithm.   Pub.Towardsai.net. https://pub.towardsai.net/understanding-k-nearest-neighbors-a-simple-approach-to-classification-and-regression- e4b30b37f151
  33. Chicco, D., Warrens, M. J., & Jurman, G. (2021). The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis  evaluation. PeerJ  Computer Science, 7, e623.
  34. Bhandari, P. (2020, June 30). Central Tendency | Understanding the Mean, Median & Mode. Scribbr.com. https://www.scribbr.com/statistics/central-tendency/
  35. (n. d.). What Is R Squared And Negative R Squared. Fairlynerdy.com. http://www.fairlynerdy.com/what-is- r-squared/
  36. (n. d.). ML | Introduction to Data in Machine Learning. Geeksforgeeks.org. https://www.geeksforgeeks.org/ml- introduction-data-machine-learning/
  37. (n. d.). What are the average prices on the secondary housing market in Ukraine: how much will you have to pay for a one-room apartment. Sud.ua. https:https://sud.ua/uk/news/ukraine/259841-kakie-srednie-tseny-na- vtorichnom-rynke-zhilya-po-ukraine-skolko-pridetsya-otdat-za-odnokomnatnuyu-kvartiru
  38. (n. d.). Real estate Kyiv and region. https://t.me/ppbestate
  39. (n. d.). Real estate of the Kyiv region. https://t.me/Neruhomist_Kyiv_region
  40. (n. d.). Telethon’s Documentation. Docs.Telethon.dev. https://docs.telethon.dev/en/stable/
  41. (n. d.). Nominatim 4.3.0 Manual. Nominatim.org. https://nominatim.org/release-docs/latest/api/Overview/
  42. (n. d.). Welcome to Python Overpass API’s doc
  43. umentation!   Python-Overpy.Readthedocs.io.    https://python-overpy.readthedocs.io/en/latest/