References

Alejandro Barredo Arrieta, Javier Del Ser, Natalia Díaz-Rodríguez. 2019. “Explainable Artificial Intelligence (Xai): Concepts, Taxonomies, Opportunities and Challenges Toward Responsible Ai,” 67. https://arxiv.org/abs/1910.10045.

Anda, Bente, Dag Sjøberg, and Audris Mockus. 2009. “Variability and Reproducibility in Software Engineering: A Study of Four Companies That Developed the Same System.” Software Engineering, IEEE Transactions on 35 (July): 407–29. https://doi.org/10.1109/TSE.2008.89.

Anderson, Christopher, Joanna Anderson, Marcel van Assen, Peter Attridge, Angela Attwood, Jordan Axt, Molly Babel, et al. 2019. “Reproducibility Project: Psychology.” https://doi.org/10.17605/OSF.IO/EZCUJ.

Ardia, David, Lennart F. Hoogerheide, and Herman K. van Dijk. 2009. “AdMit.” The R Journal 1 (1): 25–30. https://doi.org/10.32614/RJ-2009-003.

“Ascent of Machine Learning in Medicine.” 2019. Nature Materials 18 (5): 407–7. https://doi.org/10.1038/s41563-019-0360-1.

Balan, Theodor, and Hein Putter. 2019. “FrailtyEM: An R Package for Estimating Semiparametric Shared Frailty Models.” Journal of Statistical Software, Articles 90 (7): 1–29. https://doi.org/10.18637/jss.v090.i07.

Baldi, P., P. Sadowski, and D. Whiteson. 2014. “Searching for Exotic Particles in High-Energy Physics with Deep Learning.” Nature Communications 5 (1): 4308. https://doi.org/10.1038/ncomms5308.

Baldominos, Alejandro, Iván Blanco, Antonio Moreno, Rubén Iturrarte, Óscar Bernárdez, and Carlos Afonso. 2018. “Identifying Real Estate Opportunities Using Machine Learning.” Applied Sciences 8 (November): 2321. https://doi.org/10.3390/app8112321.

Batista, Gustavo E. A. P. A., and Maria Carolina Monard. 2003. “An Analysis of Four Missing Data Treatment Methods for Supervised Learning.” Applied Artificial Intelligence 17 (5-6): 519–33. https://doi.org/10.1080/713827181.

Bernd Bischl, Jakob Bossek, Jakob Richter. 2018. “MlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions,” 23. https://arxiv.org/abs/1703.03373.

Biecek, Przemyslaw. 2018. “DALEX: Explainers for Complex Predictive Models in R.” Journal of Machine Learning Research 19 (84): 1–5. http://jmlr.org/papers/v19/18-416.html.

Biecek, Przemyslaw, and Marcin Kosinski. 2017. “archivist: An R Package for Managing, Recording and Restoring Data Analysis Results.” Journal of Statistical Software 82 (11): 1–28. https://doi.org/10.18637/jss.v082.i11.

Bischl, Bernd, Giuseppe Casalicchio, Matthias Feurer, Frank Hutter, Michel Lang, Rafael Mantovani, Jan van Rijn, and Joaquin Vanschoren. 2017. “OpenML Benchmarking Suites and the Openml100,” August.

Bischl, Bernd, Michel Lang, Lars Kotthoff, Julia Schiffner, Jakob Richter, Erich Studerus, Giuseppe Casalicchio, and Zachary M. Jones. 2016a. “Mlr: Machine Learning in R.” Journal of Machine Learning Research 17 (170): 1–5. http://jmlr.org/papers/v17/15-066.html.

———. 2016b. “Mlr: Machine Learning in R.” Journal of Machine Learning Research 17 (170): 1–5. http://jmlr.org/papers/v17/15-066.html.

———. 2016c. “mlr: Machine Learning in R.” Journal of Machine Learning Research 17 (170): 1–5. http://jmlr.org/papers/v17/15-066.html.

Bischl, Bernd, Jakob Richter, Jakob Bossek, Daniel Horn, Janek Thomas, and Michel Lang. 2017. “MlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions.” arXiv Preprint arXiv:1703.03373.

Boehmke, Bradley, and Jason Freels. 2017. “LearningCurve: An Implementation of Crawford’s and Wright’s Learning Curve Production Functions.” The Journal of Open Source Software 2 (May). https://doi.org/10.21105/joss.00202.

Bogucki, Wojciech, Tomasz Makowski, and Dominik Rafacz. 2020. “Feature-Engineering.” GitHub Repository. https://github.com/DominikRafacz/feature-engineering; GitHub.

Bono, Christine, L. Ried, Carole Kimberlin, and Bruce Vogel. 2007. “Missing Data on the Center for Epidemiologic Studies Depression Scale: A Comparison of 4 Imputation Techniques.” Research in Social & Administrative Pharmacy : RSAP 3 (April): 1–27. https://doi.org/10.1016/j.sapharm.2006.04.001.

Boughorbel S., El-Anbari M., Jarray F. 2017. “Optimal Classifier for Imbalanced Data Using Matthews Correlation Coefficient Metric.” PLOS One. https://doi.org/https://doi.org/10.1371/journal.pone.0177678.

Broatch, Jennifer, Jennifer Green, and Andrew Karl. 2018. “RealVAMS: An R Package for Fitting a Multivariate Value- added Model (VAM).” The R Journal 10 (1): 22–30. https://doi.org/10.32614/RJ-2018-033.

Brown, Eric. 2019. “Tacmagic: Positron Emission Tomography Analysis in R.” The Journal of Open Source Software 4 (February): 1281. https://doi.org/10.21105/joss.01281.

Brown, Patrick, and Lutong Zhou. 2010. “MCMC for Generalized Linear Mixed Models with glmmBUGS.” The R Journal 2 (1): 13–17. https://doi.org/10.32614/RJ-2010-003.

Burns, David M., and Cari M. Whyne. 2018. “Seglearn: A Python Package for Learning Sequences and Time Series.” Journal of Machine Learning Research 19 (83): 1–7. http://jmlr.org/papers/v19/18-160.html.

Buuren, Stef Van. 2020. Mice: Multivariate Imputation by Chained Equations. https://cran.r-project.org/web/packages/mice/index.html.

Buuren, Stef van, and Karin Groothuis-Oudshoorn. 2011. “Mice: Multivariate Imputation by Chained Equations in R.” Journal of Statistical Software 45 (3): 1–67. https://www.jstatsoft.org/article/view/v045i03.

———. 2020. Mice: Multivariate Imputation by Chained Equations. https://CRAN.R-project.org/package=mice.

Casalicchio, Giuseppe, Bernd Bischl, Dominik Kirchhoff, Michel Lang, Benjamin Hofner, Jakob Bossek, Pascal Kerschke, and Joaquin Vanschoren. 2019. OpenML: Open Machine Learning and Open Data Platform. https://CRAN.R-project.org/package=OpenML.

Chen, Tianqi, and Carlos Guestrin. 2016. “XGBoost: A Scalable Tree Boosting System.” CoRR abs/1603.02754. http://arxiv.org/abs/1603.02754.

Coeurjolly, J.-F., R. Drouilhet, P. Lafaye de Micheaux, and J.-F. Robineau. 2009. “asympTest: A Simple R Package for Classical Parametric Statistical Tests and Confidence Intervals in Large Samples.” The R Journal 1 (2): 26–30. https://doi.org/10.32614/RJ-2009-015.

Computing Machinery, Association for. 2018. “Artifact Review and Badging.” https://www.acm.org/publications/policies/artifact-review-badging.

Conway, Jennifer. 2018. “Artificial Intelligence and Machine Learning : Current Applications in Real Estate.” PhD thesis. https://dspace.mit.edu/bitstream/handle/1721.1/120609/1088413444-MIT.pdf.

Coyle, Jeremy, and Nima Hejazi. 2018. “Origami: A Generalized Framework for Cross-Validation in R.” The Journal of Open Source Software 3 (January): 512. https://doi.org/10.21105/joss.00512.

Daniel J. Stekhoven, Peter Bühlmann. 2011. “MissForest—Non-Parametric Missing Value Imputation for Mixed-Type Data.” Bioinformatics 28 (1): 112–18. https://academic.oup.com/bioinformatics/article/28/1/112/219101.

Din, Allan, Martin Hoesli, and André Bender. 2001. “Environmental Variables and Real Estate Prices.” Urban Studies 38 (February). https://doi.org/10.1080/00420980120080899.

Dramiński, Michał, and Jacek Koronacki. 2018. “Rmcfs: An R Package for Monte Carlo Feature Selection and Interdependency Discovery.” Journal of Statistical Software, Articles 85 (12): 1–28. https://doi.org/10.18637/jss.v085.i12.

Dray, Stéphane, and Anne-Béatrice Dufour. 2007. “The Ade4 Package: Implementing the Duality Diagram for Ecologists.” Journal of Statistical Software, Articles 22 (4): 1–20. https://doi.org/10.18637/jss.v022.i04.

Drummond, Chris. 2012. “Reproducible Research: A Dissenting Opinion.” In.

Eisner, D. A. 2018. “Reproducibility of Science: Fraud, Impact Factors and Carelessness.” Journal of Molecular and Cellular Cardiology 114 (January): 364–68. https://doi.org/10.1016/j.yjmcc.2017.10.009.

Elmenreich, Wilfried, Philipp Moll, Sebastian Theuermann, and Mathias Lux. 2018. “Making Computer Science Results Reproducible - a Case Study Using Gradle and Docker,” August. https://doi.org/10.7287/peerj.preprints.27082v1.

Escobar, Modesto, and Luis Martinez-Uribe. 2020. “Network Coin Cidence Analysis: The netCoin R Package.” Journal of Statistical Software, Articles 93 (11): 1–32. https://doi.org/10.18637/jss.v093.i11.

Fan, Gang-Zhi, Seow Eng Ong, and Hian Koh. 2006. “Determinants of House Price: A Decision Tree Approach.” Urban Studies 43 (November): 2301–16. https://doi.org/10.1080/00420980600990928.

Fenton, N. E., and S. L. Pfleeger. 1997. Software Metrics: A Rigorous & Practical Approach. International Thompson Press.

Fern’andez, Daniel M’endez, Daniel Graziotin, Stefan Wagner, and Heidi Seibold. 2019. “Open Science in Software Engineering.” CoRR abs/1904.06499. http://arxiv.org/abs/1904.06499.

Fernández, Daniel Méndez, Daniel Graziotin, Stefan Wagner, and Heidi Seibold. 2019. “Open Science in Software Engineering.” ArXiv abs/1904.06499.

Fiona M. Shrive, Hude Quan, Heather Stuart. 2006. “Dealing with Missing Data in a Multi-Question Depression Scale: A Comparison of Imputation Methods.” BMC Medical Research Methodology 6 (57). https://link.springer.com/article/10.1186/1471-2288-6-57.

Fisher, Aaron, Cynthia Rudin, and Francesca Dominici. 2018. “All Models are Wrong, but Many are Useful: Learning a Variable’s Importance by Studying an Entire Class of Prediction Models Simultaneously.” arXiv E-Prints, January, arXiv:1801.01489. http://arxiv.org/abs/1801.01489.

Flach, Peter, Jose Hernandez-Orallo, and Cèsar Ferri. 2011. “A Coherent Interpretation of Auc as a Measure of Aggregated Classification Performance.” In Proceedings of the 28th International Conference on Machine Learning, ICML 2011, 657–64.

Flach Peter, Ferri, Hernandez-Orallo Jose. 2011. “A Coherent Interpretation of Auc as a Measure of Aggregated Classification Performance.” Proceedings of the 28th International Conference on Machine Learning. https://icml.cc/2011/papers/385_icmlpaper.pdf.

Fomel, Sergey, Paul Sava, Ioan Vlad, Yang Liu, and Vladimir Bashkardin. 2013. “Madagascar: Open-Source Software Project for Multidimensional Data Analysis and Reproducible Computational Experiments.” Journal of Open Research Software 1 (November): e8. https://doi.org/10.5334/jors.ag.

Friedman, Jerome. 2000. “Greedy Function Approximation: A Gradient Boosting Machine.” The Annals of Statistics 29 (November). https://doi.org/10.1214/aos/1013203451.

Gao, Guangliang, Zhifeng Bao, Jie Cao, A. Qin, Timos Sellis, Fellow, IEEE, and Zhiang Wu. 2019. “Location-Centered House Price Prediction: A Multi-Task Learning Approach,” January. https://arxiv.org/abs/1901.01774.

Garrod, Guy D, and Kenneth G Willis. 1992. “Valuing Goods’ Characteristics: An Application of the Hedonic Price Method to Environmental Attributes.” Journal of Environmental Management 34 (1): 59–76. https://doi.org/10.1016/S0301-4797(05)80110-0.

Geitner, Robert, Robby Fritzsch, Jürgen Popp, and Thomas Bocklitz. 2019. “Corr2D: Implementation of Two-Dimensional Correlation Analysis in R.” Journal of Statistical Software, Articles 90 (3): 1–33. https://doi.org/10.18637/jss.v090.i03.

Gentleman, Robert, and Duncan Temple Lang. 2007. “Statistical Analyses and Reproducible Research.” Journal of Computational and Graphical Statistics 16 (1): 1–23.

Geramifard, Alborz, Christoph Dann, Robert H. Klein, William Dabney, and Jonathan P. How. 2015. “RLPy: A Value-Function-Based Reinforcement Learning Framework for Education and Research.” Journal of Machine Learning Research 16 (46): 1573–8. http://jmlr.org/papers/v16/geramifard15a.html.

Goodman, Steven N., Daniele Fanelli, and John P. A. Ioannidis. 2016a. “What Does Research Reproducibility Mean?” Science Translational Medicine 8 (341). https://doi.org/10.1126/scitranslmed.aaf5027.

———. 2016b. “What Does Research Reproducibility Mean?” Science Translational Medicine 8 (341): 341ps12–341ps12. https://doi.org/10.1126/scitranslmed.aaf5027.

Gosiewska, Alicja, and Przemyslaw Biecek. 2020. “Lifting Interpretability-Performance Trade-off via Automated Feature Engineering.” https://arxiv.org/abs/2002.04267.

Gosiewska, Alicja, Aleksandra Gacek, Piotr Lubon, and Przemyslaw Biecek. 2019. “SAFE Ml: Surrogate Assisted Feature Extraction for Model Learning.” http://arxiv.org/abs/1902.11035.

Guazzelli, Alex, Michael Zeller, Wen-Ching Lin, and Graham Williams. 2009. “PMML: An Open Standard for Sharing Models.” The R Journal 1 (1): 60–65. https://doi.org/10.32614/RJ-2009-010.

Gundersen, Odd Erik, and Sigbjørn Kjensmo. 2018. “State of the Art: Reproducibility in Artificial Intelligence.” https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/17248.

Günther, Frauke, and Stefan Fritsch. 2010. “neuralnet: Training of Neural Networks.” The R Journal 2 (1): 30–38. https://doi.org/10.32614/RJ-2010-006.

Halstead, M. H. 1977. Elements of Software Science. Elsevier.

Hankin, Robin. 2007. “Introducing Untb, an R Package for Simulating Ecological Drift Under the Unified Neutral Theory of Biodiversity.” Journal of Statistical Software, Articles 22 (12): 1–15. https://doi.org/10.18637/jss.v022.i12.

Hastie, Trevor, and Rahul Mazumder. 2015a. “SoftImpute: Matrix Completion via Iterative Soft-Thresholded Svd.” https://CRAN.R-project.org/package=softImpute.

———. 2015b. SoftImpute: Matrix Completion via Iterative Soft-Thresholded Svd. https://CRAN.R-project.org/package=softImpute.

Herbold, Steffen. 2020. “Autorank: A Python Package for Automated Ranking of Classifiers.” Journal of Open Source Software 5 (48): 2173. https://doi.org/10.21105/joss.02173.

Heusser, Andrew C., Kirsten Ziman, Lucy L. W. Owen, and Jeremy R. Manning. 2018. “HyperTools: A Python Toolbox for Gaining Geometric Insights into High-Dimensional Data.” Journal of Machine Learning Research 18 (152): 1–6. http://jmlr.org/papers/v18/17-434.html.

Heyman, Axel, and Dag Sommervoll. 2019. “House Prices and Relative Location.” Cities 95 (September): 102373. https://doi.org/10.1016/j.cities.2019.06.004.

Holzinger, Andreas, Georg Langs, Helmut Denk, Kurt Zatloukal, and Heimo Müller. 2019. “Causability and Explainability of Artificial Intelligence in Medicine.” WIREs Data Mining and Knowledge Discovery 9 (4): e1312. https://doi.org/10.1002/widm.1312.

Hothorn, Torsten, Peter Bühlmann, Thomas Kneib, Matthias Schmid, and Benjamin Hofner. 2010. “Model-Based Boosting 2.0.” Journal of Machine Learning Research 11 (71): 2109–13. http://jmlr.org/papers/v11/hothorn10a.html.

Hu, Feng, and Hang Li. 2013. “A Novel Boundary Oversampling Algorithm Based on Neighborhood Rough Set Model: NRSBoundary-Smote.” Mathematical Problems in Engineering 2013 (November). https://doi.org/10.1155/2013/694809.

Hughes, Nathan, Richard Morris, and Melissa Tomkins. 2020. “PyEscape: A Narrow Escape Problem Simulator Package for Python.” Journal of Open Source Software 5 (47): 2072. https://doi.org/10.21105/joss.02072.

Hung, Ling-Hong, Daniel Kristiyanto, Sung Lee, and Ka Yee Yeung. 2016. “GUIdock: Using Docker Containers with a Common Graphics User Interface to Address the Reproducibility of Research.” PloS One 11 (April): e0152686. https://doi.org/10.1371/journal.pone.0152686.

Jadhav, Anil. 2020. “A Novel Weighted Tpr-Tnr Measure to Assess Performance of the Classifiers.” Expert Systems with Applications 152 (March): 113391. https://doi.org/10.1016/j.eswa.2020.113391.

Jas, Mainak, Titipat Achakulvisut, Aid Idrizović, Daniel Acuna, Matthew Antalek, Vinicius Marques, Tommy Odland, et al. 2020. “Pyglmnet: Python Implementation of Elastic-Net Regularized Generalized Linear Models.” Journal of Open Source Software 5 (47): 1959. https://doi.org/10.21105/joss.01959.

José M. Jerez, Pedro J. García-Laencina, Ignacio Molina. 2010. “Missing Data Imputation Using Statistical and Machine Learning Methods in a Real Breast Cancer Problem.” Elsevier 50: 105–15. https://doi.org/10.1016/j.artmed.2010.05.002.

Josse, Julie, and François Husson. 2016. “missMDA: A Package for Handling Missing Values in Multivariate Data Analysis.” Journal of Statistical Software 70 (1): 1–31. https://doi.org/10.18637/jss.v070.i01.

Kim, Donghoh, and Hee-Seok Oh. 2009. “EMD: A Package for Empirical Mode Decomposition and Hilbert Spectrum.” The R Journal 1 (1): 40–46. https://doi.org/10.32614/RJ-2009-002.

Kitzes, Justin, Daniel Turek, and Fatma Deniz. 2017. The Practice of Reproducible Research: Case Studies and Lessons from the Data-Intensive Sciences. Univ of California Press.

Kossaifi, Jean, Yannis Panagakis, Anima Anandkumar, and Maja Pantic. 2019. “TensorLy: Tensor Learning in Python.” Journal of Machine Learning Research 20 (26): 1–6. http://jmlr.org/papers/v20/18-277.html.

Kowarik, Alexander, and Matthias Templ. 2016a. “Imputation with the R Package VIM.” Journal of Statistical Software 74 (7): 1–16. https://doi.org/10.18637/jss.v074.i07.

———. 2016b. “Imputation with the R Package Vim.” Journal of Statistical Software 74 (7): 1–16. https://www.jstatsoft.org/article/view/v045i03.

———. 2016c. “Imputation with the R Package VIM.” Journal of Statistical Software 74 (7): 1–16. https://doi.org/10.18637/jss.v074.i07.

Kuhn, Max. 2008. “Building Predictive Models in R Using the Caret Package.” Journal of Statistical Software, Articles 28 (5): 1–26. https://doi.org/10.18637/jss.v028.i05.

Landau, William Michael. 2018. “The Drake R Package: A Pipeline Toolkit for Reproducibility and High-Performance Computing.” Journal of Open Source Software 3 (21). https://doi.org/10.21105/joss.00550.

Lau, Matthew, Thomas F. J.-M Pasquier, and Margo Seltzer. 2020. “Rclean: A Tool for Writing Cleaner, More Transparent Code.” Journal of Open Source Software 5 (46): 1312. https://doi.org/10.21105/joss.01312.

Law, Stephen. 2017. “Defining Street-Based Local Area and Measuring Its Effect on House Price Using a Hedonic Price Approach: The Case Study of Metropolitan London.” Cities 60 (February): 166–79. https://doi.org/10.1016/j.cities.2016.08.008.

LeVeque, Randall. 2009. “Python Tools for Reproducible Research on Hyperbolic Problems.” Computing in Science & Engineering 11 (January): 19–27. https://doi.org/10.1109/MCSE.2009.13.

Li, Xingguo, Tuo Zhao, Xiaoming Yuan, and Han Liu. 2015. “The Flare Package for High Dimensional Linear Regression and Precision Matrix Estimation in R.” Journal of Machine Learning Research 16 (18): 553–57. http://jmlr.org/papers/v16/li15a.html.

Lipton, Zachary Chase. 2016. “The Mythos of Model Interpretability.” CoRR abs/1606.03490. http://arxiv.org/abs/1606.03490.

Little, R. J. A., and D. B. Rubin. 2002. Statistical Analysis with Missing Data. Wiley Series in Probability and Mathematical Statistics. Probability and Mathematical Statistics. Wiley. http://books.google.com/books?id=aYPwAAAAMAAJ.

Liu, Rui, and Lu Liu. 2019. “Predicting housing price in China based on long short-term memory incorporating modified genetic algorithm.” Soft Computing, 1–10. https://doi.org/10.1007/s00500-018-03739-w.

Löfstedt, Tommy, Vincent Guillemot, Vincent Frouin, Edouard Duchesnay, and Fouad Hadj-Selem. 2018. “Simulated Data for Linear Regression with Structured and Sparse Penalties: Introducing Pylearn-Simulate.” Journal of Statistical Software, Articles 87 (3): 1–33. https://doi.org/10.18637/jss.v087.i03.

Markos, Angelos, Alfonso D’Enza, and Michel van de Velden. 2019. “Beyond Tandem Analysis: Joint Dimension Reduction and Clustering in R.” Journal of Statistical Software, Articles 91 (10): 1–24. https://doi.org/10.18637/jss.v091.i10.

Marwick, B. n.d. “Rrtools: Creates a Reproducible Research Compendium (2018).”

Marwick, Ben. 2016. “Computational Reproducibility in Archaeological Research: Basic Principles and a Case Study of Their Implementation.” Journal of Archaeological Method and Theory 24 (2): 424–50. https://doi.org/10.1007/s10816-015-9272-9.

Marwick, Ben, Carl Boettiger, and Lincoln Mullen. 2017. “Packaging Data Analytical Work Reproducibly Using R (and Friends).” The American Statistician 72 (1): 80–88. https://doi.org/10.1080/00031305.2017.1375986.

Matthias Templ, Andreas Alfons, Alexander Kowarik. 2020. VIM: Visualization and Imputation of Missing Values. https://cran.r-project.org/web/packages/VIM/index.html.

Matthias Templ, Peter Filzmoser, Alexander Kowarik. 2011. “Iterative Stepwise Regression Imputation Using Standard and Robust Methods.” http://file.statistik.tuwien.ac.at/filz/papers/CSDA11TKF.pdf.

Matthijs Meire, Dirk Van den Poel, Michel Ballings. 2016. ImputeMissings: Impute Missing Values in a Predictive Context. https://cran.r-project.org/web/packages/imputeMissings/index.html.

Mayer, Michael. 2019. “MissRanger: Fast Imputation of Missing Values.” https://CRAN.R-project.org/package=missRanger.

McCabe, T. J. 1976. “A Complexity Measure.” IEEE Transactions on Software Engineering 2 (4): 308–20.

McDermott, James, and Richard S. Forsyth. 2016. “Diagnosing a Disorder in a Classification Benchmark.” Pattern Recognition Letters 73: 41–43. https://doi.org/https://doi.org/10.1016/j.patrec.2016.01.004.

McNutt, Marcia. 2014. “Journals Unite for Reproducibility.” Science 346 (6210): 679–79. https://doi.org/10.1126/science.aaa1724.

Melo], Vinícius [Veloso de, and Wolfgang Banzhaf. 2018. “Automatic Feature Engineering for Regression Models with Machine Learning: An Evolutionary Computation and Statistics Hybrid.” Information Sciences 430-431: 287–313. https://doi.org/https://doi.org/10.1016/j.ins.2017.11.041.

Mevik, Björn-Helge, and Ron Wehrens. 2007. “The Pls Package: Principal Component and Partial Least Squares Regression in R.” Journal of Statistical Software, Articles 18 (2): 1–23. https://doi.org/10.18637/jss.v018.i02.

Meyer, David, Evgenia Dimitriadou, Kurt Hornik, Andreas Weingessel, and Friedrich Leisch. 2019. E1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), Tu Wien. https://CRAN.R-project.org/package=e1071.

Mi, Xuefei, Tetsuhisa Miwa, and Torsten Hothorn. 2009. “New Numerical Algorithm for Multivariate Normal Probabilities in Package mvtnorm.” The R Journal 1 (1): 37–39. https://doi.org/10.32614/RJ-2009-001.

Michel Lang, Jakob Richter, Bernd Bischl. 2020. Mlr3: Machine Learning in R - Next Generation. https://cran.r-project.org/web/packages/mlr3/index.html.

Molnar, Christoph. 2019a. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. https://christophm.github.io/interpretable-ml-book/simple.html.

———. 2019b. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. https://christophm.github.io/interpretable-ml-book/simple.html.

———. 2018. “Iml: An R Package for Interpretable Machine Learning.” Journal of Open Source Software 3 (June): 786. https://doi.org/10.21105/joss.00786.

Murdoch, Kumbier, Singh. 2018. “Interpretable Machine Learning: Definitions, Methods, and Applications,” 2. https://arxiv.org/pdf/1901.04592.pdf?fbclid=IwAR2frcHrhLc4iaH5-TmKKq263NVvAKHtG4uQoiVNDeLAG3QFzdje-yzZjiQ.

Musil, Carol, Camille Warner, Piyanee Klainin-Yobas, and Susan Jones. 2002. “A Comparison of Imputation Techniques for Handling Missing Data.” Western Journal of Nursing Research 24 (December): 815–29. https://doi.org/10.1177/019394502762477004.

Nordh, Jerker. 2017. “pyParticleEst: A Python Framework for Particle-Based Estimation Methods.” Journal of Statistical Software 78 (3). https://doi.org/10.18637/jss.v078.i03.

Obadia, Yohan. 2017. “The Use of Knn for Missing Values.” https://towardsdatascience.com/the-use-of-knn-for-missing-values-cf33d935c637.

Obermeyer, Ziad, and Ezekiel J. Emanuel. 2016. “Predicting the Future - Big Data, Machine Learning, and Clinical Medicine.” The New England Journal of Medicine 375 (13): 1216–9. https://doi.org/10.1056/NEJMp1606181.

Özalp, Ayşe, and Halil Akinci. 2017. “The Use of Hedonic Pricing Method to Determine the Parameters Affecting Residential Real Estate Prices.” Arabian Journal of Geosciences 10 (December). https://doi.org/10.1007/s12517-017-3331-3.

Pandey. 2019. “Interpretable Machine Learning: Extracting Human Understandable Insights from Any Machine Learning Model,” April. https://towardsdatascience.com/interpretable-machine-learning-1dec0f2f3e6b.

Park, Byeonghwa, and Jae Bae. 2015. “Using machine learning algorithms for housing price prediction: The case of Fairfax County, Virginia housing data.” Expert Systems with Applications 42 (April). https://doi.org/10.1016/j.eswa.2014.11.040.

Patil, Prasad, Roger D. Peng, and Jeffrey T. Leek. 2016. “A Statistical Definition for Reproducibility and Replicability.” Science. https://doi.org/10.1101/066803.

Patricia Arroba, Marina Zapater, José L. Risco-Martín. 2015. “Enhancing Regression Models for Complex Systems Using Evolutionary Techniques for Feature Engineering.” Journal of Grid Computing 13: 409–23. https://doi.org/10.1007/s10723-014-9313-8.

Peng, Roger D. 2011. “Reproducible Research in Computational Science.” Science 334 (6060): 1226–7. https://doi.org/10.1126/science.1213847.

Piccolo, Stephen R., and Michael B. Frampton. 2016. “Tools and Techniques for Computational Reproducibility.” GigaScience 5 (1). https://doi.org/10.1186/s13742-016-0135-4.

Pue, A. 2019. “Graph Transliterator: A Graph-Based Transliteration Tool.” Journal of Open Source Software 4 (44): 1717. https://doi.org/10.21105/joss.01717.

Raff, Edward. 2020. “Quantifying Independently Reproducible Machine Learning.” https://thegradient.pub/independently-reproducible-machine-learning/.

Rafiei, Mohammad H., and Hojjat Adeli. 2015. “A Novel Machine Learning Model for Estimation of Sale Prices of Real Estate Units.” Journal of Construction Engineering and Management 142 (August). https://doi.org/10.1061/(ASCE)CO.1943-7862.0001047.

Randeniya, TD, Gayani Ranasinghe, and Susantha Amarawickrama. 2017. “A model to Estimate the Implicit Values of Housing Attributes by Applying the Hedonic Pricing Method.” International Journal of Built Environment and Sustainability 4 (May). https://doi.org/10.11113/ijbes.v4.n2.182.

R Core Team. 2018. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

Rebecca R. Andridge, Roderick J. A. Little. 2010. “A Review of Hot Deck Imputation for Survey Non-Response.” https://doi.org/10.1111/j.1751-5823.2010.00103.x.

“Reproducibility in Science: A Guide to enhancing reproducibility in scientific results and writing.” 2014. http://ropensci.github.io/reproducibility-guide/.

Ripley, Brian. 2019. Class: Functions for Classification. https://CRAN.R-project.org/package=class.

Role, François, Stanislas Morbieu, and Mohamed Nadif. 2019. “CoClust: A Python Package for Co-Clustering.” Journal of Statistical Software 88 (7). https://doi.org/10.18637/jss.v088.i07.

Ronan, Tom, Shawn Anastasio, Zhijie Qi, Pedro Henrique S. Vieira Tavares, Roman Sloutsky, and Kristen M. Naegle. 2018. “OpenEnsembles: A Python Resource for Ensemble Clustering.” Journal of Machine Learning Research 19 (26): 1–6. http://jmlr.org/papers/v19/18-100.html.

Rosenberg, David E., Yves Filion, Rebecca Teasley, Samuel Sandoval-Solis, Jory S. Hecht, Jakobus E. van Zyl, George F. McMahon, Jeffery S. Horsburgh, Joseph R. Kasprzyk, and David G. Tarboton. 2020. “The Next Frontier: Making Research More Reproducible.” Journal of Water Resources Planning and Management 146 (6): 01820002. https://doi.org/10.1061/(ASCE)WR.1943-5452.0001215.

Rubin, DONALD B. 1976. “Inference and missing data.” Biometrika 63 (3): 581–92. https://doi.org/10.1093/biomet/63.3.581.

Rubin, Donald B. 1976. “Inference and Missing Data.” https://doi.org/10.1093/biomet/63.3.581.

Rudin, Cynthia. 2019. “Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead.” Nature Machine Intelligence 1 (5): 206–15. https://doi.org/10.1038/s42256-019-0048-x.

Sakia, R. M. 1992. “The Box-Cox Transformation Technique: A Review.” Journal of the Royal Statistical Society. Series D (the Statistician) 41 (2): 169–78. http://www.jstor.org/stable/2348250.

Sayyad Shirabad, J., and T. J. Menzies. 2005. “The PROMISE Repository of Software Engineering Databases.” School of Information Technology and Engineering, University of Ottawa, Canada. http://promise.site.uottawa.ca/SERepository.

Selim, Hasan. 2009. “Determinants of House Prices in Turkey: Hedonic Regression Versus Artificial Neural Network.” Expert Syst. Appl. 36 (March): 2843–52. https://doi.org/10.1016/j.eswa.2008.01.044.

Sinz, Fabian, Joern-Philipp Lies, Sebastian Gerwinn, and Matthias Bethge. 2014. “Natter: A Python Natural Image Statistics Toolbox.” Journal of Statistical Software 61 (October): 1–34. https://doi.org/10.18637/jss.v061.i05.

Slezak, Peter, and Iveta Waczulikova. 2011. “Reproducibility and Repeatability.” Physiological Research / Academia Scientiarum Bohemoslovaca 60 (April): 203–4; author reply 204.

Soetaert, Karline, Thomas Petzoldt, and R. Woodrow Setzer. 2010. “Solving Differential Equations in R.” The R Journal 2 (2): 5–15. https://doi.org/10.32614/RJ-2010-013.

Stanisic, Luka, Arnaud Legrand, and Vincent Danjean. 2015. “An Effective Git and Org-Mode Based Workflow for Reproducible Research.” SIGOPS Oper. Syst. Rev. 49 (1): 61–70. https://doi.org/10.1145/2723872.2723881.

Stekhoven, Daniel J. 2013. MissForest: Nonparametric Missing Value Imputation Using Random Forest. https://CRAN.R-project.org/package=missForest.

Stekhoven, Daniel J., and Peter Buehlmann. 2012a. “MissForest - Non-Parametric Missing Value Imputation for Mixed-Type Data.” Bioinformatics 28 (1): 112–18.

———. 2012b. “MissForest - Non-Parametric Missing Value Imputation for Mixed-Type Data.” Bioinformatics 28 (1): 112–18.

Stodden, Victoria, David H. Bailey, Jonathan M. Borwein, Randall J. LeVeque, William J. Rider, and William Stein. 2013. “Setting the Default to Reproducible Reproducibility in Computational and Experimental Mathematics.” In.

Stodden, Victoria, Marcia McNutt, David H. Bailey, Ewa Deelman, Yolanda Gil, Brooks Hanson, Michael A. Heroux, John P. A. Ioannidis, and Michela Taufer. 2016. “Enhancing Reproducibility for Computational Methods.” Science 354 (6317): 1240–1. https://doi.org/10.1126/science.aah6168.

Stodden, Victoria, Jennifer Seiler, and Zhaokun Ma. 2018. “An Empirical Analysis of Journal Policy Effectiveness for Computational Reproducibility.” Proceedings of the National Academy of Sciences 115 (11): 2584–9. https://doi.org/10.1073/pnas.1708290115.

Strobl, Carolin, Torsten Hothorn, and Achim Zeileis. 2009. “Party on!” The R Journal 1 (2): 14–17. https://doi.org/10.32614/RJ-2009-013.

Su, Xiaoyuan, Taghi Khoshgoftaar, and Russ Greiner. 2008. “Using Imputation Techniques to Help Learn Accurate Classifiers.” Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI 1 (December): 437–44. https://doi.org/10.1109/ICTAI.2008.60.

Templ, Matthias, Alexander Kowarik, and Andreas Alfons. 2020. VIM: Visualization and Imputation of Missing Values. https://CRAN.R-project.org/package=VIM.

Therneau, Terry, and Beth Atkinson. 2019a. Rpart: Recursive Partitioning and Regression Trees. https://CRAN.R-project.org/package=rpart.

———. 2019b. Rpart: Recursive Partitioning and Regression Trees. https://CRAN.R-project.org/package=rpart.

Thomas, Kluyver, Ragan-Kelley Benjamin, Pérez Fernando, Granger Brian, Bussonnier Matthias, Frederic Jonathan, Kelley Kyle, et al. 2016. “Jupyter Notebooks &Ndash; a Publishing Format for Reproducible Computational Workflows.” Stand Alone 0 (Positioning and Power in Academic Publishing: Players, Agents and Agendas): 87–90. https://doi.org/10.3233/978-1-61499-649-1-87.

Titz, Johannes. 2020. “Mimosa: A Modern Graphical User Interface for 2-Level Mixed Models.” Journal of Open Source Software 5 (49): 2116. https://doi.org/10.21105/joss.02116.

Trevor Hastie, Rahul Mazumder. 2015. SoftImpute: Matrix Completion via Iterative Soft-Thresholded Svd. https://cran.r-project.org/web/packages/softImpute/index.html.

van Buuren, Stef, and Karin Groothuis-Oudshoorn. 2011a. “mice: Multivariate Imputation by Chained Equations in R.” Journal of Statistical Software 45 (3): 1–67. https://www.jstatsoft.org/v45/i03/.

———. 2011b. “mice: Multivariate Imputation by Chained Equations in R.” Journal of Statistical Software 45 (3): 1–67. https://www.jstatsoft.org/v45/i03/.

Vandewalle, Patrick, Jelena Kovacevic, and Martin Vetterli. 2009. “Reproducible Research in Signal Processing.” IEEE Signal Processing Magazine 26 (3): 37–47.

Vanschoren, Joaquin, Jan N. van Rijn, Bernd Bischl, and Luis Torgo. 2013. “OpenML: Networked Science in Machine Learning.” SIGKDD Explorations 15 (2): 49–60. https://doi.org/10.1145/2641190.2641198.

Velez, White, D. R., and J. H Moore. 2007. “A Balanced Accuracy Function for Epistasis Modeling in Imbalanceddatasets Using Multifactor Dimensionality Reduction.” Genetic Epidemiology, no. 31: 306–15. https://doi.org/10.1002/gepi.20211.

Wilhelm, Stefan, and B. G. Manjunath. 2010. “tmvtnorm: A Package for the Truncated Multivariate Normal Distribution.” The R Journal 2 (1): 25–29. https://doi.org/10.32614/RJ-2010-005.

Wright, Marvin N., Stefan Wager, and Philipp Probst. 2020. Ranger: A Fast Implementation of Random Forests. https://CRAN.R-project.org/package=ranger.

Wright, Marvin N., and Andreas Ziegler. 2017. “ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R.” Journal of Statistical Software 77 (1): 1–17. https://doi.org/10.18637/jss.v077.i01.

Xia, Xiao-Qin, Michael McClelland, and Yipeng Wang. 2010. “PypeR, a Python Package for Using R in Python.” Journal of Statistical Software, Code Snippets 35 (2): 1–8. http://www.jstatsoft.org/v35/c02.

Yuan, Lester. 2007. “Maximum Likelihood Method for Predicting Environmental Conditions from Assemblage Composition: The R Package Bio.infer.” Journal of Statistical Software, Articles 22 (3): 1–20. https://doi.org/10.18637/jss.v022.i03.

Zhao, Tuo, Han Liu, Kathryn Roeder, John Lafferty, and Larry Wasserman. 2012. “The Huge Package for High-Dimensional Undirected Graph Estimation in R.” Journal of Machine Learning Research 13 (37): 1059–62. http://jmlr.org/papers/v13/zhao12a.html.

Zwicker, David. 2020. “Py-Pde: A Python Package for Solving Partial Differential Equations.” Journal of Open Source Software 5 (48): 2158. https://doi.org/10.21105/joss.02158.