All we need is an easy explanation of the problem, so here it is.
I have a school project in which I need to predict the airbnb listings prices in 5 different cities.
So I have about 400,000 records each with 19 columns.
All the examples on the internet that are implemented in python use data frames.. But the goal of my project is to learn how to handle big data in a DBMS.
I was thinking about using RDBMS but I’m not sure at all if it could be good or not.
Any help or hint regarding a good DBMS for my project would be highly appreciated.
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
You can certainly accomplish this with any modern RDBMS, as 400,000 records is actually small data nowadays. 🙂
Since it’s a school project, I assume you’ll want to use a free edition of the system you choose. PostgreSQL and MySQL are free out the gate, Microsoft SQL Server also offers a free edition called "Express" edition, and I believe even Oracle offers a free edition of their database system as well.
My somewhat subjective recommendation out of those choices would be to use PostgreSQL or Microsoft SQL Server, just because of their mainstream use and plethora of documentation on how to setup, use, and query in either system. And both offer a sufficient amount of features, that I think they’ll meet your data needs without a second thought.
The final criteria to help you decide on which one to use should be based off of which procedural programming language / tech stack you’re most familiar with (if you need to use one as part of this project). If you’re used to the Microsoft tech stack then SQL Server is a fitting choice, otherwise most other cases will plug into PostgreSQL just fine. If you’re planning to use Python for the application side, it probably can plug into either database system just as easily.
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂