For our project, we dove into two different datasets pertaining to wine. The first dataset we explored was called “Red Wine Quality”, which focuses on physicochemical inputs such as alcohol, acidity, etc. These are all numerical inputs used to measure elements of each wine sample. These elements were then used to measure the quality of the red wine from Portugal. The “quality” variable is a sensory output used to describe the overall quality of the wine. This was put on a scale from 1-10, 10 being the best score a wine could receive. For this dataset, we wanted to explore what inputs contribute to the quality score of the wine. The second dataset we explored was called “Wine Reviews”, this dataset explores the “points” awarded to each wine, which is based on reviews from a site called “WineEnthusiast”. The wines are rated on a scale from 0-100, with 100 being the best score a wine could achieve. Each wine also includes aspects such as the country it was made, price, winery, etc. The information we want to obtain from the second dataset is where the “best” wines come from.
Wine Quality: https://www.kaggle.com/uciml/red-wine-quality-cortez-et-al-2009/kernels
Wine Reviews: https://www.kaggle.com/zynicide/wine-reviews