Statistical Analysis On an IMDB DATA SET

genral Information:
Programming language used: Python
Source: Kaggle
Credibility : 10/10 Kaggle scale
Packages used: Pandas, Seaborn, NumPy, Matplotlib
Hypotheses:
- Correlation between budget and gross income is directly proportional.
- Correlation between company's name and gross income is directly proportional.
Process:
- Started with some data cleaning, checking for duplicates (there were none).
- Removed NaN values (switching them to zeros if they existed in numerical columns).
- Extracted the year from the date column into a separate column.
- Normalized the data and used statistical methods (Pearson correlation) to find the correlation between the columns.
Findings:
- Correlation between budget and gross income is directly proportional (correct, with a correlation coefficient of 0.74).
- A company's name has no effect on gross income (contrary to hypothesis).
Visualization:
Extracting data from the notebook and creating an interesting Dashboard using Tableau.