Reference no: EM132487466
SIT112 Data Science Concepts
Task Description
There are two main tasks for this assignment:
• Construction of the data dictionary and
• Programming tasks to perform data analysis and descriptive analytics.
Construction of the Data Dictionary
For a data scientist, after obtaining the dataset, the first most crucial task is to obtain a good understanding of the data they are dealing with. This includes: examining the data attributes
(or, equivalently, data fields), seeing what they look like, what is the data type for each field, and, from this information, determining suitable analysis tools. A systematic approach to this process, as we have learned from the lectures and practical sessions, is to construct a data dictionary for the dataset.
You are required to prepare two sheets in your data dictionary Excel file:
• Dataset description
• Attribute dictionary
The total for this task is 35 marks. The data description sheet is worth 5 marks. The attribute dictionary is worth 30 marks, where each correct attribute specification is worth 2.5 marks. Name your solution as [YourID]_datadictionary.xls and submit this file.
Programming task
A Python Jupyter Notebook file assignment1_notebook.ipynb has been prepared for you to complete this task. Download this notebook, load it up to Jupyter and follow instructions inside the notebook to complete this task.
Attachment:- Data Science Concepts.rar