Students will be introduced to key concepts required to undertake rigorous and valid data analysis. Students will be introduced to processes for collecting, manipulating and cleaning data, while gaining experience in judging the quality of data sources. Students will be introduced to statistical analysis in data science, including correlation, inferential statistics and regression, and how to use these tests in a programming environment. Relational databases, SQL, and and other database paradigms such as NoSQL, are covered as a way of storing and accessing data. A key aim of the module is to solve complex problems and deliver insights about multi-dimensional data.
Module learning outcomes
Distinguish between different types of data that are generated in science, engineering and design, and employ strategies for ensuring data quality.
Retrieve data from a variety of different data sources in a variety of different formats.
Apply inferential statistics and statistical procedures to test hypotheses about features and relationships within data sets.
Use appropriate visualisations to present and explore data sets.
Use databases, both relational and of other paradigms, to store and query data.
Identify the ethical concerns regarding the provenance of data, the privacy of individuals, and the impact data analytics can have on society, and apply topics from the code of ethics of a professional data protection body.
Indicative assessment
Task
% of module mark
Essay/coursework
100
Special assessment rules
None
Indicative reassessment
Task
% of module mark
Essay/coursework
100
Module feedback
Feedback is provided through work in practical sessions, and after the final assessment as per normal University guidelines.
Indicative reading
*** Spiegelhalter, D., The Art of Statistics: Learning from Data, Pelican, 2019.
*** VanderPlas, J. Python Data Science Handbook: Essential Tools for Working with Data, O’Reilly, 2016.
** Igual, L. Segui, S. Introduction to Data Science: A Python Approach to Concepts, Techniques and Applications, Springer, 2017