# Data Analysis - Learn statistics and more

## Data analysis – an introduction

The process of data analysis involves more than just doing a statistical test. You need to collect your data and arrange them in a manner that allows you to explore them effectively. So, there is an element of data management. The way you store your data affects your ability to explore the data, carry out further analyses and add to your dataset.

Before you even collect any data you need to plan your strategy – these plans will help you collect the right data to answer your particular questions and ensure that you set out these data in an appropriate manner. This ideal situation, having a plan before you start, is not always achieved and sometimes you'll have some data and need to find out something. Looking over a dataset and seeing what you can find is called Data Mining. You use a data mining process to explore the dataset and try to pick out any patterns that may exist. Many of the approaches used in data mining are also those used even when you have a plan, so there is some overlap. However, if you have a particular question, and no plan, you may find that the data you've got cannot answer that question. So, it is better to have a plan to start with.

Statistics can be used to help summarize data – finding averages for logical groups for example. In fact using a statistical summary is usually a good starting point for exploration of any dataset. Allied to the basic statistical summary is the business of visualizing your data. Making charts and graphs that summarize your dataset is an important aspect of data analysis. Graphs are usually more readily understood, particularly by people who are not familiar with the data. When it comes to presenting your results to others a graphical approach is essential.

Statistical tests are used to help you make decisions about the patterns in your dataset (and in some cases to highlight patterns that were not obvious). There are many kinds of analysis that can be carried out. Knowing what sort of question you want to ask is an important step to working out what kind of statistical analysis you need to carry out.

