Statistics coursework: Comparing the times taken to find words in dictionaries Plan – FUTURE TENSE – WHAT WILL YOU DO? Aim The aim is discovering which is the quickest between gender and language by making two different year groups find 5 words in dictionaries in French, German and English. Hypothesis 1 The English times will be quicker than the others seeing as our main language is English therefore, we have more knowledge and understanding, but also because we have been using English dictionaries for the longest. Hypothesis 2 The Year 10’s times will be quicker than the Year 7’s because they have had more experience with dictionaries and, as you age, your IQ increases which could mean they have better vocabulary skills. Data Collection …show more content…
It will also be discrete even though time is continuous as we are using a stopwatch which only shows the first three decimal places. The data will then be recorded onto a table and after, put together on a data sheet using excel. I am using a sample as it makes the investigation fair and also means a range of unbiased results are created from it. This is the best way since it means that I will produce a more appropriate representation of my results. I will be getting this sample by using a random number generator that I will be creating in excel using the formula, =INT(RAND()*POPULATION+2). In the formula, the ‘INT’ is what rounds the number chosen to the nearest integer and the ‘RAND’ is the random number generator. The ‘POPULATION’ variable isn’t a number since that is the thing that we are constantly changing therefore, it is not a certain number and the ‘+2’ is there as the data does not start in cell 1, it starts in cell …show more content…
This is because the computer is randomly picking the numbers, not us. 40 pieces of data are being collected and we are using 31 pieces of data because it means we have 9 pieces of data that we can use if there are any issues. Also, it is a large sample since it is above 30 and if we were to come across issues in the data then we can replace it with the extra pieces. Some issues we could face could be missing results which we would have to replace with the extra data as we have to keep everything fair and this will save time as we won’t have to do the random number generator again. We could also have the same data which would mean we have to change it and replace with the extra data that we collected. Data Analysis To present my results, I am going to use box plots for every hypothesis because they show a clear representation of what I am trying to find out and they give us the median which is good because it means we can see the average of all our results and can compare them. I will also use the interquartile range as this will show the difference between the times and also how different each range is. Potential issues for data