Hobsons Scenario

1063 Words5 Pages

Data Scientist Scenario Task
Background
Hobsons collects a lot of data from prospective international students for the purpose of understanding why they choose a certain university. By doing so, Hobsons hopes to improve conversion rates.
The data is collected from online surveys and other possible sources. Variables can be either quantitative or qualitative such as nationality, country of residence, age, gender and more. Due to the nature of data collection methods, the data is stored in separate databases and some students can be found in both(or more) databases.
To solicit the prospective students, their emails are part of the automated email program which is dedicated to provide them with the information necessary to help them make a decision …show more content…

From this, we can also be interested to see the “university application trend” for some nationalities. It would be interesting to know whether the Indians have always been interested in pursuing computer science in Australia.
Recommendation
Until now, we have decided to profile our students(current and past) according to their goals and objectives. Therefore, the best information would be information that helps the students to achieve these goals and objectives. For prospective students, this information can be information what are the best universities for their goals, possibility to get financial aid to attend these universities if available, ex-students testimony and also professionals opinions of hiring graduates from these universities. All this information will give the students a clearer picture on not only how to get into these universities but also get some insider-information on how the industry perceives the value of the …show more content…

So from a table of N number of columns, we may ended up with a smaller size table with 3 to 4 columns. These new variables are actually composites of the original variables. The advantage is that we have a smaller data table to deal with. The disadvantage is we lose some information and this algorithm also assumes the variables to be strictly quantitative(age, weight). In fact, there exist other variants of component analysis that deal with qualitative variables(nationality, eye color) like correspondence analysis.
This is why it is very important to have a well organized data since some analysis require certain assumptions on the input data without which the algorithm for the analysis would not give an output.
Back to our case study, we actually have decided our profiling criteria(students’ goals and objectives). We believe this criteria(variable) helps the most to improve the students’ applications.
If the objective were different, say, how to detect a student risking of dropping out, the profiling criteria would have been different. And the selection for the profiling criteria can either