# BUSI 275 Fall 2011 Term Project

## Introduction

The objective of this course is not only that you understand the theory (math) behind statistical analyses, but also that you demonstrate how to apply them appropriately to practical situations in business. As such, the term project is a vital component of the course and should represent the pinnacle of your work in this course (more so even than the final exam).
For the term project, you will
• Choose a data-driven, business-related topic you are interested in from the list below;
• Investigate and do background research on the issue;
• Gather data (or find suitable pre-existing datasets);
• Perform appropriate in-depth statistical analyses on the data (e.g., in Excel);
• Present your results in a 15-min in-class presentation; and
• Present your results in a well-written, clearly organized paper.
In the business world you will often (nearly always) need to work in teams, and so in this project you will be expected to work in teams of 2-4 students. Everyone in your group will get the same mark for the project, and once the project proposal is submitted, the team cannot be disbanded -- so choose your teammates wisely! Learning how to work in a team is one of the main objectives of this project; it is not easy, but it is important and worth the effort. Giving up on working together is giving up on one of the primary objectives; it is your responsibility to make it work! If you are having difficulty with your teammates, I can offer suggestions on conflict resolution, but I will not be your mediator; you must learn to be peacemakers.
"Be completely humble and gentle; be patient, bearing with one another in love. Make every effort to keep the unity of the Spirit through the bond of peace." (Ephesians 4:2-3, NIV)

## Topics

Choose one of the following options, or come up with your own idea of similar scope and depth:
• Distribution fit:
Select a random event for which you can get empirical data. Gather data about the event, and describe the underlying statistical distribution. Discuss the possible factors that cause variation. Explain how this information could be used for a decision-making in a business environment. Detail all the options you tried and what guided your decisions. You must demonstrate a detailed, in-depth attempt to understand the underlying distribution; the final fit ought to be very close.
• Financial Analysis / Time Series:
Select an "interesting variable" of some kind which can be easily found and calculated from publicly available financial data. Develop a statistical model which could be used as a standard to measure performance for future numbers. The "interesting variable" you select should be something with a business impact. Examples could include:
• "Performance of environmentally friendly mutual funds",
• "Acquisition price for soccer players",
• "Future price of oil on the NYMEX",
• "Same-store sales growth in retail".
• Multiple Regression (possibly with non-linear variables):
Conduct a multiple regression study, and report on the results. Be sure to examine and discuss possible interaction effects.
• ANOVA:
Consider the case studies 12.1-12.4 on pp.526-528 in the textbook. Identify an ANOVA study you can conduct similar to one of these studies. Gather data, perform the analysis, and report on the results and the business impact. You may have single or multiple predictors. Be sure to do follow-up analysis (post-hoc or planned comparisons).

## Dataset Description (due 4 Oct)

Find an existing dataset or describe in detail how you are going to collect your own data. Existing data can be from public sources (e.g., StatCan, U.S. Census, etc.) or from private sources (in which case you need to make sure you have permission to use it, as well as possible REB approval). If you plan to gather your own data, you will need to detail your sampling strategy (e.g., friends+family via word-of-mouth, stand on campus/street corner, work with a local retailer, etc.) as well as your data collection strategy (e.g., online questionnaire, oral interview). If you plan on offering participants an incentive (e.g., chance to win a gift card), you need to make that clear (because it does affect your sampling).
• Indicate the "owners" of the dataset, and whether you have obtained permission to use the data.
• Indicate the overall sample size
• For each variable in the dataset that you plan on using, describe it in detail, indicate its level of measurement, and show how many missing cases there are.
• For any scale-level variables (interval or ratio), show means and standard deviations, plus box-plots and histograms.
• For any nominal (categorical) or ordinal variables, show the frequency distribution using appropriate charts (e.g, bar chart, pie chart).

### Deliverables:

• Upload a single document (Word, PDF, or similar) to myCourses by 10pm on the due date. It should be formatted clearly and cleanly (make sure to size figures/charts appropriately) in MLA style or similar.

## REB Forms (due 11 Oct)

By law, any research involving human subjects must be done according to standards of research ethics. If you are gathering your own data (e.g., questionnaires) or using existing non-public data that involves human subjects, you will need to submit a form to TWU's Research Ethics Board (REB) for approval. You are not permitted to begin recruiting subjects or gathering data until you have received REB approval! Allow 3 weeks for this to happen. If you are using public data (e.g., StatCan or US Census), then you generally will not require REB approval -- however, for class purposes, you will still need to complete a REB form. For more on why research ethics is important and what the REB will be looking for, see the Tri-Council Course on Research Ethics (CORE). If the REB rejects your application with major revisions, you may be required to complete this online tutorial. See TWU's REB page for more details; the forms you need are at the bottom of the page: either "Request for Ethical Review" (if you are gathering new data) or "Analysis of Existing Data" (if you are using data that was originally gathered for another REB-approved study). On the form, list the instructor as your project's supervisor. Your project's principal investigator ("PI") should be one from your team; you may select one student arbitrarily.

### Deliverables:

• By classtime on the due date, submit to me a completed, signed, printed copy of the appropriate REB form, depending on whether you are gathering your own data or using existing data. If you are using public data that does not require REB approval, you do not need to submit a printed copy, however you for class purposes you still need to upload a completed electronic copy.
• Also, upload an electronic copy of your completed REB form to myCourses.

## Project Proposal (due 25 Oct)

Write a short summary (one-half to one page) describing the work your team proposes to do:
• Describe your dataset and the dependent and independent variables.
• If you will be gathering your own data, describe how you plan on doing that.
• Discuss the statistical question you will be posing to the data (which hypothesis test, what model) and your expected outcome. E.g., "We plan to run two-way factorial ANOVA analysing the effect of call centre and time of day on wait times. We expect to find a significant main effect with time of day but no significant interaction."
• Plan how you will divide the work amongst your team members.

### Deliverables:

• Upload your proposal to myCourses by 10pm on the due date.

## Presentation (in-class, 29 Nov and 1 Dec)

Deliver a 15-min, in-class presentation on your project. If your analysis is not yet complete by the time you do your presentation, that is okay, but you should have some preliminary results to present.
• The format of your presentation is up to you, but you must keep it under 15 minutes. The instructor may interrupt you to ask questions, which will not increase the time you have available. You may solicit questions from the audience, but that will also not increase the time you have available.
• You will be graded not only on the content of the presentation, but also your clarity, delivery style, professional demeanor, etc. Treat this as if you are presenting to your company's CEO or board of directors. Do not assume that they are familiar with statistical methods. It is recommended that you dress "business casual"; the rule of thumb is to dress "one step above" your audience.
• Every member of your team must participate in the presentation.
• In addition to your own presentation, you must attend and fill out a short feedback form for other presentations by your classmates. The feedback forms will be provided to the presenters for their reference.

### Deliverables:

• By Tue 15 Nov, sign up for a time slot to do your presentation. There will be six slots available per day, on 29 Nov, and 1 Dec.
• Upload your presentation slides (PPT, PDF, ODP, etc.) to myCourses at least 24hrs before your presentation.
• Fill out feedback forms for in-class presentations by other teams. (Hand in the forms to me at the end of class.)

## Paper (due Wed 7 Dec)

Your paper must be a complete, well-written exposition of the topic you have chosen, the analysis you performed, and your results and conclusions. You will need to do background research and cite reliable sources. As appropriate, include select tables or figures in the body of the paper to illustrate the points made in your paper. Further tables or charts can go in an appendix. As with the presentation, your target audience is someone like your CEO or board of directors -- they are giving you a chance to communicate your results, but you need to convince them of the relevance and validity of your work in a way that they can understand. Do not assume they are familiar with statistical methods. There is no length limitation on the paper! However, your paper must satisfy all the requirements given, following the outline below. Typically, BUSI275 papers that meet these requirements have averaged around 4000 words. But there is no minimum length; if you can write a clear, concise paper that meets all the requirements in less than 4000 words, so much the better!
• Abstract: a short summary (half a page at most) describing the basic results of your research at a glance.
• Introduction: describe your topic and tell me why you think it's relevant or interesting.
• Related Work: research what other people have done related to your topic and summarize their findings. For example, another researcher might have done the exact same analysis but on a different dataset. Try to keep the introduction and this section short, in favour of more space for your own methods and results.
• Methods: tell me about the variables and the dataset you have chosen, and describe how the data were gathered. Describe your analysis in a way that is understandable to someone who might not know or care about the statistics, but also is sufficiently detailed that an interested party could reproduce your results.
• Results and conclusions: what effects were significant? What effects weren't significant? Do the results agree with what you expected? Interpret the results and tell me what it all means in the context of your original topic.
• Future work: if you or another researcher were to continue in this topic, what would be the next step? How could your dataset or analysis be improved? (This section can be short.)
• References: particularly for the "Related Work" section, you will need to do background research and cite sources. The point of a list of references is to enable the reader to look up your sources.
• Appendix (optional): For tables/figures in the body of the paper, you should be very selective so that they do not take up too much space or overwhelm or bore the reader. For any additional tables/figures you wish to include, you may put them in an appendix. You may also separately upload Excel spreadsheets or other datafiles. The appendix is optional.
Your paper should be in proper, professional English. Avoid colloquialisms. As appropriate, prefer the active voice ("We performed linear regression on the data") over the passive ("linear regression was performed on the data"). Actions have agents, so indicate who they are: if you were the one who performed a step, then the use of the pronoun "We" (or "I") is appropriate ("We performed..."). If you are citing someone else's analysis, then indicate who did it: "McDermott et al. performed a similar analysis...". I will not be strict on formatting, as long as your paper is clear and readable. A suggested format is the MLA style: Purdue OWL has a helpful guide. I highly recommend that you submit a rough draft early on to get feedback. This paper constitutes a major portion of your final grade, and you don't want to be surprised that you were heading down the wrong path!

### Deliverables:

• Upload your complete paper, as a single document, to myCourses by 10pm on the due date.
• If you wish, you may upload any datafiles, spreadsheets, etc. as separate files, however the paper should be complete without these.

## Marking

Dataset Description: 8% 4% 8% 20% 60%