Introduction
The 501 major project involves quantitative analysis of an existing data in a
focused way, including a detailed description of the analysis process. The
purpose of the project is to practice and demonstrate what you are learning in
this course and, as such, the major portion of your analysis should be a
complex multiple regression or kind of ANOVA (i.e., it is not
sufficient to only use bivariate chi-square tests, t-tests, and/or
correlations). Resorting to non-parametric methods should only be done with
instructor permission.
The project can be completed in a group of two or three people. Unlike the
class assignments, you will submit one single paper for the whole group. It is
also possible to do the project individually, if you prefer.
You need to identify your topic and obtain a data set as soon as possible, so
that you can have your project approved by us in time for you to do the work
for it. The project must be approved by the instructor before you proceed. You
must also obtain permission from the TWU Research Ethics Board (REB) to conduct
a re-analysis of an existing data-set (since your analysis will be different
from the original purpose that the data was collected for). You are NOT permitted
to collect a new set of data for this project; you will not have the time to
recruit human participants, collect and analyse a data set before the project
is due.
Selecting a Suitable Dataset
You are expected to obtain an existing data-set for re-analysis. You may use
data that you have previously collected, obtain one of the existing data sets
that are maintained by the department (with permission from the original owner
and his/her supervisor), or obtain data from one of the many public archives
available on the Internet. However, the analysis that you perform will
ordinarily be new in some way (i.e., simply re-running analyses that have
already been performed and reported in a thesis document or research report is
usually inappropriate). Extending or correcting previous analyses may be
possible, but typically such strategies are too complex for this kind of
assignment. Normally, your data set needs to contain at least 50 cases /
people. Make sure you confirm the size of your sample when selecting your data
set. Exceptions to these requirements must be argued cogently (i.e., with very
good reasons).
Our course website has some ideas to help you
find a dataset.
You will need to have at least separate three variables in your analysis (e.g.,
2 predictors and 1 outcome variable in a regression; 2 IVs and 1 DV in a
factorial ANOVA). However, you may have several more variables to include in
your analysis, as long as you have a sufficient sample size.
There are four milestones for your project. Each part has a written component
that is due by the start of class (9am) on the Friday when it is due.
All written components are to be turned in electronically via myCourses.
Data Set Description (5 marks)
Due Fri 1 Oct
After you have formed your group and obtained a set of data, you need submit a
brief written description of the data set, including:
- Name of the "owner(s)" of the data (and whether you have obtained their
permission to use the data)
- Sample size
- The name of each variable that you may be using, and what it is supposed
to represent
- Number of missing cases for each variable
- Level of measurement for each variable
- Means and standard deviations for all continuous / ratio / "scale"
variables
- Box-plots and histograms for all continuous / ratio / "scale" variables
- The frequency of each category of response for all ordinal or
nominal / categorical variables (possible variations: with large numbers of
discrete values for ordinal variables, you need to present quartiles and
boxplots) Note that, if you intend to use part of a larger data set, you only
need to describe the variables that you are actually considering for your
analysis – "extra" variables can be omitted from the description.
- Reference for the data set (theses, published reports, data sets, as
appropriate all in APA style)
Your write-up should be formatted as a single document (Word or similar), with
the box-plots, histograms, etc. included as figures. For this first assignment,
strict APA style is not needed, but the document should be clean and
easy to read.
Project Proposal (5 marks)
Due 24hrs before meeting, before Fri 8 Oct
Your entire team must meet with the instructor sometime before the due date.
The instructor is available only at certain times, so be sure to book the
meeting well before the deadline. All your team members must be present.
At least 24 hours before your meeting, submit your project proposal online.
The proposal is a short (half-page to 1 page) document describing
- The main research questions and hypotheses for your project
(you may have one main RQ and several subordinate RQs)
- Justification for why the dataset is appropriate to answer those RQs
(e.g., is the sample representative of the desired population?)
- The variables you will use to answer those research questions
(identify which are IV/predictors, which are DV/outcome, and any other
variables that might play a role -- covariates, moderators, mediators, etc.)
- The statistical analyses you anticipate using (see the textbook for
an overview) for each RQ
- Justification for why those statistical analyses are appropriate for
their respective RQs
- Minimum required sample size for each statistical analysis, and
whether the dataset has sufficient sample size. You need to do a G*Power
analysis for this; you can't just make an estimate!
Come to the meeting with an electronic copy of your dataset (upload to
myCourses or bring a USB key) and come prepared to discuss your proposal.
It is also helpful to bring printed copies of your dataset description and
proposal.
REB Forms (3 marks)
Due Fri 22 Oct
After obtaining instructor's permission to proceed with your project, you
need to complete the appropriate ethics approval form from the university's
Research Ethics Board (REB).
Use the
"Request for Ethical Review - Analysis of Existing Data Form" form
(not the "Request for Ethical Review" form), available for download on the
TWU website at
http://www.twu.ca/academics/research/ethics/approval-forms.html
Instructions for completing the form:
- List one member of your group as the principal investigator
(it doesn't matter who) and your instructor as the supervisor
- Remove all identifying information (e.g., names, e-mail addresses)
from the data set before completing the form.
- If your data set was originally collected at another institution,
attach copies of the original REB application from that institution,
and the certificate of approval. You also need a letter of permission
from the current owner of the dataset (for theses, this is usually the
thesis supervisor).
- If your data set is from a TWU thesis, find and note the REB file number
for the original thesis.
- If your data set was downloaded from a public archive, provide full
descriptions of the archive, links, and statements of ethical research
practice.
Consult with your instructor if any parts of the REB form are confusing to fill
out, but do so well before the due date.
Upload the completed REB form to myCourses before the due date.
If you have electronic copies of the supporting documents (e.g., permission
letter, original REB from other institution), upload those too.
Submit two copies of the completed form, including the signature of whoever
will be the Principal Investigator, to your instructor, who will then review
the form, sign it, and submit it to the REB office (unless there are problems
with it, in which case you'll need to fix them first!).
You may not perform your analyses on your dataset until you have
received REB approval! In the past this has taken as little as 2-3 weeks
when expedited, but it could take as long as 4-6 weeks depending on workload
of the TWU REB. Any errors or incompleteness in the form may extend the time!
You may not perform any analyses on the dataset until REB approves
your project!
Final Project Manuscript (32 marks)
Due Sat 18 Dec, 12:00pm noon
Although the objective of this course is to prepare you for research, you
should remember that this paper is not the same as a research journal
article. The object of this paper is a detailed treatment of the statistical
procedures and results. As a result, you will go into much more detail on the
statistical analysis than you would for a typical journal article. The
substantive issues that you are dealing with in CPSY are less important here:
we only need enough details to know what kind of variables you have and how
they fit within the existing theory.
Unless otherwise specified, all sections of the paper should conform to APA
manuscript format (see chapter 1 of the APA publication manual for an
overview). Use the following structure, including the headings as listed here,
to practice your APA style:
Title Page:
Proper APA title page with names, affiliations, running head, etc.
(see pp. 296-297 of the APA publication manual for details).
Abstract Page:
A brief abstract, of no more than 250 words (see p. 298 of the
APA manual).
Introduction:
Should be brief and to the point; much shorter than what is typically found in
an article or manuscript. Begin with a selective conceptual overview of the
topic and select a conceptual model or research on it (or, if theory is sparse,
give reasoning about why this is an important topic to study). You will
generally have access to some background literature in the reports you have
reviewed (theses or publications). Then relate the model or study to your
specific research questions: your research questions should follow clearly from
your conceptual overview. At the end of the introduction, clearly state the
questions you will be asking in your study (in non-statistical terms).
Method and Data Set:
Describe your participants (age, gender, other demographic information) and
where they were recruited from, if known. Briefly describe data collection
procedures, if known, including any experimental manipulation (e.g.,
randomization, different treatment conditions, etc.) that was applied.
Next, describe each of your variables: For your outcome variable / DV, provide
the means and standard deviation (if you are comparing groups, also provide
that information for each comparison group). Note: make sure you use your
final, "cleaned-up" data set to obtain this information. For all predictors /
IVs, describe how were they operationally defined, and what were they intended
to measure. If a standardized test was used, report the established reliability
and validity of the test (for this assignment, you do not have to calculate the
reliability that you obtained in your sample). Also, if you are using someone
else's data, you must acknowledge your data source in full (i.e., data in this
study were originally collected by ____ for a study on ____, and used with
his/her permission).
Also, describe the actual analytical procedure that you used (providing
sufficient information to allow readers to replicate your study), and explain
why you chose that procedure, (i.e., what makes it a better choice for
answering your research question than other possible procedures).
Preliminary Analyses and Results:
In this section, you will go into much more detail than you would normally find
in a published manuscript (remember, the whole point of the project is to show
us what you have learned and so you illustrate details not included in
journals).
First, describe how you explored and "cleaned up" your data. Identify the
amount of missing data in the data set, and how you checked for systematic
patterns of missing data. Describe how you checked for potential outliers,
identify which cases may be potential outliers, and explain how you dealt with
them. Describe all the assumptions that your chosen analytical procedure makes,
and explain how you examined whether each one was met. For every test
assumption that was violated, describe (a) the implications for interpretation
of data and (b) what steps (if any) can be taken to correct the procedure.
If you have violations of assumptions of normality, make every effort to
correct the problem, whether by recoding/categorizing the variable, adjusting
your set of predictors, applying an arithmetic transform to the variable,
or researching other methods. Relying on non-parametric methods should only
be a last resort, and with instructor permission. Different non-parametric
methods also have their own assumptions; you need to justify those as well.
After running the final analyses, describe the results of your analytical
process, including any post hoc exploration that was done. Remember to assess
and report effect sizes (not just significance tests). Also calculate and
report the power that you obtained in your analysis (obtained, or "observed"
power). Report the results in both words and statistical notation in accord
with APA style (e.g., "The results demonstrated that first year students are
more familiar with APA format than third-year students, but the effect was only
moderate in size, F(1,232) = 24.56, p = .003, η2 =
.09"). Please note that it is not a show-stopper to have non-significant
results, as long as you have properly followed all the preceding steps.
But the ultimate objective is to understand the dataset and explore its
structure, relationship between variables, etc. -- not just a "yes/no" answer
to the RQs.
Discussion:
Describe how your results answer (or don't answer) the questions that you
raised in the introduction. Also discuss the limitations of your research. In
particular, focus on any statistical / analytical limitations that you have.
Unlike a regular manuscript, do not talk about the future directions or the
broader implications of your study.
References:
List all the references that you cited in your paper, using proper APA style.
Please note that the style varies according to whether the citation is an
article, book, book chapter, or electronic document. (see pp. 215-283 of the
APA publication manual for details).
Tables and Figures:
You are required to include at least one table or figure in your manuscript (I
suggest either a summary of your main statistical analysis, a graph showing the
direction or strength of your effect, or at least a summary of your participant
demographic characteristics). Remember to refer to it in your written text.
See pp. 147-200 of the APA publication manual for details on formatting tables
and figures.
SPSS Output:
In addition to uploading your final manuscript (as a Word .doc or similar),
also upload a cleaned, well-labelled SPSS output file (.spv) with only
the relevant plots and analyses used in your paper.
Please ensure there is no scratch work or duplicate runs in the output!
Label and number the different parts of the output, to make it easy to match
each output with your discussion in the methods and results sections.
You can edit the output file to insert text boxes as labels.
The maximum length for your paper (excluding the SPSS output) is
15 pages. Quality is much more important than quality; if you can do a
thorough and clear job describing your analysis in less than 15 pages,
all the better.