This is my first of what I assume will be many, many blog posts about doing research.
So let’s talk about designing experiments to compare different things — whether that’s the best ice cream flavor or bias in job hiring. In this blog post, we’re going to discuss two separate research questions — and because I’m a nerd — I’m going to write them like this:
RQ1: When shown a poorly written test, are engineers more likely to continue to replicate that poorly written test elsewhere?
RQ2: What aspects of poorly written tests are engineers most likely to notice?
With these two questions in mind, we have to decide how we split our test examples to our participants.
In this we’re going to discuss two design choices for segmenting examples: Between- vs Within-subjects.
Between-Subject Design
In a between-subject design, we split our participants into n separate groups — 2 groups in the case of RQ1 — and only show the participants the examples that go with their group.
RQ1 is a fantastic question to explain why when we would use between-subject design. As I mentioned above, there are 2 groups for this question:
- Group 1 (Exposure): People who’ve seen the poorly written tests
- Group 2 (Control): People who’ve seen well-written tests
We have to segment these groups and what they see because if Group 2 were to see any poorly written tests, then they would be contaminated (strong word, I know) and thus their results wouldn’t be useful for comparing against Group 1.
We pick between-subject whenever we need a control group — a group of people who haven’t been exposed to whatever variable we’re testing for that we believe could influence their future responses.
Within-Subject Design
Now this is the opposite of between-subject design. Instead of separating our examples and participants into separate groups, we expose each participant to the full set of samples — maybe a subset, but still the same samples.
This is the strategy that’s best for RQ2. With what we’re testing for with this research question, we don’t care if the participants have seen a good or bad test. We want them to see a lot of tests and observe which flaws they do and don’t notice.
There is an important caveat here, which is that we don’t believe that seeing a bad test before will influence their ability to notice bad tests again.
We use within-subject when we believe that exposure to the previous sample doesn’t taint any future samples.
Summary
I tried to keep this fairly short so it’s all easy to digest.
The core question is: Could exposure to the first sample irreversibly change how a participant responds to the next sample?
If yes, between-subjects (so we can protect the control group)
If no, within-subjects
I hope this helps.