I have a confession to make. I…phew, this is hard to say.
I, Sara Ness, am a Quiz Snob.
I am not a psychometrician (the technical term for someone who designs and measures psychological tests). But while creating the Relating Languages over the last 4 years, I had a chance to work with some great ones. I got obsessed with assessments.
I put in hundreds, maybe thousands of hours on my assessment. I was hired a psychometrician to analyze the data, sending me 200-page reports on its accuracy. I bought expensive software, wrote extremely detailed and in-depth outcome reports for paid takers, built a website, started a company. Over 1,300 people took the basic quiz in about a year.
I learned two things in this process. (Well…I learned a lot of things, but for the purposes of making a point, let’s pretend there were only 2.)
Creating a statistically accurate, verifiable assessment is very hard - and unless you are in academia, it just doesn’t matter. The Meyers-Briggs isn’t even verified. I was climbing a stupid tree. BUT,
It’s not that hard to make a better quiz.
This article is my little contribution to all those who are making marketing quizzes, personality quizzes, random internet quizzes, student assessments - AND who actually, even a tiny bit, care about them being accurate-ish. I’m writing it because I haven’t found a lot of this information written for the layperson, with (hopefully) clear examples and a lot of bad jokes.
If you’d like people to say, “My result actually reflects ME!”; to use the data you get as market research; or to play out some deep psychological need to be a Good Scientist despite your lack of degree (🫣 🙋🏻♂️) …I’m hoping this article will help.
In Part 1, we’ll cover choosing the right questions and picking a platform to put your assessment or quiz on. Also, you get to hear the massive mistake I made in creating my Relating Languages assessment.
In Part 2, we’ll talk about creating better questions and testing your assessment. (That might be a Part 3 if this gets long enough, but god I hope it doesn’t)
Let’s start with the basics:
Definitions
Let’s address something first: what is the difference between a quiz and an assessment?
The internet varies on formal definitions. But in general, a quiz is shorter, covers fewer areas, and is less validated (testably accurate). It’s sometimes used to test your knowledge of a subject, as in the dreaded “pop quiz”, and sometimes used to sort you into a vague bucket of personality. You know it’s vague if you get slightly different responses each time you take the quiz.
An assessment is expected to be more accurate, and can also include non-questionnaire elements like interviews and observations. In teaching situations, it tends to measure progress, while tests and quizzes measure understanding of a specific topic (or identification with a specific category) at a specific time.
So here’s my snobby stance. People all over the internet are creating personality tests (which are actually quizzes or assessments, but whatever). It’s the new marketing funnel: “What type of dog owner are you?” “What’s your sexual style?” “On a scale of budgie to golden retriever, how likely are you to buy my product?”
This is fine. Quizzes don’t need to be that accurate. Maybe one day you’re a budgie and another day you’re a golden retriever.
BUT. If you’re basing a system off a personality test, or you want accurate audience data off a quiz, you really want your test/quiz/assessment to be accurate.
And The Internet at large (yes, perhaps even you, dear reader) is not schooled in how to create accurate assessments.
My hope, in this article, is to make your quizzes suck a little less - I mean, give clearer results - without teaching you a lot of statistics. Pray for me.
Choosing the Right Questions
The first thing you need to do, in creating a quiz, is choose questions.
In this article, we’ll talk about what to choose.
In the next one, I’ll go into how to make your questions return accurate results. It’s harder than you’d think. That’s why psychometricians get paid, and people like me get…paid?
The process of choosing questions, IMO, comes down to two basic things.
Are you sorting people into categories you’ve already decided, for fun and profit or to test a model?
Are you trying to figure out what categories people are sort-able into, in order to create a model and thereby set up for fun (and probably about the same amount of profit BUT with the self-satisfaction of knowing you Did Science™️)?
In the process of writing this article, I realized a devastating thing. A way my life might have been very different, if I had only written it 3 years ago.
In creating the Relating Languages, I started with #1. I created a model and tried to test it.
But when that model didn’t get accurate results, I didn’t go to #2. I didn’t take a step back and think, “Hmm, I guess there must be something off about my model. Let me come up with a wider question set about how people interact in conversation, and sort the data based on categories I find rather than pre-defined boxes.”
No. I made what is, in retrospect, one of the stupidest quiz mistakes. I tried tweaking my questions to make people fit the answers I thought they should give. I made other questions tinier and tinier variations on the questions that DID give the results I wanted.
Basically, I created a quiz that told me exactly what I wanted to hear…while never allowing me to be wrong. I was never able to statistically verify it. I’m not going to go into what that entails, but let’s just say there is a lot of math you can do to find out if your work is Bullshit Science or not, and mine kept coming back Bullshit.
It sucks very badly to have found this out 3 years too late, from writing a Substack article.
Badly enough, in fact, that I’m considering going to grad school to learn how to do better research.
In the meantime, here’s how I wish I would have done it.
Keep reading with a 7-day free trial
Subscribe to Sara’s Substack to keep reading this post and get 7 days of free access to the full post archives.