How to Ace CASPer Video Response Questions
A founder's guide to the spoken section, where good candidates quietly lose points.
Most people prepare for CASPer by typing. Then the spoken section starts, the recording light turns on, and everything they rehearsed goes quiet.
The video response section is where strong applicants lose points they never knew they were losing. Not because their ideas were bad. Because nobody told them that on camera, how you say it is part of the score.
This guide fixes that. What the section is, how it is scored, why it trips people up, how to structure an answer out loud, and what a weak answer sounds like next to a strong one. Real examples, not filler.
What the video response section actually is
CASPer has two parts. The first is typed: you read a scenario and write your answers. The second is spoken: you read or watch a scenario, then record yourself answering follow-up questions on your webcam.
The spoken part works differently in a way that matters. You get a short moment to gather your thoughts, then a fixed recording window for each answer, usually around a minute. One take. No re-records. No editing. Once you start talking, that is your answer.
Compare that to typing. When you type, you can backspace. You can reorder a sentence. You can fix a clumsy opening before anyone sees it. None of that exists on camera. You commit live, and you cannot see how you come across while you are doing it.
That last point is the quiet trap. You feel like you sounded fine. The rater watches something different: your pauses, your filler words, whether your eyes were on the camera or darting around the room. For the full structure of the test, see our guide to the CASPer test format.
How the video section is scored, and what that means for you
Be clear on this, because a lot of advice gets it wrong. CASPer is not pass or fail. There is no pass mark to clear. Your responses are reviewed by trained raters, and your result is reported as a quartile, which is your standing relative to everyone else who took the same test form.
In plain terms, you are ranked against other applicants, not against a fixed bar. We break this down further in how CASPer scoring works.
So your job is not to be perfect. It is to come across as clearly and professionally as the strong applicants in your cohort, and ideally better. On the spoken section, coming across clearly includes how you deliver the answer, not only what is in it. Two candidates can have the same idea. The one who says it in a calm, structured, hard-to-misread way scores higher. That is not unfair. It is the whole reason the section is on camera.
Why the video section trips people up
The spoken section punishes habits the typed section hides. Here is what actually costs people.
Filler words. Um, like, you know, basically, I guess. On paper you would delete these. Out loud they leak out under pressure and make you sound unsure.
Rushed pacing. Nerves speed you up. You race to fill the time, the words blur together, and a good point lands as a panicked one.
Rambling with no structure. Without a plan, you start a sentence hoping it finds a point. Raters can hear a thought that has no destination.
Low-confidence delivery. A trailing voice, a question-mark tone on plain statements, constant hedging. The content might be fine. The delivery says you do not believe it.
Poor eye contact. Reading off a second screen, or staring at your own face instead of the lens, reads as evasive even when you are just nervous.
Freezing. The timer is running, your mind goes blank, and you lose ten seconds you cannot get back.
Notice what these have in common. None of them are about intelligence or ethics. They are about delivery under a clock. And delivery is trainable.
The shape of a strong spoken answer
You do not need a different brain for the spoken section. You need a structure simple enough to hold when your heart rate is up. The same logic that works in the typed section works out loud. If you have read our PACE framework, you already know the spine of it. Here is the spoken version, in five beats.
- 1
Take the beat. Use your thinking time. One slow breath before you speak is not wasted time. It is the difference between a sentence with a destination and one without.
- 2
Name the tension. Open by saying what the actual conflict is, in one line. This tells the rater you understood the scenario and gives your answer a frame.
- 3
Walk both sides. Name two or three people involved and what each one needs. This is empathy you can hear.
- 4
Land the decision. Say what you would actually do, and why. Do not hover. A clear, reasoned choice beats a balanced non-answer every time.
- 5
Name the value. End on the principle behind your choice: fairness, honesty, safety, respect. One sentence. This is where strong answers separate from average ones.
Five beats, roughly ten to fifteen seconds each, and you have used your minute well with room to breathe.
Delivery is half the score
Structure gets your content in order. Delivery decides how it lands. These are the mechanics worth drilling.
Pace and pauses
Speak slower than feels natural. A deliberate pause is not dead air, it is confidence. It also buys you a half second to think without filling it with um.
The timer
Treat the limit as a frame, not a finish line you sprint toward. Aim to finish a beat or two early. A clean answer that ends with calm beats a rushed one that gets cut off mid sentence.
Camera and framing
Look at the lens, not at your own face. Put the camera at eye level. Sit centered with light in front of you, not behind. Look like someone having a steady conversation.
Fillers
You cannot delete fillers by deciding to. You replace them with silence. When you feel an um coming, close your mouth for a beat instead. It feels strange. It sounds composed.
Weak answer vs strong answer
This is the part that proves the point. Same prompt, two deliveries. Read them out loud and you will hear the gap.
The prompt
You lead a small team. One member, Sam, has been arriving late and missing deadlines. Another teammate is frustrated and wants you to report Sam to your manager. What do you do?
Weak answer (about 50 seconds)
“Um, so I guess I would, like, talk to Sam? Because, you know, there could be a lot of reasons, maybe something is going on at home or whatever, so I do not really want to jump to conclusions. And I would also, um, probably talk to the other coworker, because they are frustrated, and yeah. I would just kind of try to make sure everyone is okay and, um, figure it out as a team I guess. So yeah, that is basically what I would do.”
- No clear structure. It starts a sentence hoping to find a point.
- Heavy fillers (um, like, you know, I guess) make every idea sound uncertain.
- It never actually decides anything. “Figure it out as a team” is not an action.
- No principle named, so the rater cannot see the judgment behind it.
Strong answer (about 45 seconds)
“This is a tension between supporting a colleague who may be struggling and being fair to the rest of the team. First, I would speak with Sam privately, not to accuse, but to understand. There may be something going on that I cannot see. I would also acknowledge my other teammate's frustration, because their workload is being affected and that is real. My action: I would name the pattern to Sam directly, agree on clear expectations, and set a short window to turn it around. If the missed deadlines continue, I would involve my manager, because at that point it is a fairness issue for the whole team. I am trying to balance empathy with accountability.”
- Opens by naming the tension, so the rater knows it was understood.
- Considers both Sam and the frustrated teammate. Empathy, made audible.
- Commits to a concrete, staged action instead of hiding behind teamwork.
- Ends on the value: empathy balanced with accountability.
- Delivered at a steady pace with no fillers, and finished with time to spare.
The ideas are not wildly different. The second one is just legible. On camera, legible wins. If you want more of these side by side, our piece on what a 4th-quartile answer looks like goes deeper on the typed version of the same skill.
Common mistakes to avoid
Mistake: Scripting full answers to memorize
Problem: you sound like you are reading, and one unexpected question derails the whole thing.
Fix: rehearse the structure, not a script. Drill the five beats on new prompts until the shape is automatic.
Mistake: Practicing only in your head
Problem: silent practice hides your fillers, your pace, and your nervous tics.
Fix: practice out loud, on camera, and watch it back. It is uncomfortable. That is the point.
Mistake: Trying to fill the whole timer
Problem: padding turns a sharp answer into a rambling one.
Fix: make your point, name your value, and stop. Silence at the end is fine.
Mistake: Staring at yourself on screen
Problem: it reads as broken eye contact, and it makes you self-conscious.
Fix: look at the lens. Hide your self-view if the platform lets you.
The one thing that actually fixes delivery
Here is the uncomfortable truth. You cannot fix delivery by reading about it. You have to do it out loud, under a timer, and then hear yourself back.
That last part is the one almost nobody does, and it is the one that changes everything. The first time you watch your own recorded answer, you notice the three ums you did not know you said, the way you sped up halfway through, the moment your eyes left the camera. You cannot un-hear it. And once you have heard it, you fix it fast.
The problem is that real practice like this is hard to set up. You need prompts you have not seen, a live timer, a recording, and honest feedback on delivery, not just content. Reading a scenario and thinking through an answer does not build the muscle. Only reps do.
That is exactly why we built video response practice into CasperCoach. As far as we know, it is the only CASPer prep tool that grades how you sound, not just what you type. You get a real scenario, a live timer, and you record your spoken answer the same way you will on test day. Then you get instant feedback on the things that actually move your score on camera: filler words, pacing, confidence, eye contact, and whether you rambled or landed your point. You can try the video response practice yourself.
You do not need our tool to use any of the advice above. But if you want to hear yourself before a rater does, that is the fastest way to do it.
Written by Mahad, founder of CasperCoach.
Try video responses free.
Record a real CASPer-style answer under a live timer and get instant AI feedback on your delivery: fillers, pacing, confidence, and eye contact. No credit card needed.
Try video responses freeKeep reading
How to Get a 4th-Q on CASPer: The PACE Framework
The four-step structure that turns any scenario into a clear, scorable answer, typed or spoken.
CASPer Test Format 2026
The full structure, timing, and sections, so nothing on test day catches you off guard.
Is the CASPer Test Changing in 2026?
What is new this cycle, including the spoken section, and how to prepare for it.
What a 4th-Quartile CASPer Answer Looks Like
Side by side high and low scoring responses to the same scenario.