Understanding Personality Testing in the Workplace with Dr. Clinton Kelly
E2

Understanding Personality Testing in the Workplace with Dr. Clinton Kelly

Kind: captions
Language: en

Understanding Personality 
Testing with Dr. Clinton Kelly 

Welcome to Testing, Testing 
1, 2, 3, a podcast brought to you by TestGenius. 

Jenny: Welcome everybody. My name is Jenny Arnez. 
I'm from TestGenius, and we're so glad you tuned  

in today, or perhaps you're listening. And 
today we have Mike Callen with us. He's the  

VP of Products from TestGenius. We also have 
Dr. Clinton Kelly from IoPredict. Clinton,  

you want to tell us a little bit about yourself?
Clinton: Yeah. Thanks for having me. I'd be happy  

to tell you a little bit about me. I have a 
background in industrial and organizational  

psychology. And so for those who aren't 
aware of that, I do not work with the  

depressed people at work. That's the question 
I most often get from those who don't do this.  

I think I'm a counselor for people.
Clinton: In the workforce, but no,  

I actually help people help companies decide 
who to hire and who to promote and within  

organizations. So industrial organizational 
psychologists, we focus on improving  

organizations, making things more 
efficient. I specifically focus on  

hiring within organizations. And that's where 
I've spent the almost last 20 years of my  

career doing is making tests for organizations.
Clinton: Whether those be personality tests,  

cognitive ability tests, multiple physical 
ability, interview questions. And so that's  

what I do is for all sorts of different types 
of jobs and different types of organizations,  

I help them make tests to decide who to hire.
Jenny: Mike, why don't you say a couple of words,  

let people know who you are as well.
Mike: Sure. , I'm Mike Callen. I'm the VP of  

Products at Biddle Consulting Group. Our product 
is TestGenius. It's a suite, a hiring suite,  

which is a testing platform. And it contains a 
series of off the shelf office skills testing,  

as well as a program called CritiCall, which 
is for police, fire, EMS, 911, and utility. 

Mike: Dispatchers, call takers, those 
kinds of products. And then we have an  

item banking testing platform system, 
which IoPredict Clinton's company actually uses  

for some test administration. And I would add 
that Clinton and his partner, Jason, at IoPredict,  

worked with us for several years here at Biddle 
Consulting Group and continue to work with us  

here at Biddle Consulting Group, yet they have 
gone on and launched their own company. And we're  

collaborating quite a bit as we continue forward 
in our own direction. Great collaborations. 

Jenny: Yeah, for sure. So today we're going 
to focus on personality testing. And so I  

have to tell you when I think of personality 
testing, because I'm not trained in industrial  

organizational psychology, I'm new to this field.
Jenny: Honestly, it wasn't until I joined Biddle  

Consulting Group that I even knew 
that I was a job like yours existed,  

so I'm new when I think of personality testing. 
If I were to go on to Google and  

in fact, I did that this morning. I typed in 
personality testing, things like DISC profile  

and Enneagram and Myers Briggs showed up.
Jenny: How does that type of personality  

testing connect to the type of personality 
testing that one might do in the workplace,  

that they want to use a personality test to 
hire or to select the best person for the job? 

Clinton: Yeah, that's a great question. 
And there are some similarities, but there  

are important differences between those.
Clinton: They are trying to measure traits  

or preferences of individuals. Some personality 
components, however, tests like the Myers Briggs,  

for example, it was not developed with the purpose 
of making hiring decisions. In fact, the creators  

of the Myers Briggs, they specifically say 
on their website, our test is not designed  

to be used to help hire individuals.
Clinton: It's not designed to predict  

performance in a job. It's more designed to 
give feedback for the individuals,  

maybe preferences of careers they may enjoy, but 
it's not designed to be used. In an employment  

context, neither is like the disc or strengths 
finder. Some of those, they weren't designed  

with employee prediction with hiring in mind.
Jenny: Okay. All right. It's interesting because  

one of my daughters recently applied for a job and 
they asked in the online application, what's your  

Strength Finders profile? What Enneagram are you?
Clinton: Yeah. I actually this morning I Googled  

Myers Briggs selection and I found 
a number of websites explaining how  

to use the Myers Briggs within selection.
Clinton: So people use it, but like I said,  

Myers Briggs themselves says, don't use 
it for selection, but people do it anyway. 

Mike: That's very interesting, isn't it? We're 
really focused on doing things the right way,  

following the Uniform Guidelines. 
And there's, it seems like there's  

a lot of mavericks out in the selection world.
Mike: At any rate, it's just very interesting.  

I'm actually surprised I didn't realize 
that there were people that were asking for this  

kind of a profile. I would imagine that, in terms 
of defensibility, that's a tough one to defend,  

right? If somebody says, what's your DISC profile 
or something like that and you don't get hired.  

If I didn't get hired, I might challenge that.
Clinton: Yeah, it can be problematic. Like I said,  

because when we talk about, tests and validity - 
often misunderstood topics – often those tests are  

not valid for that intended purpose. And for the 
purpose of selection, and yeah, you can definitely  

have some problems if it were challenged.
Jenny: Okay, so a couple of questions come to  

mind. Mike, you mentioned Uniform Guidelines for 
those who are watching or listening to this and  

have no idea what the Uniform Guidelines 
are. Can you guys give a definition? 

Mike: Yeah, Clinton, go ahead. It's your 
space. So I would I would get close,  

but I think you give it the best run.
Clinton: Okay. Yeah, no, sure. I'd be  

happy to. So the Uniform Guidelines were 
created in the late 1970s. So they've been  

around for a while and they are 
the Uniform Guidelines on employee selection  

procedures. And they lay out the, like it says, 
the guidelines or the requirements. If you're  

going to use tests in a hiring situation on what 
you need to do to demonstrate that those tests  

are valid for use in that specific situation.
Clinton: And the Uniform Guidelines when they,  

when we use the term test, that is, it's very 
broad. It's not test in the traditional sense,  

maybe most people think of the multiple-choice 
test. It’s anytime you are reducing your applicant  

pool. So if two people apply to the job and we 
say yes to one person and no to another, whatever  

we did to make that decision, that was a test.
Clinton: An interview is a test, a personality  

test is a test. If you're doing resume screens 
and you make a pile of yes and a pile of no,  

however you, whatever the criteria used to make 
that decision, it is a test under the Uniform  

Guidelines. And so the Uniform Guidelines lay out, 
they lay out guidelines and say, if you're going,  

whenever you're going to reduce 
your number of job applicants to make a hiring  

decision, there are certain requirements 
that you need to comply with. If you don't,  

you can't - you could have problems legally.
Mike: That goes back even to your job posting  

and your choice of where you put your job 
posting, right? You put your job posting in  

a spot where you're not going to get a diverse 
applicant pool. You have limited, essentially  

limited your selectability in that case.
Mike: If you put invalid criteria in there,  

inother words, if you're posting a job and 
it says a driver's license is required,  

but a driver's license isn't really required 
for that job, you have reduced your population  

there as well. And so there's a lot of things in 
our arena that go well beyond the traditional,  

written multiple choice type of test or 
work sample test or personality test. 

Clinton: Yeah, good, great points.
Mike: And I think another thing that's  

important to bring up is that I was taught early 
on that it's not just hiring that is  

selection, it's hiring, it's promotion, it's 
training opportunities that may be available  

to some and not available to others, like in 
a union environment.If you're going to go,  

up to a journeyman level from, I forgot what the 
basic term is, but those are all selection, right? 

Clinton: Yeah, you're getting put into what we 
call maybe a high potential group. Sometimes  

organizations will identify high potentials that 
are then put on like a track for management track  

that inherently comes with potential increased 
earnings, other things that are benefit that  

tie into selection.So whatever you're using to 
identify like high potential,those could also  

fall under the Uniform Guidelines requirements.
Mike: Interesting. That's great. Thank you. 

Jenny: So is there a separate definition 
for personality tests that's unique,  

that's specific for what you do?
Clinton: Not so much a definition.  

I would say more that intended [00:09:00] use of 
personality tests for selection.They are built and  

designed for use in selection. So that is their 
intended purpose. The way I like to compare it as  

just like a test in general. Let’s just say I have 
an accounting test used to hire accountants and  

it's a great test to hire accountants. It is valid 
to hire accountants. And now let's say I give that  

test to help me decide who I should pick in my NBA 
draft in my NBA lottery, who I should pick next.  

And I give, my NBA draft prospects this accounting 
test. You'd probably say, what are you doing? That  

test is not valid. It's not that it's not a valid 
test. It's valid for helping me hire accountants.  

It's not valid to help me hire NBA players. And 
so that's similar to like these different types  

of personality tests. It's not that they're not 
valid like Myers Briggs. It's just not valid to  

use in a hiring situation. Yeah, for that purpose.
Jenny: So you've used that word valid a few times  

now. Do you want to give a definition 
of what that actually means?

Clinton: Yeah. And so validity, 
there are different, what we call  

types of validity or evidences of validity.
Clinton: And what the most common one for  

personality is criterion, what we call criterion 
validity. To not go into lots of different types  

of validity, because I don't think this 
is the purpose of today's conversation,  

we'll briefly cover a personality 
test under the Uniform Guidelines.  

It requires criterion related validation.
Clinton: So if you are using a personality test  

for selection, there should be some evidence of 
criterion related validation. And if we pull up,  

I can pull up a slide here. We can show 
a few here, criterion validity, what that  

is a mathematical relationship between how people 
score on the And some measure of job performance. 

Clinton: So for example, if we say, higher scores 
on the test, they're more likely to sell more of  

the product if it's a sales position. So if 
we say higher scores on the test, they make  

more sales that would be, if we can 
mathematically show the relationship between  

your score on the personality test and how much 
product you sell, that is criterion validation. 

Jenny: Okay.
Clinton: And this is a mathematical  

relation to show this kind of graph it out. You 
can see if we plot here on the X axis, we have  

test scores that range in this case, 0 to 100. 
And we have some sort of measure of performance on  

the side. We can graph this. You see this person 
has a test score of 22, a job performance of 31. 

Clinton: Over here, test score of 85, job 
performance of 55. You can visually see the  

relationship amongst these dots. What criterion 
validity is it's you're coming up with, we call  

it validity coefficient, which is essentially just 
a correlation coefficient in many cases that shows  

this linear relationship between how you score 
on the test and how you perform on the job. 

Clinton: And so that's what we mean by validity 
for personalities. We can show that this is a  

valid predictor of [00:12:00] success in the job.
Mike: So that validity coefficient would be like  

the strength of the validity, right?
Clinton: Yes, the strength and the  

higher that validity coefficient if it's 
a traditional correlation coefficient,  

it's going to range from 0 to 1 with 1.
Clinton: 0 being a perfect correlation,  

which that doesn't exist. But in theory, 
you could get there. The higher that number,  

the stronger that relationship visually, 
what that would look like a correlation  

of 1. Every blue dot here would be 
perfectly on this red line. That would  

be a correlation of one. So the tighter 
these blue dots are to this red line,  

the stronger that correlation coefficient.
Clinton: The more spread out they are around  

the line, the lower that value is going to be. 
The closer it's going to be to zero. Perfect.  

Correlation coefficient of zero would essentially 
be saying, it doesn't matter how you score on the  

test, we have no idea how you score on this test 
gives us no indication whatsoever of how you're  

going to perform on the job.
Clinton: That would be a  

correlation coefficient of zero.
Jenny: And where does the criteria come from? 

Clinton: And there are, that is flexible. The 
Uniform Guidelines say they just need to be  

important criteria. And they mentioned some 
examples. Some could be supervisor ratings of  

performance. It could be a sales and a sales job.
Clinton: It could be turnover. So we could  

correlate with turnover like our people. Can 
we predict maybe who's less likely to turn over  

by with this personality test? And so that is 
open and it just needs to be what they call  

criteria that are important to the organization 
to the job. So most typically, it's going to  

be some sort of measure of performance, if it's 
sales supervisor ratings or turnover, those are  

the types of things that are very common to see.
Mike: And, when you have a test like this you can  

assemble this test and then you can just, start 
collecting all sorts of information and find out  

that a test correlates to some sort 
of aspect that isn't necessarily valid. It wasn't  

intended to select for that purpose. So can 
you talk a little bit about that philosophy  

of not just creating a test and throwing 
everything up against the wall to see what  

sticks versus having some intention when you 
design? It's a little bit risky, we've gone  

through this step with you folks many times.
Mike: And there's a great deal of risk when you  

go through and you put this together because 
you could potentially go through and not find  

what you're looking for. And that doesn't open up 
the ability to find, some other unrelated thing, 

Clinton: Yeah, no, you need to be intentional 
when you're designing these tests. 

Clinton: We don't just, like you said, 
just throw a bunch of stuff against the  

wall and see what sticks. And so whenever we're 
designing a test, a personality test for a job,  

we want to be intentional and we want to 
do what we call a job analysis where we  

analyze. What is done on the job? What are 
the requirements of the position?

Clinton: What are individuals spending their 
time doing? What are the most important parts,  

the most difficult parts? We're going to 
interview employees, talk with them. And  

so you need to do your research and be intentional 
with what you are doing. One of the, I'll share a  

slide here. Sorry, I'm going to skip forward.
Clinton: This is an example for one we put  

together for a factory worker. And this doesn't 
show everything. We did multiple interviews and  

we even went on site and watched employees, but 
we took a look and we said, Hey, there's some  

attributes we've noticed in our research, in our 
analysis of the job, the workers that appear to,  

in this case, stay on the job, here's 
some characteristics that we're seeing. 

Clinton: These eight things that they're 
self confident, analytical, forthright,  

and et cetera. It takes some time. So 
you need, like Mike, like you said,  

we need to be intentional and purposeful. 
So you're not just saying let's just throw  

some stuff out there and see if we get lucky.
Mike: So we, when we've created personality  

tests we've worked together to create
personality tests, we have used a concurrent  

validation strategy. And I know that, basically 
I would say that, from my perspective, the two  

basic ways to do to validate a personality test 
would be either a concurrent study or a predictive  

study. Why don't you talk a little bit about that 
and, maybe what some advantages of either are. 

Clinton: Okay. So great. So those are the two 
types, the concurrent or predictive. Concurrent,  

what that is done typically with your existing 
employees. So if you are in a job and you have,  

a few hundred employees already employed in this 
position, what we do is we develop the test or  

take an existing test from some vendor.
Clinton: We administer it to your current  

employees. And then we correlate that with 
measures of job performance. So if you have  

existing measures to job performance, or we 
can create new measures of job performance.  

And so that is a concurrent one. As you take your 
existing employees, have them complete  

the test and we correlate it with job performance.
Clinton: Predictive is where we start to give  

the test to job applicants. And often you'll 
not be scoring, it's collecting data in the  

background. So people are applying for this 
job and you're giving 'em this new test. The  

applicants don't know that it's not being used, 
but you're really not considering their score. 

Clinton: And as you hire new individuals on 
the job you then, at a later point in time,  

correlate how they scored on the test with 
measures of job performance. And so both  

you're trying to do the same thing- correlate test 
scores with job performance. The main difference  

is concurrent is with existing employees.
Clinton: Predictive, usually use it with job  

applicants and at a later date, you have to come 
back. So the advantage, you can probably tell just  

from the way I'm describing it, the advantage of 
concurrent is it's typically faster. You can take  

your existing employees. You don't have to wait 
six months or a year to find how they panned out. 

Mike: Yeah, and your population is much more 
under control as well, because obviously,  

if you're doing a predictive study,
you're going to be testing all these applicants,  

and you might not hire 75 percent of them. So 
you've actually collected data that you're not  

then going to be able to apply later. And 
then I guess one so one advantage of the  

predictive study is that you get the broadest 
range of people that in terms of knowledge,  

skill, ability, and personal characteristics 
with the concurrent study, you have a little  

bit of range restriction. Is that not correct?
Clinton: Yeah, we can't, you can get some range  

restriction because these individuals are current 
employees and we assume they're performing at an  

adequate level or else they would not be current 
employees or they would no longer be employed  

there. So that can be a downside is that range 
restriction is because there are scores on your  

test. There's maybe not as much variability 
or differences in how they score. One of the,  

one of the upsides of a predictive 
study is you are getting, like you said,  

the range of responses and you can also 
get, you're getting that real  

life situation from the job applicants.
Clinton: They are really applying for a  

job. So you're mimicking what you intend the 
test to be used for in a concurrent sample.  

When you're giving it to current employees, 
they already have the job. And so they may  

complete the test from a different frame of 
reference than your job applicants. And so  

that can be a potential downside of a concurrent.
Clinton: And so there are benefits and negatives  

to both approaches, but both are equally 
allowed under the uniform guidelines. And  

both have been shown from research perspective 
to, to basically result in creating tests  

that are predictive of performance..
Jenny: So you're mentioning developing  

tests or validating tests that 
are used for employee selection. 

Jenny: Are they ever use it 
a post hire environment for  

perhaps employee development or training?
Clinton: They can be, but that's often what  

you'll see, maybe with Myers Briggs or DISC 
or Strengths Finder some of those. And so  

for the intended use, again, it comes back to 
that intended use. Depends on what  

the test was designed to be used for.
Clinton: Some tests have multiple,  

have been designed to be used for multiple types 
of things. But this is something to be aware of.  

What is your test designed to do? And sometimes 
I'll see this with clients who want to take a  

test. And not necessarily a personality test, but 
let's say a technical skills test and they want  

to use it in a diagnostic way to diagnose where 
someone is needs training or things like that. 

Clinton: And I'm like, that test wasn't designed 
to do that. It gave you some initial information  

to make a hiring decision, but it really wasn't 
designed to diagnose and tell you here's their  

training needs. And so you just need to be 
aware of what your tests can and can't do. 

Mike: There's a a maxim in HR as well, that 
says job performance trumps testing. One of  

the things that happens is when a person goes 
from applicant to candidate, to employee is  

that you start to have these instances where, 
job performance is being recorded.  

And so very often we have people who will 
ask us, Oh, is it okay to give our employees  

these pre-employment tests and then, if they 
can't pass them any longer to get rid of them. 

Mike: And generally speaking, that's a really 
dangerous area to tread into because, you have a  

much better job-related measure of what's going 
on, which is their performance on the job at  

that point. So at any rate, it's just an aside.
Clinton: That's that's a great point and kind of  

shows in this, you can see that in this 
graph, tests are by no means perfect. 

Clinton: Like you said, job performance trumps, 
trumps the test scores. Like in this line here,  

you can see some of these blue dots, they're 
not on the line. So in some cases, the test,  

a person may out predict what we outperform 
and what we predict. In some cases, a person  

may underperform what we predict.
Mike: And so just to be clear here,  

any one of these dots is the 
intersection between an individual's  

test score and their performance on the job.
Mike: So that one that's in the middle  

top there above the word test score. There's 
somebody who scored approximately 50, but rated  

almost 60 on their performance measure, right?
Clinton: Yeah, so in that case, they performed  

better than we thought they would on the job.
Mike: And then conversely, down below the  

line to the left of the 40,  

there's somebody who scored, is that right?
Mike: Am I saying this right? Yeah. They under  

tested and under performed, either way, right?
Clinton: Yeah, but they were a lower test score,  

but even then, the performance 
was lower than was anticipated. 

Mike: Yeah. Thank you.
Clinton: And so that's  

something that the tests are, but the tests are 
not perfect measures, but what I always say is  

it's, they're better than the alternative.
Clinton: And so they are getting you,  

they are increasing your odds. And so it's like 
when people sometimes will say this person is just  

not a great test taker. I often will say there's 
also sometimes someone's great grandma who smoked  

40 cigarettes a day and lived until she was 97. 
But you probably would not say in, in, in general,  

the trend is if you smoke a lot, you're 
likely going to have some other health issues. 

Mike: We're going to expect everybody who smokes 
several packs a day to live to a hundred. That's  

not the, that's not the case. Another side of 
that's really important, at least I always try  

to bring this up when I have this conversation 
with people, is that the alternative is that  

a human being is making a selection based 
upon, in terms of the applicant, nothing. 

Mike: You have selected somebody who will be 
hired and you have selected somebody who will  

not be hired, for apparently no reason 
whatsoever, or no concrete reason. And  

so that's another reason why to have this 
decision making process tied to something  

that's observable that you can find patterns in.
Mike: Or if you can find that there's disparities,  

you can measure those things and see what 
exists and then remediate that situation. 

Jenny: Yes, so if an employer would like 
to begin using a personality  

assessment. How do they begin?
Clinton: There's a, I guess there's  

a couple of different ways. There are, 
you can, reach out to vendors out there. 

Clinton: There are two different approaches. 
There are, is what I would call the build a  

custom test from scratch approach, and there 
is the off the shelf approach. There are  

existing personality tests that are off the 
shelf. And then there's the, Hey, we build  

it custom for this particular job approach.
Clinton: What you're looking for may depend on  

your situation in your organization. If you're 
a smaller organization and, or it's a role in  

your organization where you don't have that many 
employees. Off the shelf is probably going to be  

the better route where you, a vendor has spent 
the time to develop a test for that type of job. 

Clinton: And because you just don't have to 
do a sort of study where you can show this  

mathematical relationship between test scores and 
job performance, that requires data. And if you  

only have 10 employees, Iin a certain position 
in the organization, you don't have the data and  

the necessary data to demonstrate a 
relationship and have any confidence in that. 

Clinton: And so in that situation, you're going 
to need to rely on an off the shelf test where  

they've done some of that research and can show 
how it predicts performance for that type of  

job. If you have a large organization, 
you may be in the spot where you say,  

let's custom develop this. If we have a few 
hundred employees in a position, you can do that. 

Clinton: So those are the kind of the 
2 approaches. And I'm sure there's,  

I can get into like, how can you continue to want 
to get into how we do that once identified maybe  

which one, the custom or the off the shelf.
Jenny: Yeah. Yeah. I think you should. Yeah. 

Clinton: So then when you get 
a customer or off the shelf,  

there are different types out there of tests.
Clinton: A lot of them, when it comes to  

personalities, they're typically statements. When 
you take a personality test, you're often reading  

statements. And I'll show here a couple examples. 
This is an example we'll call a forced choice  

personality test, where you say, Hey, I'm more 
likely to, and this is a silly example  

here, but eat a salad or eat a hamburger.
Clinton: And you have some points in the  

continuum. Sometimes you're just given two 
options. You just pick one or the other.  

This point, you're given four choices on other 
personality tests. You're just given a statement  

and you just slay like true or false, or that's 
like me or not like me. For example, I enjoy  

team projects and you would say true or false.
Clinton: Other times one of the most common still  

is we'll call it like your type skills that 
strongly disagree to strongly agree. Again,  

I enjoy team projects or strongly disagree or 
somewhat disagree or to the strongly agree. So  

personalities generally are in that kind 
of flavor where they, you can see they  

have statements and you're making some sort of 
agreement or disagreement with that statement. 

Clinton: And if you're looking to implement it, 
so you need to go, if you're going to the vendor  

route, go to the vendor and you should ask the 
vendor about what this test was designed for and  

its intended use to make sure they 
built this for selection. And not only built it  

for selection, but they have validity for your job 
type, going back to that accountant and NBA player  

example to make sure they're not giving you a 
test that were designed to select NBA players  

and you're trying to use it to hire accountants. 
Just what was this test? Where, what validity  

evidence do they have for your job type?
Mike: So as well, Clinton, there are many,  

I may not be using the right word, but I would say 
standardized personality tests where the battery  

like Hogan, for instance, they're going to ask the 
same 110 questions of every person, but over time,  

they're going to do small studies with certain 
job titles or certain groups of people in order  

to be able to take this standardized test and key 
it or score it in such a way that it doesn't have  

meets with that particular
population on the other side of the coin. 

Mike: There can be a custom test, which 
wouldn't necessarily have standardized  

questions. They might have questions that are 
written that have more face validity. They're  

more written directly towards this particular 
job title that you're using the test for,  

is that not correct? And do you have any 
recommendations about that, or any positives,  

negatives about either of those scenarios?
Clinton: Yeah, that is correct. Most of the,  

a lot of the tests out there that measure 
what we call the big five personality  

characteristics. And it, like you said, Mike, 
it's the same exact test for every position. 

Clinton: The difference being what, where you 
need to fall on the characteristics. So let's  

say openness to experience. You may, they may 
say on this job profile or job type we found that  

people who are more likely to be successful are 
high in openness to experience. And another job  

type people who are more likely 
to be successful are in the middle range  

on openness to experience for the lower range.
Clinton: And so they're given the same exact test,  

but there's a different, like constellation of 
profiles of where you fall in these big five  

personality characteristics that they're saying 
is indicative of success. And I don't know if  

there's really like a, say this one is better 
than the other, but when you have a test that's  

custom designed for a specific position, the good 
thing about that is that it was, like I said,  

that it was custom tailored for that job type.
Clinton: And so that, again, the frame of  

reference, even the way the items are written from 
what I've seen, just in my personal experience  

with work, I tend to see higher validity 
coefficients with those tests when design,  

when they are well designed and put together. 
Higher validity coefficients than with the  

general tests that are just Hey, we then custom 
create a profile depending on your job type,  

but it's the same exact test for all jobs.
Mike: Yeah. I think another aspect that's  

very important to maybe particularly important 
now in this age of social media is  

the applicant satisfaction part of it. 
When I was growing up and I had nothing  

to do with this particular career, I took 
a lot of tests or did interviews or did  

physical ability tests, those kinds of things.
Mike: And so I've often reflected back on those  

processes and thought about which ones were very 
satisfying to me. They had the face validity and  

face validity is basically, how does it feel? 
Does it feel to the applicant? Like it would be  

something that they might encounter on the job. 
So it doesn't mean the test is valid or not. 

Mike: It just really has to do with my 
own impression of it. And I do remember  

going back through in my mind, some of these 
experiences I had and I was like, Oh, that  

was a valid test. That was not fair. That was a 
very unfair experience. And I think some of these  

personality tests, where you're asking, would you 
rather be, at a a noisy party with loud music or  

walking by yourself on a beach?
Mike: What does that have to do with being in  

ABC job? There's a lot of things that don't feel 
really good about that. Whereas if you're asking  

it in context of the job, do you prefer an active 
work environment where your day goes by quickly  

versus a quiet environment where it's maybe a 
little less stressful or those kinds of things  

that has a lot better feel to it than the other.
Clinton: That's a great point that face validity,  

which is not what we always say. It's not like a 
technical type of validity, but it is real in the  

sense that candidate perceptions of the experience 
matter. If candidates feel a positive experience,  

they're less likely to, for example, to challenge 
a test or beat it up or complain about it. 

Clinton: And that is one good thing of 
the custom develop test too is that they  

tend to have a little more face validity 
because the questions by written with that  

job in mind and so yeah that is one benefit 
another benefit of custom developed tests. 

Mike: Perfect. Thank you.
Jenny: So here's  

what we're going to do.
Jenny: We're going to take  

a short break and then we'll be right 
back with everyone in just a moment. 

Voiceover: And we'll be right back 
after a word from our sponsor. 

Ready to revolutionize your HR strategies? 
Head over to TestGenius.com to discover our  

latest tools and solutions designed 
to streamline your hiring processes. 

Jenny: Welcome back everybody. And just a 
reminder, we're with Dr. Clinton Kelly from  

IoPredict, and we're talking about personality 
testing. So when someone takes a personality  

test or a personality assessment, a score is 
spit out for that person, correct? Can you talk  

a little bit about personality test scoring?
Clinton: Yeah. So when it comes to the scores,  

there are different scores that are done and 
those, when we talk about those off the shelf  

tests that measure the big five personality 
characteristics. You'll often get a scale, a  

score in each of those big five personality, what 
we call sub skills. You'll get like an  

openness to experience score, an extroversion 
score, conscientiousness score, and then you'll  

often get an overall fit recommendation.
Clinton: And this is something that's on  

the screen. This is for a custom developed tests 
that Biddle and TestGenius, where you give like  

an overall recommendation. And you can see here 
on the screen here, in this case, we say, Hey,  

people are highly recommended. They have a 77 
percent chance of being successful on the job. 

Clinton: And this is based on job performance data 
collected from actual individuals. You can see  

the numbers here, the number of individuals. 
And so what we give on the score, we give a,  

usually give an overall recommendation score. Most 
tests you'll see like some sort of overall score.  

And then you'll see some sort of subscale scores.
Clinton: The overall score is typically what is  

going to be the driving factor in whatever 
decisions you are making or informing the  

decision. What you don't want to do, 
and is tempting for many people to do,  

who are not super familiar 
with personality tests, is take subscale  

scores. And let that drive your decision.
Clinton: Because the subscales are a sub  

component. They are sub, like it says, they're 
sub to the overall score. So less data points  

from the test are driving subscale scores 'cause 
it's just a portion of the overall test. And  

so it really should be that overall score or 
profile score you get from a personality that  

should drive your recommendation and those sub 
skills can be informative pieces of information  

to then maybe dive into maybe in an interview or 
another step of the selection process. But don't  

let a subscale score. I've seen this happen where 
someone says there's some overall profiles like  

we highly recommend them based on the overall 
profile, but they'll see a subscale score where  

they're fairly low on some subscales I don't 
know if you should hire this person - I'm like,  

no, don't let that drive the decision.
Mike: That's a really important point  

that you brought up about the subscale is 
that you can use it to drive questions in  

the interview process. In fact, very 
often the reports I know the reports that are in  

the test that we've created together, you've 
included those drill down questions because  

that gives you a really great opportunity 
to, a find out, did the test potentially  

really nail this subscale or did it maybe miss 
the subscale or furthermore is this weakness,  

maybe something that the candidate is familiar 
with and has the ability to work around it, which  

can turn a weakness into a strength. So is that 
always something that's included with personality  

tests for selection or are we just lucky?
Clinton: Yeah, not always. So Yeah,  

they're not always included, so that is, yeah, so 
there are some where they just give you an overall  

score and and I've even seen some that just give 
you an overall score, there are even those sub,  

sub scores, and then, but yeah, so it is just, it 
is nice when you get more information like that,  

there are a number of tests out 
there that will provide sub scores and even  

some additional information or potential 
prodding questions, but yeah, it is,  

like I said, it's great when you can have that 
additional information to prod into an interview. 

Clinton: Yep. But like Mike had said, just 
don't let us, don't let a subscale score drive  

your overall decision. It's like letting, like one 
quarter, one quarter of performance in a business  

drive your overall decision. If you don't look 
at the yearly profit and loss, you're like, no,  

over a year we made a lot of money.But one 
quarter one, it was slow. And now you say,  

Hey, we've got to close the company 
because things are horrible. It's no,  

don't let the subscale score, the overall score.
Mike: I want to tell a quick story because  

this was something that was so 
impactful about 10 years or so ago. 

Mike: We created a testing process for 
nurses and it included knowledge testing,  

video situational judgment testing, and 
personality testing. And one of our.  

Clients told us about a instance where they were 
reviewing the results with the applicant in real  

time. So they had tested and then
they moved them onto an interview process. 

Mike: And the experience level of this applicant 
was very good. The knowledge was very good. The  

situational judgment was good, but there was 
this one subscale, which was hostility. And  

there was a red flag on the hostility scale. And 
the interviewer asked the first prodding question,  

Hey, so tell me about a time, and it was some 
question that had to do with when they were in  

a stressful situation, how they didn't handle it.
Mike: And then they asked the second prodding  

question and then started, started to 
feel that this was exposing something,  

asked the third prodding question, and at this 
point, the candidate slammed her hand down on  

the desk, said, I've been to anger management 
classes. I've worked through this. If you  

would just move on, I would appreciate it.
Mike: There was a great instance where,  

you know, that this did bring up 
something that should have been discussed  

and was discussed and herein is the real 
value in collecting that data and having  

an opportunity to talk about it because, 
boy, when you talk about dodging a bullet,  

that was an important one right there.
Clinton: And that is a great example of how to  

use that. In some cases, those subscale scores, it 
may be you may find something like you said that's  

yeah, this is a red flag, but more, more often 
than not, I'll see individuals without prodding,  

like I said, without doing those follow up 
questions, take a subscale and automatically maybe  

disqualify someone because of a subscale score.
Mike: And we talked a little bit about how you  

use it for selection. There's a, we do a lot of 
work in the 911 space, and I know we're going  

to do a follow up podcast where we talk a little 
bit more about that. But there's a, there's fairly  

popular personality tests in that space that 
are content based validated, which  

the Uniform Guideline specifically says, 
don't content validate a personality  

test.You want a criterion validate it.
Mike: So let's talk, why don't you talk  

a little bit about the defensibility aspect? 
I think that the stuff we've talked about up  

to this point is really the utility aspect of 
the validation. Here's why you go through this  

process so that it will do a good. Good job of 
selecting, but why do you go through this process  

to make sure that if somebody, an applicant or 
the DOL or the DOJ or the EEOC or some plaintiff's  

attorney has a problem with something that you 
did, what, why does this, how does this come  

into play and how does this help somebody ease 
out of an uncomfortable defensibility situation? 

Clinton: Yeah, great question. When it comes 
to those situations, if you have a test that,  

for example, is just content valid or 
you don't have the proper validation  

in place for a personality test to show that 
it's predictive of performance,  

if you end up passing, let's say your personality 
test ends up passing more males than females or  

more females than males or in one or just more 
of one protected group over another, that's  

when the Uniform Guidelines kick into place.
Clinton: The Uniform Guidelines say, Hey, if your  

test screens out a disproportionate amount of one 
protected group status, you need to show that your  

test is a valid test and that it's screening those 
people out because they can't do the job. And it's  

not screening them out just because they're female 
or they're Hispanic or whatever group that may be. 

Clinton: And so if you are using a test that 
doesn't have the proper validation in place,  

and let's say you're screening out a 
disproportionate number of some protected group  

status, you will lose in court if you get sued.
Mike: Yeah. Good point. That's good advice. It  

doesn't happen very often, but it does happen 
that a process gets questioned and it can really

financially impact an organization in 
terms of legal decisions that might go towards the plaintiff,

but also it hurts their reputation in 
the space.

You don't want to be that group that got sued because they did something incorrectly.

Clinton: Yeah. So if you're a vendor out there and you're not doing your due diligence, 
yeah, it can really hurt your reputation. Clinton: And that's what I tell with clients 
and said, yeah, you can, we can put out a test  

that doesn't have the proper kind of validity 
to back it up but I don't want to do that.  

Cause that's going to, that's going to hurt 
my reputation as an individual in the field. 

Mike: Yeah. Yeah. And so you and I know, and 
a lot of people may or may not know that,  

when criterion validity is established elsewhere, 
the Uniform Guidelines in 7B does allow for  

validity to be transported from the environment 
in which it was established over to an employer's  

own specific environment.
Mike: Why don't you talk a little bit  

about what that means? Processes and 
any recommendations that you have. 

Clinton: Yeah. And that is a 
really great strategy for jobs,  

especially where you just don't have the sample 
size to conduct these criteria types of criteria  

and validation studies in your own organization.
Clinton: And so the Uniform guidelines allow us,  

let's say Mike, cause we've done, we'll talk 
about this in a later one, but we've worked with  

the 911 space with personality testing. 
And a lot of 911 dispatching agencies,  

they just don't have hundreds of dispatchers. 
They have maybe a handful. There's a lot  

of smaller jurisdictions out there.
Clinton: So if they want to implement  

a personality test and say, Hey, what profile 
or what type of personality is more likely to  

be successful in this job? They just 
logistically don't have the numbers to  

do such a study. And so what 7b, section 7b of 
the Uniform Guidelines allows is it says, hey,  

if a validation study for let's say 911 
dispatchers on the personalities has been  

conducted with a specific test, you as 
a smaller group, you can adopt this test and not  

have to do your own criterion validation study.
Clinton: If you can show that your dispatchers,  

your 911 dispatchers perform substantially the 
same major work behaviors as those 911 dispatchers  

where this study was originally conducted. So 
basically you have to show, hey, there's a match,  

there's overlap between my job where I want to use 
this test and the job where they, what that they  

used to initially develop and validate this test.
Clinton: And if you can show that the two jobs  

are substantially similar in the tasks 
and the work behaviors they perform,  

you can transport that validity 
under Section 7B of the Uniform  

Guidelines. And that fulfills 
their validation requirements. 

Mike: My understanding is that when you're 
looking at these work behaviors that were  

covered under the initial study, that's 
the work behaviors that you're looking  

at in terms of transporting this validity.
Mike: There may be some other work behaviors  

that weren't covered under that validation 
study that aren't really a part  

of this conversation. But in terms of 
being able to transport the validity  

from this original study and bringing it over 
to your environment, is it not those particular  

performance dimensions that were purported to be 
measured or in that original study by that test? 

Clinton: yeah, you do want to look at those. 
So those original things that are measuring  

that test are those work behaviors also 
important in your job? And and you say,  

that's the critical part is showing that overlap. 
And it's not, it is not super difficult to do it  

is as long as I say as long as in the original 
study, the vendor did their due diligence and how  

they showed what the work behaviors were in that 
original study. So if they've done their homework,  

it's a very easy process. I have seen this in 
the past where a client that I was working with  

was wanting to use a test.
Clinton: And they had some  

criteria and validation, but they weren't able to 
give me information on what the people in those  

jobs did or what exactly they did in that 
original study. And so it was,  

we couldn't really transport the validity because 
they didn't do their homework on that initial  

study. And so that's why it's important.
Clinton: When you do that initial study,  

the vendors should have done a good 
thorough job analysis of analyzing  

the job. So you can make sure there's a 
match between where they did the study and  

your job where you want to implement the test.
Mike: That's a really great point. And we might  

want to toot our horns a little bit right here.
Mike: But, being a shop, both of our shops  

are shops where we lean very heavily on the 
Uniform Guidelines and, the principles and  

the legislation and, laws that have been put 
into place so that we are doing things right  

to begin with so that if somebody does determine 
that they'd like to transport that validity over  

to their environment, they're able to do that.
Mike: But you do really want to do your research  

or as much of the research as you can to make sure 
that you are partnering up with an organization  

that has the same values and is doing 
things right to begin with. Otherwise it's the  

garbage in garbage out philosophy, right? If 
you start to transport over a validation study,  

that's a bad validation study.Then, 
you're protected by nothing, right? 

Clinton: Yeah. And so that's one thing. If you're 
looking to implement a study is ask that vendor  

to say, Hey, do you have a validation study? They 
should have a technical report. If you're asking  

a vendor who's got a personality test and you 
ask questions about validation or do you have a  

technical report that documents things and they 
start to kind of him hot or dodge you or they  

send you like a one page more a marketing flyer.
Clinton: Those would be red flags in my opinion  

to say, this person may not have done their 
homework. You should be hearing things like  

Uniform Guidelines, like criterion 
validation. There's certain terms,  

keywords that the vendor should be mentioning.
Mike: Very good. We've mentioned Uniform  

Guidelines a lot of times.
Mike: So we actually have a  

website set up. That's uniformguidelines.com where 
the Uniform Guidelines are in there.  

Topically by heading hyperlinked. There's also the 
Questions and Answers to the Uniform Guidelines,  

which came what, 10 or 20 years later, right? Or 
I don't remember how many years, but that was,  

they circled back through and said, okay, we've 
got this document, looking back now upon it,  

what are different scenarios that have been 
encountered and how did this get interpreted? 

Mike: So both the Uniform Guidelines and the 
questions and answers are very valuable in  

terms of knowing what the best practices are for 
testing and selection. So uniform guidelines.com,  

you can go there and you can, take a look at it, 
bookmark it and refer to it anytime that you'd  

like, that's a resource that we sponsor.
Mike: Jenny, what else? 

Jenny: I actually think we're getting to a point 
where we're going to start to wind down. Okay. And  

just want to let everybody know this is part 
one or episode, session one with Dr. Kelly.  

And next time when we meet, we're going to be 
talking about, specifically about personality  

testing in our CritiCall application.
Mike: Okay. Which would be the dispatch realm. 

Jenny: 911 emergency dispatchers. So let 
me do this. Let me change the screen here,  

bring us all back. And as we wind down Clinton, 
if someone wants to get in touch with you,  

how do they do that?
Clinton: Yeah. So you  

can reach out through our website at iopredict.com
Clinton: And my email is just CKelly. So my name,  

first initial, last name, CKelly at ioPredict.com. 
You can also reach out to Mike over at TestGenius  

and Biddle. And like I said, we have a great 
relationship and we work together hand on hand  

and hand on a lot of validation projects. 
And so those are some ways to get in touch. 

Mike: And you and Jason still are maintained 
at Ckelly at biddle.com as well where we work  

together very closely and we continue to 
really value this relationship and want to  

continue to be able to collaborate
together for many years to come. It's been a,  

it's been really great. And I for one, would 
like to thank you for being on with us today. 

Mike: , I think we maybe even went a little bit 
longer than we had targeted. But it's been really  

valuable hearing from your experience regarding 
this subject, which is, can be a little nebulous,  

it can be a little scary to tiptoe into. 
So I've learned a lot and thank you so  

much for your time today.
Clinton: Yeah. Thank you  

for having me. I really I enjoyed it.
Jenny: Oh, it's been great. And we do have  

show notes that will show on the page where you 
see this video. We'll have links to IoPredict's  

website to uniform guidelines.com and other 
resources that were mentioned in this video. So  

thanks to everyone for listening and for watching.
Mike: Great. Thank you. 

Thanks for tuning in to Testing, 
Testing 123 brought to you by TestGenius and  

Biddle Consulting Group. Visit our website at 
testgenius.com for more information.

Episode Video

Creators and Guests

Jenny Arnez
Host
Jenny Arnez
Training Development and Sales Support at Biddle Consulting Group & TestGenius
Mike Callen
Host
Mike Callen
President of Biddle Consulting Group and TestGenius
Clinton Kelly, PhD
Guest
Clinton Kelly, PhD
Principal Consultant with ioPredict, and specializing in test development and validation including validation of IT coding assessments.