Normative each item independent of all other items. You can legitimately compare various people who have taken the test. (i.e. I.Q. or MMPI)
Ipsative – person being tested needs to compare items with each other. Occupational preference surveys. You cannot legitimately two or more people who have taken a ipsative measure. Strengths/weaknesses within a specific person.
Speed versus Power Tests –
Speed test – keyboarding test. Timed and assesses accuracy.
Power Test – not timed. Achievement Test is a power test. Level of difficulty of individual taking the test. Nobody can receive perfect score ideally.
Maximum / typical Performance Measure –
Maximum – assesses best possible performance (Achievement Test)
Typical – A typical or characteristic performance (Interest Inventory)
Spiral versus Cyclical –
Spiral -items get more and more difficult.
Cyclical – several sections each of which is spiral in nature.
Vertical versus horizontal
Vertical -different forms of the test for various age groups / grade levels.
Horizontal – measures various factors at once.
Test battery to describe the situation where we administer a group of tests to the same person. Can be combined into a profile. More accurate than merely assessing the individual with a single measure.
Parallel Forms / Equivalent Forms –
Test has various versions that all measure the same thing.
Parallel Forms – each person takes different version of test.
NEXT, you should be concern with the quality of the test. How good is it? There are two things to consider. Most critical issue is validity & Second is reliability.
Validity – does test measures what it purports to measure.
Content validity – extent that the test samples the behavior that it is supposed to.
Construct validity – refers to the extent that a test measures an abstract trait, construct, or psychological notion.
Criterion-Related Validity – test is correlated with an outside criterion (i.e. a standard).
Concurrent Validity – A job test might be compared to an actual score on an actual job performance.
Face validity – does it look like it is testing what it is supposed to.
Reliability – refers to whether a test will consistently yield the same results. Does the score remain stable over repeated measures.
Experts often assert that the quality of a test is determined by validity and reliability. A reliable test is not always valid. However, a valid test will always be reliable.
Test-Retest Reliability – simply test same group using same measure 2x and correlate to see if consistent.
Equivalent Forms Reliability – to equivalent forms of same test administered to same pop and correlated.
Split-Half Method – examiners take whole test and split it in half with two tests. And a correlation made between two halves of the test.
Interrater reliability – with subjective tests. You take the test and then have two independent raters grade it and see if scores are similar.
Reliability coefficient can tell you if it is reliable.
00 is perfect reliability in the test. Happens with physical measure
Coefficient .90 or +.90 is considered really good in a psych test.
.90 is accurate
.10 is d/t error
Francis Gaulton – intelligence is a unitary factor that was normally distributed like height or weight (Bell shaped curve). 1869 he chose 197 men who achieved fame. It was 300x more likely that famous person would have a famous relative. Gaulton felt it was a product of genetics. ½ cousin Charles Darwin.
Charles Spearman – 1904 British psychologist postulated a 2 factor theory of intelligence, (G & S Factors)
Louis Thurston – intelligence is a series of factors, primary abilities. Used factor analysis to develop these.
P. Gilford – 120 elements add up to intelligence. Best remembered for dimension of convergent and divergent thinking
Raymond B. Katell – two forms of intelligence. Fluid intelligence and crystalized.
Fluid – dependent on nervous system and the ability to solve complex novel problems.
Crystalized – application of fluid to education. Is the ability to use facts.
James McKean Katel – mental test coined in 1890. First person to use psychological tests to predict academic performance.
FIRST INTELLIGENCE TEST – Alfred Binet French psychologist & French doctor Theodore Simone in 1905. Revisions occurred in 1908 and 1911. The first test was named the Binet Simone scale.
In 1904 the French government wanted to discriminate normal Parisian children from those who were mentally deficient.
Teacher’s could not be trusted to make this distinction.
Dull children could be separated from the others and placed in a simplified curriculum….
Used the concept of age-related tasks.
Binet never believed his tests measured intelligence.
Intelligence Quotient – IQ is divided computed us Wilhelm Stern’s formula
Mental Age / Chronological age x 100 = IQ.
This is know as a ratio IQ.
oday prefer deviation IQ. Compare obtained IQ against a normative sample
Louis Turman – 1916 adapted for American Usage. Stanford Binet. Updated in 1937 and again in 1960 and 1986 the MA/CA no longer used. Not called IQ. Now called SAS “Standard Age Score” at this time. Since 2003 the standford Binet intelligence scale 5th edition, has been used and can be administered ages 2-85 and beyond. The current version created by Gale H. Royd uses 10 subtests. 5 verbal subtests and five nonverbal subtests. Mean is 100 and SD is 15. One small controversy remains. The old Form LM is till the best test for measuring ability of gifted individuals.
Weschler Scales – Mean score is 100 SD is 15. David Weschler first published in 1939 Weschler Bellevue. Grew in popularity for adults.
WAIS-3 most popular adult intelligence test in the world. 14 sub-tests. 7 verbal subtests and 7 performance subtests. Verbal IQ / Performance IQ and full IQ.
WISC-IV – for children is used for ages 6-16 11 months. Takes 50-70 minutes. Six verbal subtests and subtests.
WIPSI-3 Weschler preschool and primary scale of intelligence revised for ages 2/6mths – 7/3mths. Takes 1.5 hours. Wipsi is long, can administer over two sessions. The rationale is that children at this age have difficulty concentrating for long periods of time.
Infant and Preschool IQ tests – useful to pick up mental retardation. Predictive validity is extremely poor of IQ-
Denver Developmental Screening Test 2
Bailey Scale of Infant Development (BSIDII) – most widely used. 1-42m
FTII Fagan test of infant intelligence.
Tests given before age 7 do not correlate well with tests later in life.
Group IQ tests – not as accurate as individual tests. Began in 1917 Army Alpha and Army Beta testing recruits during WW2. In WW2 the Army general classification test AGCT test. Armed forces qualification tests. Used frequently in schools.
PROS – don’t need special training to give. Give to many people.
CONS – Not as accurate
Asian Americans score highest then European Americans, then Hispanic Americans and at the bottom African Americans.
Some feel any IQ test should be a culturally fair test. (eliminate BIAS)
Culture fair tests do not predict academic performance as well
ake them culture free…take problems on test and make them problems that would not depend on knowledge of any culture.
Heated debate in social science has been over racial differences in IQ. Arthur Jensen had social science community arguing back and forth when publishing 1969 article which states that blacks scored 11-15 points lower than whites and this can be due to genetics. Robert Williams created the BITCH test “Black Intelligence Test of Cultural Homogenity”. Any black inner-city child that a duce and a quarter is a Buick Electra 225. How many high IQ kids would answer this question.
SOMPA – System of Multicultural Pluralistic Assessment. Eliminate Culture from tests and create culture-free tests. Some say you can eliminate culture from an exam. Proponents of these test remind us that they tell us nothing about our makeup. They are good predictors of success in life.
The FLYNN Effect – IQ tests worldwide are going up. We are unsure whether it is because of better nutrition, earlier maturation. Or increase practice of video games.
First published in 1943 by Hathaway and McKinnley extent of emotional disturbance and helps with diagnosis using 567 true/false questions.
10 clinical scales –
Hypochondriasis – Concern about health.
Hysteria – use of physical/mental symptoms to avoid symptoms
Paranoia – suspicious
Psychasthenia – excessive worry or guilt
Hypomania – overlyactive
Social Introversion – Shy
Myers Briggs Type Indicator – Based on Carl Jung theory of types four bipolar scales which result in four letter type.
Exam hint. Myers Briggs a theory based inventory since it Is based on a theory. MMPI is a criterion based inventory since it compares a person taking it to a criterion group.
Self-report inventories like MBTI more accurate than projective tests. Projective test shows neutral stimuli and asked to interpret, (ink blot).
Other Misc Personality Things….
Rorschach inkblot test – Association Projective test, ink blot test…what does the blot bring to mind. Most popular ink blot measure for ages 3 and up. By Herman Rorschach using 10 ink blot cards.
Construction Projective Test – TAT Thematic Apperception Test. Person being tested is asked to describe make up or construct about a picture on a card. The picture is ambiguous. Created by Henry Murray and Christina Morgan in 1935. Orignally based on needs pressed theory today you can utilize psychoanalytic.
Expressive Projective Test – draw a person or house/tree/person test. Bender Gestalt test, a test of organicity and screens for brain damage.
Arrangement Projective Test – place pictures in a sequence and discuss why in this order. Sentence completion test. Difficult to hide things here….
Interest & Aptitude testing
Interest Inventory – Occupational and Educational Interests. Students younger than the 10th grade show instability in interests and the interests may not be that valid. It is very easy to give untruthful responses on these. Strong Interest Inventory (SII) Based on Holland’s six types. Most famous.
Ask people who are happy and successful for three years what they like.
When a person’s profile matches this, then a particular profession might be appropriate.
Self-directed Search (SDS) administered by self and scored self.
Fairly reliable and nonthreatening.
Aptitude Test – measure an inherited capability rather than what you have learned.
ACT/SAT/GRE – examples
Great aptitude must have superb predictability.
GATB – assesses 9-12 students and adults on pen and pencil
Achievement test – what have you learned and are primarily used in educational settings. National Counseling Examination. GRE….Some books call GRE tests that measure aptitutde and achievement both. Some tests cross this fine line.
STANDARD ERROR OF MEASUREMENT (SEM) How accurate or inaccurate a test is. The Standard Error is a measurement of the variation in a single person’s score of he/she would take test again.
EXAMPLE IQ TEST STANDARD IQ ERROR +/- 3.
YOU GET 100
68% YOU FALL BETWEEN 97-103.
SMALLER PERCENT OF TIME YOUR SCORE WILL BE HIGHER/LOWER
INACCURATE TO SAY THAT BOB SMARTER THAN NANCY IF ONE IS AT 100 and ONE AT 102.