By Jordan Winberg
The following article is part of a multi-part series of excerpts from the author’s senior thesis. Begin with Part 1.
The Impact of Stereotypes
To get a more in depth look at how stereotypes may be effecting women in leadership, it is important to consider the work of Professor Madeline Heilman, Phd. Professor Heilman is a psychologist who has devoted her entire life’s work to investigating how stereotypes can adversely affect how women are evaluated in the workplace. Her first breakthrough was already discussed in the first paragraph of the paper: the lack of fit model.
Heilman found that if a person is not seen as a “good fit” for the job (a person has a lack of fit), then the person will be expected to fail, and confirmation bias will then negatively influence the way a person is evaluated (Heilman, 1983). In addition, both expectations for what a woman is and what a woman should be interfere with their ability to obtain and succeed in leadership roles; competency is not enough for women to succeed (Heilman, 2001).
This was demonstrated during an experiment which involves the evaluation of a pair (one male and one female) of workers (Heilman & Haynes, 2005). During the experiment, subjects are told that they are participants in study that will help evaluate the accuracy and effectiveness of an employee assessment.
To do this, participants evaluate one of the team members on their ability to create an investment portfolio, which was designed to yield the largest return. Participants are told that each team member spent a period of time developing a plan individually, and then they grouped together to form a joint plan; however, the participants are also told that due to time constraints, they are only going to be able to evaluate one of the two workers.
In actuality, the researchers use a fake team with fake results. Background information about each worker, a picture of the workers, and a group score for success is given to each participant. Two different sets of background information are used, however, they are designed to be equal, and each set is used 50% of the time for each gender. The picture of the workers always stays the same, and measures are used to equalize attractiveness, age, dress, and facial expression.
Finally, the group score for task outcome is always excellent (92 out of 100), and it is given to the participant. Therefore, the only significant difference between each worker is gender, and each participant is told that the team collectively did an excellent job on the task. In the final section, more specific ratings are given about task. However, half of the time, the title for the final section reads “Individual Assessment Form”, and half of the time, the title for the final section reads “Group Assessment Form.” This way, some participants assume they have group scores, and some assume they have individual scores.
The scores never change in this section, and all sections have favorable or high ratings. Each participant is then asked to rate their worker on a scale of one to nine, with one being not at all and nine being very much, on the following questions: “To what extent do you think this individual was influential in determining the joint portfolio?”, “To what extent do you think this individual was responsible for the final budget?”, and “To what extent do you think that this individual took the leadership role?”
Results indicate that the female worker was rated significantly lower than the male worker, when the participants are given group assessment scores, even though the group assessment scores are high (Heilman & Haynes, 2005).
A replication of the study was done, only the type of task has changed; participants are told that the workers need to devise an appropriate budget for a computer software company, and workers need to research the law before starting the task. During this replication, the dependent variable is that 50% of the time, the workers are given specific, individual pieces of law to research, while 50% of the time, each worker researches everything.
Similar effects are seen in this variation, as females are given significantly lower ratings than males if the workers had a joint task (Heilman & Haynes, 2005).
A third replication of the study is done, where participants are again told that the team needs to devise an appropriate budget for a computer software company. However, in the third replication, background information includes a fake evaluation from a past employer. One set of performance evaluations has “specific” results (i.e.: “Top 2%,” “Top 5%,” “Top 10%,” “Top 25%,” “Top 50%,” and “Bottom 50%.”), while one set of performance evaluations has “vague” results (i.e.: “Top 25%,” “Top 50%,” and “Bottom 50%.”), and a control is used where no evaluation is given.
Results show that female participants who do not have a past supervisor rating, or who have “vague” supervisor ratings, are scored significantly lower than the males (Heilman & Haynes, 2005).
All versions and replications of this study indicate that competency is not always enough for females to have fair ratings on their performance. Evidence indicates that successful completion of a task given to a female, in whole or in part, is not attributed to the female’s competency unless specific proof is given (Heilman & Haynes, 2005).