by Martin Green
Everyone in the industry knows how important it is to have a good maintenance department. To cater to this need, many consultants offer testing services to identify workers with superior knowledge of their crafts. Benchmark Testware is one such company. This discussion is based on our experience with projects of this kind.
Not long ago, we were contacted by a medium-sized auto parts manufacturer; we’ll call them “XYZ Mfg.” Out of a total workforce of 500, they had about 40 skilled technicians in various categories including: robotics technicians, tool and die shop, cold forging, paint line, etc. They wanted a test they could use to justify promoting their top technicians to Level 2, involving higher salary and greater responsibility. Why create a new level? Because the company noticed that they were losing some of their best people to competitors; by establishing a senior level with a higher rate of pay, they simply hoped to improve their retention rate.
This example, by the way, is a very typical case of a company carrying out internal testing; “internal” as opposed to “external” testing of outside job applicants. To be sure, there are other reasons for internal testing: one client wanted to establish a “Senior Maintenance Tech” level in order to be able to favor his most capable people with coveted overtime assignments. (His union contract obliged him to allocate overtime jobs on the basis of “share and share alike”.) There are even occasional examples of companies that simply choose to do a survey of their in-house knowledge base for strategic purposes, in order to better allocate their training resources. Some companies want to know what skills they lack before they go out and hire new people. But in fact, such cases of pure “knowledge inventory” surveys are rather few and far between. The great majority of internal testing programs are tied to some form of “pay-for-knowledge” program.
Why would any company use a test to accomplish this purpose? Don’t they already know who their key people are? The obvious reason: labor relations. In a union environment, it can very difficult for management to do anything which appears to favor one union member over another. By using an “objective” instrument such as a test, the company distances itself from the taint of favoritism.
But there is a big downside to this kind of testing. No test can do a perfect job of identifying the best workers. In fact, it might be argued that MOST tests do a rather POOR job of identifying the best workers! (Typically, a correlation of 20% to 30% with actual job perfomance. More later on what this actually means.) It is almost inevitable that when the scores are tabulated, there will be a situation where Chuck has outscored Bob even though everyone knows Bob is a better worker. Not only does this defeat the purpose of the whole exercise, but it can lead to considerable jealousy and resentment. And there is nothing to be done at that point except to grind your teeth and live with the results!
Actually, you can do something about it…you can give in to the clamoring and let Bob re-write the test. (This is what our clients at XYZ Mfg. actually did.) But this is a bad idea…because when you give Bob a second chance, then Dick and Harry will want one too. And before you know it, EVERYONE will have passed the test, and you’ll have a shop full of “Senior Technicians”, each one of them earning $2.00/hr more than he was making previously.
It is very easy to end up with a pay-for-knowledge testing program that fails to acheive its stated purposes. Here are some of the main things that are likely to go wrong:
- The test will almost certainly provide a less accurate ranking of the workers than the supervisors could have provided by making personal assessments.
- With re-tests, sooner or later everyone is bound to pass and qualify for the pay hike.
- The whole process can lead to pointless acrimony between workers and management.
This pretty well sums up the argument AGAINST pay-for-knowledge testing. If it’s enough to convince you to never get involved in this kind of thing….think again. Unless you’re lucky enough to own your own paper mill, chances are you sometimes have to live with management decisions that differ from those you might have made on your own. So let’s move forward and assume your company is going to implement pay-for-knowledge. It’s not all bad. Not only are there various ways to cut your potential losses; there are even some positive benefits you can acheive if you do things right.
(Before we jump into the topic of damage containment, it is worthwhile to point out, as an aside, that PRE-employment testing is a whole different ball of wax. Although it is true that any test will fail to accurately rank SOME applicants, there is still a net positive benefit in any partial ranking…especially because, unlike the results from your internal people, a test of people who you bring in from the street will tell you things that you don’t ALREADY know about them. And there is generally no problem with re-testing or post-game arguments about who should have outscored whom.)
But let’s get back to pay-for-knowledge testing. Assuming you’re in on it, how do you maximize your benefits?
There are three general areas where you can make a positive difference: the quality of the test questions, the rules and regulations surrounding the testing process, and the public relations aspect. Let’s look at all three areas in order, beginning with the test questions.
It’s easy to write really BAD test questions. It’s not so easy to write good questions. Therefore, there is a natural tendency for the test to fill up rapidly with bad questions. One of the worst things you can do is pad the test with math. Writing a test for electricians? Be sure to include a good number of Ohm’s law calculations….NOT!!! How about millwrights? Beware of questions that begin, “There is a gear with 42 teeth rotating at 600 rpm…”. These types of questions are easy to write and tell you almost nothing about the actual job ability of the person being tested. It’s a fact of life (trust me) that some of your very best people out on the shop floor cannot calculate their way out of a paper bag.
It’s also wise to avoid too many questions on definitions, e.g: “The NPSH rating of a pump refers to…”. Such things are best left for the engineers to worry about, not the millwrights. And beware of getting overly technical. One of our supervisors at XYZ was assigned to write a series of questions on their NC Laser cutter; at first, he came up with three or four very good questions on setup and programming. Then he found the section in the equipment manual dealing with laser physics…and he quickly churned out another dozen questions about wavelengths, coherence, and quantum excitation levels. He was so proud of his work that we could hardly bring ourselves to tell him we couldn’t use his material.
So what’s left? Well, a typical paper mill has thousands of individual pieces of equipment, with tens of thousands of pages of documentation. Open a manual on any given page, pick a drawing, and ask: “What piece of equipment is this detail from: a refiner, a screen, or a winder?” (Computer-based test systems make it easy for you to include plenty of scanned images and digital photographs.) Start off easy, and then get a little harder…but not too hard. There are many, many questions you can ask about your own equipment. “What kind of seals are used on the #3 acid pump…mechanical seals, or packing? What make of coupling is installed between the drive and the pump…Falk or Dodge? Does this pump have a replaceable suction sideplate, or does the whole pump case have to be replaced?”
We have found, as a general principle, that easy questions are better than hard questions. Another way of saying the same thing is: your best workers are not necessarily the ones who can answer the hardest questions….but your worst workers are indeed the ones who get the easy questions wrong. We saw the truth of this observation during our project at XYZ. There was some grumbling among the supervisors because the Cold Forging department had written questions that appeared to be significantly easier than the Robotics group. It turned out later that the Cold Forging did indeed have the highest average score (easy questions), but more significantly: their results had the best correlation with actual job performance of any department.
This brings us to the next topic: how do you set the pass score? This is always a problem, because you often don’t have any idea how well people will score. It is very good if you can get the union to concede one senior member as an automatic “Level 2”, so he can write the test ahead of time. Not only can you get a reference point for setting the pass mark, but you can get valuable feedback on the questions. Unfortunately, this kind of thing is not done often enough. Sometimes there will be enough former craftsmen among management…however, these are usually the same people who worked on designing the test questions.
If you can’t get a good measure of test difficulty ahead of time, the worst thing you can do is to arbitrarily “lock in” a pass mark of, for example, 65%. Because if you find later that the scores were higher than expected, and you want to raise the passing grade to 70%, watch out! This is the kind of thing that gets people really upset, and you will probably end up with a grievance on your hands.
If it’s absolutely impossible to get a benchmark ahead of time, it will be very helpful if you can reach an understanding with the union, to the effect that management retains the right to adjust the pass level upwards as required. It is a good idea in any event to have such an understanding, because with the passage of time, there is sure to be “mark inflation” as people re-take the test and gradually remember more and more answers from one year to the next.
It’s even better if you can do without a “pass mark” altogether. How can you do this? By establishing numerical quotas: In any given year, the company will decide how many new “Level 2’s” are required, and the promotions will be allocated on a “first-past-the-post” basis. For whatever reasons, this simple expedient does not appear to be widely used.
As a bare minimum, you ought to make sure there is a reasonable time period between re-tests. We recommend one year, although six months seems to be more common. In some places, dissatisfied applicants can even demand (and receive) immediate re-tests. Needless to say, this makes a mockery of the whole process.
We’ve talked about designing the tests and setting the rules for passing. There’s one more area that needs to be mentioned….the human side. Sadly, many of us have worked in environments where any new initiative on the part of management is automatically opposed by the union. The imposition of testing is an obvious potential flashpoint. What can you do to avoid needless acrimony?
If you’ve done everything right so far….that is, if you’ve written good, relevant test questions, if you’ve sought (and listened to) input from the union on the design of the tests, if you’ve set up a fair process for promoting those who’ve shown they deserve it, and if you’ve communicated the rules clearly to your employees….you’re off to a good start. But there is more that you can do, to make the whole thing a positive experience.
First of all, you should have some fun writing the test questions! It’s a great opportunity for the maintenance supervisors to get together with the engineers and really learn about their own equipment. And in the future, once some of the tradespeople have achieved their Level 2 standing, you can involve them in the question design process as well. It is an empowering experience which can contribute to a culture of sharing knowledge.
Next, you should study the results after the tests are scored. You can learn a lot about the strengths and weaknesses of your knowledge base by seeing which questions were answered well, and which were answered poorly. You may be surprised by the results, which could also help you to better target your future training and hiring efforts.
Finally, remember that you are sure to have a number of employees who might not have the brainpower to achieve their Level 2 standing, but are still excellent workers who do their jobs well. They also deserve to be appreciated; and money isn’t the only way to give this message.
This pretty much summarizes some of the positive and negative things I’ve seen in various pay-for-knowledge projects. Before closing off, it is appropriate to mention some of the legal aspects involved in testing.
You may already know that you can get in trouble with the EEOC for giving tests that discriminate against minorities. What exactly is the extent of your liability, and how can you protect yourself?
Discrimination in testing is covered under a legal doctrine known as “adverse impact”. This is a purely statistical measure: it means that disadvantaged groups do relatively poorly on the tests despite the apparent neutrality of the process. “Adverse impact” differs from the more serious form of discrimination known as “disparate treatment” in the following respect: in cases of “disparate treatment”, you can be assessed punitive damages; but in cases of “adverse impact”, your liability is normally limited to the actual lost wages of the affected workers. To be more precise, if there are ten minority applicants for a group of openings where two of them might have expected to be hired…then the ten rejected applicants split the two salaries amongst themselves.
This can still be a fairly serious cost for a company that does a lot of hiring. But note that in cases of internal promotions, the liability is limited to the DIFFERENCE in pay between the two levels. It is also based on the expected number of promotions from the various target groups, based on their presence in the eligible population. So if, according to their numbers in the applicant group, you might have expected to promote ten women/minorities, but you only promoted four…. your maximum liability would be the differential salary (e.g. $2.00/hr) multiplied by six employees. And as the expression goes, “to the losers go the spoils.”
There is, however, a way to protect yourself: validate your test. Validation not a difficult procedure; it is covered in the Code of Federal Regulations in a document called the Uniform Guidelines. You can find all the relevant links on the internet by going to Benchmark Testware’s website at www.aptitude-testing.com. There are different methods of validation; for pre-employment tests, we prefer to use what is called a criterion-based method. This consists of having a sample group of existing employees write the test, and comparing their scores with their known job performance. Using standard statistical methods, you can calculate a correlation value between test scores and job performance ratings. (This figure is also sometimes called the validity co-efficient.) As mentioned earlier, typical pre-employment tests have validity coefficients in the range of 20 to 40%. By comparison, Benchmark Testware’s flagship product, the Shop Apprentice test, has shown correlation values in the range of 60 to 80% in various studies throughout Canada and the United States. For an explanation of what these figures mean, see the article “When is a test valid enough?”, posted on our website. For now, it is enough to say that using a test with a validity of 34% is equivalent to hiring a baseball player on the basis of his last ten trips to the plate.
For internal promotions, it’s a bit different. Your available sample group happens to be the same group of people who will be writing the test “for keeps”. So you can’t readily do a criterion-based validation. There is another method, called content validation, that consists of having a group of job experts sit down and review the questions. In practical terms, this means you will want to take copious notes at all your review meetings. Leave a big, messy paper trail so it’s obvious that the tests were designed with due care and diligence. Especially, make a point of rejecting questions that don’t meet the criterion of job relevance, and document your reasons.
But for the most part, you shouldn’t have too much to worry about. Most challenges to employment testing occur with large public-sector employers using tests as a mass-screening technique. In plant maintenance applications, you are generally dealing with much smaller numbers. Not only is the potential cash-out much smaller for would-be litigants, but on account of the small numbers, it can actually take quite a few years to accumulate the statistics that are necessary to establish even a prima facie case of adverse impact. (Remember, establishing adverse impact doesn’t mean you’ve broken the law, or that you’ll have to pay damages….it just means you may be asked to show that your test has been validated!)
Finally, one of the biggest thing you’ve got going for you, when it comes to testing for crafts knowledge, is this: in plant maintenance, you are dealing with hard skills: electrical knowledge, mechanical knowledge, hydraulic equipment, bearings, welding, etc. These are areas where there is a general perception among all concerned that testing for knowledge is a legitimate prerogative of the employer. So there is not a great deal of motivation to challenge such tests. But by all means, protect yourself by validating, even if it’s not a formal process under the direction of an industrial psychologist. Because any documentation you keep on file which records the process you went through to develop your test can only help you in the unlikely event that there is any trouble down the road.
Martin Green is a mechanical engineer from Winnipeg, Canada. Since 1995, as director of Benchmark Testware, he has been providing pre-employment tests to the companies in the manufacturing sector. He can be reached at 1-888-378-9273.