Jul 9 2016
The Value Of Surveys: A Debate With Joseph Paris
Joseph Paris and I debated this issue in the Operational Excellence group on LinkedIn, where he started a discussion by posting the following:
Contents
“Riddle me this…
If the Japanese way of management and their engagement with employees is supposedly the best, yielding the best result, why is there such a lack of trust among employment across the spectrum; employers, bosses, teams/colleagues. From Bloomberg and EY.
Japanese Workers Really Distrust Their Employers
Lifetime employment sounds like a great thing, but not if you hate where you work. That seems to be the plight of Japanese “salarymen” and “office ladies.” Only 22 percent of Japanese workers have “a great deal of trust” in their employers, which is way below the average of eight countries surveyed, according to a new report by EY, the global accounting and consulting firm formerly known as Ernst & Young. And it’s not just the companies: Those employees are no more trusting of their bosses or colleagues, the study found.
By contrast, about two-thirds of workers in India and Mexico had a great deal of trust in their employers (see chart).
Joseph Paris thinks that this survey is meaningful and that this bar chart reflects reality. After drilling down to the published summaries of the survey, I came to the conclusion that it was flawed, and that the answer to the riddle is that the survey did not actually prove the existence of any “lack of trust in bosses, teams, and colleagues” among Japanese employees. Following are our exchanges on this topic.
Michel Baudin: This article is another example of blanket conclusions drawn from flimsy evidence. If you follow the links from Bloomberg/BusinessWeek to EY and to the organization that actually did the survey, you find the following disclaimer: “All sample surveys and polls, whether or not they use probability sampling, are subject to multiple sources of error which are most often not possible to quantify or estimate, including sampling error, coverage error, error associated with nonresponse, error associated with question wording and response options, and post-survey weighting and adjustments.”
It is not a serious study. Perhaps a serious study of such a topic is impossible. Does “trust” even have a common meaning in Japan, Brazil, India, and the US? The results are just Belief Statements (BS).
Joseph Paris: That “disclaimer” is true of every sample of data – completeness, accuracy of data, context, variables, etc – and therefore applies to any analysis ever done.
… I does not make this one any less valid than any other analysis ever done (though perhaps more honest).
… You say “flimsy evidence”. On what basis do you make that statement.? How much evidence is enough.? Or is that your “belief statement”?
Michel Baudin: It does not apply to every analysis ever done. Not every dataset consists in subjective answers provided on a voluntary basis. Subjective answers mean different things to different people, and you have no idea of why the non-responders didn’t respond or of what they would have responded.
There are objective ways even to measure how employees feel about working in a company, such as employee turnover or absenteeism, and these metrics are about the whole work force, not a self-selected sample.
Joseph Paris: It absolutely applies to every analysis ever done.
… Subjective answers on a voluntary basis – this is metadata, the data about data. Every measurement is influenced by variables – often an infinite number of variables. Yet we squelch many of them out in the name of clarity – manipulating the data. This almost never has nefarious intent.
… The last bit to which you refer are measurements which yield an “inferred” result or conclusion – which is never as accurate as a direct measurement.
Michel Baudin: Please explain to me how an analysis of shock absorber defects detected at final test is affected by the following:
“error associated with nonresponse, error associated with question wording and response options, and post-survey weighting and adjustments.”
A number that struck me in a Mexican auto parts maquiladora was that it had an employee turnover of 11%/year, compared to 44%/year for the other plants around it. This objective measurement told me more about the management of this plant than I would have learned from a survey.
About inference, I always like to cite the German tank problem of World War II, in which British statisticians inferred German tank production quantities from the serial numbers of tanks destroyed or captured, and their estimates were within 1% of actuals, as checked after the war. If you know what you are doing, inference is a good second choice when you can’t measure directly.
Joseph Paris: Hello Michel; Thank you for engaging in this debate – even if this thread has deviated from the subject at hand and focused on the form, not the substance.
… In your last post, you invoke a logical fallacy called the “Texas Sharpshooter” in that you restrict the superset of disclaimers identified above and focus on a much more narrow few. Fair enough…
You made reference to a shock absorber. Well, I don’t know about shock absorbers or their history. But a shock absorber is just an apparatus, so I will substitute a Power Control Unit (PCU) for my example.
Error associated with question wording and response…
… There was a problem in 737’s in which the rudder of the jets would do a hard-over in the opposite direction which was desired. Eventually (after several instances and a few fatal crashes), it was discovered that the PCU miss-behaved under certain circumstances that were not considered nor tested. This led to a redesign.
… Have all circumstances now been accounted for? Only time will tell.
Your second comment, regarding Mexican manufacturers…
… The survey was not about any individual company, but an aggregate of a country. But again, fair enough…
… Just as you point out that one company had 11% employee turn-over to 44% of nearby companies, how does that drill to the department level.? The shift level.?
… Is there really any difference between a country and a company other than scale.?
About inference… You present an anecdotal argument…
… I comment that conclusions from direct measurements are always more accurate than those that are inferred. Your rebuttal is a story about something that occurred once – and then only to 1% accuracy.
… Would actually counting the tanks yield a (much) more accurate assessment?
… Is 1% really good enough? In all instances? If there were an assessment of those shock absorbers that was based on inference, and it was 99% accurate, would that be satisfactory?
In this poll, is inference necessary because a direct assessment of the total cannot be achieved? Is it close enough? It certainly is for me.
… If the Japanese were not such outliers in this poll – and if I didn’t always hear from “Western experts” about how happy and respected the Japanese feel about being employed by Japanese companies, I would not have given it a second thought.
… But neither of those are shown to be true – so it gives cause for pause.
Michel Baudin: From the disclaimer, I took a few statements that are definitely specific to surveys. The people who conduct surveys know their limitations and are usually cautious in their conclusions. The people who quote them, trying to get to the essence of the conclusions, make broader claims with few caveats, and the people who quote the people who quote them make even broader claims. After three or four levels of indirection, the conclusions are global-cosmic.
To return to the original discussion, as an employee, you may trust your employer at a basic level to pay your wages on time, at a higher level to keep you around in a downturn, and at the highest level to help you navigate your career path. Trust can mean different things, particularly in different cultures.
During WW II, counting tanks rolling off the line in Germany was not an option for the British. Knowing within 1% how many tanks the enemy could build per month was valuable intelligence, and much more accurate than the alternative estimates. (See the Wikipedia article on the German tank problem.
I find this case fascinating because the method was so clever, and it was an early example of data mining, because it was extracting information from data that was not collected for this purpose
Your PCU case is similar to the shock absorber case, in that it was free of subjective data.
You wrote : “It absolutely applies to every analysis ever done.” You cannot prove that with anecdotes but you can refute it with just one. Logically, all it takes is one case that contradicts it.
Joseph Paris: Now all we have to do is find the one…
Michel Baudin: Your Boeing 737 PCU example fits the bill.
The evolution of employment relations in Japan is a topic worth looking into, but the EY study of trust does not look as if it can be trusted.
Back in 1979, Robert E. Cole published “Work, Mobility, and Participation: A Comparative Study of American and Japanese Industry.” It was quite detailed on HR practices within Ford and Nissan, and their respective supply chains. I remember, for example, that he found that, while professional-level employees were twice as likely to leave Ford as Nissan, they were six times more likely to leave Ford suppliers than Nissan suppliers.
I would love to see a current study conducted at this level.
Joseph Paris: Actually, the Boeing 737-PCU does not fit your bill.
… I maintain that the “disclaimer” remains true of every sample of data – completeness, accuracy of data, context, variables, etc – and therefore applies to any analysis ever done.
The fact that the problem with the Boeing 737-PCU was not discovered would be an example of (see embedded);
“All sample surveys and polls, whether or not they use probability sampling, are subject to multiple sources of error which are most often not possible to quantify or estimate, including; … coverage error (not tested), error associated with nonresponse (not discovered with existing test), error associated with question wording (test design) and response options (test rigor)…”
But… All of this still leaves us with…
… The lack of trust between and among Japanese employees, employers, and their co-workers – where trust is half that of their peers – makes Japan an outlier of such magnitude that there is undoubtedly something more there than just “polling error”.
Michel Baudin: To me “surveys and polls” refers to people, not machines. You don’t survey the response of a PCU to a command, you measure it. “Nonresponse” refers to a person’s choice of not responding, not to the failure of an instrument to get a reading; “question wording,” to questions specifically asked to people; “response options,” to checkboxes on a list…
The differences in the sizes of the bars may be due to cultural differences in the meaning of “a great deal of trust,” or of disgruntled employees choosing to answer while happy employees declined… It may be true that Japanese employees have lost trust in their employers and colleagues. All I am saying is that this kind of survey does not prove it because it is based on a flawed method.
Joseph Paris: There is no difference between collecting and analyzing data – whether man or machine – both are measuring. We even “poll the data”, a lexicon.
… Prove.? If it smells like dung, looks like dung, and has the consistency of dung, I don’t have to taste it to know it’s dung.
Michel Baudin: You can measure people’s height, weight, pulse, time to run 100m, ability to solve test problems, saving habits, … But, if you don’t see the difference between that and asking them whether they have a “great deal of trust” in their colleagues, their managers, and their employer, I don’t know what else to say.
Joseph Paris: Data is data… Perhaps you never had to give a performance review – that would explain your not understanding the correlations.
Michel Baudin: What do you mean by “data is data”? Meaning doesn’t matter? Data quality doesn’t matter? Does it not matter wether you have 100 points or 10 million?
The best data analysts I know work on a domain in which they have expertise. They don’t just know data science. They also know semiconductor process technology, or e-commerce, or epidemiology, and they start from the domain’s problems.
Joseph Paris: Still… all this “distraction” (and that is what it is) leaves us with…
… The lack of trust between and among Japanese employees, employers, and their co-workers – where trust is half that of their peers – makes Japan an outlier of such magnitude that there is undoubtedly something more there than just “polling error”.
Michel Baudin: Is it a distraction, or is it the answer to your riddle? Whenever we hear that an assertion is based on a study, our first reaction should be to check that the study credibly supports the assertion. Otherwise, we may be misled and waste our time. Six months ago, I posted on LinkedIn an article entitled “Hundreds of Studies Show…” Yeah, right!. Your comments would be welcome.
#datascience, #survey, #selfselectedsample, #japan, #subjectivity, #objectivity
Kyle Harshbarger
July 10, 2016 @ 7:15 pm
Apparently the survey responders are those that have self-selected into a Harris Poll survey group. This cannot be a representative of any population. I don’t know enough about Japanese culture to make a generalization, but I think Americans who self select into this type of group are likely to be in jobs where trust is lower (food service, retail). Unfortunately there is not enough of a demographic breakdown in the study to come to meaningful conclusions.
Joseph Paris
May 14, 2017 @ 11:19 am
Mike; Thank you for sharing… Discourse and debate are healthy and necessary.