What’s Going On In German Companies | Bodo Wiegand | Wiegand’s Watch

Bodo WiegandBodo Wiegand heads Germany’s Lean Management Institute. In his latest newsletter, on Wiegand’s Watch, he explains his concerns about the future competitiveness of German companies. Here is my full translation of his article, followed by my comments:

Bodo Wiegand: “A huge potential is not realized and simply left fallow – can we really afford that?

I think we cannot afford it.

In China and India, more engineers are trained each year than we have in Germany in total, and then we fail to exploit the huge potential of the engineers we have. Why? Because we do not want to give up our fiefdoms, our functional thinking and our single-minded concern for our turf.

Continue reading

Manufacturing Data Cleaning: Thankless But Necessary

Whether you manage operations with paper and pencil as in 1920 or use the state of the art in information technology (IT), you need clean data. If you don’t have it, you will suffer all sorts of dysfunctions. You will order materials you already have or don’t need, and be surprised by shortages. You will make delivery promises you can’t keep, and ship wrong or defective products. And you will have no idea what works and what doesn’t in your plant.

I have never seen a factory with perfect data, and perhaps none exists. Dirty data is the norm, not the exception, and the reason most factories are able to ship anything at all is that their people find ways to work around the defects in their data, from using expediters to find parts that aren’t where the system thought they were, to engineers who work directly with production to make sure a technical change is implemented. Mei-chen Lo, of Kainan University in Taiwan, proposed a useful classification of the issues with data quality. What I would like to propose here is pointers on addressing them.

Continue reading

Is Choosing a Consultant Truly The Second Step in ERP Implementation?

According to the previously cited guide from ERP Focus, choosing an implementation consultant is the second step of ERP implementation, right after selecting a vendor. In the consulting business, being a certification as an implementer from a leading ERP vendor is known as a license to print money. Even vendors of ERP products acknowledge that their customers spend more to implement the software than to buy it, and that much of this cost goes into consulting fees. The following are a few thoughts about the process of ERP implementation and the roles played by consultants, contractors, and the in-house IT team. Continue reading

Toyota’s IT Vision at Industry Week’s Best Plants Conference | Chain of Thought

See on Scoop.itlean manufacturing

“‘…Toyota Motor’s group leaders were complaining about the systems IT was delivering. They wouldn’t let them focus on being out on the production line. So IT’s focus became providing tools to allow group leaders to be more efficient…”

Michel Baudin‘s insight:

The article’s author is challenged about getting to the point but, when he eventually does, it is worth reading. What I found most original is IT focusing on the needs of group leaders, Toyota’s name for first-line managers, who oversee four to six teams of four to six operatiors each. It is a constituency is definitely underserved by IT in most manufacturing organizations and whose potential is underestimated.

Most companies expect little from first-line managers beyond expediting parts, tracking time and attendance, and disciplining workers to make their numbers. In fact, being both part of management and in direct contact with production operators on the shop floor puts them in a unique position as agents of change.

This is why TPS puts them in charge of smaller groups, with the expectation that they will spend time leading improvement projects and supporting the professional growth of their teams. Most IT groups pay more attention to the executive suite than to the shop floor, where, in particular, you are not just interacting with people through screens but also with machines through their controllers. This requires a different set of IT skills, and the article says that Toyota partnered with Rockwell Automation for this purpose.

See on mhlnews.com

ERP and Lean

The discussion Pat Moody started in the Blue Heron Journal is in the form of advice to a production planner in a heavy equipment plant who has been put in charge of implementing a new ERP system to replace a collection of legacy systems. The call for help is signed “Hopeful in the Midwest.”

What would we say if, instead, this person had been tasked with throwing out all the machine tools of multiple vintages that make up the plant’s machine shop and replace them with one single, integrated Flexible Manufacturing System (FMS)?

My recommendation to this person would be to find another job. Unless the company has gone through preparation steps that Hopeful does not mention, the ERP project is likewise headed for disaster and Hopeful should run from it.

ERP boosters take it for granted that one single integrated system to handle all information processing for a plant is an improvement over having multiple systems. From a marketing standpoint, it is a powerful message, well received by decision makers, as evidenced by the size of the ERP industry.

Yet most plants do have multiple systems, and it is worth asking why. It is not just because organizational silos are uncoordinated. It is also because the best systems for each function are made by specialized suppliers. The best systems for production planning and scheduling, supply chain management, maintenance, quality, human resources, etc. are developed by organizations led by experts in each of these domains.

ERP systems are built by companies that grew based on expertise in one of these domains and then expanded to the others, in which they had no expertise. One major ERP supplier got its start in multi-currency accounting; another by dominating the market for Database Management Systems; yet another by focusing on HR management. Unsurprisingly, the software they provided in all other areas has frustrated practitioners by its mediocrity.

Perhaps, the reason you hardly ever meet any manufacturer who is happy with an ERP implementation is that the idea of an all-in-one integrated system is not that great to begin with.

What is the alternative?

First, management should respect the need for departments to have the systems that support them best, requiring only that they should be able to share information with other departments.

For example, Marketing, Engineering, and Accounting should not be mandated to use modules from a single all-in-one system, but they should be required to use the same product IDs and product families, for management to be able to view sales, production, and financial results accordingly.

To make this possible, the company needs a consistent information model of its activities, including the objects that need to be represented, the states these objects can be in, the information they need to exchange, and a structure for all the retained information.

The development of such a model is beyond the capabilities of a production planner, and often beyond the capability of anyone in the IT department of a manufacturing company. It requires high-level know-how in systems analysis and database design, and should be done by a consultant who is independent of any ERP supplier, in cooperation with the operating department and the IT group.

The first phase should focus on improving the performance of the legacy systems in targeted areas, and introducing middleware to facilitate the integration of data from multiple legacy systems. This involves work in Master Data Management for specs and nomenclature, Data Warehousing for history, and real-time databases for status.

The replacement of legacy systems should be considered based on the lessons learned through improvement, in particular with a realistic, internally developed view of costs and benefits. As in the case with new production equipment, the introduction of new IT systems may best be coordinated with the development of new production lines or plants.

Lean IT Summit 2012 – Were the presentations on topic?

See on Scoop.itlean manufacturing

What does “Lean IT” mean? Most of those who use this term think of it as the application of Lean principles to the operations of IT companies or departments. To others, including myself, it is primarily about the effective use of IT in support of Lean manufacturing, which is a completely different but very real problems.

The IT departments of manufacturing companies are rarely any help in implementing Lean. The plants’ information systems are full of data about the business and the technology of the plant that is only accessible through the IT department, but its members are busy implementing system upgrades and have no clue what uses this data could be put to.

On the other side are Lean implementers whose computer skills are limited to Excel and PowerPoint, and who underestimate the potential of 21st century IT based on comments made by Ohno in the 1960s.

I looked in vain for any discussion of these issues in the presentations at this summit, although plenty of solutions exist. There could have been, for example, discussions of how data warehousing technology to consolidate data from multiple legacy system into a single, comprehensive source of clean data about product and process specs, the status and history of demand and production, quality, maintenance, inventory, and the supply chain, along with plans for the future.Then it would have been fascinating to hear how tools like the ones Nate Silver applied to US presidential politics could be applied to identifying patterns in this data could be used to allocate products among families and size production lines for each  family…

We could also have heard about Lean implementers breaking out of the Excel/PowerPoint box, learning how to design and query databases, and how to judiciously apply data mining tools.

This is about making IT effective, and it should be job one. Then we can worry about applying Lean internally to IT department to make them more efficient.

See on www.lean-it-summit.com

What are standards for?

Many discussions of standards in a Lean context do not address purpose. In Manufacturing, product quality cannot be guaranteed unless the same operations are done the same way on every unit, regardless of time, shift, day, or even plant. This is why you need Work Standards and Job Breakdown Sheets. To achieve high productivity, you then need to design operator jobs to fill up the takt time with useful work, consistently throughout their shift, and without overburdening them. This is the purpose of Standardized Work, which is visible in manual assembly in the form of Yamazumi charts and, where human-machine interactions are involved, Work-Combination charts.
What is remarkable in Lean plants is that management pays attention to standards and that operators actually follow them. Elsewhere, the standards are in 3-ring binders sitting unread on shelves and containing obsolete information, with the knowledge of how the work is actually done residing only in the heads of the people who do it (See Figure 1).
Figure 1. Operator instructions in binders
To make sure the binders are actually on the shelf and in the proper order, it is becoming common as part of 5S to run tape across the back as shown here. The primary purpose of such binders is to show their own existence to auditors. There usually is no space or music stand provided near operator work stations to hold then open. The content is usually based on verbose boiler plates and generated by writers who do not leave the offices. Clearly, this is not an effective method to direct how work is actually done. The Lean approach is to provide instructions in A3 sheets posted above each station. For examples, see Figures 2 and 3; for details, Lean Assembly, pp. 157-163 and Working with Machines, pp. 133-153.
Figure 2. Assembly instruction sheets for stations and Yamazumi Chart for an assembly line
Figure 3. Work-Combination Chart for Machining in a cell
The relationship between standards and Kaizen is complex. The existence of effective standards is not a precondition for improving operations, or else too many operations would be impossible to improve. Once standards are in place, however, they provide a baseline for future improvements.
In general, standards constrain what people do, and don’t work well when developing them turns into making rules for others to follow. An overabundance of standards, and rigid enforcement, can stifle the very creativity we want to nurture in the work force. There are two approaches to solving this dilemma that need to be used jointly:
  1. Avoid unnecessary standards.
  2. Have the standards include a transparent and simple process for improving them.

Avoiding unnecessary standards

Consulting for Motorola some years ago, I remember being shown a stack of organization charts from at least 20 engineering groups, and no two of them were in the same format. There were all sorts of rectangular or rounded boxes and straight of curved connections. Most were in the usual pyramid shape, but some were inverted, with the manager at the bottom and the individual contributors on top. All charts were equally easy to read, but there was also a clear message: “We are a company of 40,000 entrepreneurs, and we don’t standardize what doesn’t need to be.”

Unnecessary or counterproductive standards

The opposite extreme is the German standard (DIN) that specifies the size of the balls at the end of motorcycle brake handles. According to Kei Abe, who designed motorcycles at Honda in the 1960s, when Soichiro Honda found out about this standard, he said: “The German motorcycle industry is doomed.”
Figure 1. Ball at the end of motorcycle brake handle
Employee email addresses are an area where standardization is directly detrimental to the objectives pursued by the company. Except for those involved with sales or public relations, most companies do not publish their employees’ professional email addresses, so as to protect them from spammers and recruiters. Yet they generate these addresses in standard formats, the most common being first-name.last-name@domain-name.com. This standard is easily inferred from the business card of a single employee, and enables any outsider to build an email list by retrieving names from a social network and formatting them to the standard.

Necessary standards

In Manufacturing, here are some examples where standards are necessary but frequently not in place, starting with the most obvious:
  • System of units. In US plants of foreign companies, it is not uncommon to encounter both  metric and US units. Companies should standardize on one system of units and use it exclusively.
  • Technical parameters of the process, such as the torque applied to a bolt, or die characteristics in injection molding, diecasting, or stamping.
  • Instruction sheet formats. Supervisors who monitor the work with the help of instruction sheets posted above each station need to find the same data is the same location on the A3 sheet everywhere.
  • Operator interfaces to machine controls. Start, and emergency stop buttons should look alike and be in consistent locations on all machines. So should lights, sounds, and messages used for alarms and alerts.
  • Andon lights color code. Andon lights are useless unless the same, simple color code is used throughout the plant, allowing managers at a glance to see which machines are working, idle, or down.
  • Performance boards for daily management. Having a common matrix of charts across departments is a way to ensure that important topics are not forgotten and to make reviews easier. For a first-line manager, for example, you may have columns for Safety, Quality, Productivity and Organization, and rows for News, Trends, Breakdown by category, and Projects in progress.

Corporate Lean groups and standards

At the corporate level, standards are necessary for operational data provided by the plants. On the other hand, it is easy for Corporate to overreach in mandating management and engineering practices at the plant level. Corporate Lean groups, for example, have been known to demand current and future state Value-Stream Maps from every shop of every plant as a standard. Such maps are then dutifully produced by people who do not always understand the technique and its purpose, and whose organizations may be functional departments rather than Value Streams. These maps are then
posted on walls for visitors to see.
More generally, corporate Lean groups should refrain issuing standards that mandate implementation tactics at the plant level. Tom DeMarco made a useful distinction between methods and methodologies. Methods are like tools in a box: as a professional, you pick which ones to use as needed to solve the problem at hand. A methodology, on the other hand, walks you through a sequence of 12 steps that supposedly leads to a solution regardless of what the problem is. A methodology is an excuse for not thinking; it turns people into what DeMarco calls “template zombies.” He writes about software development, but there are template zombies in Manufacturing.
The rigidity associated with methodological thinking is best illustrated by the following story on exam questions:
Question 1: How to boil water?
Answer 1: Take a pot, fill it up with water, place it on the stove, turn on the burner, and wait.
Question 2: How to boil water, when you already have a pot of cold water on the stove?
Answer 2: Empty the pot, put it away, and you are back to Question 1.
Not only do methodologies make you do unnecessary tasks, but they also restrict your achievements to what they can be effective for. In many companies that have corporate Lean programs, as a plant manager or engineer, you will get no credit for improvements by any means other than the standard methodology, and may even lose your job for failing to apply it, regardless of your results.

Instead of trying to develop and enforce a standard, one-size-fits-all methodology for all of a company’s plants — whose processes may range from metal foundry to final assembly — the corporate Lean group should instead focus on providing resources to help the plant teams develop their skills and learn from each other, but that is a different discussion.

Process for improving standards

When a production supervisor notices that an operator is not following the standard, it may mean that the operator needs to be coached, but it may also mean that the operator has found a better method that should be made the standard. But how do you make this kind of local initiative possible without jeopardizing the consistency of your process? The allowed scope for changes must be clear, and there must be a sign-off procedure in place to make them take effect.

Management policies

I remember an auto parts plant in Mexico that had dedicated lines for each customer. Some of the customers demanded to approve any change to their production lines, even if it involved only moving two machines closer, but other customers left the auto parts maker free to rearrange their lines as they saw fit as long as the did not change what the machines did to the parts. Needless to say, these customers’ lines saw more improvement activity than the others.

In this case, the production teams could move the torque wrench closer to its point of use but they could not replace it with an ordinary ratchet and a homemade cheater bar. The boundary between what can be changed autonomously and what cannot is less clear in other contexts. In milling a part, for example, changing the sequence of cuts to reduce the tool’s air cutting time can be viewed a not changing the process but, if we are talking about deep cuts in an aerospace forging, stresses and warpage can be affected by cut sequencing.

If a production supervisor has the authority to make layout or work station design changes in his or her area of responsibility, it still must be done with the operators, and there are several support groups that must be consulted or informed. Safety has to bless it; Maintenance, to make sure that technicians still have the required access to equipment; Materials, to know where to deliver parts if that has changed. Even in the most flexible organizations, there has to be a minimum of formality in the implementation of changes. And it is more complex if the same product is made in more than one plant. In the best cases, when little or no investment is required, the changes are implemented first, by teams that include representations from all the stakeholders, and ratified later. We can move equipment on the basis of chalk marks on the floor, but, soon afterwards, the Facilities department must have up-to-date layouts.

The more authority is given to the local process owners, the easier it is to implement improvements, but also the more responsibility upper managers assume for decisions they didn’t make. The appropriate level of delegation varies as Lean implementation progresses. It starts with a few, closely monitored pilot projects;  as the organization matures and develops more skills, the number of improvement projects explodes, and the local managers develop the capability to conduct them autonomously. At any time, for the upper managers, it is a question of which decisions pass the “sleep-at-night” test: what changes can they empower subordinates to make on their own and still sleep at night?

Generating effective standards

If there is a  proven method today to document manufacturing processes in such a way that they are actually executed as specified, it is Training Within Industry (TWI). The story of TWI is beginning to be well-known. After being effective in World War II in the US, it was abandoned along with many wartime innovations in Manufacturing, but lived on at Toyota for the following 50 years before Toyota alumni like John Shook revived it in the US.

There are, however, two limitations to TWI, as originally developed:

  1. It is based on World War II information technology. It is difficult to imagine, however, that if the developers of TWI were active today, they would not use current information technology.
  2. It includes nothing about revision management. There is a TWI Problem-Solving Manual (1955), and solving a problem presumably leads to improving the process and producing a new version of job breakdown, instructions, etc. This in turn implies a process for approving and deploying the new version, archiving the old one and recording the date and product serial numbers of when the new version became effective.

Revision management

The developers of TWI may simply have viewed revision management as a secondary, low-level clerical issue, and it may have been in their day. The pace of engineering changes and new product introduction, however, has picked up since then. In addition, in a Lean environment, changes in takt time every few months require you to regenerate Yamazumi and Work Combination charts, while Kaizen activity, in full swing, results in improvements made to thousands of operations at least every six months for each.

In many manufacturing organizations, the management of product and process documentation is slow, cumbersome, and error-prone, particularly when done manually. Today, Product Documentation Management (PDM) is a segment of the software industry addressing these issues. It is technically possible to keep all the standards, with their revision history, in a database and retrieve them as needed. The growth of PDM has not been driven by demands from the shop floor but by external mandates like the ISO-900x standards, but, whatever the reasons may be, these capabilities are today available to any manufacturing organization that chooses to use them.

Using software makes the flow of change requests more visible, eliminates the handling delays and losses associated with paper documents, allows multiple reviewers to work concurrently, but it does not solve the problem of the large number of changes that need to be reviewed, decided upon, and implemented.

This is a matter of management policies, to cover the following:

  1. Making each change proposal undergo the review process that it needs and no more than it needs.
  2. Filtering proposals as early as possible in the review process to minimize the number that go through the complete process to ultimately fail.
  3. Capping the number of projects in the review process at any time.
  4. Giving the review process sufficient priority and resources.


In principle, revision management can be applied to any document. In practice, it helps if the documents have a common structure. If they cover the same topics, and the data about each topic is always in the same place, then each reviewer can immediately find the items of interest. This means using templates, but also walking the fine line to avoid turning into DeMarco’s template zombies.

If you ask a committee of future reviewers to design an A3 form for project charters, it will be a collection of questions they would like answered. Accountants, for example, would like to quantify the financial benefits of projects before they even start, and Quality Assurance would like to know what reduction in defective rates to expect… Shop floor teams can struggle for days trying to answer questions for which they have no data yet, or that are put in a language with acronyms and abbreviations like IRR or DPMO that they don’t understand. More often than not, they end up filling out the forms with text that is unresponsive to the questions.

The teams and project leaders should only be asked to answer questions that they realistically can, such as:

  • The section of the organization that is the object of the project, and its boundaries.
  • The motivation for the project.
  • The current state and target state.
  • A roster of the team, with the role of each member.
  • A crude project plan with an estimate for completion date.
  • A box score of performance indicators, focused on the parameters on the team performance board that are reviewed in daily meetings.

The same thinking applies to work instructions. It takes a special talent to design them and fill them out so that they are concise but sufficiently detailed where it matters, and understood by the human beings whose activities they are supposed to direct.


It is also possible to display all instructions on the shop floor in electronic form. The key questions are whether it actually does the job better and whether it is cheaper. In the auto parts industry, instructions are usually posted in hardcopy; in computer assembly, they are displayed on screens. One might think that the computer industry is doing it to use what it sells, but there is a more compelling reason: while the auto parts industry will make the same product for four years or more, 90% of what the computer industry assembles is for products introduced within the past 12 months. While the auto parts industry many not justify the cost of placing monitors over each assembly station, what computer assemblers cannot afford is the cost and error rate of having people constantly post new hardcopy instructions.

In the auto industry, to provide quick and consistent initial training and for new product introduction in its worldwide, multilingual plants, Toyota has created a Global Production Center, which uses video and animation to teach. To this day, however, I do not believe that Toyota uses screens to post work instructions on the shop floor. In the assembly of downhole measurement instruments for oilfield services, Schlumberger in Rosharon, TX, is pioneering the use of iPads to display work instructions.

Data, information, knowledge, and Lean


Terms like data and information are often used interchangeably. Value Stream Mapping (VSM), for example, is also called Materials and Information Flow Analysis (MIFA) and, in this context, there is no difference between information and data. Why then should we bother with two terms? Because  “retrieving information from data” is meaningless unless we distinguish the two.

The term knowledge is used to call a document a “body of knowledge” or an online resource a “knowledge base,” when their contents might be more aptly described as dogmas or beliefs with sometimes a tenuous relationship with reality. Computer scientists are fond of calling  knowledge anything that takes the form of rules or recommendations. Having an assertion in a “knowledge base,”  however, does not make it knowledge in any sense the rest of the world would recognize. If it did, astrology would qualify as knowledge.

In Lean, as well as for many other topics, clarity enriches communication, and, in this case, can be achieved easily, in a way that is useful both technically and in everyday language. In a nutshell:

  1. Data is what is read or written.
  2. Information is what you learn from reading data.
  3. Knowledge is information that agrees with reality.

Authors like Chaim Zins have written theories about data, information, and knowledge that are much more complicated and I don’t believe more enlightening than the simple points that follow. They also go one step further, and discuss how you extract wisdom from knowledge, but I won’t follow them there. The search for wisdom is usually called philosophy, and it is too theoretical a topic for this blog.


In his landmark book on computer programming, Don Knuth defined data as “the stuff that’s input or output.” While appropriate in context, this definition needs refinement to include data that is not necessarily used in a computer, such as the Wright Brothers’ lift measurements (See Figure 1). If we just say data is the stuff that’s read or written, this small change does the trick. It can be read by a human or a machine. It can be written on paper by hand or by a printer, it can be displayed on a computer screen, it can be a colored lamp that turns on, or even a siren sound.

Figure 1. Data example: the Wright Brothers’ lift measurements

More generally, all our efforts we make a plant visible have the purpose of making it easier to read and, although it is not often presented this way, 5S is really about data acquisition. Conversely, team performance boards, kanbans andons, or real-time production monitors are all ways to write data for people, while any means used to pass instructions to machines can be viewed as writing data, whether it is done manually or by control systems.

What is noteworthy about reading and writing is that both involve replication rather than consumption. Flow diagrams for materials and data can look similar, but, once you used a screw to fasten two parts, you no longer have that screw, and you need to keep track of how many you have left. On the other hand the instructions you read on how to fasten these parts are still there once you have read them: they have been replicated in your memory. Writing data does not make you forget it. This fundamental difference between materials and data needs to be kept in mind when generating or reviewing, for example, Value Stream Maps.


Information is a more subtle quantity. If you don’t know who won the 2010 Soccer World Cup and read a news headline that tells you Spain did, you would agree that reading it gave you some information. On the other hand, if you already knew it, it would not inform you, and, if you read it a second time, it won’t inform you either. In other words, information is not a quantity that you can attach to the data alone, but to a reading of the data by an agent.

If you think of it as a quantity, it has the following characteristics:

  1. It is positive. You can learn from reading data, but reading data cannot make you forget. As a quantity, information can therefore only be positive or zero, and is zero only  if the data tells you nothing you didn’t already know. In addition, data in a news story about an outcome that you know to be impossible add no information.
  2. It is maximum for equally likely outcomes.A product order gives you the most information when you had no idea what product it might be for. Conversely, if you know that 90% of all orders are for product X, the content of the next order is not much of a surprise: you will lose only 10% of the time if you bet on X. The amount of information you get from reading data is maximum when you know the least, and therefore all possible values are equally likely to you.
  3. It is subadditive. If you read two news stories, the information you acquire from both will be at most the sum of the information you acquire from each. If you read about independent topics like the flow of orders on the company’s e-commerce website and lubrication problems in the machine shop, the information from both stories will be the sum of the information in each. If, however, the second story is about, say, dealer orders, then the two stories are on related subjects, and the total information received will be less than the sum of the two.

The above discussion embodies our intuitive, everyday notion of what information is. For most of our purposes —  like designing a performance board for Lean daily management, an andon system, a dashboard on a manager’s screen, or a report — this qualitative discussion of information is sufficient. We need to make sure they provide content the reader does not already know, and make the world in which he or she operates less uncertain. In other words, reading the data we provide should allow readers to make decisions that are safer bets about the future.

In the mid 20th century, however, the mathematician Claude Shannon took it a step further, formalized this principle into a quantitative definition of information, and proved that there was one only one mathematical function that could be used to measure it. He then  introduced the bit as its unit of measure. Let us assume that you read a headline that says “Spain defeats the Netherlands in the Soccer World Cup Final.” If you already knew that the finalists were Spain and the Netherlands and thought they were evenly matched, then the headline gives you one bit of information. If you had no idea which of the 32 teams that entered the tournament would be finalists, and, to you, they had all equal chances, then, by telling you it was Spain and the Netherlands, the headline gives you an additional 8.9 bits.

Over the decades, his theory has had applications ranging from the design communication networks to counting cards in blackjack, more than to help manufacturers understand factory data. It has a use, however, in assigning an economic value to the acquisition of information, and thereby justify the needed investment.


On November 3, 1948, the readers of the Chicago Tribune received information in the “Dewey defeats Truman” headline (See Figure 2). None of them would, however, describe this information as knowledge, just because it was not true. It should not need to be said, and, outside the software industry, it doesn’t. Software marketers, however, have muddied the water be calling rules derived from these assertions “knowledge,” regardless of any connection with reality. By doing so, they have erased the vital distinction between belief, superstition or delusion on one side and knowledge on the other.

Figure 2. When information is not knowledge

As Mortimer Adler put it in Ten Philosophical Mistakes (pp. 83-84), “it is generally understood that those who have knowledge of anything are in possession of the truth about it.  […] The phrase ‘false knowledge’ is a contradiction in terms; ‘true knowledge’ is manifestly redundant.”

When “knowledge bases” were first heard from in the 1980’s, they contained rules to arrive at a decision, and only worked well with rules that were true by definition. For example, insurance companies have procedures to set premiums, which translate well to “if-then” rules. A software system applying these rules could then be faster and more accurate than a human underwriter retrieving them from a thick binder.

On the other hand, in machine failures diagnosis, rules are true only to the extent that they actually work with the machine; this is substantially more complex and error-prone that applying procedures, and the rule-based knowledge systems of the 1980’s were not successful in this area.  Nowadays, a “knowledge base” is more often a forum where users of a particular software product post solutions to problems. While these forums are useful, there is no guarantee that their content is, in any way, knowledge.

The books on data mining are all about algorithms, and assume the availability of accurate data. In real situations, and particularly in manufacturing, algorithms are much less of a problem than data quality. There is no algorithm sophisticated enough to make wrong data tell a true story. The key point here is that, if we want to acquire knowledge from the data we collect, the starting point is to make sure it is accurate. Then we can all be credible Hulks (See Figure 3).

Figure 3. The Credible Hulk (received from Arun Rao, source unknown)

A management perspective on data quality

Prof. Mei-chen Lo, of National University and Kainan University in Taiwan, worked with Operations Managers in two semiconductor companies to establish a list of 16 dimensions of data quality. Most  are not parameters that can be measured, and should be considered instead as questions to be asked about a company’s data. I learned it from her at an IE conference in Kitakyushu in 2009, and found it useful by itself as a checklist for a thorough assessment of a current state. Her research is about methods for ranking the importance of these criteria.

They are grouped in four main categories:

  1. Intrinsic. Agreement of the data with reality.
  2. Context.  Usability of the information in the data  to support decisions or solve problems.
  3. Representation. The way the data is structured, or not.
  4. Accessibility. The ability to retrieve, analyze and protect the data.

Each category breaks further down as follows:

  1. Intrinsic quality
    • Accuracy. Accuracy is the most obvious issue, and is measurable. If the inventory data says that slot 2-3-2 contains two bins of screws, then can we be confident that, if we walk to aisle 2, column 3, level 2 in the warehouse, we will actually find two bins of screws?
    • Fact or judgement. That slot 2-3-2 contains two bins of screws is a statement of fact. Its accuracy is in principle independent of the observer. On the other hand, “Operator X does not get along with teammates” is a judgement made by a supervisor and cannot carry the same weight as a statement of fact.
    • Source credibility. Is the source of the data credible? Credibility problems may arise due to the following:
      • Lack of training. For example, measurements that are supposed to be taken on “random samples” of parts are not, because no one in the organization knows how to draw a random sample.
      • Mistake-prone collection methods. For example, manually collected measurements are affected by typing errors.
      • Conflicts of interest. Employees collecting data stand to be rewarded or punished depending on the values of the data. For example, forecasters are often rewarded for optimistic forecasts.
    • Believability of the content. Data can unbelievable because it is valid news of extraordinary results, or because it is inaccurate. In either case, it warrants special attention.
  2. Context.
    • Relevance. Companies often collect data because they can, rather than because it is relevant. It is the corporate equivalent of looking for keys at night under the street light rather than next to the car. In the semiconductor industry, where this list of criteria was established, measurements are routinely taken after each step of the wafer process and plotted in control charts. This data is relatively easy to collect but of little relevance to the control and improvement of the wafer process as a whole. Most of the relevant data cannot be captured until the circuits can be tested at the end of the process.
    • Value added. Some of the data produced in a plant has a direct economic value. Aerospace or defense goods, for example, are delivered with documentation containing a record of their production process, and this data is part of the product. More generally, the data generated by commercial transactions, such as orders, invoices, shipping notices, or receipts, is at the heart of the company’s business activity. This is to be contrasted with data that is generated satisfy internal needs, such as, for example, the number of employees trained in transaction processing on the ERP system.
    • Timeliness. Is the data available early enough to be actionable? A field failure report on a product that is due to problems with a manufacturing process as it was 6 months ago is not timely if this process has been the object to two engineering changes since then.
    • Completeness. Measurements must be accompanied by all the data characterizing where, when, how and by whom they were collected and in what units they are expressed.
    • Sufficiency. Does the data cover all the parameters needed to support a decision or solve a problem?
  3. Representation
    • Interpretability. What inferences can you draw directly from the data? If the demand for an item has been rising 5%/month for the past 18 months, it is no stretch to infer that this trend will continue next month. On the other hand, if you are told that a machine has an Overall Equipment Effectiveness (OEE) of 35%, what can you deduce from it? The OEE is the product of three ratios: availability, yield, and actual over nominal speed. The 35% figure may tell you that there is a problem, but not where it is.
    • Ease of understanding. Management accounting exists for the purpose of supporting decision making by operations managers. Yet the reports provided to managers are often in a language they don’t understand. This does not have to be, and financial officers like Orrie Fiume have modified the vocabulary used in these reports to make them easier for actual managers to understand. The understandability of technical data can also be impaired when engineers use cryptics instead of plain language.
    • Conciseness. A table with 100 columns and 20,000 rows with 90% of its cells empty is a verbose representation of a sparse matrix. A concise representation would be a list of the rows and columns IDs with values.
    • Consistency. Consistency problems often arise as a result of mergers and acquisitions, when the different data models of the companies involved need to be mashed together.
  4. Accessibility
    • Convenience of access. Data that an end-user can retrieve directly through a graphic interface is conveniently accessible; data in paper folders on library shelves is not. Neither are databases in which each new query requires the development of a custom report by a specially trained programmer.
    • Usability. High-usability data, for example, comes in the form of lists of property names and values can easily be tabulated into spreadsheets or database tables, and, from that point on, selected, filtered and summarized in a variety of informative ways. Low-usability data often comes in the form of a string of characters, that first needs to be separated, with character 1 to 5 being one field, 6 to 12 another, etc., and the meaning of each of these substrings needs to be retrieved from a correspondence table, to find that ’00at3′ means “lime green.”
    • Security. Manufacturing data contain some of the company’s intellectual property, which must be protected not only from theft but from inadvertent alterations by unqualified employees. But effective security must also be provided efficiently, so that qualified, authorized employees are not slowed down by security procedures when accessing data.

Prof. Mei-Chen Lo’s research on this topic was published in The assessment of the information quality with the aid of multiple criteria analysis (European Journal of Operational Research, Volume 195, Issue 3, 16 June 2009, Pages 850-856)