Using iPads, QR scans, Sharepoint and Infopath to implement TWI

Franck Vermet‘s group at Schlumberger in Rosharon, TX, assembles and tests measurement instruments that operate deep inside oil wells. They are built for internal use by Schlumberger Oilfield Services, to collect data for customers. They are high-value products, with tight tolerances and the ability to operate in an environment that is not friendly to electronics.
With Mark Warren‘s help, Vermet’s group has been looking to TWI as a way to rely less on knowledge in the heads of experienced operators and more on documented processes for the following purposes:

  1. To ensure that the same work is done the same way by different individuals.
  2. To train new operators faster.
  3. To improve the processes systematically and in a controlled manner.
  4. To support the implementation engineering changes to the products and new product introduction.

In World War II, however, TWI was implemented with cardboard pocket cards and handwritten Job Breakdown sheets, but the Schlumberger team was determined to use more modern technology. After investigating the available options, they realized the following:

  • Sharepoint has a built-in revision management system that makes a Sharepoint site suitable as a repository of process specifications. This helps with traceability and ISO compliance.
  • Infopath is a form design tool they could use to generate TWI templates and store the filled out Job Breakdown sheets in the Sharepoint site.
  • iPads are an effective presentation device on the shop floor, not just for the equivalent of pocket cards,  but for drawings and photographs as well.
  • QR scans linking to job instructions are posted on equipment by means of printed magnets. They can be scanned using an iPad to retrieve the instructions from the Sharepoint site.

Implementation is still in its early days, but all indications from users are that it works. It should be noted also that the approach is sound from the point of view of data management. Unlike the proliferation of Excel spreadsheets that is so often seen in factories, with more or less accurate and up-to-date copies of master data floating around, this approach provides the necessary controls, with the current data retrieved from the server as needed.

As could be expected, the Schlumberger team is facing headwinds from two directions:

  • Among the TWI revivalists, there are those for whom, if it’s not handwritten on cardboard, it isn’t TWI, and they frown on the use of a gadget like the iPad. Never mind that many retail shops already use them as point-of-sale systems.
  • Within the company, TWI-authenticity is nor a concern, but the uncontrolled spread of computer and networking technology is, at least for the IT department. It supports the use of a standard configuration of tools, which does not include what Vermet’s group is using.

The Schlumberger team is  now training suppliers in these tools with the goal of getting them to inspect outgoing parts in such a way that incoming inspection at Schlumberger can be eliminated.

Data, information, knowledge, and Lean


Terms like data and information are often used interchangeably. Value Stream Mapping (VSM), for example, is also called Materials and Information Flow Analysis (MIFA) and, in this context, there is no difference between information and data. Why then should we bother with two terms? Because  “retrieving information from data” is meaningless unless we distinguish the two.

The term knowledge is used to call a document a “body of knowledge” or an online resource a “knowledge base,” when their contents might be more aptly described as dogmas or beliefs with sometimes a tenuous relationship with reality. Computer scientists are fond of calling  knowledge anything that takes the form of rules or recommendations. Having an assertion in a “knowledge base,”  however, does not make it knowledge in any sense the rest of the world would recognize. If it did, astrology would qualify as knowledge.

In Lean, as well as for many other topics, clarity enriches communication, and, in this case, can be achieved easily, in a way that is useful both technically and in everyday language. In a nutshell:

  1. Data is what is read or written.
  2. Information is what you learn from reading data.
  3. Knowledge is information that agrees with reality.

Authors like Chaim Zins have written theories about data, information, and knowledge that are much more complicated and I don’t believe more enlightening than the simple points that follow. They also go one step further, and discuss how you extract wisdom from knowledge, but I won’t follow them there. The search for wisdom is usually called philosophy, and it is too theoretical a topic for this blog.


In his landmark book on computer programming, Don Knuth defined data as “the stuff that’s input or output.” While appropriate in context, this definition needs refinement to include data that is not necessarily used in a computer, such as the Wright Brothers’ lift measurements (See Figure 1). If we just say data is the stuff that’s read or written, this small change does the trick. It can be read by a human or a machine. It can be written on paper by hand or by a printer, it can be displayed on a computer screen, it can be a colored lamp that turns on, or even a siren sound.

Figure 1. Data example: the Wright Brothers’ lift measurements

More generally, all our efforts we make a plant visible have the purpose of making it easier to read and, although it is not often presented this way, 5S is really about data acquisition. Conversely, team performance boards, kanbans andons, or real-time production monitors are all ways to write data for people, while any means used to pass instructions to machines can be viewed as writing data, whether it is done manually or by control systems.

What is noteworthy about reading and writing is that both involve replication rather than consumption. Flow diagrams for materials and data can look similar, but, once you used a screw to fasten two parts, you no longer have that screw, and you need to keep track of how many you have left. On the other hand the instructions you read on how to fasten these parts are still there once you have read them: they have been replicated in your memory. Writing data does not make you forget it. This fundamental difference between materials and data needs to be kept in mind when generating or reviewing, for example, Value Stream Maps.


Information is a more subtle quantity. If you don’t know who won the 2010 Soccer World Cup and read a news headline that tells you Spain did, you would agree that reading it gave you some information. On the other hand, if you already knew it, it would not inform you, and, if you read it a second time, it won’t inform you either. In other words, information is not a quantity that you can attach to the data alone, but to a reading of the data by an agent.

If you think of it as a quantity, it has the following characteristics:

  1. It is positive. You can learn from reading data, but reading data cannot make you forget. As a quantity, information can therefore only be positive or zero, and is zero only  if the data tells you nothing you didn’t already know. In addition, data in a news story about an outcome that you know to be impossible add no information.
  2. It is maximum for equally likely outcomes.A product order gives you the most information when you had no idea what product it might be for. Conversely, if you know that 90% of all orders are for product X, the content of the next order is not much of a surprise: you will lose only 10% of the time if you bet on X. The amount of information you get from reading data is maximum when you know the least, and therefore all possible values are equally likely to you.
  3. It is subadditive. If you read two news stories, the information you acquire from both will be at most the sum of the information you acquire from each. If you read about independent topics like the flow of orders on the company’s e-commerce website and lubrication problems in the machine shop, the information from both stories will be the sum of the information in each. If, however, the second story is about, say, dealer orders, then the two stories are on related subjects, and the total information received will be less than the sum of the two.

The above discussion embodies our intuitive, everyday notion of what information is. For most of our purposes —  like designing a performance board for Lean daily management, an andon system, a dashboard on a manager’s screen, or a report — this qualitative discussion of information is sufficient. We need to make sure they provide content the reader does not already know, and make the world in which he or she operates less uncertain. In other words, reading the data we provide should allow readers to make decisions that are safer bets about the future.

In the mid 20th century, however, the mathematician Claude Shannon took it a step further, formalized this principle into a quantitative definition of information, and proved that there was one only one mathematical function that could be used to measure it. He then  introduced the bit as its unit of measure. Let us assume that you read a headline that says “Spain defeats the Netherlands in the Soccer World Cup Final.” If you already knew that the finalists were Spain and the Netherlands and thought they were evenly matched, then the headline gives you one bit of information. If you had no idea which of the 32 teams that entered the tournament would be finalists, and, to you, they had all equal chances, then, by telling you it was Spain and the Netherlands, the headline gives you an additional 8.9 bits.

Over the decades, his theory has had applications ranging from the design communication networks to counting cards in blackjack, more than to help manufacturers understand factory data. It has a use, however, in assigning an economic value to the acquisition of information, and thereby justify the needed investment.


On November 3, 1948, the readers of the Chicago Tribune received information in the “Dewey defeats Truman” headline (See Figure 2). None of them would, however, describe this information as knowledge, just because it was not true. It should not need to be said, and, outside the software industry, it doesn’t. Software marketers, however, have muddied the water be calling rules derived from these assertions “knowledge,” regardless of any connection with reality. By doing so, they have erased the vital distinction between belief, superstition or delusion on one side and knowledge on the other.

Figure 2. When information is not knowledge

As Mortimer Adler put it in Ten Philosophical Mistakes (pp. 83-84), “it is generally understood that those who have knowledge of anything are in possession of the truth about it.  […] The phrase ‘false knowledge’ is a contradiction in terms; ‘true knowledge’ is manifestly redundant.”

When “knowledge bases” were first heard from in the 1980’s, they contained rules to arrive at a decision, and only worked well with rules that were true by definition. For example, insurance companies have procedures to set premiums, which translate well to “if-then” rules. A software system applying these rules could then be faster and more accurate than a human underwriter retrieving them from a thick binder.

On the other hand, in machine failures diagnosis, rules are true only to the extent that they actually work with the machine; this is substantially more complex and error-prone that applying procedures, and the rule-based knowledge systems of the 1980’s were not successful in this area.  Nowadays, a “knowledge base” is more often a forum where users of a particular software product post solutions to problems. While these forums are useful, there is no guarantee that their content is, in any way, knowledge.

The books on data mining are all about algorithms, and assume the availability of accurate data. In real situations, and particularly in manufacturing, algorithms are much less of a problem than data quality. There is no algorithm sophisticated enough to make wrong data tell a true story. The key point here is that, if we want to acquire knowledge from the data we collect, the starting point is to make sure it is accurate. Then we can all be credible Hulks (See Figure 3).

Figure 3. The Credible Hulk (received from Arun Rao, source unknown)

What’s unique about the Kanban system? (Revisited)

10 years ago, I wrote an article by this title in Karen Wilhelm’s Lean Directions, and a detailed treatment of pull systems in Lean Logistics, pp. 199-270 (2005). While 6 to 10 years in an eternity in Information Technology, it is not in Manufacturing, and I have not seen evidence that technological advances since then have invalidated these discussions yet. Also in 2005, Arun Rao and I wrote a paper on RFID Applications in Manufacturing, which outlined ways this technology could be used, among other things, to implement the Kanban replenishment logic on the side of an assembly line. To the best of our knowledge, it still isn’t broadly used, and bar codes are still the state of the art on the shop floor.

For placing orders with suppliers, on the other hand, the recirculating  hardcopy Kanban has never really taken root in the US, and orders are usually placed electronically. When Kanbans are used with suppliers, they are usually single-use cards printed by the supplier to match electronic orders, that are attached to parts and scanned when the corresponding parts are consumed to trigger a reorder. This is the eKanban system, and more a horseless carriage than a car, in that it is an electronic rendition of a system whose logic was constrained by the use of cards.

A management perspective on data quality

Prof. Mei-chen Lo, of National University and Kainan University in Taiwan, worked with Operations Managers in two semiconductor companies to establish a list of 16 dimensions of data quality. Most  are not parameters that can be measured, and should be considered instead as questions to be asked about a company’s data. I learned it from her at an IE conference in Kitakyushu in 2009, and found it useful by itself as a checklist for a thorough assessment of a current state. Her research is about methods for ranking the importance of these criteria.

They are grouped in four main categories:

  1. Intrinsic. Agreement of the data with reality.
  2. Context.  Usability of the information in the data  to support decisions or solve problems.
  3. Representation. The way the data is structured, or not.
  4. Accessibility. The ability to retrieve, analyze and protect the data.

Each category breaks further down as follows:

  1. Intrinsic quality
    • Accuracy. Accuracy is the most obvious issue, and is measurable. If the inventory data says that slot 2-3-2 contains two bins of screws, then can we be confident that, if we walk to aisle 2, column 3, level 2 in the warehouse, we will actually find two bins of screws?
    • Fact or judgement. That slot 2-3-2 contains two bins of screws is a statement of fact. Its accuracy is in principle independent of the observer. On the other hand, “Operator X does not get along with teammates” is a judgement made by a supervisor and cannot carry the same weight as a statement of fact.
    • Source credibility. Is the source of the data credible? Credibility problems may arise due to the following:
      • Lack of training. For example, measurements that are supposed to be taken on “random samples” of parts are not, because no one in the organization knows how to draw a random sample.
      • Mistake-prone collection methods. For example, manually collected measurements are affected by typing errors.
      • Conflicts of interest. Employees collecting data stand to be rewarded or punished depending on the values of the data. For example, forecasters are often rewarded for optimistic forecasts.
    • Believability of the content. Data can unbelievable because it is valid news of extraordinary results, or because it is inaccurate. In either case, it warrants special attention.
  2. Context.
    • Relevance. Companies often collect data because they can, rather than because it is relevant. It is the corporate equivalent of looking for keys at night under the street light rather than next to the car. In the semiconductor industry, where this list of criteria was established, measurements are routinely taken after each step of the wafer process and plotted in control charts. This data is relatively easy to collect but of little relevance to the control and improvement of the wafer process as a whole. Most of the relevant data cannot be captured until the circuits can be tested at the end of the process.
    • Value added. Some of the data produced in a plant has a direct economic value. Aerospace or defense goods, for example, are delivered with documentation containing a record of their production process, and this data is part of the product. More generally, the data generated by commercial transactions, such as orders, invoices, shipping notices, or receipts, is at the heart of the company’s business activity. This is to be contrasted with data that is generated satisfy internal needs, such as, for example, the number of employees trained in transaction processing on the ERP system.
    • Timeliness. Is the data available early enough to be actionable? A field failure report on a product that is due to problems with a manufacturing process as it was 6 months ago is not timely if this process has been the object to two engineering changes since then.
    • Completeness. Measurements must be accompanied by all the data characterizing where, when, how and by whom they were collected and in what units they are expressed.
    • Sufficiency. Does the data cover all the parameters needed to support a decision or solve a problem?
  3. Representation
    • Interpretability. What inferences can you draw directly from the data? If the demand for an item has been rising 5%/month for the past 18 months, it is no stretch to infer that this trend will continue next month. On the other hand, if you are told that a machine has an Overall Equipment Effectiveness (OEE) of 35%, what can you deduce from it? The OEE is the product of three ratios: availability, yield, and actual over nominal speed. The 35% figure may tell you that there is a problem, but not where it is.
    • Ease of understanding. Management accounting exists for the purpose of supporting decision making by operations managers. Yet the reports provided to managers are often in a language they don’t understand. This does not have to be, and financial officers like Orrie Fiume have modified the vocabulary used in these reports to make them easier for actual managers to understand. The understandability of technical data can also be impaired when engineers use cryptics instead of plain language.
    • Conciseness. A table with 100 columns and 20,000 rows with 90% of its cells empty is a verbose representation of a sparse matrix. A concise representation would be a list of the rows and columns IDs with values.
    • Consistency. Consistency problems often arise as a result of mergers and acquisitions, when the different data models of the companies involved need to be mashed together.
  4. Accessibility
    • Convenience of access. Data that an end-user can retrieve directly through a graphic interface is conveniently accessible; data in paper folders on library shelves is not. Neither are databases in which each new query requires the development of a custom report by a specially trained programmer.
    • Usability. High-usability data, for example, comes in the form of lists of property names and values can easily be tabulated into spreadsheets or database tables, and, from that point on, selected, filtered and summarized in a variety of informative ways. Low-usability data often comes in the form of a string of characters, that first needs to be separated, with character 1 to 5 being one field, 6 to 12 another, etc., and the meaning of each of these substrings needs to be retrieved from a correspondence table, to find that ’00at3′ means “lime green.”
    • Security. Manufacturing data contain some of the company’s intellectual property, which must be protected not only from theft but from inadvertent alterations by unqualified employees. But effective security must also be provided efficiently, so that qualified, authorized employees are not slowed down by security procedures when accessing data.

Prof. Mei-Chen Lo’s research on this topic was published in The assessment of the information quality with the aid of multiple criteria analysis (European Journal of Operational Research, Volume 195, Issue 3, 16 June 2009, Pages 850-856)