Data Farming:  A Way to Maximize Use of Data for Machinery Condition Monitoring




A survey conducted by Cisco Systems Inc., results of which were presented at the May 2017 Internet of Things (IoT) World Forum in London, revealed that: 

  • 60 percent of IoT initiatives stalled at the proof of concept stage;
  • (Only) 26 percent of the surveyed companies considered their IoT deployments and initiatives as being successful;
  • Overall nearly three-fourths of IoT initiatives were considered a failure, while a third of all projects being completed were not seen as a success.i

Planning, managing and expanding the transition to Industry 4.0 and selling it to the executive level of an organization are skill sets that most reliability and maintenance (R&M) practitioners don’t possess. However, R&M personnel do have skills and proven methodologies, such as reliability-centered maintenance (RCM), that are ideally suited to supporting a controlled transition to use the Industrial Internet of Things (IIoT), advanced analytics, cloud computing, artificial intelligence (AI) and related subjects to yield early maximum return on investment and greater machinery reliability. This linking of old and new tools is called data farming.
This article shows how to maximize the use of data for condition monitoring. It also describes how data farming can help overcome another major problem with today’s digital transformation – the fact that only a small portion of the data being accumulated is being analyzed in any meaningful way. ii The cost of these problems can be quite large, with no appreciable benefit to an organization’s bottom line. 

Past, Present and Future

Today’s and tomorrow’s capabilities involve wireless technology, in-house analytics, big data management, the IIot and/or IoT, cloud computing and advanced analytics. In contrast to the seemingly huge amounts of condition and performance data that had to be developed and managed manually with much less capable computers and peripherals over 30 years ago, consider the advantages made possible by today’s networks and devices. With IIoT and IoT data capacity and speed, plus related communications capabilities provided by internal and external wireless network protocols, data can flow from machine to remote data center(s) almost instantly. Added to this, mobile 4G long-term evolution (LTE) cellular network  capabilities in use today are, or in the near future, being eclipsed by vastly superior 5G mobile network speed and capacity in the very near future (50 to 1,000 times greater than 4G by two iii). None of these communications capabilities existed in the 1980s. 


Figure 1: Example mid-2019 machinery monitoring scheme showing levels of analysis and nature of feedback (Source: Jack R. Nicholas, Jr., All Rights Reserved)


Most importantly, the number of personnel needed to carry out a modern predictive analysis program for hundreds of clients is becoming far less than before. The variety of analytic methods available has grown significantly and become much more sophisticated. With AI, human involvement is being reduced to only the most difficult analysis and interpretation tasks. With shared computer resources in the Cloud, costs are dropping due to fewer people involved and seemingly continuous reduction in unit cost of data storage in digital form.

While analytic capabilities, such as pattern recognition, regression analysis, correlation, tests against limits and ranges, relative comparison and statistical analysis, were applied in the 1980s, computer storage and recall capacity, speed, and costs limited their use largely to the highest priority programs, such as for national defense. iv

By mid-2019, for industrial and utility sectors of the world economy, a typical condition and performance monitoring initiative involving internal and external wireless connectivity and the IIoT applied to critical machinery might look like what is depicted in Figure 1.

For condition and performance monitoring, an increasing number of easily installed digital wireless sensors are available for application to machinery. The energy to power the sensors and wireless links most often comes now from batteries, the life of which are dependent upon battery design technology, capacity and frequency of data transmission. Alternative power sources, such as fuel cells, are in the development pipeline. These show promise of vastly increased power output and increased life over today’s batteries. However, the biggest advancement in this field may be the development of ambient harvesting or scavenging methods to charge galvanic cell power circuits using sources of mechanical, thermal, natural, light, magnetic field and nuclear radiation energy. v These sensors generally provide raw data and send them to nearby computers directly or via associated programmable logic controllers (PLCs) for basic analysis (e.g., checks against alert and alarm limits, relative comparison) and communications of abnormal conditions to designated personnel.

A key principle of modern condition and performance monitoring is to conduct as much analysis as close to the edge of the network as feasible. This is becoming possible with the introduction of smart sensors that can be customized to suit the application and provide actionable information.vi Each sensor point should be assessed for its ability to indicate an actual or likely failure mode standing alone or in combination with any other source(s) of data, such as may be done with the computer card, which is described next. But first, the failure modes and related mitigating digital analysis tasks must be determined.

The most effective methodology for doing this is RCM. In order to get the earliest return on investment in RCM, regardless of the most desirable outcome, a Pareto analysis should be conducted to determine which of the worst bad actors should be investigated. vii

Actual failures for which a proper root cause analysis (RCA) or defect elimination (DE) viii activity is conducted may also yield digital analysis tasks.

Early 2017 saw the introduction of a credit card sized compute card that provides a modular approach to designing edge computing power and connectivity to consumer or industrial products. The card is a full computer with memory, storage, input/output options, WiFi and Bluetooth connectivity. ix  Today, there are many more minicomputers and microcomputers and chips designed for edge analysis with the same or even greater capabilities. These include analysis methods, such as regression or trend analysis, tests against limits or ranges, pattern recognition and correlation analysis, supported by an artificial intelligence software resident in chips integral to the edge computer.x

Ideally, a single device receiving inputs from many sensors in a local area, along with effective programming, can provide actionable intelligence of a basic, but quite useful, nature, if not also a specific course of action.

Wireless links and gateways are used to connect sensors to computational devices and then to transfer structured information to the next level of analysis, usually with an ethernet or other wired network protocol. At this level, the whole plant can be monitored once proper connectivity is established. Data lakes or local cloud storage capabilities can be created for accumulation of data without having to go outside the facility. A local computer or server may be loaded with the digital twin of the plant being monitored. A virtual (i.e., digital) twin plant has all the characteristics of the real plant integral to it. The virtual twin contains all the information that describes normal, safe operations for comparison with actual plant conditions from start-up through full production to shutdown. Ideally, anything out of the ordinary comes to the attention of in-house analysts, operators and maintenance personnel, along with what to do to correct any abnormal condition.

Connectivity options are increasing. In addition to wireless network protocols, such as Bluetooth, WiFi and Zigbee, a totally new path for data transfer is emerging called LiFi. LiFi works with light emitting diodes, which are becoming commonly used in many permanent structures and vehicles, such as aircraft, in place of higher energy consuming space lights. It encodes messages in flashes of light. Local area networks can be created in ways similar to microwave based systems, but at less expense, although the aforementioned computers, cell phones and the like would have to be altered to receive the signals. One major technology company revealed in early 2016 that the operating system in its newly released smartphone has LiFi capabilities. xi

The total analysis capability implied by Figure 1 includes off-site or cloud computing using advanced analytics, perhaps combined with data from sources outside the organization having the same types of assets employed in similar ways. Advanced analytical methods include data mining, clustering analysis, classification and time series analysis, among other methods. xii The benefits, a subject beyond the scope of this article, are many, as described in the book, Asset Condition Monitoring Management.xiii

Focusing the Data Collection and Analysis Effort through Data Farmingxiv

As previously indicated, fertile fields for data farming are found in methodologies, such as RCM, RCA and DE. Seeds to plant are the tasks derived from these proven schemes, especially if the tasks are nonintrusive and involve data collection and interpretation using mathematical algorithms and/or visual analysis.
Once the tasks have been implemented, the data being collected have value, even if the value lies in assuring that everything is alright and no remedial action needs to be taken. This sounds simple, and it is straightforward. However, it requires careful resource management (e.g., by filtering) in order to prevent the system collecting, aggregating and analyzing the data from becoming overwhelmed with repetitive and redundant information. xv
The overall data farming process is depicted in Figure 2.


Figure 2: Data farming process in support of advanced analytics (Source: Jack R. Nicholas, Jr., All Rights Reserved) 

Initially, not all tasks may be managed using this concept. Thus, a wish list should be established to search for solutions providers that may have answers to unknowns unresolved in-house. The wish list should be used when organizations send personnel to conferences and exhibits featuring firms involved with Industry 4.0 issues, such as machine analysis in the Cloud, plant, or at the edge. One of the advantages of attendance is the opportunity to locate answers for hard to solve problems by gaining knowledge about specifics on your wish list not only from vendors, but also from other attendees.

Harvesting the results from defined data source(s) for which failure modes are known may be done using commercially available software programs or services providers, such as software as a service (SaaS). Interpretation may be done by humans now, but will undoubtedly be performed in the future using AI software programs, many of which are becoming available off-the-shelf from a rapidly increasing number of providers around the world.

Results from analyses must be managed. First, decide what to use in-house and select those that will be sent off-site to a Cloud for aggregation, analysis and interpretation in conjunction with data from other sources. Sending results to any Cloud, whether internal or external, requires synchronization or structuring of results terminology and formatting to avoid mixing “apples” and “oranges,” wasting resources and producing meaningless or useless information.

Time series database management systems (DBMS) comprise the fastest growing and most popular database segment in Industry 4.0 circles from early 2016 into 2018. xvi Manufacturing leads by far in this regard.xvii This implies where organizations are investing the most money in IIoT pursuits. The big question is: Are they getting their money’s worth from this investment? In time series analysis, abnormal patterns, trends, or conditions relative to established ranges or limits are analyzed and reported for follow-up action by owners of the machines or processes being monitored. This is called data mining or exploratory data mining. While this makes common sense, it can be very expensive in terms of data storage costs and searches for degraded conditions. Often, it takes a long time for clusters or patterns to develop, even with the help of deep learning machine diagnostic or other AI programs. This may be useful if the conclusion is there are no problems or none being detected. The concept of data farming, however, is to target the search for early warning when known causes and defects are recognized as probable. Blind searching can continue while targeted analysis based on data farming is being performed.

The basic difference between data farming and data mining is depicted in Figure 3.


Figure 3: Data farming and data mining differences

Figure 3 stresses the cost factor. Farming provides a means for controlling the cost by concentrating on known failure modes and causes, without inhibiting the great potential of data mining. Ultimately, the influence of data farming should be reflected in off-site, cloud-based analysis, increasing the value of the output. This permits organizations to increase the amount of data that is being productively analyzed. Other data may still be subjected to advanced analytics processing when it is deemed of value in identifying yet unknown problems created by aging and other factors.

Figure 4: Example future machinery monitoring scheme showing levels of analysis and nature of feedback (Source: Jack R. Nicholas, Jr., All Rights Reserved)

A conceptual depiction of a monitoring scheme involving both data farming and data mining using the IIoT is illustrated in Figure 4.

Data farming can result in early and more valuable gains, both internally at a plant site and externally from the Cloud. It doesn’t diminish the value of data mining, but may greatly increase the useful outputs and value of the data being processed, resulting in a higher percentage of successful IoT and IIoT projects than recently revealed in the Cisco Systems survey.

Conclusion

Data farming uses well established methodologies (e.g., RCM, RCA, DE) to target specific failure modes and causes by analyzing the sensor information that reveals them. Contrary to the positions posed by some IoT/IIoT enthusiasts and big data database management organizations, these methodologies will not, at least in the short term, become obsolete. They can be used to great advantage in gaining value from the data being collected, accumulated, structured, filtered, analyzed and acted upon. This will overcome the current trend of IoT/IIot initiatives failing in the vast majority of cases that involve machinery condition and performance monitoring.
The main advantage of data farming as defined in this article is in controlling costs, allowing for value to be gained while the digital revolution begins to show its real potential in increasing machinery reliability and decreasing production costs in manufacturing.

References

ihttps://www.i-scoop.eu/internet-of-things-guide/internet-things-project-failure-success/
iiIt is difficult to judge from the many estimates being made in various publications and on the Internet and even more so to separate out machinery data from all others being collected either internally or externally in the Cloud. Machinery data subjected to any type of analysis using local or advanced analytics is at present estimated by the author to be less than 5 percent of all data.

iiiGill, Bob. “Industrial IoT and the Promise of 5G, Industrial IoT/Industrie 4.0.” Viewpoints, May 4, 2016. Also see https://industrial-iot.com/2016/04/industrial-iot-promise-5g/ and Anthony Sabastian. “South Korea to spend $1.5 billion on 5G mobile network that’s ‘1,000 times faster’ than 4G.” ExtremeTech, January 22, 2014, https://www.extremetech.com/computing/175206-south-korea-tospend- 1-5-billion-on-5g-mobile-network-thats-1000-times-faster-than-4g. Deployment started in 2017 in anticipation of a pre-commercial 5G trial during the Pyeongchang, Seoul, South Korea 2018 Winter Olympic Games. See http://gsacom.com/global-spectrum-situations-5g-positioning-countries/. Mobile network providers in the United States are beginning to advertise deployment in 2019 of 5G LTE capabilities in some markets.

ivNicholas, J. “What Organizations Must Do to Take Best Advantage of Big Data and Predictive Analytics in the Operations, Maintenance and Reliability Field.” Paper presented at The Reliability Conference, Las Vegas, NV, April 2017. Describes the Submarine Maintenance Monitoring and Support Program, arguably one of the most comprehensive asset condition monitoring programs ever conducted in the 20th century (Applying 26 predictive technologies on up to 65 systems on each of 122 nuclear powered subs operating from ports in Europe and westward to Hawaii’).

vNicholas, J. Asset Condition Monitoring Management. Fort Myers: Reliabilityweb.com, December 2016 (ISBN 978-1-941872-52-9), Chapter 10, pp170-171. Provides more detail on battery alternative power sources for sensors.

viSmart sensors may be quite sophisticated with integral microprocessors. However, these require much more power than a sensor that can transmit digital data periodically and are not necessarily the first choice in all cases. ABB has developed a compact sensor that is attached to the frame of low voltage induction motors. No wiring is needed. Using on-board algorithms, based on ABB’s decades of motor expertise, the smart sensor relays information about the motor’s health (e.g., vibration, temperature) via a smartphone and/or over the Internet to a secure server. This solution can make huge numbers of motors into smart devices, enabling them to benefit from intelligent services. The solution was launched in the North American market in 2016. See http://new.abb.com/motors-generators/service/advanced-services/smart-sensor

viiKhan, F.I. “Bad Actor Program.” Uptime Magazine Aug/Sept 2019, pp 56-60. Article provides an excellent approach to Pareto analysis. Although the article is aimed at preventive maintenance optimization, lifecycle costing, spare parts forecasting and reliability, availability and maintainability (RAM) modeling, it is equally useful for focusing on which systems to conduct reliability-centered maintenance.

viiiLedet, W. P., Ledet W. J. and Abshire, S. M. Don’t Just Fix It, Improve It! Fort Myers: Reliabilityweb.com, 2009. (ISBN 978-0-9825163-1-7). Provides details on defect elimination.

xide Leeuw, Valentijn. https://industrial-iot.com/2017/01/intel-compute-card-iot-moduless-potential-impact-product-life-cycle-, January 27, 2017.

xVincent, James. “Google unveils tiny new AI chips for on-device machine learning.” The Verge, July 26, 2018: https://www.theverge.com/2018/7/26/17616140/google-edge-tpu-on-device-ai-machine-learning-devkit. Article states: Google is moving its AI expertise down from the Cloud, and has taken the wraps off its new Edge TPU, a tiny AI accelerator that will carry out machine learning jobs in IoT/IIoT devices. The Edge TPU is designed to do what’s known as inference. This is the part of machine learning where an algorithm actually carries out the task it was trained to do, like, for example, recognizing an object in a picture. Google’s server-based TPUs are optimized for the training part of this process, while these new Edge TPUs will do the inference. Six of these chips will fit within the perimeter of a U.S. one cent coin.

xiThe Economist newspaper. “In a Whole New Light.” September 24, 2016, pp76-77. The article identifies Velmenni, an Indian firm, PureLiFi, a British firm and Luciom, a French firm, marketing various applications of this technology. Apple’s iPhone operating system released that year is described as LiFi capable, but a sensor (not provided with the phone) must be added to make the capability useful.

xiiIbid, reference v. Page 179 provides a typical list of the names of some analytical methods.

xiiiIbid, reference v. Chapter 10, pp 178-184.

xivThe author of this article cannot locate the specific source of the term, data farming, and cannot claim to have originated it. The idea came from reading an article in the U.S. Naval Institute Proceedings on a totally different subject. Its definition may not even be the same of that of the author of the Proceedings article, who used the term without defining it that triggered the idea of its value in digital machine monitoring.

xvValerio, Pablo. “Managing Resources on IoT and Edge Computing.” IoT Times website, August 12, 2019: https://iot.eetimes.com/managing-resources-on-iot-and-edge-computing/

xviRisse. Michael. “The new rise of time-series databases.” Smart Industry website, February, 26, 2018: https://www.smartindustry.com/blog/smart-industry-connect/the-new-rise-of-time-series-databases/

xviiIbid.