Asset Performance Management Archives : Itus Digital https://www.itus-digital.com/category/asset-performance-management/ Fri, 19 Jan 2024 09:30:48 +0000 en-US hourly 1 https://wordpress.org/?v=6.1.5 https://www.itus-digital.com/wp-content/uploads/2019/08/cropped-itus-siteiconnew-32x32.png Asset Performance Management Archives : Itus Digital https://www.itus-digital.com/category/asset-performance-management/ 32 32 Asset Management Using a Value Chain Approach https://www.itus-digital.com/asset-management-using-a-value-chain-approach/ https://www.itus-digital.com/asset-management-using-a-value-chain-approach/#respond Fri, 19 Jan 2024 09:29:51 +0000 https://www.itus-digital.com/?p=3843 The concept of the value chain emerged with Michael Porter of Harvard University in the mid-1990s.  In Professor Porter’s model, “Products pass through a chain of activities in order, and at each activity the product gains some value.” (Wikipedia) Using a Value Chain Approach for...

The post Asset Management Using a Value Chain Approach appeared first on Itus Digital.

]]>
The concept of the value chain emerged with Michael Porter of Harvard University in the mid-1990s.  In Professor Porter’s model, “Products pass through a chain of activities in order, and at each activity the product gains some value.” (Wikipedia)

Using a Value Chain Approach for Asset Management Thinking

The value chain approach to asset management provides us with an important mental model that helps us develop asset management thinking.  Asset management thinking helps the organization consider how investments in assets helps them maintain competitive advantage, be accountable to their commitments to sustainability, and meet the needs and expectations of their internal and external stakeholders.

Asset Management & Sustainability

Whether they realize it or not, the C-Suite’s ability to materially demonstrate, deliver, and report on sustainability objectives, has a critical dependency on the asset management system.  The value chain approach to asset management is a useful mental model to help communicate that dependency and demonstrate how the asset management system is consistent and aligned with sustainability goals. Without it, asset management efforts can suffer from blind spots, be misunderstood by the organization, and fall short of intended outcome.

Using a Value Chain Approach for Selecting Technology

The topics of digital transformation, IoT, and artificial intelligence have dominated the marketplace. These are relevant topics and deserve some attention given how the economics have rapidly evolved and how accessible powerful technology is today.  The marketplace is teeming with vendors promoting their digital solutions.  This can distract management’s attention away from existing processes and organizational capabilities and cause them to overlook what’s already in place and working well.

Using a Value Chain Approach for Business Outcomes

Market fundamentals have permanently shifted.  Energy transition, renewables, and other green projects are now permanent fixtures of the landscape of every market – be it public sector or private sector.  Sustainability objectives are top of mind for executive leadership, all day, every day.  Asset management practitioners can be invited into the board room and earn a permanent seat at the table the same way other management systems like environment, safety, and quality have in recent decades.

We Invite You to Join the Discussion

On January 10, 2024, Itus Digital hosted a webinar on this subject.  It was a conversation among the most knowledgeable people on this subject.  Tom Smith – is an author, lecturer, and advisor on asset management.  Susan Lubell – author and business leader in implementing asset management business decision processes.  Rob Arseneau – experienced subject matter expert on aligning corporate vision and continuous improvement for reliability and asset management.

The post Asset Management Using a Value Chain Approach appeared first on Itus Digital.

]]>
https://www.itus-digital.com/asset-management-using-a-value-chain-approach/feed/ 0
Stepping Over the RCM Finish Line to Drive Results https://www.itus-digital.com/stepping-over-the-rcm-finish-line/ https://www.itus-digital.com/stepping-over-the-rcm-finish-line/#respond Wed, 15 Nov 2023 14:47:42 +0000 https://www.itus-digital.com/?p=3808 The Reliability Centered Maintenance (RCM) methodology has had mixed reviews over the years.  The negative view of RCM is not because it is a flawed or an inferior methodology.  Put an untrained driver in the cockpit of an F1 racecar and it would be a...

The post Stepping Over the RCM Finish Line to Drive Results appeared first on Itus Digital.

]]>
The Reliability Centered Maintenance (RCM) methodology has had mixed reviews over the years.  The negative view of RCM is not because it is a flawed or an inferior methodology.  Put an untrained driver in the cockpit of an F1 racecar and it would be a miracle if that driver even completed the race.  But it would not be the fault of the race car.

The mixed reviews are because of how stakeholders view the result of the RCM workshop portion of the project.  There are organizations where the participants view the RCM workshop’s conclusion as having crossed the finish line.  There are other organizations where the participants view the conclusion of the RCM workshop as taking the first step over the starting line.  Which one is more likely to achieve the anticipated outcome of the RCM methodology?

Richard Lobley, in his excellent paper from 2011, he states, “In the 1990s there was a significant (over 1200) uptake of organizations who implemented RCM as a maintenance methodology.”  In this article he too suggests there are two types of organizations that invest in RCM.  There are those that, “dedicated sufficient resource to ensure successful implementation and continuous improvement of the RCM methodology.”  There are then those “that either did not implement RCM correctly, or failed to ensure that the right resources were dedicated to embedding the methodology.”

So, what does it mean to implement RCM correctly? According to Nowlan and Heap, when done well it produces a, “resulting scheduled-maintenance program thus includes all the tasks necessary to protect safety and operating reliability, and only the tasks that will accomplish this objective”.  Of course, this means more than just maintenance tasks.  This ‘resulting scheduled-maintenance program’ is what we refer to as an asset strategy today. The operationalized strategy is vital to achieving its expected benefits.

So, what is an operationalized asset strategy? It starts with the base definition and implementation of the maintenance, inspection, and monitoring activities designed to mitigate failure risk.  From there, the asset strategy is enhanced to enable real time assessment of activity execution and equipment condition with automatic notification when failure threats are detected.

RCM may not be something that’s achievable for your organization.  But you have assets that need care and feeding for your organization to achieve its goals. Have you thought about your own asset strategies and what you would improve about them?  Does the thought of evaluating your current strategies give you heartburn?  PM (Preventive Maintenance) Optimization is a great approach that does not require you to go back to the “RCM drawing board.”  Learn more about this approach from our partner, ABS Group, in their recent paper A Streamlined Approach to Preventative Maintenance.

The Itus team has decades of experience developing and implementing asset performance management solutions within industrial organizations.  Through those experiences one of the biggest learnings is once you cross the RCM/FMEA ‘finish line’ the journey to asset optimization has just begun.  Operationalized asset strategies are the next segment of the race.

 

The post Stepping Over the RCM Finish Line to Drive Results appeared first on Itus Digital.

]]>
https://www.itus-digital.com/stepping-over-the-rcm-finish-line/feed/ 0
If Assets Were Athletes, Would You Manage Them Differently? https://www.itus-digital.com/if-assets-were-athletes/ https://www.itus-digital.com/if-assets-were-athletes/#respond Tue, 10 Oct 2023 16:05:48 +0000 https://www.itus-digital.com/?p=3588 Dave Brailsford may not be a name that is well known in the asset management world but if you’re a cyclist, you might very well know the impact he made in the sport of elite competitive cycling.  As the coach of the British cycling team,...

The post If Assets Were Athletes, Would You Manage Them Differently? appeared first on Itus Digital.

]]>
Dave Brailsford may not be a name that is well known in the asset management world but if you’re a cyclist, you might very well know the impact he made in the sport of elite competitive cycling.  As the coach of the British cycling team, Dave elevated them from mediocrity to being dominant during the 2008 Beijing Olympics collecting sixty percent of the available gold medals.  The team continued its domination of the sport for the following ten years.

What was the difference between Brailsford and the previous team’s coaches?

The whole principle came from the idea that if you broke down everything you could think of that goes into riding a bike, and then improve it by 1 percent, you will get a significant increase when you put them all together – Dave Brailsford interview with Matt Slater, BBC Sport

 

Brailsford did the equivalent of what an asset manager would do by deconstructing the asset strategy.  The list of things included:

  • Redesign of the seat
  • Rubbed alcohol on the tires for better grip
  • Riders wore heated overshorts to maintain ideal muscle temperatures
  • Utilized biofeedback sensors to monitor how athletes responded to training
  • Tested fabrics in a wind tunnel and switched racing suits
  • Massage gels
  • Hired a surgeon to teach riders how to wash their hands to avoid catching cold
  • Determined the best pillow and mattress to optimize sleep for each rider
Does this sound familiar?

Many asset managers will have done something similar.  The objective would be to improve performance, at the lowest sustainable cost, while keeping an eye on risk. A cross-functional team would be brought together.  They would spend time developing the asset strategy.  At the end, like the British cycling team, they will have come up with a list of strategy actions including some re-design, time-based and condition-based activities, that when done well, would achieve improvements in opportunity, cost, risk, and performance.

Then what?

The Dave Brailsford story wouldn’t have been told if the focus was on the strategy and how it was developed.  The story became known for the execution and monitoring of the strategy.  Brailsford leveraged the aggregation of marginal gains.  Brailsford was relentless when it came to monitoring for effectiveness, watching for emerging threats, and continually incorporating hundreds of small additional improvements over time.

We convince ourselves that massive success requires massive action – James Clear, Atomic Habits

 

As an asset management professional, you’ve broken down everything that goes into preserving the function and reliability of a critical asset.  The question to ask yourself is have you moved beyond ‘development’ of the asset strategy and ‘actualized’ all the little things that, in aggregate, generate significant business value.  Learn how you can operationalize the efforts you’ve invested in your asset strategies by connecting with us here.

 

 

The post If Assets Were Athletes, Would You Manage Them Differently? appeared first on Itus Digital.

]]>
https://www.itus-digital.com/if-assets-were-athletes/feed/ 0
Accelerating Artificial Intelligence in Asset Management – Part I https://www.itus-digital.com/accelerating-ai-in-asset-management/ https://www.itus-digital.com/accelerating-ai-in-asset-management/#respond Thu, 10 Aug 2023 14:58:23 +0000 https://www.itus-digital.com/?p=3477 Have you watched “Formula One Drive to Survive”? This Netflix series showcases the amazing engineering effort behind the top performing racecars in the world.  But why are we discussing F1 racecars in a blog about AI (Artificial Intelligence)?  This Netflix series tells a story about...

The post Accelerating Artificial Intelligence in Asset Management – Part I appeared first on Itus Digital.

]]>
Have you watched “Formula One Drive to Survive”?

This Netflix series showcases the amazing engineering effort behind the top performing racecars in the world.  But why are we discussing F1 racecars in a blog about AI (Artificial Intelligence)?  This Netflix series tells a story about decisions, how they affect outcomes for every race, and how the use of data and information affects the quality of those decisions.

The current state of AI is like Formula One racecars. These machines depend on advanced technology and specialized skills.  They require a highly technical engineering team coordinating their efforts to ensure the racecar goes around each lap of the track, in the least amount of time possible.  This is a process of continually compounding the team’s collective understanding of their cars, the tracks they race on, their competitors, and the conditions they need to adjust for before and during the event.

Harnessing the power of AI is similar to harnessing the power of an F1 race car.  The team of engineers, subject matter experts, and data scientists precisely coordinate their efforts to use information they have (historical data) to produce information they don’t have (predictions).  The quality of these predictions directly enables the quality of decisions made by heads of the team.  At this elite level, the aggregation of marginal improvements is what makes the difference between a podium finish or not scoring any points at all.  This elite and prestigious level of performance can attract and retain the top available talent to be part of this team.

Similarly, to harness the power of AI, a team of data scientists, subject matter experts, and specialized software engineers are required.  This means that only those organizations that can attract and dedicate such a team are going to reap the rewards promised by AI. At least for now.

When you consider the application of AI within industrial facilities, building a team to drive successful AI projects can be even more challenging.  A skills gap due to an aging and retiring workforce, the need to collaborate across many organizational functions such as operations, maintenance and IT can create significant burden on already stretched resources.  One of the keys to being able to successfully leverage AI at scale, is to put the power directly into the hands of equipment and process experts.  Yes, AI as an enabling technology provides great productivity gains by rapidly assessing large volumes of data on a continuous basis, but the true value drivers are not algorithms.  It’s the people and their knowledge of how the production process operates, how equipment can fail, the symptoms that indicate potential failure and most importantly what action to take when anomalies are detected.

In the APM space, we have seen AI applied to the most critical and complex assets, yet these represent a very small percentage of the total assets in an industrial facility.  If we could turbo charge the process by enabling equipment experts to directly build anomaly detection models at a faster rate, we can unlock much more value across a broader set of challenges related to the risk, cost, and performance of assets.

It wasn’t the sophisticated engineering that went into the car that brought about the era of motorized transportation. It was Henry Ford’s assembly line production which brought the car to the masses.  AI is going to continue to advance and will do so at an increasing rate.  But until we solve the challenges of practicality and availability, we will be dependent on precision-coordinated teams of highly skilled experts.

Look out for our next post where we will offer some insights into approaches to fully harness the power of AI.

The post Accelerating Artificial Intelligence in Asset Management – Part I appeared first on Itus Digital.

]]>
https://www.itus-digital.com/accelerating-ai-in-asset-management/feed/ 0
Rapid Compliance to ISO 10816 Vibration Monitoring Standard https://www.itus-digital.com/rapid-compliance-to-iso-10816-vibration-monitoring-standard/ https://www.itus-digital.com/rapid-compliance-to-iso-10816-vibration-monitoring-standard/#respond Wed, 24 May 2023 15:12:55 +0000 https://www.itus-digital.com/?p=3433 Vibration data expansion and standards adoption The industrial market has exploded with new sensors and solutions which can quickly and affordably monitor vibration in rotating machinery and predict potential failure.  From wireless sensors to handheld devices to completely outsourced condition monitoring services, you now have...

The post Rapid Compliance to ISO 10816 Vibration Monitoring Standard appeared first on Itus Digital.

]]>
Vibration data expansion and standards adoption

The industrial market has exploded with new sensors and solutions which can quickly and affordably monitor vibration in rotating machinery and predict potential failure.  From wireless sensors to handheld devices to completely outsourced condition monitoring services, you now have many options when looking to mitigate one of the most common failure risks in rotating machinery, high vibration.

In addition, we are seeing industrial organizations increase their adoption of standards as they can provide reliability and maintenance teams with the opportunity to take advantage of expert perspectives and years of learning at a very low cost.  These technical standards, in particular, provide a solid starting point for organizations looking to implement basic condition monitoring.

As a technology provider to the industrial market, Itus Digital leverages standards as they allow our customers to quickly develop best practices built on what experts across industrial sectors, have developed over the years.  Our latest work here is the inclusion of ISO 10816 compliance elements into the Itus Asset Twin Library.

This article offers insights into the ISO 10816 Standard and provides a walk through on how Itus Asset Twin capabilities enable the requirements of this standard.  As a technical footnote to this article, we will limit our discussion to the use of overall vibration readings as defined in the ISO Standard.  We will not be addressing the use of detailed vibration data (such as waveform) which are also a key aspect of vibration data diagnostic activities.

What is the ISO 10816 Standard?

The ISO 10816 standard provides guidelines to evaluate the severity of overall vibration levels of machinery taken from sensors or handheld devices.  The standard has seven ‘parts’ which offer guidance for a variety of machine designs and applications.  For the audience reading this blog, ISO 10816-3 (Part 3) will be of particular interest as it specifically applies to industrial machines with nominal power above 15kW and speeds between 120 and 15,000 revolutions per minute (RPM).  Typical machines included in this standard include pumps, fans, motors, blowers, and rotary compressors which can often be significant drivers of repair cost and downtime.

One of the most valuable aspects of ISO 10816-3 are guidelines on how best to interpret the results from vibration readings taken on non-rotating parts of the machine (such as the housing).  The chart below provides a concise summary of how to interpret the severity of overall vibration readings based upon the machine group and foundation type.

SO 10816 Part 3 Evaluation Matrix

Once readings are taken, the standard can be applied to evaluate if further action should be taken such as a more thorough diagnostic, operator check or specific maintenance activity to prevent equipment damage.  It can also be a trigger to check the execution of standard preventative maintenance activities.

Applying the ISO 10816 Standard

Here is a common situation we see with our customers.

  • Vibration monitoring programs are recognized as a valuable method for early detection of failure risk across various equipment types to prevent catastrophic failure, unnecessary repair costs and production downtime.
  • Investments in continuous vibration monitoring solutions have been made in highly critical equipment which represent a small percentage of all rotating equipment.
  • Expansion of vibration monitoring programs is desired on medium and low criticality equipment leveraging low-cost sensors and handheld devices.

 

The key challenge they face is how to leverage these new equipment condition data streams to their fullest potential without putting more burden on already stretched resources?  Additionally, how to implement these new data streams without creating another siloed data source.

This is where enabling technology such as the Itus Asset Twin can be leveraged to develop best practices such as those that can be enabled with ISO 10816.  With Itus technology, not only can the prescribed decision criteria be automated, but the results of the evaluation can also be driven into broader reliability and maintenance processes in your organization, realizing more value from your assets:

  • Centralized management of all vibration monitoring advisories from your various sensors, field data collectors, and solutions to drive collaboration among your equipment experts.
  • Integration of additional equipment conditions from other systems such as thermography, oil analysis, and process monitoring to enable broader failure mode coverage and overall asset health scoring.
  • Automatic generation of requests in your maintenance management to drive appropriate follow up or repairs as needed.

 

Let’s walk through a quick example of how ISO 10816-3 requirements are enabled with an Itus Asset Twin.  Below is the configuration of the standard’s requirement for Machine Group 1 on a flexible foundation.

ISO 10816 Protection in Itus APM

In this configuration, the Itus Asset Twin is utilized as a screening tool for machinery vibration to identify situations which require more diagnostic tests or the initiation of an inspect/repair activity.  Here are the key aspects of the Protection implementation.

  • Tag Aliases – The incoming vibration data stream from sensors, handheld devices or historians. This template is set up for readings with units of in/sec but mm/s is also supported.
  • Thresholds – The definition for the analytical evaluation of the vibration readings. This example is very straightforward with two threshold evaluations, .28 in/sec and .44 in/sec.
  • Actions – The prescribed action which will be recommended when the Threshold condition has been met. In this case, when the ISO defined Threshold of .28 in/sec is met we recommend performing a more detailed vibration diagnostic at a high priority within the next 3 days.  If the vibration readings exceed .44 in/sec, we recommend an operations/maintenance intervention as the standard indicates the machine may experience damage.

 

This example of an Asset Twin demonstrates how ISO 10816-3 requirements can be technically enabled ‘in kind’ but you can easily adjust this definition based upon your specific operating experience and context as needed for your business needs.

Itus Libraries quickly enable the requirements of this standard

Our use case for this article has focused on one specific aspect of ISO 10816, Part 3, more specifically, machine group 1 with a flexible Foundation.  Beyond this example, subscribers to the Itus solution have access to every scenario covered in Part 1 and 3 of ISO 10816 through the Itus Asset Twin Library.  For subscribers of the Itus solution, the requirements of the ISO standard can be enabled for a piece of equipment within minutes by accessing the library, selecting the appropriate machine group and foundation type, and then applying the model to the vibration data feeds from sensors, devices or monitoring solutions.

ISO 10816 Part 3 Asset Twin

 

The Itus Asset Twin Library also provides Asset Twin templates for the most common failure modes across over 200 types of industrial equipment.  Leveraging this library allows maintenance and reliability teams to standardize their asset strategies and drive a consistent approach to preventive maintenance activities, condition monitoring and evaluation of equipment health.

Minimize failure risks by enabling standards requirements with Itus technology

Vibration monitoring is a very effective approach for early detection of potential failure in rotating machinery.  Historically, applications of vibration monitoring have been limited to highly critical equipment due to the cost and required resources.

Advancements in sensor technology and handheld data collection have dramatically reduced the cost of assessing machinery vibration which now opens the opportunity to expand usage for medium and low criticality equipment.

The downside is organizations are now drowning in sensor data and need simple methods to analyze and interpret that data from a myriad of equipment and condition data sources.  Fortunately, ISO 10816 provides a practical model to assess the severity of vibration readings and should be considered when implementing a vibration monitoring program.

Evaluation of vibration data and compliance to the ISO 10816 standard is just one simple use case of how the Itus APM solution can be used to drive optimal asset strategies to reduce maintenance cost and lower equipment failure rates.  If you would like to learn more about our approach and solution, please connect with us here.

The post Rapid Compliance to ISO 10816 Vibration Monitoring Standard appeared first on Itus Digital.

]]>
https://www.itus-digital.com/rapid-compliance-to-iso-10816-vibration-monitoring-standard/feed/ 0
Enhance Reliability Engineer Productivity with Context https://www.itus-digital.com/enhance-relliability-engineer-productivity-with-context/ https://www.itus-digital.com/enhance-relliability-engineer-productivity-with-context/#respond Thu, 27 Apr 2023 21:22:37 +0000 https://www.itus-digital.com/?p=3388 In Asset Management a key method to identify improvement opportunities and track the progress of refinements is by evaluating the data from your business systems. There are a variety of data elements that can be leveraged to drive asset management programs with most organizations focusing...

The post Enhance Reliability Engineer Productivity with Context appeared first on Itus Digital.

]]>
In Asset Management a key method to identify improvement opportunities and track the progress of refinements is by evaluating the data from your business systems.

There are a variety of data elements that can be leveraged to drive asset management programs with most organizations focusing on information from the following areas:

  • What are my assets and where are they located.
  • What are my assets worth (how critical are they to my operation).
  • What condition are they in.
  • What is my plan to maintain, inspect and monitor the assets.
  • What maintenance, inspection and monitoring activities have been performed on my assets.
  • What did we learn when we maintained, inspected, repaired, and monitored my assets.

 

With the maturing of technologies in the industrial space, we are now seeing more organizations which have solved the foundational challenges of data collection and integration of various asset data sources.  Enterprise systems such as Enterprise Resource Management (ERP), Enterprise Asset Management (EAM), Process Historians as well as IT/OT integration solutions coupled with the concept of Data Lakes now ensure access to core asset information for resources in engineering, maintenance, operations, and reliability functions.  With this evolution new questions now emerge about the optimal approach to present the data to ensure stakeholders can quickly view and interpret the data to make timely decisions which positively impact the business.  An illustration of this came from a recent conversation we were having with a customer where they used a fishing analogy:

“Yes, we have a Data Lake which has pulled all the information together.  Our challenge is we are in a large body of water with a rowboat trying to catch a trophy fish!  What we need now is a fast boat with a fish finder to win the tournament, do you have one of those?”

 

In this article, we will explore approaches which are utilized to deliver data to asset managers to support their quest to optimize performance & cost and risk.  We have a philosophy at Itus Digital, people are the most valuable asset at any manufacturing facility – their knowledge, decisions and actions are what ultimately produce a product at an optimal rate and cost profile.

Unfortunately, our industrial workforce is shrinking, and we are experiencing an unprecedented loss of knowledge as baby boomers retire.  Analyzing and presenting asset information in the most efficient and effective manner is more important than ever.  We also have a challenge on the opposite side of this issue, our new generation of engineer’s have grown up with iPhones and apps and expect to consume data in a completely different way.  They want information to be served in a complete ‘context’, in real time, wherever they are on whatever device they use.  Meeting these expectations is key to driving optimization and continuous improvement with our next generation of asset managers.

Business Intelligence is a starting point

The concept of business intelligence (BI) has been around for a long time but really started gaining momentum in the early 2000’s as enterprise solutions consolidated business processes and data into common systems.  One of the cornerstones of BI is presenting complied information into Dashboards that stakeholders can access to view standardized metrics and results.

A very common example in asset management is below, a Pareto analysis of asset repair costs.  In the simplest form it is the aggregate summation of repair costs, sorted descending for some asset population over a period of time.

Asset Repair Cost Pareto

By accessing this information, a Maintenance Manager or Reliability engineer gets foundational insights on the assets driving the most repair cost which can be the basis for improvement initiatives.

Dashboards which pull together core information and insights are a great starting point.  They provide a common location and view of standardized metrics which can drive decision-making.  Many, such as Microsoft’s Power BI, also provide nice capabilities to filter, sort and pivot the information allowing for refinement of the data and analysis.

However, there are some limitations with traditional BI analytical approaches.  One issue is BI implementations tend to be ‘passive’ and require an analyst to go hunt and search for information to identify a problem.  Given how busy resources are within industrial facilities, sometimes the simple process of accessing an application, logging in and then finding the outlier in the chart, is enough to prevent the problem-solving process.  Another core challenge with BI and Dashboards, they usually do not drive the process to act and solve the identified problem.  The objective of the dashboard is to present data.  If you want to act, you must move to other applications to invoke improvement processes and creation of tangible value.  We find that the most effective use of dashboards tends to be with standardized metrics that do not require close interval evaluation or rapid response.  A couple examples of these include bad actor reports, monthly maintenance spend, and weekly downtime hours by operating unit.

Purpose Built Apps Provide More Context

Today, most software applications are designed to meet very specific objectives and in the industrial segment we now have many categories of purpose-built solutions.  EAM solutions track assets and manage maintenance activities.  Process historians collect, aggregate and store massive volumes of time series data.  Asset Performance Management (APM) solutions analyze maintenance and operating histories to identify poor performing assets and drive corrective actions to mitigate failure risk.

Modern solutions in these solution categories are now offering very specific and compelling context based upon their purpose.  When designed appropriately, these apps offer asset managers the right information, at the right time in the most appropriate context to solve the problem.  The following example from the Itus APM solution highlights the value of full context for a centrifugal pump under management.  In the Itus solution, users can define specific watchlists which contain the assets they manage or want to monitor closely, and each asset has its own watchlist card.

Asset Watchlist

The watchlist card is designed to provide complete context of the current asset strategy and health in a simple and intuitive view.  From this one card, we see a high-risk asset ($750,000 failure consequence) with a high severity condition (elevated vibration and temperature on the Bearing) which was generated 1 day ago.  I can also see we have a submitted corrective action which is 7 days old.

If you are not into interpreting icons and colors, simply flip the card over to get the full narrative which specifies at risk failure modes and indicates for $55 we can inspect and lubricate the pump to mitigate the potential failure.

Asset Watchlist

Let’s compare that with our previous BI example.  The dashboard did inform me that this particular pump is my bad actor from the repair cost perspective, which is a great start.  However, purpose-built apps can provide much richer context and therefore more timely and accurate mitigation and business benefit.  With full context, I know the risk associated with failure, what failure modes/threats are currently active and whether action has been taken to address the problem.  Just as important, I am one click away from escalation or additional action as needed based upon the context.

Real Time Exception Management Is The New Norm

In the consumer markets, Smartphones, Tablets and Apps have fundamentally shifted the expectations for how people get, interact with and act on information.  Let’s take a simple example of going on a family vacation or business trip in a car.  Not that long ago you would need to get a map, find your start and end points and then document your directions from point A to B.  If you encountered a closed road, accident or high traffic volume while traveling you would just have to accept the consequences and/or start the mapping process over.  Map applications on smartphones have completely shifted how most of us travel from point A to B.  They are purpose built with full context AND provide real time notifications and recommendations when conditions such as high traffic have rendered your original route as less than optimal.  Most importantly, they simplify an old process and generate new value – optimal travel time based upon current conditions.

Fortunately, industrial applications are learning from the consumer markets and building capabilities to deliver the right information at the right time, to the right place in a prescriptive fashion.  When designed correctly modern apps already know what you need to know and enable you to opt-in to that information in real time.  The following example is also from the Itus APM solution and demonstrates some of these key capabilities.

First, the application already knows what a Reliability Engineer wants to know.  For example:

  • Is an asset in poor health?
  • Have we taken corrective action?
  • Is a known failure risk emerging?
  • Has a new failure mode emerged?
  • Are we executing the recommended maintenance?
  • Has the asset strategy changed?
  • Has the MTBF changed?

 

Next the Reliability Engineer can quickly opt-in to the information based upon their specific needs and preferences:

User Notification Opt In

Finally, the application delivers the information in real time, in complete context with prescriptive recommendations:

Reliability Engineer User Notification

The advantage of this analytical approach is clear.  No need to spend valuable time hunting through volumes of information to find an exception that needs to be managed further.  We often hear that Reliability Engineers spend 80% of their time gathering data and only 20% of their time analyzing data. With an intelligent and exception-based model, this ratio can easily be flipped.  Simply tell the application what you want to know and let it do the work based upon current conditions.

Data management and analysis are foundational processes when managing industrial assets.  How data is consumed and presented are key considerations when looking to maximize subject matter expert time and effectiveness of decision making.  Modern technologies are solving for the classic challenges of data gathering and integration which is now driving focus on how data is delivered, visualized, and analyzed for maximum business impact.  As you design your asset management program, carefully consider what analytical approaches you are putting in place and work to drive full context, timely delivery, and integrated models to drive mitigating actions.

The team at Itus Digital have been building and deploying analytical solutions with industrial organizations for the last 25 years.  If you would like to chat about analytics or learn more about our solutions, connect with us here.

The post Enhance Reliability Engineer Productivity with Context appeared first on Itus Digital.

]]>
https://www.itus-digital.com/enhance-relliability-engineer-productivity-with-context/feed/ 0
Failure findings from Norfolk Southern train derailment in Ohio https://www.itus-digital.com/early-findings-from-recent-norfolk-southern-train-derailment-in-ohio/ https://www.itus-digital.com/early-findings-from-recent-norfolk-southern-train-derailment-in-ohio/#respond Wed, 22 Mar 2023 00:14:03 +0000 https://www.itus-digital.com/?p=3364 In early February, a Norfolk Southern (NS) train derailed near East Palestine, Ohio. The train consisted of 149 rail cars, with 11 containing hazardous materials that ignited after the derailment. Fortunately, there were no reported injuries or fatalities. However, there are concerns about the long-term...

The post Failure findings from Norfolk Southern train derailment in Ohio appeared first on Itus Digital.

]]>
In early February, a Norfolk Southern (NS) train derailed near East Palestine, Ohio. The train consisted of 149 rail cars, with 11 containing hazardous materials that ignited after the derailment. Fortunately, there were no reported injuries or fatalities. However, there are concerns about the long-term environmental and health impacts on the 2,000 residents of East Palestine. The National Transportation Safety Board (NTSB) has released a preliminary report, RRD23MR005 – Norfolk Southern Railway Train Derailment with Subsequent Hazardous Material Release and Fires, which provides initial insights into the failure modes and protections being evaluated as part of the investigation. These insights are worth exploring further in this article.

The primary mechanical failure risk being investigated in the accident is the overheating of a wheel bearing. Overheated bearings are not the only problem that can cause a train to derail, but they are essential to a train’s safe and efficient operation. Inside each bearing is a series of rollers that are a critical component in turning the rail car axle. When lubricated, the bearings limit friction while supporting the railcar’s weight. If a bearing gets too hot, usually from a loss of lubricant, it can melt, causing it to seize up or come off the axle. The resulting damage can throw a railcar out of alignment and cause it to jump the tracks.

“Roller bearings fail. But it’s absolutely critical for problems to be identified and addressed early so these aren’t run until failure”  NTSB Chair Jennifer Homendy

 

Norfolk Southern has implemented protections to mitigate bearing failures on its railcars using a Hot Bearing Detector (HBD). The HBD is placed trackside at fixed points and automatically measures the axle temperatures as the train passes by. Its function is to detect overheated bearings and provide real-time warnings to train crews so they can take appropriate action. A more detailed view of the protection scheme, evaluation thresholds, and prescribed actions is provided below.

 

 

The protection definition follows very similar constructs to an Asset Twin in the Itus Solution. Key components which are defined include a Failure Risk, which is the elevated bearing temperature due to lack of lubrication. Also defined is the specific Protection, which monitors the bearing temperature over time. The Advisories (prescriptive actions) are also defined, which detail what should be done when certain conditions identify emerging threats.

The report found that the temperature of the bearing in question had been increasing for 30 miles before reaching East Palestine. However, only the third reading reached Norfolk Southern’s threshold to stop and inspect the train via a real-time audible alarm. Unfortunately, the alarm was triggered too late as the train derailment was already in process.

 

The bearing temperatures were evaluated at three data points before the train derailment. The 23rd car’s axle had a recorded temperature of 38 degrees above ambient temperature at Milepost 79.9 on the Fort Wayne Line. At the next detector at 69.01, it increased to 103 degrees, and at Milepost 49.81 on the east side of East Palestine, the recorded temperature was 253 degrees above ambient.

The accident highlights key constructs to consider when designing failure risk protection models.

Data polling rates should consider condition escalation rates to allow for enough time to detect and respond to mitigate the identified failure risk. In the case of the NS derailment, the distance between the last two temperature measurements on the wheel bearing was 19.2 miles, and the temperature difference over that time was 150 degrees. Somewhere over that time, the bearing temperature passed through a non-critical threshold that would have advised the train engineer to stop and inspect the rail car but significantly pushed past the critical threshold of 200 degrees. To mitigate this potential data gap in the future, the Association of American Railroads announced that all seven Class 1 railroads in the country have committed to adding approximately 1,000 detectors to close the gaps between detectors and achieve an average spacing of 15 miles.

Design thresholds, analytics, and actions from actual experience. When designing a Failure Risk Protection, reliability engineers must make key decisions, including how many thresholds to implement, how much tolerance should be given for each threshold, and what to prescribe for risk mitigation at each level. Design thresholds, analytics, and actions from actual experience, using historical data to simulate a model and define analytics for a specific operating context. Sometimes this information is available through the collective knowledge of industry experts via standards such as ISO10816 (mechanical vibration).  In other situations, this information may be available from OEMs in their operation, maintenance, and troubleshooting manuals defined from their specific failure testing.  Many times, these models are developed from a ‘really bad experience’ or consequential failure.  As a result of this accident, Norfolk Southern is working with manufacturers to develop more sensors, reevaluate triggering thresholds, and analyze data for patterns that could provide earlier warnings. A more robust method to design Protections is from actual operating experience.  Solutions like Itus provide an ability to simulate a model from historical data which is an ideal method to define analytics for a specific operating context.  The capability allows engineers to understand how often the system will advise at various threshold levels, ensuring there is appropriate time to respond with the most appropriate prescriptive action to mitigate the risk.

Accounting for the worst-case consequence is essential in evaluating risk and advising on mitigations with critical context to drive appropriate action. Unfortunately, we often see analytical models which are not designed from a complete risk context.  By injecting risk into a Failure Protection scheme we can assess the situation more accurately (historical probability of failure) but more importantly we can advise on mitigations with critical context to drive appropriate action.  According to the Federal Railroad Administration, about 1,000 derailments occur each year but on average only 17 of those accidents involved rail cars with hazardous cargo that could present significant safety or environmental risks.  Ideally, risk should be evaluated within the analytical model to provide more advanced warning, increase time to respond, and offer more specific context on what is happening and what to do. As we are learning from the East Palestine accident, there is a very different consequence when a rail car carrying freight has a failing bearing vs. a rail car carrying highly toxic vinyl chloride.

The East Palestine accident impacted 2,000 residents with long-term environmental and health concerns. The rail industry has a keen understanding of the wheel bearing – lack of lubrication failure mode for rail cars and is making progress on the protections that can be put in place to mitigate the risks. The preliminary report has provided great insights for anyone building analytical or failure risk monitoring models that are great reminders for design moving forward.

If you are interested in learning more about Asset Twins and how the Itus solution can rapidly implement failure risk protections to predict and minimize unplanned events, feel free to reach out to us, we would love to chat!

The post Failure findings from Norfolk Southern train derailment in Ohio appeared first on Itus Digital.

]]>
https://www.itus-digital.com/early-findings-from-recent-norfolk-southern-train-derailment-in-ohio/feed/ 0
Can ChatGPT replace the Reliability Engineer? https://www.itus-digital.com/can-chatgpt-replace-the-reliability-engineer/ https://www.itus-digital.com/can-chatgpt-replace-the-reliability-engineer/#respond Wed, 18 Jan 2023 22:04:22 +0000 https://www.itus-digital.com/?p=3317 If you have been hanging around the Internet lately you have probably heard about OpenAI’s platform which recently had one of their products, ChatGPT go viral after its launch in November 2022.  Microsoft is also in discussions with OpenAI to potentially invest $10 Billion dollars...

The post Can ChatGPT replace the Reliability Engineer? appeared first on Itus Digital.

]]>
If you have been hanging around the Internet lately you have probably heard about OpenAI’s platform which recently had one of their products, ChatGPT go viral after its launch in November 2022.  Microsoft is also in discussions with OpenAI to potentially invest $10 Billion dollars into the company in addition to their initial $1 Billion investment.  That was not a typo, $10 Billion dollars in new investment with the intention to integrate the technology into the Office product suite as well as their search engine Bing.

ChatGPT is a free AI application (Neural Network based) which has scoured the internet for information and can answer questions in a very ‘human-like’ fashion.  The capability is very expansive, some examples we experimented with included defining job descriptions, writing essays on English literature, solving bio medical engineering equations, and even writing Python code!

Our experimentation with ChatGPT got the Itus team wondering, if we set asset operating context aside, could it define an asset strategy?

To satisfy our curiosity, we decided to run some scenarios and see what we got back.  Our first request to ChatGPT was to ask it to define the failure modes for a specific asset class.  On this task we would say the results were a bit better than expected as we received a relatively good list of failure modes like what someone with a reliability background would come up with.  Asking it about a centrifugal pump it returned the following:

Common failure modes for a centrifugal pump include:

  • Impeller wear or damage
  • Shaft or bearing failure
  • Clogging or blockage of the suction or discharge lines
  • Cavitation (vaporization of the fluid due to low pressure)
  • Worn or damaged mechanical seals
  • Loss of prime (loss of suction)
  • Air binding (air getting into the pump)
  • Overheating
  • Electrical or control issues.

 

Then looking specifically at impeller wear we inquired about mitigating maintenance or condition monitoring which could be implemented, to which we received the following:

Some maintenance and condition monitoring activities that can help mitigate impeller wear in a centrifugal pump include:

  • Regularly inspecting and cleaning the pump’s impeller and other components to ensure they are free of debris or other blockages.
  • Monitoring the pump’s vibration levels and bearing temperatures to detect any abnormal conditions that could indicate wear or damage.
  • Checking the alignment of the pump’s shaft and impeller to ensure they are properly aligned and not causing undue wear.
  • Measuring the pump’s flow rate and head pressure to ensure it is operating within its design parameters.
  • Performing regular lubrication and oil changes, as well as checking the oil’s viscosity and contamination, this help to maintain the pump’s bearings and mechanical seals.
  • Checking the pump’s suction and discharge lines for blockages or restrictions.
  • Implementing a preventative maintenance schedule and following the pump’s manufacturer’s recommendations for regular maintenance.
  • Conducting regular ultrasonic testing of the impeller to detect any defects.
  • Inspecting the impeller’s wear ring and replacing it if it worn or damaged.

 

Not a bad list but it is generic and lacking details.

We explored further and asked for an FMEA on the same asset class.  The result was mixed.  At first, we were impressed with how it organized the results by typical FMEA definition and included potential causes, potential severity, risk priority number, and recommended actions in the results.   Optimism faded when we realized we only got two failure modes back.  On subsequent attempts, we were more specific with the asset class and service of the fictitious asset.  The results changed very little.

So where does this leave us?  Does ChatGPT have a role in a reliability department?  Could it be utilized to enhance strategy development approaches such as RCM or FMEA?

If you had nothing, ChatGPT could be used to get generic information about failure modes, failure rates and PMs to mitigate those failure modes.  While this is more helpful than an empty document or spreadsheet, it still leaves the real effort of implementing the asset strategy as defined by the FMEA untouched.

This is where the Asset Performance Management (APM) process can be an enabler.  APM is rooted in the methodologies of RCM and FMEA and implements their output in the form of the asset strategy.  Modern APM solutions offer built-in library of strategies which will be much more useful than what ChatGPT can currently provide.  However, a library is not enough.  The value of APM is only realized when the strategy is implemented and “operationalized”.  This is where modern APM solutions drive tangible value for industrial organizations seeking to maximize availability, optimize maintenance spend and lower risk to people and the environment.

If you are a student looking to write an essay on The Lord of the Flies, you could certainly consider ChatGPT for some insights.  However, for the maintenance and reliability domain, we recommend you tap into the equipment and process experts in your organization, the actual operating history of your equipment and the expertise built into APM applications which will implement asset strategies inclusive of PMs, analytics, monitoring and prescriptive actions.

We certainly look forward to seeing this class of technology grow and what it can do and the creative ways it will be applied.  It will also be interesting to see what it learns from this article which is now publicly available for its engine to analyze!

At this point we can easily state – the reliability engineer role is going to be around for a while!

The post Can ChatGPT replace the Reliability Engineer? appeared first on Itus Digital.

]]>
https://www.itus-digital.com/can-chatgpt-replace-the-reliability-engineer/feed/ 0
Confessions of an Old School Risk Matrix Zealot https://www.itus-digital.com/confessions-of-a-risk-matrix-zealot/ https://www.itus-digital.com/confessions-of-a-risk-matrix-zealot/#respond Wed, 04 Jan 2023 18:21:27 +0000 https://www.itus-digital.com/?p=3307 Webster's Dictionary defines risk as the “possibility of loss or injury”.  ISO 31000 defines a risk matrix as “a tool for ranking

The post Confessions of an Old School Risk Matrix Zealot appeared first on Itus Digital.

]]>
Webster’s Dictionary defines risk as the “possibility of loss or injury”.  ISO 31000 defines a risk matrix as “a tool for ranking and displaying risks by defining ranges of consequences and likelihood”.  With these two definitions you can simply define a risk matrix like the example below:

 

Example Risk Matrix

 

The problem with this simple approach is context.  Can it really be said with confidence that a rare but catastrophic event is as low a risk as a negligible but frequent event?  No, because the matrix above lacks meaningful context.  I’ve learned that different risk contexts require different risk visualizations.

I used to think that a risk matrix was the ultimate tool in evaluating criticality and risk for asset management.   I used to think that all categories of risk could be evaluated congruently in a common risk matrix across an organization.  In my past, I even led an entire product line of asset management tools upon this principle, thus forcing many subject matter experts in asset reliability, mechanical integrity, and process safety to attempt to align to a common risk matrix.  Those were some very spirited discussions usually turning into debates, with alignment not always the result.  As William Blake once said, “The fool who persists in his folly will become wise.”  Indeed, I have learned this lesson, with age and experience comes wisdom and humility.

In my defense, the risk matrix is a very good tool.  It provides a visual representation to communicate risk concepts simply while providing a framework for prioritization.  It can be customized for specific organizations with specific definitions of risk.  It is no wonder industry so quickly wants to adopt it to understand risk across our organizations.  Unfortunately, it is not a good tool for comprehensive risk quantification, nor its mitigation.  It is limited to discrete ranges of probability and consequence.  It is often over-simplified, subjectively qualitative and does not consider how risk changes over time.

Let’s consider a couple of common categories of risk in asset intensive industry – Safety and Operations.

First and foremost, let’s discuss Safety risks which are focused on consequences from minor injury up to very severe injury, including fatality.  Naturally, the protection of people is of utmost importance.  A safety risk assessment considers events that could lead to personnel harm or injury.  For example, could the event occur, can the event cause a chemical leak and/or fire, and is there a possibility that someone could be exposed to the leak or fire.  When determining a risk matrix, the consequence categories can scale reasonably from “minor first aid” to “fatality”.  But in this context our probability scale actually factors in multiple probabilities for every row (i.e. event, leak, fire, and exposure).  Thus, the matrix probability could exponentially scale from an occurrence of once per year to an occurrence of once every 10,000 years.  It can be difficult to practically understand something occurring only once every 10,000 years, but this makes total sense to a safety engineer or likewise to a mechanical integrity engineer. They are responsible for mitigating severe consequences that should never happen which include fatality, loss of containment, fire, and hazards to the environment.

Safety and integrity engineers use methods of risk mitigation dictated by process safety management standards such as Hazards Analysis, Safety Integrity Systems (SIS), Layers of Protection Analysis (LOPA), Risk Based Inspection (RBI) as well as compliance to jurisdictional standards.  These methods are relatively complex methodologies that drive recommended and mandatory actions as part of an overall safety and integrity plan.   While you can use a risk matrix as a visual representation of the result these methods, you cannot easily use the same risk matrix to represent what I will next call Operational risk.

For the context of this article, I define Operational risk as the risk of unplanned production downtime and associated costs, which may include maintenance, overtime, lost production, rework, and scrap.  Often unplanned production downtime is caused by asset failures.  Reliability engineers seek to mitigate the risk of these asset failures, especially those that impact production.  There is typically an immediate cause and effect of a critical production asset failure.  The asset fails and production is immediately impacted.  Imagine assets that failed every 10,000 years, reliability engineers would not be needed to determine how to improve those failure rates!  That is the difference of probability scale for a reliability engineer.  They are dealing with asset failure consequences of production loss and costs which could occur multiple times a year up to once every several years.  It is a completely different context to that of the Safety or Integrity engineer.

Thus, I have learned that a risk matrix is just a defined set of intersections of probability and consequence chosen to represent a category or context for risk.  As described above, contexts for Safety and Operations are different, thus it is reasonable for risk matrices to differ.  Remember it is a tool for visualization and prioritization more than assessment and mitigation.

Back to the reliability engineer focused on Operational risks, there is another key aspect to consider for risk assessment that I call “mission time”.  The mission time for an asset or a group of assets, such as a production unit, can be thought as the time between major shutdown events where restorative maintenance or asset replacement is performed.   If a risk matrix is used to assess risk in this context, the assessment will be limited to just the ranges defined for each intersection and it will not have any context of risk over time.

For the reliability engineer it is best to assess risk with a more quantitative probability estimate over a mission time multiplied by an estimate of the overall cost of the potential failure to the business.  This allows the engineer to focus improvement efforts on the assets that will return the most value to the business.  These include efforts leveraging reliability methodologies such as failure mode-based strategy development, maintenance optimization, reliability modeling, asset health monitoring and advanced analytics.

A better risk assessment method is to estimate a failure probability quantitatively for the asset.  This can easily be done with an estimate of failure rate experienced (or expected) combined with a desired mission time.  A simple calculation can be used to represent the probability over time such as a random Weibull or exponential distribution.   Plotting the distribution will provide a simple visual over time.

 

Failure Probability Over Time

 

Consequence can also be estimated based on the overall cost of failure, which would include all costs including repair costs and production losses.  Combining this cost-based consequence with our failure probability, overall risk for our mission time can be easily calculated.

 

Overall Risk Calculation

 

Now if you have a set of production assets to evaluate for a system or unit, you could compare them across a mission time between shutdowns.  A comparison might look like the chart below; note the riskiest asset is not always the one to focus on during the mission time.

 

System Level Risk Assessment

 

Assets with a higher probability of failure but lower cost, might need more attention during the mission time than an Asset with a much lower probability of failure but much higher cost.  This is a risk comparison tool to evaluate assets in a specific context over a specific time.  Also, by estimating failure probability and cost of failure, you are better equipped to leverage other reliability methods to determine the best course of action to improve assets.

Now for those of you that have assessed asset criticality with a qualitative risk matrix approach, take heart because all is not lost.  You are a step ahead.  Consider your risk matrix intersection an initial assessment that can be leveraged in the more detailed assessment above.  You can simply use your ranges as estimates of failure rates and costs to plug into the same formulae and then adjust as needed.

With this comparison, the next steps can be taken to mitigate risk and capture its associated value to the organization.  There are several mitigation methods to improve an asset’s performance or reduce the cost of unplanned failures.  These include addressing problems such as:

  • The asset is unreliable with low inherent MTBF
  • My asset strategy does not cover all failure modes
  • My asset strategy does not effectively cover failure modes
  • My asset strategy interval is too low (doing too much)
  • My asset strategy interval is too high (doing too little)
  • My asset strategy is not being executed properly
  • My asset strategy is not addressing the root cause of the failures

 

In summary, remember an initial risk assessment is just the starting point to any form of active risk management.  You must use the assessment to drive prioritization and improvement of asset performance in support of the operation of your business.  A risk matrix is a tool that can be used, but we believe a quantitative approach is better.  The better the assessment, the better decisions you will make regarding risk mitigation.

If you want to take your first, or next, steps to assess operational risk, we can help!

Register for our FREE Asset Risk Analyzer here and start managing and mitigating your risk in minutes!

 

 

The post Confessions of an Old School Risk Matrix Zealot appeared first on Itus Digital.

]]>
https://www.itus-digital.com/confessions-of-a-risk-matrix-zealot/feed/ 0
Use of the Mean Time Between Failure Calculation https://www.itus-digital.com/calculating-mean-time-between-failure/ https://www.itus-digital.com/calculating-mean-time-between-failure/#respond Tue, 29 Nov 2022 21:38:24 +0000 https://www.itus-digital.com/?p=3255 Track Mean Time Between Failure from your EAM system with our FREE Calculator (in Excel) Mean Time Between Failure (MTBF) Basics The purpose of this blog is to introduce the concept of calculating Mean Time Between Failure (MTBF) and offer our FREE tool to calculate...

The post Use of the Mean Time Between Failure Calculation appeared first on Itus Digital.

]]>
Track Mean Time Between Failure from your EAM system with our FREE Calculator (in Excel)

Mean Time Between Failure (MTBF) Basics

The purpose of this blog is to introduce the concept of calculating Mean Time Between Failure (MTBF) and offer our FREE tool to calculate MTBF directly from EAM solutions such as SAP and Maximo.  When it comes to estimating how often an asset fails, there are many layers of sophistication from preparing data, calculating values, and then interpreting results.  The level of analysis rigor to apply in an MTBF calculation is driven by the specific use case, an organization’s maturity, and the required level of accuracy in the results. We are just scratching the surface on this topic in this blog so please reach out to us here if you would like to learn about more sophisticated approaches to managing equipment failure rates.

Want to skip the MTBF intro and start getting reliability insights right away with our FREE MTBF Calculator?  Scroll to the bottom and download!

Managing the Mean Time Between Failures (MTBF) of your equipment is one of the most basic yet effective approaches to measure performance, identify bad actors and drive reliability improvement programs.  Typically represented in years, months, or days, MTBF measures the average length of time that an asset has operated without interruption.  The metric provides key insights to help drive tactics to ensure assets are operating to their fullest potential at the optimal cost profile and should be part of any maintenance and reliability professional’s toolkit.

Unfortunately, the MTBF calculation is not always well understood and while the mathematical equation is simple, many still struggle to gain visibility to their equipment failure rates.  We see the MTBF measure as a cornerstone to driving equipment reliability improvements and are often asked to help organizations determine the best approach to calculate the failure rate of their equipment.  To help customers gain visibility to their failure rates, we offer several approaches from simple to complex depending on input data quality, required accuracy and level of calculation sophistication.  This blog introduces our simplest approach (which is also FREE) to quickly calculate equipment MTBF utilizing a Microsoft Excel template.

Data Inputs to Calculate MTBF

The most common source of data for calculating MTBF is work order data from a computerized maintenance management system (CMMS) or Enterprise Asset Management (EAM) system.  Work orders are typically utilized to plan repair activities, order replacement parts and source labor which typically provides a good record of when a failure has occurred.  While the work order might be the most common approach to determining when an asset has failed, they can present calculation challenges depending on the completeness of the information supplied during the order closure process.  The subject of work order data quality is, and will most likely be forever, debated amongst maintenance and reliability professionals.  Some will take the position that the data must be perfect to be utilized in an MTBF calculation which unfortunately will prevent them from utilizing this very valuable measurement of equipment performance.  “Perfect is the enemy of good” is an aphorism used to describe this engineering perspective, an insistence on perfection often prevents implementation of good improvements.  The good news is that our experience in leveraging work order data for reliability measurements over the last 20 years, tells us that many of you now have work order data that is appropriate for the use case most reliability programs need today:  an estimation of rate of failure to make better reliability and maintenance investment decisions.

Other technological advancements, such as the rapid expansion of sensors and monitoring systems, are making it more common to automatically identify and document equipment failures with greater accuracy.  For equipment that is already monitored by a control system or process historians, you may already have direct access to the running state of the machine which can help you determine when an asset has failed in addition to automatically calculating run times which are key inputs to MTBF calculations.  For assets that are not actively monitored, it is now possible to install affordable and non-invasive sensors which can monitor conditions (such as energy usage) to help automatically detect when an asset is not running and document each time the asset starts or stops.

MTBF Calculation Approaches

In its simplest form, MTBF is calculated by taking the total time an asset is running and dividing it by the number of failures that happened over that same period of time or:

MTBF = Running time between installation date and last failure / # of Failures

Running time:  This is the total amount of time an asset is running over a specific time period.  Note that if you have assets that are not operating 24×7, running time is not represented by calendar time.

# of Failures:  This is the total number of equipment failures or breakdowns which have occurred over a specific time period.

 

Here is a very simple example.  Take a cooling water pump that continuously operates at a manufacturing facility.  The pump does not have a standby spare, and we want to calculate its specific MTBF over a 5-year period.  The asset has experienced 3 failures over the 5 years.

Running Time = 5 years

# Of Failures = 4

MTBF = 1.25 Years (5 years / 4 failures)

For this blog, the focus is on a simple method to calculate MTBF directly from work order data.  If you have more descriptive data sets and supporting calculation tools, you may want to utilize a Weibull Distribution for your failure rate analysis.  A Weibull analysis will provide a ‘beta’ value (or shape factor) which offers additional insights on the failure pattern (infant mortality, random, wear-out) associated with the MTBF value.  If you are interested in learning more about Weibull analysis, we highly recommend this tutorial from our partners at Prelical – it’s a great introduction to utilizing this mathematical technique for estimating failure rates.

MTBF is calculated – Now What?

With MTBF calculated, there are many ways to utilize this information to make better reliability and maintenance decisions, in fact too many to offer a complete list in this blog.  Here are a couple of common use case’s reliability professionals are driving once they have calculated MTBF on their equipment:

Identify poor performing equipment.  Evaluate the performance your equipment by comparing against similar equipment in a similar operating context.  In the previous example, MTBF was estimated at 1.25 years for the cooling water pump.  With this information it is now possible to benchmark it’s specific performance against industry standards such as the Itus Asset Twin Library, OREDA database (Oil and Gas Industry Specific) or OEM specific performance guarantees.  Once compared, it is easy to identify bad actors in your equipment population, evaluate reasons for the poor performance and establish a plan to improve the asset strategy.

Measuring the effectiveness of your reliability initiatives.  While MTBF is a lagging indicator, it is useful in assessing the effectiveness of a reliability and maintenance improvement program.  The pump example we reviewed demonstrates how we can measure the specific performance of one asset over time.  If you have leveraged a strategy development process such as a Failure Modes and Effects (FMEA) analysis and developed an optimized strategy or preventative maintenance plan for an equipment class (i.e., centrifugal pumps), the MTBF calculation can also be used to measure the effectiveness of that strategy.  As you implement your maintenance and monitoring strategy, the equipment class should become more reliable over time and MTBF should increase.

Analyzing future operational risk.  For equipment failures which directly impact production, MTBF can be a key input to evaluating future operational risk.  Keeping with our simple cooling water pump example which has an MTBF of 1.66 years.  If this pump is needed to run for the next 2 years to meet projected demand, it is highly likely it will experience at least one failure during that run cycle.  With a calculated MTBF value, it is possible to utilize solutions such as Asset Risk Analyzer to determine future operational risk, communicate potential downtime implications to management and justify investment in a reliability improvement initiative.

MTBF provides a practical approach to measure the current and historical failure rates for industrial equipment.  With visibility to MTBF information, maintenance and reliability professionals can make wise decisions on where to focus efforts and investments to meet business objectives.  Historically, the MTBF measurement has been considered an advanced reliability technique but is much more common today as organizations have implemented foundational asset management systems and processes.

If you are not currently utilizing MTBF to enhance your decision making, consider using our FREE Spreadsheet to get started.  Register below to get instant access to our template.

Download MTBF Calculator

Our calculator will walk you through the entire process of getting data from your EAM system, how to further classify records as breakdowns vs preventative maintenance, as well as specifics on how the calculations work and what to do with the results.  Register below and get instant access to start actively managing your equipment failure rates!

 

 

The post Use of the Mean Time Between Failure Calculation appeared first on Itus Digital.

]]>
https://www.itus-digital.com/calculating-mean-time-between-failure/feed/ 0