It is best to discard the terminology of Unsafe Act and Unsafe Condition. If you use Error instead of unsafe act, for instance, then you can lump ALL incidents into one database, including those that appear to be only quality, reliability, process safety, etc. (those that may appear to have no relationship to occupational health and safety). An error in today’s incident may appear to be a near miss or hit to process reliability or quality, but tomorrow the same underlying causes (and root causes) will lead to an error that is a safety near miss or actual harm. By dropping the term "unsafe", you can lump all problems into one database and then drive your error rates down faster and therefore achieve your goals faster. Leaving "safety" in these terms will inadvertently exclude a lot of errors and failures that can teach you how to lower the probability of those with safety outcome; many folks will report a near miss or hit to a reliability of quality problem but will not report a near miss to a safety incident. Using the terms of Error and Failure (component related) and Natural Phenomena has worked great at many organizations where this change was made.
Find out more about PII and our investigator and RCA training at: www.piii.com
2905878b-5c54-4d34-bc48-ed70ad2a1455|0|.0
Our staff has led about 10,000 HAZOP/PHA over the past 20+ years. We found in the early 1990s that using a risk matrix "live" in a HAZOP/PHA will actually hurt the brainstorming (producing less scenarios) because the team waste more time on scoring schemes while in the use of a risk matrix. Further, about 50% of the scores were false; the team adjusted the scores to match their internal expert opinions.
We have paid attention since then and the same is true today. In the mid-1990s we stopped recommending use of risk scoring (and risk matrix) in a PHA/HAZOP. Instead, we taught teams (and led them his way ourselves) to make a consensus judgment of the residual risk using their expert opinion. We have found better results overall with this approach, including the team finding more scenarios (their main job) and assessing the value of existing safeguards (we use IPLs rules here, but no scores for the IPLs) and then judging if the residual risk is low enough (if not, we make recommendations to get to tolerable risk).
If the team is confused on the risk (which occurs about 5% of the time) then we recommend doing a LOPA. I was a co-originator of LOPA and was the primary author of the first book (LOPA: CCPS, 2001) and the upcoming book on IPLs and IEs (CCPS, 2012). My main interest in developing LOPA was to have a method to do an order-of-magnitude risk assessment correctly. I do not recommend doing LOPA or using any scoring during a PHA/HAZOP... you do not want to do anything to limit brainstorming by the team.
With that said, about 5% of our clients require us to use a risk matrix or score scenarios in a quasi-LOPA fashion during the PHA/HAZOP meetings; the main reason for this is because someone else convinced a manager many years ago this was necessary and now it is hard to change the policy. This is sad. Folks should listen to data from experts for such decisions and not just be overly impressed with colors on matrices and numbers. Remember, most of the IPL and IE values are (1) consensus values (voted on by folks like those on HAZOP teams), (2) have an order of magnitude deviation on either side of the average, and (3) may not represent site data at all. Using "word" definitions of consequences and frequency and probability are not any better than voting on the overall risk; so why bother with that extra, falsely better, way of voting on risk.
These topics are covered in a couple of the papers on our website: www.piii.com
2d10c405-70ee-4792-bc9e-c6f96a395613|0|.0
Here is an excellent causation model that maps a causal factor (cause) down to the management system level (root cause level):
http://www.process-improvement-institute.com/_downloads/Root_Cause_Chart_PII_R6_Complimentary.pdf
(This chart includes references to the original work in the legend.) It differs substantially from Dan Peterson’s approach and many others (including ILCI / SCAT) in that it is all focused on the company responsibility for implementing and enforcing systems to control human error rates. More than a million accidents and near misses have been investigated with this model or with ones very similar to it (based on it). This originated in the US DOE work in 1985-86, which in turn was a simplification of MORT by US DOE decades earlier. It became popular in the private industry around 1990. (Note that at least one proprietary method was developed from the same basis.) This also matches nicely the US NRC description of human error originating from human factors; see SPAR-H from US NRC for more details on that aspect; SPAR-H also has many scaling factors for the importance of good and poor human factors on error probability; these factors originating from Swain and others in the 1970s and 1980s have been amended to account for newer data derived from control room data and cockpit data. So, both methods (the root cause chart and SPAR-H and derivatives) are in turn derived from tons of hands-on data (data collected to quantify the likelihood of human error and since supported by site/company data in many independent studies). These approaches are used extensively in the process industry for quality, reliability, and process safety incidents. Those in the occupational safety field seem to want a different model; possibly because many of the incidents related to occupational safety have only one layer of protection, which is up to the worker who may also be the initiator of an accident sequence. There are lower limits to the human error rate; once that reality sinks in you must look to re-engineer the task (different fixtures, different tools, and different processes) to lower the risk further.
39f2b5b4-4546-4049-b6cf-a807fec5cd62|0|.0
The best overall leading indicator (KPI) we have found is the number of near hits (or near misses) divided by the of loss events (accidents). We even now know the target values for these. Less than 5 is poor and indicates you are learning more from accidents than from free near misses. Greater than 20 is good. Greater than 50 is excellent. A related KPI is the number of near misses that are investigated to root causes divided by the number of loss events. This ratio should be at least 15.
There are many other leading KPI we have tried and many have proven useful. These fall in the category of "making sure the EHSQ activities are being accomplished each day/week". For instance, how many BBSM observations were performed per worker (should be 4/month/person or higher). How many JSAs performed per month (no target and it must be compared to the work activities); how many per new type of task is better to measure and should be close to 100%.
In process safety, there are many as well, such as the average delay in completing the first review level for a requested change; or the number MOCs that did not get a risk review (which should be zero); or the number of recommendations that are pending resolutions after 30 days, 60 days, 360 days, etc. For process safety, we normally have 100-150 different KPIs. For occupational health and safety, there are about 50 we normally use. For Quality, the leading indicators mostly overlap with the process safety leading indicators, since most of the effort in each (PSM and Quality) is to control human error of normal job activities.
9cf217f8-8af3-4457-8d13-f316fe4e76a1|0|.0
Process safety management is 20+ years older than the extremely weak PSM regulation from US OSHA, so to 99% of the companies around the world that implement PSM, OSHA PSM elements and interpretation is almost irrelevant. Even in the USA, most of the companies who have implemented PSM more than 10 years extend process safety to processes not covered because using effective process safety practices is a cost effective way to operate a business. PSM should not be viewed as a compliance issue; the implementer will miss the benefits with a compliance-only approach.
US OSHA is not the incentive to implement PSM. For instance, note that US OSHA has issued a total of $95 million in fines related to PSM since their reg was finally issued in 1992 (that is about $5 million in PSM fines per year, on average; this is spread across more than 100,000 sites). Whereas process safety weaknesses cost the industry in just the USA more than $20 billion per year (likely much higher in indirect costs) in lost revenue (downtime), lost capital (replacement/repair) of process systems, litigation and settlements in civil courts (for liability claims), etc. So, the real benefits of reducing these losses is 1000 to 5000 times greater than avoiding OSHA citations. This is the business case for process safety and most of the benefits are actually in process reliability (where process safety programs started) gains and enhanced quality.
To find our more, check out our papers (all free) at www.piii.com or attend our PSM course (Course 2).
7877786d-d0f1-4d90-87b1-f2690e3702d0|0|.0
We have license to all proprietary software and must use them based on client demands. And we also use Excel and Word. We do hundreds of PHAs each year; our staff has done more than 5000 unit-size PHAs of projects and of existing units and revalidations to date (over the past 20 years)! LEADER is clearly the best if you do HAZOP of continuous processes because it is the faster at setup of the nodes and the Only software to support linking of consequence of one deviation to the cause of another deviation (which can Greatly improve thoroughness and speed; which is a rare combo). It also has the fastest in-meeting hotkeys and really useful technical functions. We find it saves 25% of meeting time and prep and after meeting time over all competitors (including PHA Pro); that is significant when you add up the value of everyone's time. If you are not doing HAZOP of continuous mode, then Word or Excel is best or at least a close to equal with LEADER. PHA Pro is third best (note they have the best sales staff; though customer support is not good since they are sales oriented and not service oriented). PHA Pro can be second best on occasion only because it automates use of risk matrix, if you must risk rank (that makes it a little better than Word, but still far behind LEADER). HAZOP Manager by Lihou is about the same as PHA Pro. PHA Works is Far Behind the other options. We do not sell or promote software; all are competitors to us on training and delivery. Note that one of the major software vendors has 8 times as many salesmen as folks who lead PHAs for them; that should raise a red flag. You do NOT need software in order for you to lead and document a Great PHA. See our paper on www.piii.com on Optimizing PHAs for a comparison of methods, rules, and software.
Bottom-line: If you only do a few PHAs, then just use Excel or Word. If you do a lot of What-if, then use Word or Excel. If you do a Lot of HAZOP of continuous processes, then use LEADER. If you need to use risk matrices (which is not that valuable, but most folks seem to want to use these in qualitative meetings, see the many papers discussing the pros and cons), then either automate the lookup functions of Excel, or use LEADER or PHA Pro.
We do LOPA by Excel or Word (though LEADER makes it easy enough as well)... but LOPA is easy for me since I'm one of the originators of LOPA (Art Dowell and Martin Gollin and me) and the co-author of the first LOPA text by CCPS/AIChE (2001) and the primary author of the second textbook on LOPA (IPLs and IEs) by CCPS/AIChE, due out in early 2012... the rules of LOPA are strict and take some discipline and experience to use properly, but the documentation of LOPA is Very simple and the math is Very simple.
To find out more about training courses and consulting services from PII, visit: www.piii.com
3bbfbee5-a958-4930-b92e-851fbfbf7dfa|0|.0
Process safety is mostly about controlling direct human error (like operator errors), indirect (latent) effects of human error (main reason parts fail), and compensating for human error. There is also a lot of basic chemical engineering to learn to understand how PSM is controlled and what leads to process hazards. Many folks say that "hands-on" experience is key... there is NO replacement for that... make sure the hands-on PSM activities are always addressing the questions: "how can this __xx___ best optimize human factors and control or compensate for human error," where xx is any procedure or tool or equipment interface or training module. If you are new to process safety, there is a lot of reading materials that can help you learn more about process safety. For instance, the papers on our website are free and are good starting resources; some for beginners and some for experienced process safety staff. You should also start a library of the 40 or so unique textbooks from the Center for Chemical Process Safety (CCPS), the division of American Institute of Chemical Engineers (AIChE), which controls the international definitions and standards for Process Safety. All of the books are developed by committees of experts (and some novices) and many of the textbooks are excellent (some are Not that as strong). A good starting book is "Risk-Based Process Safety", 2007. This is the current process safety definition by CCPS/AIChE. It is well written. After that, get the current revisions of the textbooks on the core process safety elements, such as "Guidelines for Hazard Evaluation Procedures", 3rd edition, 2008 (adding the new text in Chapter 9 is the key improvement brought out by the 3rd edition). The "Guidelines or Mechanical Integrity" is also good and well written. The upcoming book (early 2011) on Independent Protection Layers and Initiating Events will also teach you a lot (even if you do not need LOPA right now). From there, you need to decide where you want to go next. Go to www.aiche.org/ccps to see a list of the books; but the current editions are now sold through the publisher used by AIChE, which is Wiley; Amazon carries many of the titles.
PSM takes about 3-4 years to learn the basics and 10+ to get good at it. This does not account for closing a knowledge gap, if one exists, on the engineering principals involved (how well do you understand PSV sizing, SIS specification and calculations, LOPA and IPL principals, metallurgy and material of construction sensitive issues, Joules-Thompson effects, chemical reaction kinetics, etc.).
Related to training, there are several worthy providers of PSM-related training. You can do a Google search for such courses; most offer public courses. Our courses are highly praised and attended.
Visit PII at www.piii.com to find out more about effective implementation of process safety and human factors optimization.
0fd03c1d-1467-43ea-970b-e5a3b4f22a15|0|.0
Saving time is not the same as efficiency; in fact that is why we wrote the paper "Efficient Hazard Evaluation" which you can download from our website's home page. Since our staff (combined with the previous staff I managed at another company) have led more than 10,000 HAZOPs, and run more documented experiments during these HAZOPs than others, we have had a chance to see what works and does not work to increase effectiveness and efficiency. In the case of PHA/HAZOP, effectiveness and efficiency are tied together for many of the best-practice rules that we follow.
For effectiveness during PHA/HAZOP of any mode of operation, one KEY focus is to make sure Brainstorming is Maximized; because if brainstorming is diminished, then accident scenarios are missed and therefore IPLs are not there when you need them. Take just one small item in the paper on this topic of maximizing brainstorming. Implementing that item will increase brainstorming (and usually saves times), since keeping the brain from burnout or boredom increases brainstorming ability. For instance, take the rule: "Do Not use an LCD projector for team meeting notes during the meetings" (only do so on confusing points, as an exception); this can save 20% or more of team meeting time and also increases the brainstorming effectiveness because the team is not reading and editing what is on the screen. Also: "Use Linking between Consequences and Causes to build scenarios more thoroughly" ... this also happens to be faster, once you practice it a few times.
For effectiveness overall, MAKE SURE that the Non-Routine modes of operation are PHA/HAZOPed. This requires a 2 guideword HAZOP or What-if (and in some cases a 7-guideword HAZOP) of the step-by-step procedures for startup and shutdown and lighting furnaces and online maintenance. This will enhance HAZOP results/outcomes tremendously since 75% of the major accidents occur during these modes of operation. The same paper discusses using the savings in wasted time (dulled brainstorming) to analyze these non-routine modes of operation. Also note that the new section 9.1 of the CCPS/AIChE textbook, Guidelines for Hazard Evaluation Procedures, 3rd edition, 2008, was added for the purpose of giving this part of the hazard analysis (i.e., non-routine modes of operation) the Focus; that chapter of the newest edition also explains how to effectively and efficiently perform this analysis of non-routine modes of operation.
If you have questions, feel free to send me an e-mail at:
wbridges@piii.com
b53d1965-f898-4232-ac18-aa62cf6b68ce|0|.0