Pumping station failures cause CSOs and basement floods. A CMMS with vibration thresholds, run-hour PMs, and mobile escalation cuts the unplanned outage rate measurably.
Pumping station failures cause combined sewer overflows and basement floods. The CMMS practices that consistently reduce unplanned outages are vibration thresholds, run-hour PMs, and a tight alarm escalation chain. Done well, they cut unplanned outages by 40 to 70%.
Across a typical municipal sewer network, pumping station failure modes cluster into five categories. Bearing failure (gradual, vibration-detectable, usually predictable). Impeller wear and rag-up (gradual on the wear side, sudden on the rag side). Mechanical seal failure (often catastrophic and hard to predict without trending). Motor burnout (electrical, usually preceded by elevated current draw). Control failure (PLC, level transmitter, VFD, or the wiring between them).
Of those, the first three are mechanical and trend slowly. They are exactly the failure modes that condition-based maintenance was invented for. Motor and control failures need a different toolkit (insulation testing, infrared scans, PLC backup discipline) but also yield to a CMMS that schedules them.
The economic case is straightforward. A medium-size lift station handling 4,500 m³/day costs roughly USD 8,000 to USD 25,000 per unplanned outage when you account for emergency callouts, hire of bypass pumps, regulatory fines for overflow events, and reputational cost. A network of 30 such stations averaging two unplanned outages each per year burns through USD 0.5 million to USD 1.5 million annually that a tighter PM regime would prevent.
ISO 10816-3 gives the working reference values. For a typical end-suction sewage pump in the 15 to 75 kW range mounted on a rigid foundation, vibration velocity should sit below 2.8 mm/s RMS in good condition. The "alert" threshold sits between 2.8 and 4.5 mm/s. The "alarm" threshold above 4.5 mm/s indicates the machine is heading for damage if not investigated.
The CMMS practice that works is to wire those thresholds into the trigger logic. A continuous vibration sensor (or a monthly handheld reading captured on the mobile app) feeds a value into the asset record. When the reading crosses the alert level, the CMMS auto-creates a "vibration investigation" work order and assigns it to the rotating equipment lead. When it crosses the alarm level, the work order goes to the on-call planner with email and SMS escalation.
What you do not want is a wall full of vibration meters and a quarterly review meeting. The trend goes through 2.8 in March, the meeting is in June, the bearing seizes in May. Trigger logic in the CMMS shortens the loop to days.
Calendar-based PMs ("grease the bearings every six months") punish lightly-used assets and undermaintain heavily-used ones. Run-hour PMs solve both problems. The pump's hour meter (or VFD-reported run time) feeds the CMMS, and PMs trigger on accumulated hours.
Working numbers for a typical sewage pumping station: bearing greasing every 4,000 run hours, mechanical seal inspection every 6,000 run hours, oil change for gearboxes or oil-bath bearings every 8,000 hours, full overhaul candidate at 30,000 hours. These are starting points; the manufacturer's manual and your own service history dictate the final cadence.
The key configuration detail is that the CMMS should pull the run hours automatically from the SCADA historian or the VFD, not rely on a technician reading the hour meter. Manual run-hour entry is a maintenance debt that goes unpaid for months and corrupts the PM schedule. If your SCADA cannot expose run hours, the next-best option is to make run-hour reading a mandatory field on every monthly inspection round PM.
The escalation chain that works for unmanned pumping stations has four steps and lives partly in SCADA, partly in the CMMS, and partly in the on-call rota.
Step 1: SCADA detects the alarm condition (high wet well level, motor trip, loss of communication). Step 2: SCADA forwards the alarm to the CMMS, which auto-creates an emergency work order and pages the on-call technician with the work order number and the asset's location. Step 3: if the on-call technician does not acknowledge within 15 minutes, the page escalates to the on-call supervisor. Step 4: if the supervisor does not acknowledge within 30 minutes, the page escalates to the duty manager and the regulator notification clock starts.
Two things break this chain in practice. First, alarm flooding: a single power blip generates 40 alarms across 12 stations and the on-call technician's phone is unusable. The fix is alarm grouping at the SCADA layer plus a "do not page on first alarm in a network event" rule in the CMMS. Second, stale on-call data: the rota was updated in HR but not in the CMMS. The fix is a single source of truth, refreshed weekly via an integration or a manual update PM.
Wet wells and check valves are the unloved assets in a pumping station, but they cause a disproportionate share of the call-outs. A wet well that has not been desludged in 18 months has a 100 mm cake of FOG (fats, oils, grease) at the level transmitter line, which intermittently lies about the level. A check valve that has not been inspected in three years has a stuck flap, which means one pump backflows through the other when the duty pump trips.
Working PM cadence: wet well desludge and clean every 12 months for normal-flow stations, every 6 months for stations downstream of food processors or restaurant clusters. Check valve inspection annually, with disassembly every 3 years for any valve over 6 inches. Float and ultrasonic level transmitter cleaning every 3 months. Pump rail and guide-bar inspection every 12 months.
Configure these as separate PMs, not as items on a generic "annual lift station inspection" checklist. Separate PMs produce separate work orders, separate close-out times, and separate evidence trails. A single bundled PM hides the truth that nobody actually got down into the wet well in October.
The metrics that matter for a pumping station maintenance programme are unglamorous and few. Track them monthly per station and as a network rollup.
Unplanned outages per station per year. Industry good practice for a well-run network sits around 1.5 per station per year. Networks above 3 are running reactively. The number drops within 6 to 12 months of disciplined PM scheduling.
Mean time to repair (MTTR). The clock starts when the alarm fires and stops when the station is back in service. Typical good value: 2 to 4 hours for in-network repairs, 8 to 24 hours when a spare pump has to be sourced. MTTR above 12 hours network-wide signals weak parts inventory.
PM compliance percentage. Of the PMs scheduled this month, what percentage closed within 7 days of their scheduled date. Below 80% means the planning team is overloaded or the schedule is unrealistic. Above 95% is achievable with a properly resourced crew and is the threshold at which the unplanned-outage number begins to fall noticeably.
Critical spares on hand. Bearings, mechanical seals, and impellers for the top 10 pumps in the network should be on the shelf. If the CMMS shows zero on hand and a 6-week lead time, MTTR will blow out the next time one of those assets fails.