Operations & Maintenance
12.1 O&M Overview
Effective operations and maintenance (O&M) of an underground security surveillance system requires a combination of remote monitoring, scheduled preventive maintenance, and a well-defined incident response procedure. The harsh underground environment — high humidity, temperature cycling, vibration, and limited access — means that equipment degrades faster than in surface installations, and that access for repairs is more time-consuming and costly. A proactive O&M strategy that catches problems early through remote monitoring and scheduled inspections is far more cost-effective than a reactive strategy that waits for failures to occur.
The photograph below shows a typical O&M operations center where a monitoring engineer tracks system health in real time using a multi-screen dashboard, while a field technician simultaneously performs a cabinet inspection. This two-layer approach — remote monitoring plus scheduled field visits — is the recommended O&M model for underground security surveillance systems.
12.2 Key Performance Indicators
The following KPIs define the operational targets for an underground security surveillance system. These targets should be included in the service level agreement (SLA) between the system operator and the maintenance contractor. KPI performance should be reviewed monthly and reported quarterly to the facility manager.
12.3 Preventive Maintenance Schedule
The preventive maintenance schedule defines the activities, frequency, and responsible party for all routine maintenance tasks. Preventive maintenance is the primary tool for maintaining system availability and extending equipment life in the harsh underground environment. All maintenance activities must be documented in the maintenance log, and any defects found must be raised as corrective maintenance work orders within 24 hours of discovery.
| Frequency | Activity | Responsible | Documentation |
|---|---|---|---|
| Daily (Remote) | Check camera online status in VMS dashboard | Control room operator | Daily log entry |
| Verify recording is active on all channels | Control room operator | Daily log entry | |
| Check storage utilization — alert if >80% | Control room operator | Daily log entry | |
| Review UPS battery status and alarm log | Control room operator | Daily log entry | |
| Monthly (Field) | Clean camera dome covers with IPA wipe | Maintenance technician | Maintenance work order |
| Inspect cable entries and glands for seal integrity | Maintenance technician | Maintenance work order | |
| Test UPS battery runtime under load | Maintenance technician | UPS test report | |
| Check cabinet temperature and humidity logs | Maintenance technician | Maintenance work order | |
| Quarterly (Field) | Verify camera focus and coverage area | Maintenance technician | Quarterly inspection report |
| Test motion detection in all zones | Maintenance technician | Quarterly inspection report | |
| Check firmware versions; apply updates if available | Network engineer | Firmware update log | |
| Review and rotate VMS user passwords | IT security | Password change log | |
| Annual (Full) | Full acceptance re-test (all items from Chapter 10) | Project manager + client | Annual inspection report |
| Replace UPS batteries if capacity <80% of rated | Maintenance technician | Battery replacement record | |
| Anti-corrosion inspection and touch-up of all metalwork | Maintenance technician | Annual inspection report |
12.4 Incident Response Procedure
When a system alarm or failure is detected, the incident response procedure defines the steps to be taken to restore service within the target MTTR. The procedure distinguishes between critical incidents (affecting a complete zone or the entire system) and non-critical incidents (affecting individual cameras or non-essential functions). Critical incidents must be escalated immediately to the on-call engineer; non-critical incidents can be queued for the next scheduled maintenance visit if the affected camera count is below the threshold defined in the SLA.
| Incident Type | Example | Response Time | Escalation Path | Resolution Target |
|---|---|---|---|---|
| P1 — Critical | All cameras in a zone offline; VMS server down; complete recording failure | Acknowledge within 15 min; dispatch within 1 hour | Operator → On-call engineer → Facility manager | Restore within 4 hours |
| P2 — High | Multiple cameras offline (>10%); UPS alarm; storage >90% | Acknowledge within 30 min; dispatch within 4 hours | Operator → On-call engineer | Restore within 8 hours |
| P3 — Medium | Single camera offline; image quality degraded; motion detection false alarms | Acknowledge within 2 hours; schedule within 24 hours | Operator → Maintenance queue | Restore within 48 hours |
| P4 — Low | Firmware update available; label damaged; minor cable management issue | Acknowledge within 24 hours | Maintenance queue | Resolve at next scheduled visit |
Spare Parts Readiness: To meet the MTTR targets above, a minimum spare parts inventory (as defined in Chapter 8) must be maintained on-site or at a nearby depot. Without spare parts, even a simple camera replacement can take days if parts must be ordered. Review spare parts inventory quarterly and replenish any consumed items within 30 days.