Rules evaluation endpoint service disruption
Resolved Aug 29 at 01:07pm CDT
Rule evaluation requests were temporarily failing after underlying library code changed. We have reverted the version of said library and redeployed the previous version of the Rules API, which resolved the incident.
Rules API service degraded
Resolved Jul 31 at 10:26am CDT
Between 9:34 am CDT and 9:46 am CDT, deployment of a dependent back-end service coincided with exceptionally high overall platform load, exposing an issue which resulted in a large number of 400 responses from the Rules API. The service degradation automatically resolved once the deployment completed, but Verosint engineering is aware of the issue, working on a permanent fix, and will take additional safety precautions during subsequent deployments of the dependent back-end service until the ...
SignalPrint API Backend Maintenance
Resolved Jun 27 at 04:00pm CDT
SignalPrint log ingestion will be temporarily paused to facilitate reconfiguration of backend stream resources. The SignalPrint API will continue to function normally during the operation and data sent to the API will buffer until the streams are resumed. Dashboards will not display newly-ingested data until the maintenance is complete, but otherwise we anticipate no impact to service availability.
Rules API and Signalprint API service disruption
Resolved Jun 21 at 03:55pm CDT
Following a standard application deployment, monitors for the Rules API and SignalPrint API backends notified Verosint engineering of significantly elevated error responses from those endpoints. Engineering staff determined that a faulty VCS repository configuration allowed the affected services configurations to be updated without deploying the application version for which said configuration was intended. Building and deploying an appropriate artifact restored both services, the associated ...
Brief unplanned API downtime
Resolved Jun 05 at 03:40pm CDT
The API backend experienced a brief service interruption on 05 June 2023, beginning at 20:33 UTC and ending at 20:39 UTC. Root cause was determined to be a stale network configuration on a subset of backend cluster nodes which prevented new ingress service instances from initializing. Rolling replacement of the affected cluster nodes ensured the correct network configuration was applied and API service monitors recovered shortly after the first few nodes joined the cluster.