Skip to content

Improved Observability (Azure) #138

@jezzsantos

Description

@jezzsantos

Azure Monitoring

Questions we want answers to:

  • Function Hosts:
    • Have any of the Azure Function been disabled?
      • Alert when this happens
    • Have any functions gotten overloaded (memory, etc.)
  • Queues (Storage Account):
    • Have any queues got more than 1000 messages queued?
      • Alert when this happens
    • Are there any messages in any of the poison queues
      • Alert when this happens
  • ServiceBus Topics:
    • Have any topic/subscriptions got more than 100 messages queued?
      • Alert when this happens
    • Overview of messages in all the Service Bus subscriptions
    • Throughput of messages
  • SQL Database
    • General performance issues: CPU, Memory, Contention
    • Alert when response times exceed 1000ms for any query
  • App Services (All):
    • Alert when Health endpoint consistently not responding
    • Basic perf issues (Memory, CPU) alerts
    • Alert when response time exceeds 1000ms
  • ApiHosts (AppServices):
    • Alert when it sees any 500
    • Alert when it sees any Exception
  • WebsiteHost (AppService):
    • Alert when it sees a HTTP 400 (from ANY ApiHost)
    • Alert when it sees a Crash report from the browser app (POST /record/crash)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions