Trophy Arki1, Google Cloud Authorized Training Partner of the years 2019 & 2020 in Latin America

Logging, Monitoring, and Observability in Google Cloud

Este curso de três dias ministrado por instrutor ensina aos participantes técnicas para monitorar, solucionar problemas e melhorar a infraestrutura e o desempenho do aplicativo no Google Cloud. Guiados pelos princípios de Site Reliability Engineering (SRE) e usando uma combinação de apresentações, demonstrações, laboratórios práticos e estudos de caso do mundo real, os participantes ganham experiência com monitoramento full-stack, gerenciamento e análise de log em tempo real, depuração de código em produção, rastreamento de gargalos de desempenho de aplicativos e criação de perfil de uso de CPU e memória.

Objetivos

Neste curso, os participantes aprenderão as seguintes habilidades:

  • Plan and implement a well-architected logging and monitoring infrastructure
  • Define Service Level Indicators (SLIs) and Service Level Objectives (SLOs)
  • Create effective monitoring dashboards and alerts
  • Monitor, troubleshoot, and improve Google Cloud infrastructure
  • Analyze and export Google Cloud audit logs
  • Find production code defects, identify bottlenecks, and improve performance
  • Optimize monitoring costs

Público-Alvo

Esta aula destina-se ao seguinte público:
  • Cloud architects, administrators, and SysOps personnel
  • Cloud developers and DevOps personnel

Pré-requisitos

To get the most out of this course, participants should have:

  • Google Cloud Platform Fundamentals: Core Infrastructure or equivalent experience
  • Basic scripting or coding familiarity
  • Proficiency with command-line tools and Linux operating system environments

Duração

24 horas (3 dias)

Investimento

Consulte o valor atualizado e datas das próximas turmas abertas em nossa página de inscrições.
Caso tenha interesse em uma turma fechada para sua empresa, entre em contato conosco.
Dependências de outros cursos e certificações com o curso de Logging, Monitoring and Observability in Google Cloud
Dependências de outros cursos e certificações com o curso de Logging, Monitoring and Observability in Google Cloud

Resumo do curso

O curso inclui apresentações, demonstrações e laboratórios práticos.
  • Understand the purpose and capabilities of Google Cloud operations-focused components: Logging, Monitoring, Error Reporting, and Service Monitoring
  • Understand the purpose and capabilities of Google Cloud application performance management focused components: Debugger, Trace,
    and Profiler
  • Construct a monitoring base on the four golden signals: latency, traffic, errors, and saturation
  • Measure customer pain with SLIs
  • Define critical performance measures
  • Create and use SLOs and SLAs
  • Achieve developer and operation harmony with error budgets
  • Develop alerting strategies
  • Define alerting policies
  • Add notification channels
  • Identify types of alerts and common uses for each
  • Construct and alert on resource groups
  • Manage alerting policies programmatically
  • Choose best practice monitoring project architectures
  • Differentiate Cloud IAM roles for monitoring
  • Use the default dashboards appropriately
  • Build custom dashboards to show resource consumption and application load
  • Define uptime checks to track aliveness and latency
  • Integrate logging and monitoring agents into Compute Engine VMs
    and images
  • Enable and utilize Kubernetes Monitoring
  • Extend and clarify Kubernetes monitoring with Prometheus
  • Expose custom metrics through code, and with the help of
    OpenCensus
  • Identify and choose among resource tagging approaches
  • Define log sinks (inclusion filters) and exclusion filters
  • Create metrics based on logs
  • Define custom metrics
  • Link application errors to Logging using Error Reporting
  • Export logs to BigQuery
  • Collect and analyze VPC Flow logs and Firewall Rules logs
  • Enable and monitor Packet Mirroring
  • Explain the capabilities of Network Intelligence Center
  • Use Admin Activity audit logs to track changes to the configuration or metadata of resources
  • Use Data Access audit logs to track accesses or changes to user-provided resource data
  • Use System Event audit logs to track GCP administrative actions
  • Define incident management roles and communication channels
  • Mitigate incident impact
  • Troubleshoot root causes
  • Resolve incidents
  • Document incidents in a post-mortem process
  • Debug production code to correct code defects
  • Trace latency through layers of service interaction to eliminate performance bottlenecks
  • Profile and identify resource-intensive functions in an application
  • Analyze resource utilization cust for monitoring related components within Google Cloud
  • Implement best practices for controlling the cost of monitoring within Google Cloud