Trophy Arki1, Google Cloud Authorized Training Partner of the year 2019 in Latin America

Logging, Monitoring, and Observability in Google Cloud

Este curso de tres días dirigido por un instructor enseña a los participantes técnicas para monitorear, solucionar problemas y mejorar la infraestructura y el rendimiento de las aplicaciones en Google Cloud.
Guiados por los principios de Site Reliability Engineering (SRE) y utilizando una combinación de presentaciones, demostraciones, laboratorios prácticos y estudios de casos del mundo real, los asistentes adquieren experiencia con el monitoreo completo, la administración y el análisis de registros en tiempo real. depurar código en producción, rastrear cuellos de botella en el rendimiento de las aplicaciones y perfilar el uso de CPU y memoria.

Objectives

In this course, participants will learn the following skills:

  • Plan and implement a well-architected logging and monitoring infrastructure
  • Define Service Level Indicators (SLIs) and Service Level Objectives (SLOs)
  • Create effective monitoring dashboards and alerts
  • Monitor, troubleshoot, and improve Google Cloud infrastructure
  • Analyze and export Google Cloud audit logs
  • Find production code defects, identify bottlenecks, and improve performance
  • Optimize monitoring costs

Audience

This class is intended for the following audience:

  • Cloud architects, administrators, and SysOps personnel

  • Cloud developers and DevOps personnel

Prerrequisites

To get the most out of this course, participants should have:

  • Google Cloud Platform Fundamentals: Core Infrastructure or equivalent experience
  • Basic scripting or coding familiarity
  • Proficiency with command-line tools and Linux operating system environments

Duration

24 hours (3 days)

Investment

Check the next open public class in our enrollment page.
If you are interested in a private training class for your company, contact us.
Dependencias de otros cursos y certificaciones con el curso de Logging, Monitoring and Observability in Google Cloud
Dependencias de otros cursos y certificaciones con el curso de Logging, Monitoring and Observability in Google Cloud

Course Outline

The course includes presentations, demonstrations, and hands-on labs.
  • Understand the purpose and capabilities of Google Cloud operations-focused components: Logging, Monitoring, Error Reporting, and Service Monitoring
  • Understand the purpose and capabilities of Google Cloud application performance management focused components: Debugger, Trace,
    and Profiler
  • Construct a monitoring base on the four golden signals: latency, traffic, errors, and saturation
  • Measure customer pain with SLIs
  • Define critical performance measures
  • Create and use SLOs and SLAs
  • Achieve developer and operation harmony with error budgets
  • Develop alerting strategies
  • Define alerting policies
  • Add notification channels
  • Identify types of alerts and common uses for each
  • Construct and alert on resource groups
  • Manage alerting policies programmatically
  • Choose best practice monitoring project architectures
  • Differentiate Cloud IAM roles for monitoring
  • Use the default dashboards appropriately
  • Build custom dashboards to show resource consumption and application load
  • Define uptime checks to track aliveness and latency
  • Integrate logging and monitoring agents into Compute Engine VMs
    and images
  • Enable and utilize Kubernetes Monitoring
  • Extend and clarify Kubernetes monitoring with Prometheus
  • Expose custom metrics through code, and with the help of
    OpenCensus
  • Identify and choose among resource tagging approaches
  • Define log sinks (inclusion filters) and exclusion filters
  • Create metrics based on logs
  • Define custom metrics
  • Link application errors to Logging using Error Reporting
  • Export logs to BigQuery
  • Collect and analyze VPC Flow logs and Firewall Rules logs
  • Enable and monitor Packet Mirroring
  • Explain the capabilities of Network Intelligence Center
  • Use Admin Activity audit logs to track changes to the configuration or metadata of resources
  • Use Data Access audit logs to track accesses or changes to user-provided resource data
  • Use System Event audit logs to track GCP administrative actions
  • Define incident management roles and communication channels
  • Mitigate incident impact
  • Troubleshoot root causes
  • Resolve incidents
  • Document incidents in a post-mortem process
  • Debug production code to correct code defects
  • Trace latency through layers of service interaction to eliminate performance bottlenecks
  • Profile and identify resource-intensive functions in an application
  • Analyze resource utilization cust for monitoring related components within Google Cloud
  • Implement best practices for controlling the cost of monitoring within Google Cloud