Whether you`re Google`s search engine, which serves a billion active monthly users who interact for free with your service, or Salesforce with 3.75 million paying subscribers, building a technology product means serving people. At Google, we distinguish between an SLO and a Service Level Contract (SLA). An ALS usually involves a promise made to someone who uses your service that its SLO availability should meet a certain level for a certain period of time, and if it does not, some kind of fine is paid. This may be a partial refund of the service subscription fee paid by customers for this period or an additional subscription added free of charge. The concept is that the exit of the SLO will hurt the service team, so they will insist on staying within the SLO. If you spend money on your customers` hands, you`ll probably need ALS. Service availability: The time available to use the service. This can be measured using the time window, z.B 99.5% availability between hours 8 a.m. and 6 p.m. and more or less availability at other times. E-commerce processes are generally extremely aggressive. 99.999 percent operating time is an unusual requirement for a website that generates millions of dollars per hour.
Define carefully. A supplier can optimize ALS definitions to ensure they are met. For example, the Incident Response Time measure is designed to ensure that the provider corrects an incident within a minimum of minutes. However, some providers can complete ALS 100% by providing an automated response to an incident report. Customers should clearly define ALS so that they represent the intent of the level of service. Since your original article was published, we`ve made a few Stackdriver updates that make it even easier for you to integrate SLIs into your Google Cloud Platform (GCP) workflows. You can now combine your internal SLIs with the GCP SLIs you use, all in the same Stackdriver monitoring board. On Next `18, Spotlight`s session with Ben Treynor and Snapchat will show how Snap uses its dashboard to get an overview of what`s important to its customers and directly attribute it to the information it receives from GCP to get a detailed overview of the customer experience. IT outsourcing agreements, in which the remuneration of service providers is linked to the results obtained, have gained popularity, with companies developing from time and pure materials or full-time price models. Multi-stage ALS – ALS is divided into different levels, with each client being referred to the same services in the same ALS. The concept of SRE begins with the idea that metrics should be closely linked to business objectives.
We use several important tools – SLO, SLA and SLI – in the planning and practice of the SRE. Measures should be designed so that bad conduct is not rewarded by both parties. If z.B. a service level is violated because the customer does not provide information on time, the provider should not be penalized. Next week at Google Cloud Next `18, you`ll hear about new ways and ensure the availability of your apps. A big part of this is creating and tracking service level metrics – something our Site Reliability Engineering (SRE) team does day in and day out on Google. Our ultimate goal of our SRE principles is to improve services and therefore the user experience, and next week we`ll be discussing a few new ways to integrate SRE principles into your business. Set a good base number. Defining the right measures is only half the fight. To be useful, measures must be set at reasonable and achievable performance levels.