Statsd

From 탱이의 잡동사니
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Overview

로그 수집 데몬 statsd 내용 정리

Basic

Statsd datagram

StatsD clients encode metrics into simple, text-based, UDP datagram. Though your client takes care of forming these datagrams.

A StatsD datagram, which contains a single metric, has the following format.

<bucket>:<value>|<type>|@<sample rate>
  • Bucket
The bucket is an identifier for the metric. Metric datagrams with the same bucket and the same type are considered occurrences of the same event by the server.
  • Value
The value is a number that is associated with the metric. Values have different meanings depending on the metric's type.
  • Sample rate
The sample rate is used to indicate to the server that the metric was down-sampled. The sampling rate is intended to reduce the number of metric datagrams sent to the StatsD server, since the server's aggregations can get expensive. The sample rate determines what percentage of the metric points a client should send to the server. The server accounts for this sampling by dividing the values it receives by the sample rate. For example, if a metric has a sampling rate of 0.1, only 10 percent of the metrics will be went by the client to the server. The server will then divide the values for these metrics by 0.1 (or multiply by 10) to get an estimate of the true value in the case of additive metrics, such as the login invocation count we used in the example above.

Data Types

Counters

Counters type 은 가장 기본적인 데이터 타입이다. 지정된 샘플링 시간동안 발생한 지정된 이벤트의 갯수를 나타내는데 사용된다.

예를 들어 샘플링 시간이 10초이고, 10초 동안 발생한 이벤트(지정된 이벤트)의 갯수가 7번이라고 한다면, counters 의 값은 0.7 이 된다.

Counters count occurrences of an event. Counters are often used to determine the frequency at which an event is happening. Counter metrics have "c" as their type in the datagram format. The value of a counter metric is the number of occurrences of the event that you wish to count, which may be positive or negative whole number. Many clients implement "increment" and "decrement" functions, which are shorthand for counters with values of +1 or -1, respectively.

login.invocations:1|c        # increment login.invocations by 1
other_key:-100|c             # decrement other_key by 100

Timers

Timers are meant to track how long something took. They are an invaluable tool for tracking application performance.

The statsd server collects all timers under stats.timers prefix, and will calculate the lower bound, mean, 90th percentile, upper bound, and count of each timer for each period(by the time you see it in Graphite, that's usually per minute).

  • The lower bound is the lowest value statsd saw for that stat during that time period.
  • The mean is the average of all values statsd saw for that stat during that time period.
  • The 90th percentile is a value x such that 90% of all the values statsd saw for that stat during that time period are below x, and 10% are above. This is a great number to try to optimize.
  • The upper bound is the highest value statsd saw for that during that time period.
  • The count is the number of timings statsd saw for that stat during that time period. It is not averaged.

Gauges

Gauges are a constant data type. They are not subject to averaging, and they don't change unless you change them. That is, once you set a gauge value, it will be a flat line on the graph until you change it again.

Gauges are useful for things that are already averaged, or don't need to reset periodically. System load, for example, could be graphed with a gauge.

The statsd server collects gauges under the stats.gauges perfix.

Sets

Sets count the number of unique values passed to a key.

If the method is called multiple times with the same value in the same sample period, that value will only be counted once. <source lang=python> c.set('users', 'foo') c.set('users', 'bar') </source>

See also