Rabbitmq

From 탱이의 잡동사니
Jump to: navigation, search

Overview

Rabbit MQ 내용 정리

Basic

RabbitMQ is a message broker: it accepts and forwards messages. You can think about it as a post office: when you put the mail that you want posting in a post box, you can be sure that Mr. or Ms. Mailperson will eventually deliver the mail to your recipient. In this analogy, RabbitMQ is a post box, a post office and a postman.

Producing

Producing means nothing more than sending. A program that sends message is a producer.

Queue

A queue is the name for a post box which lives inside RabbitMQ. Although messages flow through RabbitMQ and your applications, they can only be stored inside a queue. A queue is only bound by the host's memory & disk limits, it's essentially a large message buffer. Many producers can send messages that go to one queue, and many consumers can try to receive data from one queue.

Consuming

Consuming has a similar meaning to receiving. A consumer is a program that mostly waits to receive messages.

Queue

Round-robin dispatching

By default, RabbitMQ will send each message to the next consumer, in sequence. On average every consumer will get the same number of messages. This ways of distributing message is called round-robin.

Message acknowledgement

Doing a task can take a few seconds. You may wonder what happen if one of the consumer starts a long task and dies with it only partly done. With simple configuration, once RabbitMQ delivers message to the customer it immediately marks it for deletion. In this case, if you kill worker we will lose the message it was just processing. We'll also lose all the messages that were dispatched to this particular worker but were not yet handled.

In order to make sure a message is never lost, RabbitMQ supports message acknowledgements. An ack(nowledgement) is sent back by the consumer to tell RabbitMQ that a particular message had been received, processed and that Rabbit MQ is free to delete it.

If a consumer dies (its channel is close, connection is closed, or TCP connection is lost) without sending an ack, RabbitMQ will understand that a message wasn't processed fully and will re-queue it. If there are other consumers online at the same time, it will then quickly redeliver it to another consumer. That way you can be sure that no message is lost, even if the workers occasionally die.

There aren't any message timeouts: RabbitMQ will redeliver the message when the consumer dies. It's fine even if processing a message takes a very, very long time.

Manual message acknowledgments are turned on by default.

Acknowledgement must be sent on the same channel the delivery it is for was received on. Attempts to acknowledge using a different channel will result in a channel-level protocol exception.

Forgotten acknowledgement

It's a common mistake to miss the 'basic_ack'. It's an easy error, but the consequences are serious. Messages will be redelivered when your client quits (which may look like random redelivery), but RabbitMQ will eat more and more memory as it won't able to release any unacked messages.

In order to debug this kind of mistake you can use 'rabbitmqctl' to print the 'messages_unacknowledged' field.

$ sudo rabbitmqctl list_queues name messages_ready messages_unacknowledged
Listing queues ...
hello	0	0
task_queue	0	0

Message durability

When RabbitMQ quits or crashes it will forget the queues and messages unless you tell it not to. Two things are required to make sure that messages aren't lost: we need to mark both the queue and messages as durable.

First, we need to make sure that RabbitMQ will never lose our queue. In order to do so, we need to declare it as durable:

	q, err := ch.QueueDeclare(
		"task_queue", // name
		true,         // durable
		false,        // delete when unused
		false,        // exclusive
		false,        // no-wait
		nil,          // arguments
	)
	failOnError(err, "Failed to declare a queue")

Although this command is correct by itself, it would be possible to doesn't work. Because if it already defined queue, which is not durable, the RabbitMQ doesn't allow you to redefine an existing queue with different parameters and will return an error to any program that tries to do that.

Now we need to makr our messages as persistent - by supplying a 'DeliveryMode: amqp.Persistent'.

	err = ch.Publish(
		"",     // exchange
		q.Name, // routing key
		false,  // mandatory
		false,
		amqp.Publishing{
			DeliveryMode: amqp.Persistent,
			ContentType:  "text/plain",
			Body:         []byte(body),
		},
	)

Message persistence

Marking messages as persistent doesn't fully guarantee that a message won't be lost. Although it tells RabbitMQ to save the message to disk, there is still a short time window when RabbitMQ has accepted a message and hasn't saved it yet. Also, RabbitMQ doesn't do fsync() for every message -- it may be just saved to cache and not really written to the disk. The persistence guarantees aren't strong.

If you need a stronger guarantee then you can use publisher confirms.

Fair dispatch

You might have noticed that the dispatching still doesn't work for some cases. For example in a situation with two workers, when all odd messages are heavy and even messages are light, one worker will be constantly busy and the other one will do hardly any work. Well, RabbitMQ doesn't know anything about that and will still dispatch messages evenly.

This happens because RabbitMQ just dispatches a message when the message enters the queue. It doesn't look at the number of unacknowledged messages for a consumer. It just blindly dispatches every n-th message to the n-th consumer.

In order to defeat that we can use the 'back.qos' method with the 'prefetch_count=1' setting. This tells RabbitMQ not to give more than one message to a worker at a time. Or, in other words, don't dispatch a new message to a worker until it has processed and acknowledged the previous one. Instead, it will dispatch it to the next worker that is not still busy.

	err = ch.Qos(
		1,     // prefetch count
		0,     // prefatch size
		false, // global
	)
	failOnError(err, "Failed to set QoS")

Queue size

If all the workers are busy, your queue can fill up. You will want to keep an eye on that, and maybe add more workers, or use message TTL.

Publish / Subscribe

Essentially, publishing is going to be broadcast to all the receivers.

Exchanges

The core idea in the messaging model in RabbitMQ is that the producer never sends any messages directly to a queue. Actually, quite often the producer doesn't even know if a message will be delivered to any queue at all.

Instead, the producer can only send messages to an exchange. An exchange is a very simple thing. On one side it receives messages from producers and the other side it pushes them to queues. The exchange must know exactly what to do with a message it receives. Should it be appended to a particular queue? Should it be appended to many queue? Or should it get discarded. The rules for that are defined by the exchange type.

There are a few exchange types available:

dicrect
topic
headers
fanout

Fanout

The fanout exchange is very simple. As you can probably guess from the name, it just broadcasts all the messages it receives to all the queues it knows.

Temporary queues

As you may remember previously we were using queues that had specific names. Being able to name a queue was crucial for us -- we needed to point the workers to the same queue. Giving a queue a name is important when you want to share the queue between producers and consumers.

But that's not the case for our logger. We want to hear about all log messages, not just a subset of them. We're also interested only in currently flowing messages not in the old ones. To solve that we need two things.

Firstly, whenever we connect to Rabbit we need a fresh, empty queue. To do this we could create a queue with a random name, or, even better - let the server choose a random queue name for us.

Secondly, once we disconnect the consumer the queue should be automatically deleted.

In the amqp client, when we supply queue name as an empty string, we create a non-durable queue with a generated name:

q, err := ch.QueueDeclare(
  "",    // name
  false, // durable
  false, // delete when usused
  true,  // exclusive
  false, // no-wait
  nil,   // arguments
)

When the method returns, the queue instance contains a random queue name generated by RabbitMQ. For example, it may look like amq.gen-JzTY20BRgKO-HjmUJj0wLg.

When the connection that declared it closes, the queue will be deleted because it is declared as exclusive.

Bindings

We've already created a fanout exchange and a queue. Now we need to tell the exchange to send messages to our queue. That relationship between exchange and a queue is called a binding.

err = ch.QueueBind(
  q.Name, // queue name
  "",     // routing key
  "logs", // exchange
  false,
  nil
)

From now on the logs exchange will append message to our queue.

Routing

In this tutorial we're going to add a feature to it - we're going to make it possible to subscribe only to a subset of the messages. For example, we will be able to direct only critical error messages to the log file (to save disk space), while still being able to print all of the log messages on the console.

Bindings

A binding is a relationship between an exchange and queue. This can be simply read as: the queue is interested in messages from this exchange.

Bindings can take an extra 'routing_key' parameter. To avoid the confusion with a 'Channel.Publish' parameter we're going to call it a 'binding key'. This is how we could create a binding with a key.

err = ch.QueueBind(
  q.Name,    // queue name
  "black",   // routing key
  "logs",    // exchange
  false,
  nil)

The meaning of a binding key depends on the exchange type. The 'fanout' exchanges, which we used previously, simply ignored its value.

Direct exchange

Our logging system from the previous tutorial broadcasts all messages to all consumers. We want to extend that to allow filtering messages based on their severity. For example we may want the script which is writing log messages to the disk to only receive critical errors, and not waste disk space on warning or info log messages.

We were using a 'fanout' exchange, which doesn't give us much flexibility - it's only capable of mindless broadcasting.

We will use a 'direct' exchange instead. The routing algorithm behind a 'direct' exchange is simple - a message goes to the queues whose 'binding key' exactly matches the 'routing key' of the message.

Multiple bindings

It is perfectly legal to bind multiple queues with the same binding key.

Emitting logs

We'll use this model for our logging system. Instead of 'fanout' we'll send messages to a 'direct' exchange. We will supply the log severity as a 'routing key'. That way the receiving script will be able to select the severity it wants to receive.

As always, we need to create an exchange first

err = ch.ExchangeDeclare(
  "logs_direct", // name
  "direct",      // type
  true,          // durable
  false,         // auto-deleted
  false,         // internal
  false,         // no-wait
  nil,           // arguments
)

And we're ready to send a message.

err = ch.ExchangeDeclare(
  "logs_direct", // name
  "direct",      // type
  true,          // durable
  false,         // auto-deleted
  false,         // internal
  false,         // no-wait
  nil,           // arguments
)
failOnError(err, "Failed to declare an exchange")
 
body := bodyFrom(os.Args)
err = ch.Publish(
  "logs_direct",         // exchange
  severityFrom(os.Args), // routing key
  false, // mandatory
  false, // immediate
  amqp.Publishing{
    ContentType: "text/plain",
    Body:        []byte(body),
})

To simplify things we will assume that 'severity' can be one of 'info', 'warning', 'error'.

Subscripting

Receiving messages will work just like in the previous tutorial, with one exception - we're going to create a new binding for each severity we're interested in.

q, err := ch.QueueDeclare(
  "",    // name
  false, // durable
  false, // delete when usused
  true,  // exclusive
  false, // no-wait
  nil,   // arguments
)
failOnError(err, "Failed to declare a queue")
 
if len(os.Args) < 2 {
  log.Printf("Usage: %s [info] [warning] [error]", os.Args[0])
  os.Exit(0)
}
for _, s := range os.Args[1:] {
  log.Printf("Binding queue %s to exchange %s with routing key %s",
     q.Name, "logs_direct", s)
  err = ch.QueueBind(
    q.Name,        // queue name
    s,             // routing key
    "logs_direct", // exchange
    false,
    nil)
  failOnError(err, "Failed to bind a queue")
}

Topic

= Topic exchange

Messages sent to a topic exchange can't have an arbitrary 'routing_key' - it must be a list of words, delimited by dots. The words can be anything, but usually they specify some features connected to the message. A few valid routing key examples: *stock.usd.nyse, nyse.vmw, quick.orange.rabbit. There can be as many words in the routing key as you like, up to the limit of 255 bytes.

The binding key must also be in the same form. The logic behind the 'topic' exchange is similar to a 'direct' one - a message sent with a particular routing key will be delivered to all the queues that are bound with a matching binding key. However there are two important special cases for binding keys:

*(star) can substitute for exactly one word.
#(hash) can substitute for zero or more words.

See also