class FancyArray < Array
def initialize(size)
# ...
end
end
What is wrong with this picture? Well, in my Ruby code, I can do x = Array.new. But what happens when I attempt to use the FancyArray class in place of Array? If I do x = FancyArray.new, I will surely get an ArgumentError exception because FancyArray requires that I pass one argument when calling the new method.
Let's phrase this in terms of the Subtype Requirement: Let x be an instance of Array. Then q(x) = the arity of the initialize method is -1. Let y be an instance of FancyArray, which is a subclass of Array. Then q(y) = arity of the initialize method is -1 by the Subtype Requirement.
Now let's relate the above to Ruby code and check if the Subtype Requirement holds:
irb(main):001:0> x = Array.instance_method(:initialize).arity => -1 irb(main):002:0> y = FancyArray.instance_method(:initialize).arity => 1 irb(main):003:0> x == y => falseIt is clear from this that FancyArray does not conform to the Subtype Requirement. Consequently, code that expects to use an Array will not function correctly when a FancyArray is substituted. It's important to also note that the Subtype Requirement applies to any observable property of the object. The example used in the paper is of a Stack and Queue. Both classes may provide push and pop methods, but the semantics of the methods are quite different between the two classes. Now, you may say, "But, I have a very good reason for requiring an argument to new." Well then, I would venture to say you have an important reason to consider the difference between composition and inheritance for designing your program. 3. Composition versus Inheritance Of the three object-oriented principles—inheritance, encapsulation, and polymorphism—inheritance has been so abused there could be a 12-step program devoted entirely to it. Fortunately, the remedy for inappropriate use of inheritance is quite simple: compose your objects of other objects. Inheritance models an is a relationship, while composition models a has a relationship. If your object is a String, then it will do all the normal String things just as a String would do them. This is very important. It needs to do String things not just externally, when you call the methods, but internally, when the other String methods call each other. Is your FancyTemplate class really a String? Then, for example, I should always be able to request its length. However, your FancyTemplate instance probably doesn't have a length when it is being built. Therefore, String methods that may be employed during the construction phase could be highly confused. In such case, I suggest your FancyTemplate has a String internally, and it can be urged to give you a representation of that String at some point in time. Yet, it is not a String from the perspective of inheritance and conforming to the Liskov Substitution Principle. Only you can tell whether your model is best represented by inheritance or composition. When designing your classes, be sure to consider the view from inside and out. If you are contorting your methods to act like the class you are inheriting from, perhaps your class only has one of those things, rather than being one of them. Most importantly, remember that you are not the only kid on the playground. 4. Playing Nicely This is more about general advice than specific admonitions. We are lucky to have such a powerful, expressive language in Ruby. Opening a core class to patch a method is tremendously useful and powerful. However, remember that with great power, comes great responsibility. First and foremost, simply be conscious of what you are asking Ruby to do for you. I used this example earlier, and I'm going to repeat it because in Rubinius we have encountered this more times that we can count. Ruby is an object-oriented language. You cause computation to occur by sending messages to an object. How can the object work if it has no methods? (I say with my best Zoolander impersonation). If your code does:
class SomeClass
instance_methods(false).each { |m| undef_method m }
end
you are (most likely) doing it wrong. There are many variations on this theme, but they all share the same problem: the assumption that those methods you are removing are as superfluous as Johnny's appendix. I assure you, we don't randomly add methods to classes in Rubinius. Again, your code may work fine in MRI when you do this because MRI calls C functions on that object behind your back with impunity. But, we do want to have nice things, right? If you ever wonder what consequences your code may have, just drop into the #rubinius channel on freenode. We will happily discuss it with you.
A related problem occurs when code inherits from a core Ruby class and redefines one of the core methods. When the core classes are implemented in Ruby, the methods may depend on one another to perform their tasks. For example, in Hash it would not be entirely unreasonable for each_value to be implemented in terms of each. Well, not unreasonable, that is, until you try to run REXML in the Ruby Standard Library. REXML has an Attributes class that inherits from Hash. The Attributes class then implements an each_attribute method. For good measure, it overrides each to use each_attribute. And each_attribute calls each_value. Waiter, I believe there's a StackError in my Attributes. The moral of the story: the two edges on this wonderful Ruby sword are sharp. It does take extra work to consider how methods on a particular class interact with one another; to some extent, this is an implementation detail. However, it's something to be aware of when you write code. Of course, you can always browse the Ruby implementation of the core classes in Rubinius.
Playing nicely is more than being conscientious about how you write your own code. It's also important to consider how you use code others have written. Your code should not depend on implementation details of the classes and libraries you use. However, it's often hard to know what those implementation details are. Often the dependency will be subtle and implicit. Your code will appear to work fine in MRI but break in one of the alternative implementations. There is no general solution to this problem, but you can usually avoid it by checking the assumptions your code makes about the other code it uses. One example of this is mutating a collection in the block passed to an iterating method. Consider the following code:
some_hash.each { |key, value| some_hash.delete(key) if fancy_test(value) }
Hash is a fairly complex data structure and this bit of code can have very different behavior depending on how Hash is implemented. Thankfully, Matz has explicitly said this behavior is undefined.
5. Neighborly C Extensions
While playing nicely in Ruby code is important, it's also very important when writing C extensions. These are programs typically written in C/C++ that directly access the C functions that MRI uses to implement Ruby. You probably regularly use one or more gems or libraries that are partially implemented by a C extension. C extensions are often used to access native libraries from Ruby, for example, when writing database adapters.
C extensions are not the only way to access native libraries from Ruby. There are also the FFI and DL libraries. Rubinius was the first implementation to popularize the use of the foreign-function interface (FFI) library for accessing native code. In fact, vital pieces of the core library in Rubinius are implemented via FFI, which is a modern implementation of DL, the dynamic load library that MRI has included for years. There are now quality implementations of FFI available on both JRuby and MRI.
FFI is generally the preferred way to interface with native libraries. The benefits include not needing a C compiler and being able to harness the speed or power of a native library while writing pure Ruby code. However, there are still two core use cases for C extensions: 1) when the data marshaling through the FFI layer imposes too large a performance cost; or 2) when your code already relies on an existing C extension. These use cases are hard to get around. Fortunately, we have put a lot of effort into getting C extensions working quite well on Rubinius. In fact, many C extensions just work.
However, there is one particular problem with some C extensions that limits our ability to support them: some have explicit dependencies on MRI data structures, for example, RHash. Depending on a data structure your code does not control makes your program vulnerable to breaking if the other code changes its implementation. Unfortunately, the C programming language doesn't do much to enforce good practices here. If the C compiler can see a structure or function in a header file, you are free to use it in your program. Yet, just because you can, does not mean you should. Instead, you should always use a function interface (also known as an API) to access the data. Treat data structures that are not your own as opaque.
Of course, that is the ideal world. MRI cannot foretell every use case that a C extension may have. So some of these problems are simply the result of people being more creative than the MRI developers imagined, which is mostly a good thing. In version 1.9, MRI is enforcing the use of API's over raw struct access. For example, rather than using RSTRING(obj)->ptr, your code should do RSTRING_PTR(obj) instead. Since Rubinius is compatible with MRI version 1.8.7, we still support both forms in this case. However, to make your code robust and portable, you should use the RSTRING_PTR API.
One thing Rubinius does not support is code like RHASH(obj)->tbl that accesses the RHash struct directly. This is partially because, in Rubinius, Hash is implemented entirely in Ruby. However, most C extension code needs to do something like iterate over the entries rather than just access the structure. In this case, the rb_hash_foreach function is available, so it's quite easy to change a C extension so it will run on Rubinius. In fact, a number of C extensions have already been updated in this way. If you encounter a problem with a C extension, please file an issue for it.
We understand there are valid use cases for writing C extensions. While Rubinius is implemented very differently than MRI, we want your C extensions to be able to run in Rubinius and we have worked hard to ensure that most C extensions do run. If you encounter cases where there is no function API to work with MRI data, let us know. We can collaborate with Matz and the MRI developers to add such APIs. That way, you can help us help you to make Ruby better for everyone. Win!
Ruby is a terrific language and with your help, it can be even better. Do you have any tips for writing better Ruby code? Please, let us know.
If you are new to Rubinius, you may find these previous posts informative:
class Friend < ActiveRecord::Base
belongs_to :user
belongs_to :contact, :class_name => "User", :foreign_key => "contact_id"
# user befriends contact
def self.befriend(user,contact)
relationship = find_by_user_id_and_contact_id(user.id,friend.id)
if relationship.nil?
transaction do
Friend.create(:user => user, :contact => contact)
end
end
end
end
class User < ActiveRecord::Base
has_many :friends, :dependent => :destroy
has_many :contacts, :through => :friends, :order => "created_at DESC", :dependent => :destroy
end
However, I have always felt that it's clumsy. What I really want to say is:
"Each user has a list of IDs that represent the people that they are friends with."
Sounds like a de-normalized list right?
h2. The Solution
Enter Redis. Redis is a key-value store similar to memcached but more flexible since lists, sets, ordered sets and strings can all be used as values. Thanks to its simple API, the problem I described is essentially an atomic operation in Redis. Redis has a great "set" implementation and allows you to do all of the things you would imagine a set to do: addition, subtraction, unique insertion, deletion, union, intersection, etc.
The operation will ultimately look like this:
SET = Redis.new
SET.set_add key, value
However, since we are working inside a Rails app, we need to make sure we have the right plumbing setup.
# Create a redis.rb in your initializers folder.
# Create a new Redis database for each of your needs.
In our case, we want to have a dataset that keeps track of a User's helpers (other users who are helping them) and a list of a User's friends (other users that the user is helping). Since we are going to be using these Redis objects throughout the codebase, I like to declare them as global variables in the redis.rb initializer file.
HELPERS = Redis.new(:db => 0)
HELPING = Redis.new(:db => 1)
Notice that I pass in the :db key so that we make sure HELPERS and HELPING will hold two different Redis objects. You can use redis-namespace gem if you want, but I find the default syntax from the redis-rb gem works well enough for my purposes.
Now that we have these global Redis objects at our disposal throughout the application, we can start using it in our Friend.befriend method.
class Friend < ActiveRecord::Base
belongs_to :user
belongs_to :contact, :class_name => "User", :foreign_key => "contact_id"
# user befriends contact
def self.befriend(user,contact)
begin
HELPERS.set_add contact.id, user.id
HELPING.set_add user.id, contact.id
rescue
RedisLogger.info "Redis Exception"
end
end
end
class User < ActiveRecord::Base
has_many :friends, :dependent => :destroy
has_many :contacts, :through => :friends, :order => "created_at DESC", :dependent => :destroy
end
However, this isn't the best solution right out of the gate. Using a NoSQL datastore has some drawbacks that aren't apparent in development mode but reveals its ugly face in production. If you are not careful, a simple restart of your Redis server can cause you to loose all your data. Managing your Redis data in production deserves it's own post, (coming soon) but for now, let's create a safer solution that you can gradually roll out as you become more comfortable with storing, backing up and using Redis datafiles.
class Friend < ActiveRecord::Base
belongs_to :user
belongs_to :contact, :class_name => "User", :foreign_key => "contact_id"
# user befriends contact
def self.befriend(user,contact)
relationship = find_by_user_id_and_contact_id(user.id,friend.id)
if relationship.nil?
transaction do
Friend.create(:user => user, :contact => contact)
end
add_to_denormalized_list(user,contact)
end
end
def self.add_to_denormalized_list(user,contact)
begin
HELPERS.set_add contact.id, user.id
HELPING.set_add user.id, contact.id
rescue e
RedisLogger.info "Redis Exception"
end
end
end
class User < ActiveRecord::Base
has_many :friends, :dependent => :destroy
has_many :contacts, :through => :friends, :order => "created_at DESC", :dependent => :destroy
end
The strategy is simple, mirror the MySQL data in Redis. By adding a call to add_to_denormalized_list, we mirror the ActiveRecord call using the simple and elegant Redis set syntax discussed above. As you and your team get more practice and become more comfortable using Redis in production, you can start writing more to the denormalized list, eventually moving this part of your application away from ActiveRecord and MySQL to Redis. You could do this manually or you can use James Golick's recently released gem called Rollout that uses, you guessed it, Redis, to programatically rollout features to users.
Like anything else you code, testing and benchmarking this process in production is crucial to make sure you are saving time and cycles. It might seem like a waste to duplicate your data in Redis, but you are a pragmatic polyglot persistence developer right? You want to explore the NoSQL space while making sure that a little mistake or misunderstanding doesn't sink your ship. Give something like this a try, it doesn't get any more pragmatic. When do you try it or come up with something new, let me and everyone else know about it.
Thanks for reading.Engine Yard has a long history with open source software. We have supported many big name projects over the years including Merb, Ruby 1.8.6, Rubinius, JRuby, and Rails. In addition to these larger projects, we also strive to open source internal technology that benefits the community as a whole. These projects are usually less well known, but we'd like to fix that.
Today we are announcing that the Engine Yard command line client is fully open sourced.
First, we have developed a new deployment tool that runs on the instance that is being deployed to. This code runs when you deploy from the command line, and will soon be the default for deploying from the dashboard. The code is available at engineyard-serverside.
The second component is the engineyard gem itself, a client library for our dashboard API. It is primarily used for managing custom recipes and deployment, but it will continue to expand over time. This code is available at engineyard.
The Engine Yard CLI was announced last month and we have a complete overview on the blog. This new deployment system separates the deployment of your code, and the configuration of your cluster. This allows code to be deployed without any fear of incompatible configuration, and allows configuration changes to your server when the time is right for you. We provide even more flexibility through simple hooks into the deployment process, allowing you to completely override the way deployments happen. You can read about these and other features in greater detail in our recently revamped documentation site
Please feel free to send pull requests and file any bugs or feature requests using Github Issues.
carbon:/home/ftp/pub/ruby/1.8$ ls -la | grep ruby-1.8.0 -rw-rw-r-- 1 root ftp 1979070 Aug 4 2003 ruby-1.8.0.tar.gzSo, it has been around for a while, and offers a good starting point for discussing concurrency in Ruby. MRI Ruby 1.8.x supports concurrency in a few ways. One of the first things newcomers to Ruby leap for are its threads. Depending on the language these newcomers were familiar with before arriving at Ruby, they may be in for a surprise. MRI Ruby 1.8.x provides a green thread implementation. As mentioned above, green threads do not make use of any threading system native to the platform. Instead, 1.8.x's threads are implemented within the interpreter itself. This leads to threads behaving consistently across any platform the interpreter runs on. Because they are green threads, however, they offer no advantages for CPU bound tasks. cpu_bound_threads.rb
require 'benchmark'
threads = []
thread_count = ARGV[0].to_i
iterations = ARGV[1].to_i
increment = iterations / thread_count.to_f
sum = 0
Benchmark.bm do |bm|
bm.report do
thread_count.times do |counter|
threads << Thread.new do
my_sum = 0
queue = (1 + (increment * counter).to_i)..(0 + (increment * (counter + 1)).to_i)
queue.each do |x|
my_sum += x
end
Thread.current[:sum] = my_sum
end
end
threads.each {|thread| thread.join; sum += thread[:sum]}
puts "The sum of #{iterations} is #{sum}"
end
end
This is a simple program that takes a large range of numbers, divides them into smaller ranges, and hands each smaller range to a thread that calculates the sum of the range it was given. The results from each individual thread are then added together to arrive at a final answer.
All examples ran on an 8 core Linux machine. The numbers below are an average of the results of 100 runs for each set of inputs.
Threads
Iterations
50000
500000
5000000
1
0.01730298
0.17149276
1.70610744
2
0.01724724
0.17179465
1.70557474
4
0.01729293
0.17181384
1.70570264
8
0.01741591
0.17210276
1.71201153
As demonstrated by the numbers, MRI 1.8 threads are absolutely no help at all for a CPU bound application. In fact, there is a small but measurable cost to the overhead of managing them that is apparent in the numbers. As thread count increased, timing consistently and measurably slowed. If you are an MRI 1.8 user, do not despair; threads are but one concurrency option available to you.
An option that will better serve you for CPU bound tasks is process based concurrency. The idea is simple. In order to leverage multiple cores/CPUs, just create more than one process to handle the work load. Ruby provides a fork() method call which, on platforms that support it using the underlying fork() call from the C standard library. This will create a new process, with a new process ID, that can be considered an exact copy of the parent process, except that its resource allocations will be reset to 0.
Since processes do not share memory spaces, you must utilize another system provided communication mechanism in order to pass work to or from processes; this avoids the potential pitfalls that arise when trying to correctly manage locks on shared resources, but it does force one to think more specifically about exactly how to achieve communication.
cpu_bound_processes.rb
require 'benchmark'
processes = []
process_count = ARGV[0].to_i
iterations = ARGV[1].to_i
increment = iterations / process_count.to_f
sum = 0
def in_subprocess
from_subprocess, to_parent = IO.pipe
pid = fork do
from_subprocess.close
r = yield
to_parent.puts [Marshal.dump(r)].pack("m")
exit!
end
to_parent.close
[pid,from_subprocess]
end
def get_result_from_subprocess(pid, from_subprocess)
r = from_subprocess.read
from_subprocess.close
Process.waitpid(pid)
Marshal.load(r.unpack("m")[0])
end
Benchmark.bm do |bm|
bm.report do
process_count.times do |counter|
processes << in_subprocess do
my_sum = 0
queue = (1 + (increment * counter).to_i)..(0 + (increment * (counter + 1)).to_i)
queue.each do |x|
my_sum += x
end
my_sum
end
end
processes.each {|process| sum += get_result_from_subprocess(*process)}
puts "The sum of #{iterations} is #{sum}"
end
end
In this example I used IO pipes to send data from the master process to the children, and to receive data from the children, back into the master.
As earlier, testing was done on an 8 core linux machine, with 100 runs of each test. The program is equivalent to the threaded version, and was changed only as necessary to enable it to be used in a multiprocess model instead of a multithread model.
Worker Processes
Iterations
50000
500000
5000000
1
0.01805432
0.17199047
1.70812685
2
0.0098329
0.08675517
0.85509328
4
0.00609409
0.0446612
0.43100698
8
0.00847991
0.05346145
0.25621009
Take a good look at these numbers. Everything moves in the correct direction, until you get to the 8 process column. Then timing slows for both the 50000 and 500000 iteration rows that are under the 4 process column. Do you have any theories as to why?
Processes are, in many ways, a great way to handle concurrency. One of their drawbacks, though, is that they are heavy structures. They can take up significant time and resources to create . Linux uses copy-on-write semantics when creating forked processes. This means it doesn't actually duplicate the address space of the forked process until pages in that space start changing. Then, it duplicates what changes. This means that forked processes on Linux can be created fairly quickly. However, MRI 1.8 is not very friendly to copy-on-write semantics.
If you are unfamiliar with the way memory is managed and garbage is collected in MRI 1.8, you should check out my article on MRI Memory Allocation. One key aspect is that objects carry all of their status bits with them. This means that when the garbage collector scans the object space to find objects it can collect, it touches every object in the address space. For a process forked with copy-on-write semantics, this forces the kernel to make copies of all of those pages. This takes time, and largely negates the fast-creation benefit of copy-on-write forked processes.
The times for the lower iterations on the 8 thread test reveal a cost to this form of concurrency. The overhead associated with creating the forked processes overwhelms the performance gains from the division of labor when the work to be done is brief enough. This is a reality for any form of concurrency -- there is always a performance tax from some amount of overhead. That tax is just higher when spawning something heavy like a process. Keep this in mind when you explore concurrency options for your task.
These first two examples both represent CPU bound problems. Many real world problems are not CPU bound, though. Rather, they are IO bound issues. Because an IO bound problem has latencies imposed on it by something outside of the program itself, IO bound problems can provide an excellent case for using MRI 1.8's green threads to improve performance.
io_bound_threads.rb
require 'net/http'
require 'thread'
require 'benchmark'
def get_data(url)
tries = 0
response = nil
if /^http/.match(url)
m = /^http:\/\/([^\/]*)(.*)/.match(url)
site = m[1]
path = m[2]
begin
http = Net::HTTP.new(site)
http.open_timeout = 30
http.start {|h| response = h.get(path)}
rescue Exception
tries += 1
retry if tries < 5
end
end
response.kind_of?(Array) ? response[1] : response.respond_to?(:body) ? response.body : ''
end
mutex = Mutex.new
signal = ConditionVariable.new
thread_count = ARGV[0].to_i
fetches = ARGV[1].to_i
url = ARGV[2]
threads = []
count = 0
active_threads = 0
Benchmark.bm do |bm|
bm.report do
while count < fetches
while count < fetches && active_threads < thread_count
mutex.synchronize do
active_threads += 1
count += 1
end
Thread.new do
get_data(url)
mutex.synchronize do
active_threads -= 1
threads << Thread.current
signal.signal
end
end
end
mutex.synchronize do
signal.wait(mutex)
end
while th = threads.shift
th.join
end
end
end
end
This script makes many HTTP requests. For simplicity's sake, lets say it just makes the same request over and over again, but could easily be expanded to take a list of URLs, and to do something useful with the returned data. The script uses threads much like the CPU bound example, except that it is a bit more sophisticated in how it counts the work it has assigned to generated threads, and how it waits for all the threads to be completed.
This table shows timing from it in action. The target URL used was not local to the testing machine. Each run used the indicated number of threads to gather the URL, either a "fast" URL, with an over-the-net response speed of about 35 requests per second, or a "slow" URL with an over-the-net response speed of about 3 requests per second, 400 times. There were 100 runs completed. The numbers below are an average from those runs.
Worker Threads
Request speed
35/second
3/second
1
6.53462668
61.1016239
2
3.34861606
30.4514539
5
1.38942396
12.1620945
10
0.72804622
6.0968646
20
0.47964698
3.0411382
Just a glance at these numbers clearly shows that Ruby threads are a big help with an IO bound activity like this. The relationship between number of threads and reduction in time to complete the task is not linear; but even with up to 20 threads there is a significant benefit to additional numbers of threads. The benefit is more linear, and evident for slower requests because the requests spend more time waiting on IO, and less on CPU bound activities.
There are some caveats to be aware of with regard to Ruby threads. First, even though they are green threads, as soon as one starts sharing resources between threads, threading becomes something that can be hard to get right. Share as little as possible, thoroughly think through your code, and use tests to support your reasoning, because threading problems can be hard to diagnose and solve.
Second, MRI 1.8 has a limit on the number of threads that it will manage. As a consequence of how the internals are implemented, this means that on most systems (notably excluding win32 systems), total thread count is limited to 1024. Also, because of the way it is implemented, the overhead increases to manage a larger number of threads versus smaller. Each thread consumes a significant amount of memory, so do not go crazy with threads or it will backfire on you.
Third, because of the way that Ruby threading is implemented, it is possible for a C extension to Ruby to take control of the process and prevent Ruby from allowing context switches to other threads. It is possible to write extensions so that they do not do this, but many are not written in this way. Where this bites most people, is with code that interacts with a database. One can reasonably look at a database query as an IO bound activity -- all the Ruby process is really doing is sending a request to the DB and waiting for a response. However, most DB interaction libraries are implemented as C extensions, and some of them do not play well with Ruby threads. One of the most common offenders is Mysql-Ruby. It will block all of Ruby while waiting for the result from a long running query. This means that a long running query will block the whole process until it returns. On the other hand, Ruby-PG, the driver for Postgres, will context switch within pgconn_block(), the function that makes blocking calls to the database, thus permitting other MRI 1.8 threads to run even during a long running query.
Fourth, because MRI 1.8 threads are green threads, they all run inside the context of a single process and a single system thread. Thus, while they give the appearance of concurrency, there is actually only one thread running at once. This is okay, because it is the appearance of concurrency that matters. If you run top on your laptop or VM shell, you will see a large number of processes running on your system. This number will exceed the number of cores that you have by a large margin, but you rarely have to worry about which processes are actually running on one of the cores at any given time. Your kernel takes care of slicing up access to the CPU into fine enough grains that it appears that all the running processes are executing on a core at the same time (even though most of them probably are not actually running at any given time). Concurrency in computing doesn't strictly mean that two or more things are actually running at the same time. Rather, it means that there is an appearance that they are, and that one works with them on the assumption that they are, and lets the underlying scheduler deal with making reality fit that appearance.
An entire book could be written about concurrency in Ruby. I've just scratched the surface with this overview of process and thread based concurrency in Ruby. Hopefully this helped answer a few questions or suggested some techniques to consider.
Future installments in this series will cover Ruby 1.9.x (which uses system threads as opposed to green threads), JRuby, Rubinius, and using event systems like EventMachine to handle concurrency. So stay tuned! There is a lot more coming soon!~/projects/jruby ➔ cat foo_heap_example.rb
class Foo
end
ary = []
10000.times { ary << Foo.new }
puts "ready for analysis!"
sleep
~/projects/jruby ➔ jruby foo_heap_example.rb
ready for analysis!
So we have our test subject ready to go. To use the jmap tool, we need the pid of this process. Of course we can use the usual shell tricks to get it, but the JDK comes with a nice tool for finding all JVM pids active on the system: jps
~/projects/jruby ➔ jps -l 52862 sun.tools.jps.Jps 52857 org/jruby/Main 48716 com.sun.enterprise.glassfish.bootstrap.ASMainFrom this, you can see I have three JVMs running on my system right now: jps itself; our JRuby instance; and a GlassFish server I used for testing earlier today. We're interested in the JRuby instance, pid 52857. Let's see what jmap can do with that.
~/projects/jruby ➔ jmap
Usage:
jmap [option] <pid>
(to connect to running process)
jmap [option] <executable <core>
(to connect to a core file)
jmap [option] [server_id@]<remote server IP or hostname>
(to connect to remote debug server)
where <option> is one of:
<none> to print same info as Solaris pmap
-heap to print java heap summary
-histo[:live] to print histogram of java object heap; if the "live"
suboption is specified, only count live objects
-permstat to print permanent generation statistics
-finalizerinfo to print information on objects awaiting finalization
-dump:<dump-options> to dump java heap in hprof binary format
dump-options:
live dump only live objects; if not specified,
all objects in the heap are dumped.
format=b binary format
file=<file> dump heap to <file>
Example: jmap -dump:live,format=b,file=heap.bin <pid>
-F force. Use with -dump:<dump-options> <pid> or -histo
to force a heap dump or histogram when <pid> does not
respond. The "live" suboption is not supported
in this mode.
-h | -help to print this help message
-J<flag> to pass <flag> directly to the runtime system
<
The simplest option here is -histo, to print out a histogram of the objects on the heap. Let's run that against our JRuby instance.
~/projects/jruby ➔ jmap -histo:live 52857 num #instances #bytes class name ---------------------------------------------- 1: 22677 3192816 <constMethodKlass> 2: 22677 1816952 <methodKlass> 3: 35089 1492992 <symbolKlass> 4: 2860 1389352 <instanceKlassKlass> 5: 2860 1193536 <constantPoolKlass> 6: 2798 739264 <constantPoolCacheKlass> 7: 5861 465408 [B 8: 5399 298120 [C 9: 3042 292032 java.lang.Class 10: 4037 261712 [S 11: 10002 240048 org.jruby.RubyObject 12: 3994 179928 [[I 13: 5474 131376 java.lang.String 14: 1661 95912 [I ...The resulting output is a listing of literally every object in the system...not just Ruby objects even! The value of this should be apparent; not only can you start to investigate the memory overhead of code you've written, you'll also be able to investigate the memory overhead of every library and every piece of code running in the same process, right down to byte arrays (the "[B" above) and "native" Java strings ("java.lang.String" above). And so far we haven't had to do anything special to JRuby. Nice, eh? So, back to the matter at hand: the Foo class from our example. Where is it? Well, the answer is that it's right there; 10000 of those 10002 org.jruby.RubyObject instances are our Foo objects; the other two are probably objects constructed for JRuby runtime purposes. But obviously, there's nothing in this output that tells us how to find our Foo instances. This is what I'm remedying in JRuby 1.6. On JRuby master, there's now a flag you can pass that will stand up a JVM class for every user-created Ruby class. Among the many benefits of doing this, we also get a more useful profile. Let's see how to use the flag (which will either be default or very easy to access by the time we release JRuby 1.6).
~/projects/jruby ➔ jruby -J-Djruby.reify.classes=true foo_heap_example.rb ready for analysis!If we run jmap against this new instance, we see a more interesting result.
num #instances #bytes class name ---------------------------------------------- 1: 22677 3192816 <constMethodKlass> 2: 22677 1816952 <methodKlass> 3: 35089 1492992 <symbolKlass> 4: 2860 1389352 <instanceKlassKlass> 5: 2860 1193536 <constantPoolKlass> 6: 2798 739264 <constantPoolCacheKlass> 7: 5863 465456 [B 8: 5401 298208 [C 9: 3042 292032 java.lang.Class 10: 4037 261712 [S 11: 10000 240000 ruby.Foo 12: 3994 179928 [[I 13: 5476 131424 java.lang.String 14: 1661 95912 [IA-ha! There's our Foo instances! The "reify classes" option generates a JVM class of the same name as the Ruby class, prefixed by "ruby." to separate it from other JVM classes. Now we can start to see the real power of the tools, and we're just at the beginning. Let's see what a simple Rails application looks like.
~/projects/jruby ➔ jmap -histo:live 52926 | grep " ruby." 29: 11685 280440 ruby.TZInfo.TimezoneTransitionInfo 97: 970 23280 ruby.Gem.Version 98: 914 21936 ruby.Gem.Requirement 122: 592 14208 ruby.TZInfo.TimezoneOffsetInfo 138: 382 9168 ruby.Gem.Dependency 159: 265 6360 ruby.Gem.Specification 201: 142 3408 ruby.ActiveSupport.TimeZone 205: 118 2832 ruby.TZInfo.DataTimezoneInfo 206: 118 2832 ruby.TZInfo.DataTimezone 273: 41 984 ruby.Gem.Platform 383: 14 336 ruby.Mime.Type 403: 13 312 ruby.Set 467: 8 192 ruby.ActionController.MiddlewareStack.Middleware 476: 8 192 ruby.ActionView.Template 487: 7 168 ruby.ActionController.Routing.DividerSegment 508: 6 144 ruby.TZInfo.LinkedTimezoneInfo 523: 6 144 ruby.TZInfo.LinkedTimezone 810: 4 96 ruby.ActionController.Routing.DynamicSegment 2291: 2 48 ruby.ActionController.Routing.Route 2292: 2 48 ruby.I18n.Config 2293: 2 48 ruby.ActiveSupport.Deprecation.DeprecatedConstantProxy 2298: 2 48 ruby.ActionController.Routing.ControllerSegment ...This time I've opted to grep out just the "ruby." items in the histogram, and the results are pretty impressive! We can see the baffling fact that there's 970 instance of Gem::Version, using at least 23280 bytes of memory. We can see the even more depressing fact that there's 11685 live instances of TZInfo::TimezoneTransitionInfo, using at least 280440 bytes. Now that we're getting useful data, let's look at the first of our tools in more detail: jmap and jhat. jmap and jhat As you might guess, I do a lot of profiling in the process of developing JRuby. I've used probably a dozen different tools at different times. But the first tool I always reach for is the jmap/jhat combination. You've seen the simple case of using jmap above, generating a histogram of the live heap. Let's take a look at an offline heap dump.
~/projects/jruby ➔ jmap -dump:live,format=b,file=heap.bin 52926 Dumping heap to /Users/headius/projects/jruby/heap.bin ... Heap dump file createdThat's how easy it is! The binary dump in heap.bin is supported by several tools: jhat (obviously), VisualVM, the Eclipse Memory Analysis Tool, and others. It's not officially a "standard" format, but it hasn't changed in a long time. Let's have a look at jhat options.
~/projects/jruby ➔ jhat
ERROR: No arguments supplied
Usage: jhat [-stack <bool>] [-refs <bool>] [-port <port>] [-baseline <file>] [-debug <int>] [-version] [-h|-help] <file>
-J<flag> Pass <flag> directly to the runtime system. For
example, -J-mx512m to use a maximum heap size of 512MB
-stack false: Turn off tracking object allocation call stack.
-refs false: Turn off tracking of references to objects
-port <port>: Set the port for the HTTP server. Defaults to 7000
-exclude <file>: Specify a file that lists data members that should
be excluded from the reachableFrom query.
-baseline <file>: Specify a baseline object dump. Objects in
both heap dumps with the same ID and same class will
be marked as not being "new".
-debug <int>: Set debug level.
0: No debug output
1: Debug hprof file parsing
2: Debug hprof file parsing, no server
-version Report version number
-h|-help Print this help and exit
<file> The file to read
For a dump file that contains multiple heap dumps,
you may specify which dump in the file
by appending "#<number>" to the file name, i.e. "foo.hprof#3".
All boolean options default to "true"
Generally you can just point jhat at a heap dump and away it goes. Occasionally if the heap is large, you may need to use the -J option to increase the maximum heap size of the JVM jhat runs in. Since we're running a Rails app, we'll bump the heap up a little bit.
~/projects/jruby ➔ jhat -J-Xmx200M heap.bin Reading from heap.bin... Dump file created Fri Jul 09 02:07:46 CDT 2010 Snapshot read, resolving... Resolving 604115 objects... [much verbose logging elided for brevity] Chasing references, expect 120 dots........................................................................................................................ Eliminating duplicate references........................................................................................................................ Snapshot resolved. Started HTTP server on port 7000 Server is ready."Server is ready"? Damn you Java people! Does everything have to be a server with you? In this case, it's actually an incredibly useful tool. jhat starts up a small web application on port 7000 that allows you to click through the dump file. Let's see what that looks like.
Here's the front page of the tool. We see a listing of all JVM classes in the system. If you scroll to the bottom, there's a few more general functions.
Let's go with what we know and view the heap histogram again.
Here we can see that there's lots of objects taking up memory, and they're a mix of JVM-native types, JRuby implementation classes, and actual Ruby classes. In fact, here we can see our friend TZInfo::TimezoneTransitionInfo again. Let's click through.
Pretty mundane stuff so far; basically just information about the class itself. But you see at the bottom of this screenshot that we can go from here to viewing all instances of TimezoneTransitionInfo. Let's try that.
Ahh, that's more like it! Now we can see that there's a heck of a lot of these things floating around. Let's investigate a bit more and click through the first instance.
Now this is some cool stuff!
We can see that the JVM class generated for TimezoneTransitionInfo has three fields: metaClass, which points at the Ruby Class object; varTable, which is an array of Object references used for instance variables and other "internal" variables; and a flags field containing runtime flags for the object, like whether it's frozen, tainted, and so on. We can see that this object has no special flags set, and we can dig deeper into those fields if we like. We'll skip that today.
Moving further down, we see a few more amazing links. First, there's a list of all references to this object. Ahh, now we can start to investigate why they're staying in memory, even though we're not using them. We can even have jhat show us the full chains of references keeping these objects alive; a series of objects leading all the way back to one "rooted" by a thread or by global JVM state. And we can explore the other direction as well, walking all objects reachable from this one.
This is only a small part of what you can do with jmap and jhat, and they're so simple to use it feels almost criminal. But what if we want to inspect an application while it's running? Dumping heaps and analyzing them offline can tell you much of the story, but sometimes you just want to see the objects coming and going yourself. Let's move on to VisualVM.
VisualVM
VisualVM spawned out of the NetBeans profiling tools. One of the biggest complaints about the JVMs of old were that all the built-in tooling seemed to be designed for JVM engineers alone. Because Sun had the foresight to build and own their own IDE and related modules, it eventually became a natural fit to pull out the profiling tools for use by everyone. And so VisualVM was born.
On most systems with Java 6 installed, you should have a "jvisualvm" command. Let's run it now.
When you start up VisualVM, you're presented with a list of running JVMs, similar to using the 'jps' command. You can also connect to remote machines, browse offline heap and core dump files, and look through memory and CPU profiling snapshots from previous runs. Today, we'll just open up our running Rails app and see what we can see.
VisualVM connects to the running process and brings up a basic information pane with process information, JVM information, and so on. We're interested in monitoring heap usage, so let's move to the "Monitor" tab.
Already we're getting some useful information. This view shows CPU usage (currently zero, since it's an idle Rails app), Heap usage over time, and the number of JVM classes and threads that are active. We can trigger a full GC, if we'd like to tidy things up before we start poking around. But most importantly, we can do the jmap/jhat dance in one step, by clicking the Heap Dump button. Tantalizing, isn't it?
Initially, we see a basic summary of the heap: total size, number of classes and GC roots, and so on. We're looking for our friend TimezoneTransitionInfo, so let's look for it in the "Classes" pane.
Ahh, there it is, just a little ways down the list. The counts are as we expect, so let's double-click and dig a bit deeper.
Here we have a lot of the same information about object instances that we did with jhat, but presented in a much richer format. Almost everything is active; you can jump around the heap and do analysis that would take a lot of manual work very easily. Let's try another tool: the Retained Size calculator.
Because our JVM tools see all objects equally, the reported size for a Ruby object on the heap is only part of the story. There's also the variable table, the object's instance variables, and objects they reference to consider. Let's jump to a different object now, Gem::Version.
We don't want to have to scroll through the list of classes to find ruby.Gem.Version, so let's make use of the Object Query Language console. With the OQL console, you can write SQL-like queries to retrieve listings of objects in the heap. We'll search for all instances of ruby.Gem.Version.
The query runs and we get a listing of Gem::Version objects. Let's dig deeper and see how much retained memory each Version object is keeping alive.
Clicking on the "Compute Retained Sizes" link in the "Instances" pane prompts us with this dialog. We're tough...we can take it.
Reticulating splines...
So it looks like each of the Version objects take from 125 to 190 bytes for a total of 19400 bytes, most of which is from the variable table. What's in there?
Ahh...looks like there's a String and an Array. And of course we can poke around the heap ad infinatum, into and out of "native" JRuby and JVM classes, and truly get a complete picture of what our running applications look like. Now you're playing with power.
Your Turn
This is obviously only the tip of the iceberg. Tools like Eclipse Memory Analysis Tool include features for detecting leaks; VisualVM and NetBeans both allow you to turn on allocation tracing, to show where in your code all those objects are being created. There's tools for monitoring live GC behavior, and many of these tools even allow you to dig into a running heap and modify live objects. If you can dream it, there's a tool that can do it. And you get all that for free by using JRuby.
If you'd like to play with this, it all works with JRuby 1.5.1 but you won't get the nice JVM classes for Ruby classes. For that, you can pull and build JRuby master, download a 1.6.0.dev snapshot, or just wait for JRuby 1.6. And if you do play with these or other tools, I hope you'll let us know and blog about your experience!
In the future, I'll try to show some of the other tools plus some of the CPU profiling capabilities they bring to the table. For now, rest assured that if you're using JRuby, you really do have the best tools available to you.
This article was originally published on Charles Nutter's blog Headius.
Title | Position ---------------------- The Odyssey | 1 The Iliad | 2 The Nostoi | 3Bob wants to move "The Odyssey" to the bottom position. To do this, he needs to update its position to the bottom of the list (position 4), then subtract 1 from all positions. At the same time, Tom is adding a new book "The Cypria". Working this through: # Bob checks the bottom position, finds it to be 4 # Tom inserts "The Cypria" in the bottom position of 4 # Bob updates the position of "The Odyssey" to 4 # Bob subtracts 1 from all positions, and since he is using *read committed* he will "see" and update the newly inserted book. # Both "The Odyssey" and "The Cypria" have a position of 3
Title | Position ---------------------- The Iliad | 1 The Nostoi | 2 The Odyssey | 3 The Cypria | 3If Bob had used the *serializable* level, the list would have remained consistent for his entire transaction, so his update would not have affected "The Cypria" that Tom inserted, and so would not have updated its position from 4 to 3. (In practice the way databases normally handle this is to actually abort one of the transactions with an error.) For those using Rails, you may have recognized the above scenario as a typical @acts_as_list@ scenario, and you'd be correct. In a default configuration, the @acts_as_list@ plugin makes the same mistake outlined above, and will leave you with inconsistent data. The quickest fix is to wrap all list operations in a serializable transaction.
Book.transaction do
Book.connection.execute("SET SESSION TRANSACTION ISOLATION LEVEL SERIALIZABLE")
@book = Book.find_by_name("The Odyssey")
@book.move_to_bottom
end
(It may have occurred to you that some locking or a unique index on position could avoid the exact scenario above, but that breaks @acts_as_list@ and fails to address some other edge cases left as an exercise for the reader. The main point for the purpose of this article is to understand why it breaks under read committed, but works under serializable.)
As a general rule, read committed is a sensible default. It's easy to reason about, fast, and forces you to be explicit about your locking strategy. Jump up to serializable when needed, usually when dealing with ranges. MySQL's repeatable read default can be confusing and deadlock in unintuitive ways, as such it is not recommended.
This has been a very brief introduction to the four standard SQL isolation levels: read uncommitted, read committed, repeatable read, and serializable. Hopefully it has helped you get your head around them. I'll be going into much more detail with practical hands on exercises in my training days at Engine Yard's San Francisco office on the 24th and 31st of July. Visit www.dbisyourfriend.com for course and registration details.
def ship @order = PurchaseOrder.find(params[:id]) @order.ship! redirect_to order_path(@order) endImagine two users both press the "ship" button at the same time. (Or as often happen, one user double clicks the button.) The two requests will hit the load balancer and be distributed out to run on different processes. What happens when the above code---typical of many rails applications---is run in two different places at the same time? Both processes will load the order from the database at line 2. At line 3 when the ship! method is run, both processes will check the attributes of the order and see that it is currently unshipped. As a result, both execute shipping code, which may include sending emails, updating caches, and transferring funds. As a result, the customer will receive duplicate emails, or worse, be charged twice. All versions of acts_as_state_machine (AASM) exhibit this behavior. The Fix Any time you read data from the database with the intention of making changes based on that data ("ship the order if it isn't already shipped") you must obtain an exclusive database lock on the row (or employ some form of optimistic locking strategy when updating, a topic not covered in this post). The database will block any processes trying to access that row until the session that obtained the lock concludes its transaction (COMMIT or ROLLBACK). ActiveRecord allows us to do this using the :lock flag:
def ship
PurchaseOrder.transaction do
@order = PurchaseOrder.find(params[:id], :lock => true)
@order.ship!
end
redirect_to order_path(@order)
end
Working through the above example again, the first process to execute the find will issue the following SQL:
SELECT * FROM purchase_orders WHERE id = 1 FOR UPDATENotice the "FOR UPDATE" on the end; this instructs the database to place an exclusive lock on the row. When the second process executes the find and submits the above SQL to the database, the database will wait for the first transaction to complete (after calling ship! and updating the state of the order) before reading and returning the row. The returned row will now have a state of "shipped", and as such the ship! method will effectively be a noop (no operation). The customer will only receive one email. It is also possible using ActiveRecord to lock an object that has been already loaded from the database:
def ship
@order = PurchaseOrder.find(params[:id])
PurchaseOrder.transaction do
@order.lock!
@order.ship!
end
redirect_to order_path(@order)
end
This is equivalent to a reload, but adds the "FOR UPDATE" suffix necessary for a database lock. It is an extra SQL statement (the order is selected twice), but is an easier pattern to abstract away.
class Order < ActiveRecord::Base
# This method is usually provided by AASM
def ship!
return if shipped?
# Important emails and computations
end
def ship_with_lock!
transaction do
lock!
ship_without_lock!
end
end
alias_method_chain :ship!, :lock
end
With alias_method_chain, we can continue to use exactly the same controller code we started with (just a plain call to ship!), and locking is handled for us in the background.
Lost updates or duplicate execution won't be a problem for every website, but if you are starting to worry about the concurrency of your hosting infrastructure, it's worth having a look over your code too.
If you’d like to join me for some hands-on work with this, I’ll be running classes at Engine Yard's San Francisco office on the 24th and 31st of July. Visit www.dbisyourfriend.com for course and registration details.Unicorn's been a topic I've been interested in learning about for a while now; numerous Engine Yard customers and developer friends use it, love it, and recommend it. Thankfully, the opportunity to do so recently presented itself. I spent some time poking around free resources looking for answers to my questions, and it wasn't as easy as I'd hoped... so I decided to go straight to the source.
First, I spent a bunch of time going over the Unicorn README file. While comprehensive, when I was done, I still had questions, so I put them all together, and emailed the Unicorn development team. They were gracious enough to reply with detailed answers to all my questions, and now that I'm in the know, I figured this would be a great resource to share with the rest of you. It's not our usual style of blog post, but it's solid information just the same, in what's hopefully an easily consumable format.
I've organized the questions into topical sections. The topics are: Clients, Debugging, Process Management, Load Balancing, Thread-safety, Rack support and Rack wrapper, Log Files, Binary Upgrades, Forking, Listening Interfaces, Configuration, Asynchronous Transfers, The Binary and Dependencies. There's a lot, so read through it all, or skip straight to the section that interests you.
Clients What are "fast-clients"?Clients that can make full (or close to full) use of the network bandwidth available to the server. Clients on a LAN (or the same host) usually fit this description, as they don't have to trickle data to the server over a slow link.
What's a slow client, by comparison?A client with high latency or limited bandwidth that forces the server to sit idle and wait for data in the request or writable buffer space in the response. Accept filters in FreeBSD and deferred accept in Linux mitigate this problem for slow legitimate clients, but a dedicated attack can still get around those.
Slowloris is perhaps the most prominent example of the damage that could be caused by slow clients, but there have been similar tools like it floating around privately for years, including the Unicorn author's own "David" tool, which he (David) only made public after Slowloris:
Clients that sit around with idle keepalive connections is also huge problem for simple servers like Unicorn (and traditional Apache prefork), so Unicorn does not support keepalive.
The Unicorn author also works on the Rainbows! server, which is designed specifically to handle talking directly to slow clients and high-latency apps (Comet/WebSockets) without nginx in front.
What is a "low-latency, high-bandwidth connection"?Anything on localhost or the local area network that doesn't make the server sit idle, unable to service other requests.
Debugging Can you give me an example of how to debug?Reproducibility is critical to debugging. Processes are inherently simpler, as the process state is always well-defined on a per-request basis and isolated from other requests as much as possible by the OS. One example is to help track down a memory leak related to a specific class of requests:
An non-Rubyist admin noticed that among a pool of workers, some used significantly more memory than other workers. Since the log file format always logged the PID serving each request, they were able to quickly narrow down which endpoints were prone to leaking memory (without even looking at the code).
In a server where requests are all served within the same process, it would've been much harder to narrow down which endpoints were using up memory. In a server where a single process handles multiple clients simultaneously, it would've required thorough inspection of the source code to track down which requests were leaking memory.
Process Management"Unicorn will reap and restart workers that die from broken apps. There's no need to manage multiple processes or ports yourself. Unicorn can spawn and manage any number of worker processes you choose to scale to your backend."
Does that mean that Unicorn doesn't need monit or god?No server needs things like monit or god; it all depends on your comfort level, your app, and your support requirements. It's always possible—albeit unlikely—for the master process to die, but things like monit and god aren't immune to dying, either. Developers use those tools, and similar ones, like Bluepill, with Unicorn.
Load Balancing"Load balancing is done entirely by the operating system kernel. Requests never pile up behind a busy worker process."
So there isn't a mongrel queue issue?No, there no a Mongrel queue issue on a single machine. A single queue is shared across worker processes and the workers only pull off the queue when they're available to do work. There's still a potential queue issue in a cluster behind a load balancer, but the risk is mitigated, since most servers are multicore and run multiple worker processes. The queue is also tunable by specifying the :listen parameter.
Thread-safety Why is thread-safety good?The utility of thread-safety really depends on the particulars of your situation. It gives you much more flexibility with what your app can run, and under ideal conditions, threads are memory efficient and relatively inexpensive. Thus, allowing apps to work with threads is good for experienced programmers.
On the other hand though, making things thread-safe by default can hurt performance in single-threaded situations. Even contention-free locks can end up adding significant overhead due to memory barriers. Both MRI and Python core developers have come to this same conclusion.
Rack Support and Rack Wrapper What rack applications are supported?Pretty much anything that passes Rack::Lint (and sometimes, even a few that don't).
What Ruby on Rails versions does the wrapper support?The manpage says everything 1.2.x to 2.3.x, and there are integration tests for those version.
Log Files"Builtin reopening of all log files in your application via USR1 signal. This allows logrotate to rotate files atomically and quickly via rename instead of the race condition prone and slow copytruncate method."
What is the USR1 signal?USR1 is the first user-defined signal, which usually gives applications the most flexibility in determining what a signal handler for it would do. To send a USR1 signal to Unicorn, use the standard kill(1) command:
kill -USR1 $PROCESS_ID
Nginx also uses the USR1 signal for reopening log files. Most of the signals Unicorn accepts map directly to the nginx ones for ease-of-learning. Unicorn also takes steps to ensure multi-line log entries from one request all stay within the same file.
Binary Upgrades What are binary upgrades?Binary upgrades are upgrades that upgrade Unicorn itself, the version of Ruby, or even any system libraries including the system C library. For users that depend on copy-on-write functionality, it's also the only way to upgrade the application
How do you upgrade?The upgrade procedure is the same as nginx, and is also documented here (bottom of page).
What happens after upgrading?After ensuring the old processes are terminated gracefully (via SIGQUIT), that same code should ensure that the app behaves as expected. If the app is broken, another "upgrade" is required which may involve switching back to a known good version.
Forking What is the preload_app directive?The preload_app directive loads the application before forking workers, so it can share any loaded data structures. By default, workers each load a private copy of their app for out-of-the-box compatibility with existing servers.
What's a use case for using the preload_app directive?preload_app can dramatically speed up startup times. It can also make it easy to share memory across processes when using Ruby Enterprise Edition. REE also uses tcmalloc on some platforms, like Linux, instead of a generic malloc, which improves performance for most server workloads independently of copy-on-write.
Listening Interfaces What's an example configuration for how to set this up and how this can be used for debugging an application?You can set up a worker to listen on a specific address so that you can do things like strace the worker while hitting that address and see what happens. There's a commented out example in here, which is shortened and uncommented here:
after_fork do |server, worker|
# per-process listener ports for debugging/admin/migrations
addr = "127.0.0.1:#{9293 + worker.nr}"
server.listen(addr, :tries => -1, :delay => 5)
end
Normally, strace will slow down a process enough that it usually "loses" when trying to accept() a connection against other workers and it never sees the request.
Configuration Is there a good example configuration to help me get started?The examples here cover many settings, including comments. The simplest case is with preload_app=false. Here's a short example:
worker_processes 16
pid "/path/to/app/shared/pids/unicorn.pid"
stderr_path "/path/to/app/shared/log/unicorn.stderr.log"
stdout_path "/path/to/app/shared/log/unicorn.stdout.log"
In contrast, preload_app=true can significantly complicate things, as it requires disconnecting/reconnecting to the database and other connections to avoid unintended resource sharing. All configuration settings are documented in the RDoc of the Unicorn::Configurator class.
In addition to Unicorn::Configurator settings, there's also the rackup config file (usually config.ru) used by all Rack applications independently of the underlying server. There's also system/kernel tuning, which the Unicorn documentation touches on here.
The Binary What is the unicorn executable? What is the unicorn_rails executable?The unicorn executable is a Rack-only tool modeled after Rack's "rackup" and is recommended for Rack applications. unicorn_rails was made to be an easier transition for users of pre-Rack versions of Rails. The manpage encourages Rails 3 users to use plain unicorn instead.
What's the difference?From the unicorn_rails manpage, some conventions of unicorn_rails are modeled after script/server found in Rails. It creates directories under "tmp" like script/server and the -E/--environment switch sets RAILS_ENV instead of RACK_ENV.
Dependencies Are there any dependencies? Gems, system packages, etc.Rack is the only Gem Unicorn currently depends on. Unicorn does not set hard dependencies on any released version of Rack. Unicorn depends on MRI 1.8 or 1.9 on a Unix-like platform. There have been commits to make the C/Ragel HTTP parser work with Rubinius, but there have been some other issues in the pure Ruby code. Building from git requires Ragel (but the distributed source tarball/gems do not). The project does not distribute precompiled binaries.
Unicorn uses RDoc for most of the documentation and John MacFarlane's Pandoc (a Haskell tool) for the Markdown manpages. Pandoc was the most prominent Markdown to manpage converter at the time, as Ryan Tomayko's ronn had not appeared when the manpages did.
What are the requirements? Operating system, ram, etc.Most POSIX-like platforms are supported. Unicorn depends on a bunch of Unix-y things like fork(), the ability to share file descriptors with children, signals, pipes, unlinked open files, etc...
Unicorn has been deployed to and tested on various Linux distros heavily. The Unicorn mailing list has gotten reports and patches for OpenBSD compatibility, too, so that should work. Unicorn does not depend on any exotic system calls not provided natively by MRI.
RAM usage depends heavily on the application/libraries, version of Ruby, word size of the architecture, and number of worker processes configured. It shouldn't take significantly more or less than any other Ruby web server.
ConclusionI didn't start out with a specific problem, more like a void in my knowledge base, and now that that void is gone, I'm pretty pleased with the robust capabilities of Unicorn. It's not going to be the tool-of-choice for every use case, but clearly, it'll do wonders in a lot of them. As always, leave questions and comments here!