The code in this book is written in Ruby, a programming language that was designed to be simple, friendly, and fun. I’ve chosen it because of its clarity and flexibility, but nothing in the book relies on special features of Ruby, so you should be able to translate the code examples into whatever language you prefer—especially another dynamic language like Python or JavaScript—if that helps to make the ideas clearer.
All of the example code is compatible with both Ruby 2.0 and Ruby 1.9. You can find out more about Ruby, and download an official implementation, at the official Ruby website.
Let’s take a quick tour of Ruby’s features. We’ll concentrate on the parts of the language that are used in this book; if you want to learn more, O’Reilly’s The Ruby Programming Language is a good place to start.
If you already know Ruby, you can safely skip to Chapter 2 without missing anything.
One of Ruby’s friendliest features is its interactive console, IRB, which lets us enter pieces of Ruby code and immediately see the results. In this book, we’ll use IRB extensively to interact with the code we’re writing and explore how it works.
You can run IRB on your development machine by typing irb at the
command line. IRB shows a >> prompt when it expects you to provide a Ruby expression. After you type an expression and
hit Enter, the code gets evaluated, and the result is shown at a => prompt:
$irb --simple-prompt>>1+2=> 3>>'hello world'.length=> 11
Whenever we see these >> and => prompts in the book, we’re interacting with IRB. To make
longer code listings easier to read, they’ll be shown without the prompts, but we’ll still
assume that the code in these listings has been typed or pasted into IRB. So once the book has
shown some Ruby code like this…
x=2y=3z=x+y
…then we’ll be able to play with its results in IRB:
>>x*y*z=> 30
Ruby is an expression-oriented language: every valid piece of code produces a value when it’s executed. Here’s a quick overview of the different kinds of Ruby value.
As we’d expect, Ruby supports Booleans, numbers, and strings, all of which come with the usual operations:
>>(true&&false)||true=> true>>(3+3)*(14/2)=> 42>>'hello'+' world'=> "hello world">>'hello world'.slice(6)=> "w"
A Ruby symbol is a lightweight, immutable value representing a name. Symbols are widely used in Ruby as simpler and less memory-intensive alternatives to strings, most often as keys in hashes (see Data Structures). Symbol literals are written with a colon at the beginning:
>>:my_symbol=> :my_symbol>>:my_symbol==:my_symbol=> true>>:my_symbol==:another_symbol=> false
The special value nil is used
to indicate the absence of any useful value:
>>'hello world'.slice(11)=> nil
Ruby array literals are written as a comma-separated list of values surrounded by square brackets:
>>numbers=['zero','one','two']=> ["zero", "one", "two"]>>numbers[1]=> "one">>numbers.push('three','four')=> ["zero", "one", "two", "three", "four"]>>numbers=> ["zero", "one", "two", "three", "four"]>>numbers.drop(2)=> ["two", "three", "four"]
A range represents a collection of values between a minimum and a maximum. Ranges are written by putting a pair of dots between two values:
>>ages=18..30=> 18..30>>ages.entries=> [18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30]>>ages.include?(25)=> true>>ages.include?(33)=> false
A hash is a collection in which every value is associated with a key;
some programming languages call this data structure a “map,”
“dictionary,” or “associative array.” A hash literal is written as a
comma-separated list of pairs inside curly
brackets:key =>
value
>>fruit={'a'=>'apple','b'=>'banana','c'=>'coconut'}=> {"a"=>"apple", "b"=>"banana", "c"=>"coconut"}>>fruit['b']=> "banana">>fruit['d']='date'=> "date">>fruit=> {"a"=>"apple", "b"=>"banana", "c"=>"coconut", "d"=>"date"}
Hashes often have symbols as keys, so Ruby provides an alternative syntax for writing key-value pairs where the
key is a symbol. This is more compact than the key:
value syntax
and looks a lot like the popular JSON format for JavaScript objects:key => value
>>dimensions={width:1000,height:2250,depth:250}=> {:width=>1000, :height=>2250, :depth=>250}>>dimensions[:depth]=> 250
A proc is an unevaluated chunk of Ruby code that can be passed around
and evaluated on demand; other languages call this an “anonymous
function” or “lambda.” There are several ways of writing a proc literal,
the most compact of which is the ->
syntax:arguments { body
}
>>multiply=->x,y{x*y}=> #<Proc (lambda)>>>multiply.call(6,9)=> 54>>multiply.call(2,3)=> 6
As well as the .call syntax,
procs can be called by using square brackets:
>>multiply[3,4]=> 12
Ruby has if, case, and while expressions, which work in the usual
way:
>>if2<3'less'else'more'end=> "less">>quantify=->number{casenumberwhen1'one'when2'a couple'else'many'end}=> #<Proc (lambda)>>>quantify.call(2)=> "a couple">>quantify.call(10)=> "many">>x=1=> 1>>whilex<1000x=x*2end=> nil>>x=> 1024
Ruby looks like other dynamic programming languages but it’s unusual in an important way: every value is an object, and objects communicate by sending messages to each other.[1] Each object has its own collection of methods that determine how it responds to particular messages.
A message has a name and, optionally, some arguments. When an object receives a message, its
corresponding method is executed with the arguments from the message. This is how
all work gets done in Ruby; even 1 + 2
means “send the object 1 a message
called + with the argument 2,” and the object 1 has a #+
method for handling that message.
We can define our own methods with the def
keyword:
>>o=Object.new=> #<Object>>>defo.add(x,y)x+yend=> nil>>o.add(2,3)=> 5
Here we’re making a new object by sending the new message to a special built-in object
called Object; once the
new object’s been created, we define an #add method on it. The #add method adds its two arguments together and
returns the result—an explicit return
isn’t necessary, because the value of the last expression to be executed
in a method is automatically returned. When we send that object the
add message with 2 and 3 as
arguments, its #add method is executed
and we get back the answer we wanted.
We’ll usually send a message to an object by writing the receiving object and the message name separated by a dot (e.g., o.add),
but Ruby always keeps track of the current object (called self) and will allow us to send a message to that object by writing a message
name on its own, leaving the receiver implicit. For example, inside a method definition the
current object is always the object that received the message that caused the method to
execute, so within a particular object’s method, we can send other messages to the same object
without referring to it explicitly:
>>defo.add_twice(x,y)add(x,y)+add(x,y)end=> nil>>o.add_twice(2,3)=> 10
Notice that we can send the add
message to o from within the #add_twice method by writing add(x, y) instead of o.add(x, y), because o is the object that the add_twice message was sent to.
Outside of any method definition, the current object is a special
top-level object called main, and
any messages that don’t specify a receiver are sent to it; similarly, any
method definitions that don’t specify an object will be made available
through main:
>>defmultiply(a,b)a*bend=> nil>>multiply(2,3)=> 6
It’s convenient to be able to share method definitions between many objects. In Ruby, we can put method
definitions inside a class, then create new objects by sending the new
message to that class. The objects we get back are instances of the class and incorporate its methods. For
example:
>>classCalculatordefdivide(x,y)x/yendend=> nil>>c=Calculator.new=> #<Calculator>>>c.class=> Calculator>>c.divide(10,2)=> 5
Note that defining a method inside a class definition adds the method to instances of
that class, not to main:
>>divide(10,2)NoMethodError: undefined method `divide' for main:Object
One class can bring in another class’s method definitions through inheritance:
>>classMultiplyingCalculator<Calculatordefmultiply(x,y)x*yendend=> nil>>mc=MultiplyingCalculator.new=> #<MultiplyingCalculator>>>mc.class=> MultiplyingCalculator>>mc.class.superclass=> Calculator>>mc.multiply(10,2)=> 20>>mc.divide(10,2)=> 5
A method in a subclass can call a superclass method of the same name by using the super keyword:
>>classBinaryMultiplyingCalculator<MultiplyingCalculatordefmultiply(x,y)result=super(x,y)result.to_s(2)endend=> nil>>bmc=BinaryMultiplyingCalculator.new=> #<BinaryMultiplyingCalculator>>>bmc.multiply(10,2)=> "10100"
Another way of sharing method definitions is to declare them in a module, which can then be included by any class:
>>moduleAdditiondefadd(x,y)x+yendend=> nil>>classAddingCalculatorincludeAdditionend=> AddingCalculator>>ac=AddingCalculator.new=> #<AddingCalculator>>>ac.add(10,2)=> 12
Here’s a grab bag of useful Ruby features that we’ll need for the example code in this book.
As we’ve already seen, Ruby lets us declare local variables just by assigning a value to them:
>>greeting='hello'=> "hello">>greeting=> "hello"
We can also use parallel assignment to assign values to several variables at once by breaking apart an array:
>>width,height,depth=[1000,2250,250]=> [1000, 2250, 250]>>height=> 2250
Strings can be single- or double-quoted. Ruby automatically performs
interpolation on double-quoted strings, replacing
any #{ with
its result:expression}
>>"hello#{'dlrow'.reverse}"=> "hello world"
If an interpolated expression returns an object that isn’t a string, that object is automatically
sent a to_s message and is expected to return a string that can be used in
its place. We can use this to control how interpolated objects
appear:
>>o=Object.new=> #<Object>>>defo.to_s'a new object'end=> nil>>"here is#{o}"=> "here is a new object"
Something similar happens whenever IRB needs to display an object: the object is sent the
inspect message and should return a string
representation of itself. All objects in Ruby have sensible default implementations of
#inspect, but by providing our own definition, we can
control how an object appears on the console:
>>o=Object.new=> #<Object>>>defo.inspect'[my object]'end=> nil>>o=> [my object]
The #puts method is
available to every Ruby object (including main), and can be used to print strings to
standard output:
>>x=128=> 128>>whilex<1000puts"x is#{x}"x=x*2endx is 128x is 256x is 512=> nil
Method definitions can use the * operator to support
a variable number of arguments:
>>defjoin_with_commas(*words)words.join(', ')end=> nil>>join_with_commas('one','two','three')=> "one, two, three"
A method definition can’t have more than one variable-length parameter, but normal parameters may appear on either side of it:
>>defjoin_with_commas(before,*words,after)before+words.join(', ')+afterend=> nil>>join_with_commas('Testing: ','one','two','three','.')=> "Testing: one, two, three."
The * operator can also be used
to treat each element of an array as a separate argument when sending a
message:
>>arguments=['Testing: ','one','two','three','.']=> ["Testing: ", "one", "two", "three", "."]>>join_with_commas(*arguments)=> "Testing: one, two, three."
And finally, * works in
parallel assignment too:
>>before,*words,after=['Testing: ','one','two','three','.']=> ["Testing: ", "one", "two", "three", "."]>>before=> "Testing: ">>words=> ["one", "two", "three"]>>after=> "."
A block is a piece of Ruby code surrounded by do/end or
curly brackets. Methods can take an implicit block
argument and call the code in that block with the yield
keyword:
>>defdo_three_timesyieldyieldyieldend=> nil>>do_three_times{puts'hello'}hellohellohello=> nil
>>defdo_three_timesyield('first')yield('second')yield('third')end=> nil>>do_three_times{|n|puts"#{n}: hello"}first: hellosecond: hellothird: hello=> nil
yield returns the result of
executing the block:
>>defnumber_names[yield('one'),yield('two'),yield('three')].join(', ')end=> nil>>number_names{|name|name.upcase.reverse}=> "ENO, OWT, EERHT"
Ruby has a built-in module called Enumerable
that’s included by Array, Hash, Range, and other classes that represent collections of values.
Enumerable provides helpful methods
for traversing, searching, and sorting collections, many of which expect
to be called with a block. Usually the code in the block will be run
against some or all values in the collection as part of whatever job the
method does. For example:
>>(1..10).count{|number|number.even?}=> 5>>(1..10).select{|number|number.even?}=> [2, 4, 6, 8, 10]>>(1..10).any?{|number|number<8}=> true>>(1..10).all?{|number|number<8}=> false>>(1..5).eachdo|number|ifnumber.even?puts"#{number}is even"elseputs"#{number}is odd"endend1 is odd2 is even3 is odd4 is even5 is odd=> 1..5>>(1..10).map{|number|number*3}=> [3, 6, 9, 12, 15, 18, 21, 24, 27, 30]
It’s common for the block to take one argument and send it one message with no
arguments, so Ruby provides a &: shorthand as a more concise way of
writing the block message{ |object|
object.:message }
>>(1..10).select(&:even?)=> [2, 4, 6, 8, 10]>>['one','two','three'].map(&:upcase)=> ["ONE", "TWO", "THREE"]
One of Enumerable’s methods, #flat_map, can be used to evaluate an array-producing block for every value in a collection and
concatenate the results:
>>['one','two','three'].map(&:chars)=> [["o", "n", "e"], ["t", "w", "o"], ["t", "h", "r", "e", "e"]]>>['one','two','three'].flat_map(&:chars)=> ["o", "n", "e", "t", "w", "o", "t", "h", "r", "e", "e"]
Another useful method is #inject, which evaluates a block for every
value in a collection and accumulates a final result:
>>(1..10).inject(0){|result,number|result+number}=> 55>>(1..10).inject(1){|result,number|result*number}=> 3628800>>['one','two','three'].inject('Words:'){|result,word|"#{result}#{word}"}=> "Words: one two three"
Struct is a special Ruby class whose job is to generate other classes. A class
generated by Struct contains getter
and setter methods for each of the attribute names passed into Struct.new. The conventional way to use a
Struct-generated class is to subclass
it; the subclass can be given a name, and it provides a convenient place
to define any additional methods. For example, to make a class called
Point with attributes called x and y, we
can write:
classPoint<Struct.new(:x,:y)def+(other_point)Point.new(x+other_point.x,y+other_point.y)enddefinspect"#<Point (#{x},#{y})>"endend
Now we can create instances of Point, inspect them in IRB, and send them
messages:
>>a=Point.new(2,3)=> #<Point (2, 3)>>>b=Point.new(10,20)=> #<Point (10, 20)>>>a+b=> #<Point (12, 23)>
As well as whatever methods we define, a Point instance responds to the messages
x and x= to get and set the value of its x attribute, and similarly for y and y=:
>>a.x=> 2>>a.x=35=> 35>>a+b=> #<Point (45, 23)>
Classes generated by Struct.new have other useful
functionality, like an implementation of the equality method #==, which compares the
attributes of two Structs to see if they’re equal:
>>Point.new(4,5)==Point.new(4,5)=> true>>Point.new(4,5)==Point.new(6,7)=> false
New methods can be added to an existing class or module at any time. This is a powerful feature, usually called monkey patching, which lets us extend the behavior of existing classes:
>>classPointdef-(other_point)Point.new(x-other_point.x,y-other_point.y)endend=> nil>>Point.new(10,15)-Point.new(1,1)=> #<Point (9, 14)>
We can even monkey patch Ruby’s built-in classes:
>>classStringdefshoutupcase+'!!!'endend=> nil>>'hello world'.shout=> "HELLO WORLD!!!"
Ruby supports a special kind of variable, called a constant, which should not be reassigned once it’s been created. (Ruby won’t prevent a constant from being reassigned, but it will generate a warning so we know we’re doing something bad.) Any variable whose name begins with a capital letter is a constant. New constants can be defined at the top level or within a class or module:
>>NUMBERS=[4,8,15,16,23,42]=> [4, 8, 15, 16, 23, 42]>>classGreetingsENGLISH='hello'FRENCH='bonjour'GERMAN='guten Tag'end=> "guten Tag">>NUMBERS.last=> 42>>Greetings::FRENCH=> "bonjour"
Class and module names always begin with a capital letter, so class and module names are constants too.
When we’re exploring an idea with IRB it can be useful to ask Ruby to
forget about a constant altogether, especially if that constant is the
name of a class or module that we want to redefine from scratch instead
of monkey patching its existing definition. A top-level constant can be
removed by sending the remove_const message
to Object, passing the constant’s
name as a symbol:
>>NUMBERS.last=> 42>>Object.send(:remove_const,:NUMBERS)=> [4, 8, 15, 16, 23, 42]>>NUMBERS.lastNameError: uninitialized constant NUMBERS>>Greetings::GERMAN=> "guten Tag">>Object.send(:remove_const,:Greetings)=> Greetings>>Greetings::GERMANNameError: uninitialized constant Greetings
We have to use
Object.send(:remove_const,
: instead of just
NAME)Object.remove_const(:,
because NAME)remove_const is a private
method that ordinarily can only be called by sending a message from inside the Object class itself; using Object.send allows us to bypass this restriction temporarily.
[1] This style comes from the Smalltalk programming language, which had a direct influence on the design of Ruby.