Chapter 1
Introduction to Ruby for AI Programming

There are many good books and online tutorials on Ruby so this short chapter will concentrate on just what you will need for this book. This chapter will briefly introduce you to the Ruby features like collections, strings, defining classes, and I/O. I assume that you have Ruby installed, and that you have an irb session open while working through the examples in this chapter.

Note to readers: this chapter will be expanded as I write the rest of the book. This chapter will only document the specific features of the Ruby language used in this book and should not be considered a complete introduction.

1.1 Classes

Ruby is an object oriented programming language, even primitive types like integers.

Figure 1.1 shows a UML class diagram showing commonly used Ruby data classes.


PIC

Figure 1.1: UML Class Diagram showing commonly used Ruby data classes

Given any Ruby object, it is easy to interactively discover its class:

 
x="123" 
puts "Theclassis:#{x.class}" 
puts x.methods[0..5] # just print the first 6 
puts "Togetinformationonamethodlike'inject',use:riinject"

Produces the following output:

 
The class is: String 
send 
% 
index 
collect 
[]= 
inspect 
To get information on a method like 'inject', use: ri inject

The ri program is run interactively like irb to print documentation on methods or classes. Being able to quickly check what methods are available for a class or object, along with the ri utility makes it easy to get started using Ruby.

One powerful feature of Ruby is that all classes are ”open”, that is, you may add methods to any classes. As a contrived example, we will add a method my_double to the standard String class:

 
class String 
  def my_double 
     self + self 
  end 
end 
 
puts "testing123".my_double

You see the use of the self keyword here: self refers to the object that the method is being called on. This new method now works for any string object::

 
testing 123testing 123

1.2 Arrays

Ruby arrays can grow dynamically as needed and array elements can be any type of Ruby object.

 
# example comment. Following line creates a array: 
x=[1,2,3,'cat',"dog"] 
x.each {|element|  # local variable element is assigned each value in x 
  puts "nextelementinarrayis:#{element}" 
} 
z=x.collect {|e| e + e} 
puts z

Produces the following output:

 
next element in array is: 1 
next element in array is: 2 
next element in array is: 3 
next element in array is: cat 
next element in array is: dog 
2 
4 
6 
catcat 
dogdog

Ruby defines operators for arrays in a natural way. For example, you can use the - operator to ”subtract” elements from one array from another:

 
a1 = [1,4,66,2,99] 
a2 = [4,99] 
puts (a1 - a2) 
puts 
puts (a1 - a2).sort

Produces the following output:

 
1 
66 
2 
 
1 
2 
66

We have been using puts and pp to print values to standard output. These work well if you are not printing too many items; the following code would print available methods one per line:

 
puts ([1].methods - Object.methods).sort

The methods Array#join (this means method join of class Array) converts each element of an array to a string and concatenates these strings together using any ”join string” that you supply. Here, we generate a string with available method names separated by ”, ”:

 
print ([1].methods - Object.methods).sort.join(',')

This code produces the following output:

 
&, ⋆, +, -, <<, [], []=, all?, any?, assoc, at, clear, collect, collect!, compact, compact!, concat, delete, delete_at, delete_if, detect, each, each_index, each_with_index, empty?, entries, fetch, fill, find, find_all, first, flatten, flatten!, grep, index, indexes, indices, inject, insert, join, last, length, map, map!, max, member?, min, nitems, pack, partition, pop, push, rassoc, reject, reject!, replace, reverse, reverse!, reverse_each, rindex, select, shift, size, slice, slice!, sort, sort!, sort_by, to_ary, transpose, uniq, uniq!, unshift, values_at, zip, |

1.3 Strings

You can create string constants by either using double or single quotes. Strings created with single quotes are lighter weight and slightly more efficient but do not support embedded formatting. Here are a few examples:

 
# String examples 
x=3.14159 
y='Thecatranquickly' 
puts "Variablesubstitutioninastring:#{x}#{y}" 
 
# examples for string 'slices': 
s = "0123456789" 
puts s[0] # character value at index 0 
puts s[0..0] # string value of sinle character at index 0 
puts s[2..4] # include last index 
puts s[2...4] # up to last index 
puts s[-1..-1] # last character 
puts "datafaile.xml"[-4..-1] # last 4 characters

This code produces the following output:

 
Variable substitution in a string: 3.14159 The cat ran quickly 
48 
0 
234 
23 
9 
.xml

1.4 Code Blocks

One of the most powerful features of the Ruby language is the ability to define code blocks. The following listing shows two examples of passing code blocks defined inside curley brackets to the methods each and collect:

 
[1,2,44,99,"cat"].each {|n|  puts "n=#{n}"} 
 
require 'pp' 
pp [1,2,44,99,"cat"].collect {|x| x+x}

This code produces the following output:

 
n = 1 
n = 2 
n = 44 
n = 99 
n = cat 
[2, 4, 88, 198, "catcat"]

Blocks are in effect ”anonymous” methods that do not require a name because they are used explicitly in one place in your code.

You can also write your own methods to accept code blocks. In the following example, we check to see if a code block was provided and if one was provided, execute it:

 
def test name 
           puts "Codeblocktest:argument:#{name}" 
           val = yield(name) if block_given?  # yield executes an external code block 
           puts "Afterexecutinganoptionalcodeblock,val=#{val}" 
end 
 
test("anargument") 
puts 
test("anargumenttocodeblock") {|x| x.upcase}

This code produces the following output:

 
Code block test: argument: an argument 
After executing an optional code block, val= 
 
Code block test: argument: an argument to code block 
After executing an optional code block, val=AN ARGUMENT TO CODE BLOCK

1.5 Iterators, the Enumerable Mixin, and Blocks

Ruby supports Mixins to add behavior to classes. Enumerable is used to add useful behavior to collections for testing the contents of collections, iterating through collection elements, searching for specific elements, partitioning collections into two disjoint sets, and for sorting.

We saw the use of two Ruby iterators in the last section: each and collect. In both examples, these iterators were passed a code block. The combination if iterators and blocks, which we will explore more in this section, allow us to write code that is shorter and more readable than in languages like Java that support a variety of iterators but not code blocks.

Iterators allow us to perform operations on each element of a collection without having to write code dealing with the type of collection or what types of objects are in a collection. The following long example shows off the the techniques used in the rest of this book:

 
require 'pp' 
 
data = ["the", "dog", "ran", "after", "a", "black", "cat"] 
 
data.each_with_index {|element, index| puts "word#{index}:#{element}"} 
 
result = data.detect {|x| x.length > 3} # find first occurence 
puts "Firstwordwithlength>3:#{result}" 
 
less,greater = data.partition {|w| w.length <= 3} # partition into 2 sets 
pp "Elementswithlengthequalorlessthan3:", less, 
    "andelementswithlengthgreaterthan3:", greater 
 
result = data.reject {|x| x.length == 3}  # remove elements 
pp "Datawithwordsoflengthequalto3removed:", result 
 
pp "Datawithwordssortedinorder:", data.sort 
pp "Datawithwordssortedbywordlength:", data.sort_by {|w| w.length} 
 
pp "Shortcutforspecifyinganenumerationandconvertingtoanarray:", (1..7).collect

This code produces the following output:

 
word 0: the 
word 1: dog 
word 2: ran 
word 3: after 
word 4: a 
word 5: black 
word 6: cat 
First word with length > 3: after 
"Elements with length equal or less than 3:" 
["the", "dog", "ran", "a", "cat"] 
" and elements with length greater than 3:" 
["after", "black"] 
"Data with words of length equal to 3 removed: " 
["after", "a", "black"] 
"Data with words sorted in order: " 
["a", "after", "black", "cat", "dog", "ran", "the"] 
"Data with words sorted by word length: " 
["a", "ran", "dog", "the", "cat", "black", "after"] 
"Short cut for specifying an enumeration and converting to an array: " 
[1, 2, 3, 4, 5, 6, 7]

1.6 Hash tables

Hash tables implement associative memory: hash keys map into values associated with each key. Both hash keys and values can be arbitrary Ruby objects: numbers, strings, arrays of simple values, etc.

 
require 'pp' 
h = {} 
h["cat"] = "dog" 
h["data"] = [1, 2, "fish"] 
pp h 
pp h["cat"] 
pp h["data"] 
pp h.keys 
pp h.values 
pp h["nokey"]

This code produces the following output:

 
{"cat"=>"dog", "data"=>[1, 2, "fish"]} 
"dog" 
[1, 2, "fish"] 
["cat", "data"] 
["dog", [1, 2, "fish"]] 
nil

Here is another example that shows how to define a default value for a hash table: requesting the value for a key that is not in the hash table produces the default value.

 
require 'pp' 
 
test_string = "Theboychasedthedogdownthestreetbecausethedoghadbithiscatandbecausethedogpeedinthehouse." 
 
def words text  # utility to extract words from a string 
  text.downcase.scan(/[a-z]+/) 
end 
 
words = words(test_string) 
 
one_grams = Hash.new(0) # hash returns a value of zero if key not present 
bi_grams = Hash.new(0) 
tri_grams = Hash.new(0) 
 
num = words.length 
num.times {|i| 
  one_grams[words[i]] += 1 
  if i < (num - 1) 
     bi = words[i] + '' + words[i+1] 
     bi_grams[bi] += 1 
     if i < (num - 2) 
        tri = bi + '' + words[i+2] 
        tri_grams[tri] += 1 
     end 
  end 
} 
 
# example of sorting a hash: the <=> operator returns -1,0,1 depending on 
# a comparison being less than, equal, or greater than: 
bb = one_grams.sort{|a,b| b[1] <=> a[1]} 
puts "Sortingahashtableproducesanarrayofsub-arrays:" 
pp bb 
 
puts 
puts "++one_grams:" 
bb.each {|x| puts "#{x[0]}:#{x[1]}"} 
 
puts 
puts "++bi_grams:" 
bb = bi_grams.sort{|a,b| b[1] <=> a[1]} 
bb.each {|x| puts "#{x[0]}:#{x[1]}"} 
 
puts 
puts "++tri_grams:" 
tt = tri_grams.sort.sort{|a,b| b[1] <=> a[1]} 
tt.each {|x| puts "#{x[0]}:#{x[1]}"}

This code produces the following output:

 
Sorting a hash table produces an array of sub-arrays: 
[["the", 6], 
 ["dog", 3], 
 ["because", 2], 
 ["down", 1], 
 ["boy", 1], 
 ["peed", 1], 
 ["bit", 1], 
 ["and", 1], 
 ["house", 1], 
 ["in", 1], 
 ["chased", 1], 
 ["had", 1], 
 ["his", 1], 
 ["street", 1], 
 ["cat", 1]] 
 
++ one_grams: 
the : 6 
dog : 3 
because : 2 
down : 1 
boy : 1 
peed : 1 
bit : 1 
and : 1 
house : 1 
in : 1 
chased : 1 
had : 1 
his : 1 
street : 1 
cat : 1 
 
++ bi_grams: 
the dog : 3 
because the : 2 
bit his : 1 
the street : 1 
chased the : 1 
the boy : 1 
his cat : 1 
in the : 1 
cat and : 1 
dog had : 1 
the house : 1 
peed in : 1 
dog peed : 1 
had bit : 1 
dog down : 1 
and because : 1 
down the : 1 
street because : 1 
boy chased : 1 
 
++ tri_grams: 
because the dog : 2 
had bit his : 1 
bit his cat : 1 
boy chased the : 1 
cat and because : 1 
chased the dog : 1 
dog down the : 1 
dog had bit : 1 
dog peed in : 1 
down the street : 1 
and because the : 1 
his cat and : 1 
in the house : 1 
peed in the : 1 
street because the : 1 
the boy chased : 1 
the dog down : 1 
the dog had : 1 
the dog peed : 1 
the street because : 1

1.7 File I/O

Fie I/O is simple in Ruby. The following example shows how to write strings to a file, read them back, and how to delete the test file:

 
out = File.new('temp2.txt','w') 
out.puts("Thisisatest") 
out.puts("anotherlineoftext...") 
out.close 
 
open('temp2.txt') do |fd| 
  fd.each do |line| 
     puts line