Ruby Basics – Data Types

Ruby Logo

This is the first in a new series on beginning development with Ruby. Each entry in the series will cover a different basic programming concept starting with data types. All modern programming languages have a defined set of data types, most of which are shared between languages. Part of what sets Ruby apart from .Net languages like C# and VB.Net is that all data types are objects.

The classes for these data types are:

  • Boolean
  • Number
  • String
  • Array
  • Hashes
  • Symbols

Boolean

I want you to remember back to when you were in elementary school passing the pretty girl a note asking her if she like likes you, check yes or no. Boolean values represent that yes or no. It is a flag that hold either true or false. Even though it holds true or false, you see boolean values everywhere. Anyplace you see only two options, true or false, yes or no, black or white, up or down, it can be represented by a boolean.

For more information on booleans, check out the Ruby Documentation here and here.

Number

The two most commonly used types of numbers are integers and floats.

Integers are whole numbers, or numbers that do not have a decimal place or fraction. Integers can be either negative or positive numbers. So when someone asks you what the square root of nine is, you can represent both answers with an integer, 3 and -3. I still remember that number chart above the chalk board in school that was used to teach how to add and multiply negative and positive numbers together.

Floats are the equivalent to Decimals in .Net and hold numbers with a decimal place. They are used to represent money like the price of gas that never seems to be a whole penny value, and evil fractions in their decimal form, 2 3/4 would be represented by 2.75. Dividing an integer and float will provide a float value.

Check the out Ruby Documentation on integers here.
Check the out Ruby Documentation on floats here.
Check out the Ruby Documentation on rationals here.

String

Strings are one or more characters of letters, numbers, and punctuation. Most of what is presented to the user is a collection of strings with other data types sprinkled in here and there. These characters are enclosed by either single or double quotes. Double quotes allow for string interpolation and escaped characters, characters with the back slash before it and is used to represent special meaning or action, like \n for newline. Single quotes will keep the special characters and print them with the other characters in the string.

“Hello \nBacon”
= Hello
= Bacon

‘Hello \nBacon’
= Hello \nBacon

For more information on strings, check out the Ruby Documentation

Array

An array is a collection of values of any type, unlike languages like C and C# where the values in the array have to all be of the same type. Values are stored and retrieved by their index, which is a zero based index. A zero based index starts at zero instead of one and increments from there and an index of 9 would be the 10th value in an array. All values have a unique index. You can think of this as a key value collection by thinking of the array index as the key and the value you add to the array as the value.

For more information on arrays, check out the Ruby Documentation.

Hashes

Similar to arrays, hashes store a collection of values in a key and value fashion. Unlike arrays, hashes do not assign an index for the value you add, but instead uses a value you provide as the key.

To illustrate:
Array[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]:
Key = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
Value = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10

Hash["FirstName" => "Mark", "LastName" => "Brown", "Status" => "Bacon"]:
Key = “FirstName”, “LastName”, “Status”
Value = “Mark”, “Brown”, “Bacon”

For more information on hashes, check out the documentation at Ruby Documentation.

Symbols

The best description I have found for Symbols is the one given in Why’s (Poignant) Guide. In there it is stated that symbols are light weight strings that are generally used when the string value will not be printed. I have to admit I am not 100% sure of what these are. Are they pointers or the equivalent of references in .Net? Or are they a class, like String, but with limited functionality, like a beer with half the calories? Feel free to leave a comment if you would like to help straighten this out for me.

For more information on symbols, check out the documentation at Ruby Documentation.

Comments

  1. The main difference between Strings and Symbols is that if the same string is used twice in your code, there will be two different instances of a String object while with Symbols there will only be one. Therefore Symbols are often used as hash keys and other internal stuff. Consider this example:

    1.9.3-p125 :001 > “Test”.object_id
    => 14881040
    1.9.3-p125 :002 > “Test”.object_id
    => 12367040
    1.9.3-p125 :003 > :Test.object_id
    => 461708
    1.9.3-p125 :004 > :Test.object_id
    => 461708

    As you can see, :Test and :Test have the same object_id and so are the same object. “Test”.object_id will give you a new ID anytime you run it.

    Just another thing about Hashes: to define Hashes, you have to use curly brackets in Ruby: {“FirstName” => “Mark”, “LastName” => “Brown”, “Status” => “Bacon”}. To reference one item in the Hash, use square brackets as for Arrays.

  2. Symbols are essentially interned strings. You may know that in Java (and C# too, IIRC), all strings which appear as literals within a class file will be automatically interned. That is, if I write:

    class MyClass {
    public void doStuff() {
    System.out.println(“Hello”);
    }
    }

    The string “Hello” will be ‘interned’: one String object will be created for it, and any other instances of the string literal ‘hello’ in my code will point to the same String object.

    So far so technical – why should we care about String internment when the language and runtime manage the memory? Well, it turns out that having some string-like values which get interned is a very common pattern in programming – for example, we might want to represent the possible states of an order in a human-readable way like “pending”, “processing”, “dispatched”, “complete”, but not pay any performance penalty. Using interned strings allows this.

    Ruby doesn’t perform any automatic string internment, but instead provides symbols for use in these cases. So we use them when we want human-readable values but we’re only communicating between Ruby objects (and, in another way, to the programmer reading our code).

    I hope that makes some sense?

  3. For me the main distinction between Strings and Symbols is that String are mutable (much a surprise if you came from the Java world) and Symbols are not.
    Immutability is by itself a very deep and important concept, but also it’s closely related to another concept – Identity and Value. Conceptually you can percept Strings as mainly values and Symbols as identifiers. Symbols are so often used in Hashes as keys, that there is a new, shortened hash syntax introduced in Ruby 1.9.
    But then come some other Ruby-specific features, such as method invocation shorthands etc.

  4. It’s probably unwise to use floats for money, as they can lose precision when doing arithmetic with them, they aren’t exactly exact values. See BigDecimal instead.

  5. Here is another angle: Symbols are just integers with prettier names. :)

    In Ruby, integers are kind of like constants. In your program say you have the integer 1 multiple places, but they all refer to the same exact object in memory (they have the same object_id). This is not the case for a string like “foo”.

    Symbols work exactly the same way. A symbol :foo carries exactly the same object_id wherever it goes.

  6. Mark Brown says:

    I would like to thank everyone who posted a comment. I have a much better understanding of what symbols are now, and some great suggestions to include in the next part of the series.

  7. Very good.Thanks

  8. I must respectfully suggest that it’s incorrect to say floats hold numbers with a decimal place. Really—and look out, I’m getting pretty esoteric here—they hold numbers with a “radix point,” in this case in base-two. One shouldn’t rely on floats to hold accurate fractions of 10 or 100, which is what the “deci-” in “decimal” indicates. This is one of my favorites:

    “%.20f” % 0.1 # (0.1 to float to decimal string with 20 places)
    => “0.10000000000000000555″

    I’d additionally mention the BigDecimal class: http://www.ruby-doc.org/stdlib-1.9.3/libdoc/bigdecimal/rdoc/BigDecimal.html

  9. Symbol object is immutable. Immutable objects can only be overwritten after assignment, you can not change it, while Mutable objects can be changed after assignment

Trackbacks

  1. Homepage says:

    … [Trackback]…

    [...] There you will find 29502 more Infos: rubybacon.com/ruby-data-types/ [...]…

Speak Your Mind

*