Ruby Heredocs | Alchemists

Ruby Heredocs

Ruby heredocs — or here documents — are a nice way to embed multiple lines of text as a separate document in your source code while preserving line breaks, indentation, and other forms of whitespace. This frees you up from having to concatenate multiple lines of strings which can get cumbersome.

Heredocs originate from UNIX as generally found in shell scripting. Heredocs are not specific to the Ruby language, though. Other languages incorporate some form of this syntax as well.

For the purposes of this article, we’ll explore the heredoc syntax in Ruby only.

Table of Contents

Syntax
Advanced
- Arguments
- Messages
Conclusion

Syntax

In general, heredoc syntax consists of several lines:

A shovel operator (<<) to start the heredoc.
A dash (-) or tilde (~) to determine heredoc behavior.
The opening delimiter, of your choice, which denotes the beginning of your content.
The actual content which may be be multiple lines.
The closing delimiter which is identical to your opening delimiter.

Here’s an example to illustrate the above:

<<-BODY
  A body
  of text.
BODY

Notice the use of << and - to start the heredoc while BODY is used to delimit the start and end of your content. I happened to use BODY in the above example but any delimiter — that is self describing — would work. For example, I could have used CONTENT, DEMO, TEXT, and so forth instead of BODY.

There are multiple forms of the heredoc syntax you can use (as hinted at above). Each is described in the sections below.

Basic

The basic heredoc syntax is <<- followed by a delimiter of your choice (in the example below I use BODY).

content = <<-BODY
This is a multi-line snippet of Ruby code:

def print text
  puts text
end
BODY

puts content

# This is a multi-line snippet of Ruby code:
#
# def print text
#   puts text
# end

ℹ️ Since Ruby 2.3.0, use of the dash is no longer required and exists mostly for backwards compatibility. That said, the dash used to determine if string interpolation was enabled or disabled (more on this shortly).

Shortcuts

You can also use the %, %q, and %Q shortcuts. Here’s an example only using the % shortcut:

content = %(
This is a multi-line snippet of Ruby code:

def print text
  puts text
end
)

puts content

# This is a multi-line snippet of Ruby code:
#
# def print text
#   puts text
# end

Notice the above is the same as when we used <<-. No change in output. Also %Q is identical in behavior to % but with less typing. That said, use of %Q is no longer recommended since % is sufficient.

As for %q, this shortcut is identical to the above but handles situations where your content has single and double quotes in the body. Example:

content = %q(
This is an example with single and double quotes:

'single quotes'
"double quotes"
)

puts content

# This is an example with single and double quotes:
#
# 'single quotes'
# "double quotes"

In truth, you’re better off sticking with the basic version of heredoc syntax since you can do the same thing without having to think about using %, %q, %Q or not. Example:

content = <<-CONTENT
This is an example with single and double quotes:

'single quotes'
"double quotes"
CONTENT

puts content

# This is an example with single and double quotes:
#
# 'single quotes'
# "double quotes"

Squiggly

The squiggly heredoc was introduced in Ruby 2.3.0 and is the superior syntax because it automatically removes leading indentation so your heredoc is more readable. Example:

content = <<~BODY
  This is a multi-line snippet of Ruby code:

  def print text
    puts text
  end
BODY

puts content

# This is a multi-line snippet of Ruby code:
#
# def print text
#   puts text
# end

Notice how the above example uses two spaces of indentation for the entire body (i.e. the content between the two BODY delimiters). This is identical to the dashed examples shown earlier but improves readability while preventing the indentation from showing up in the output. This isn’t possible with the dashed or shortcut syntax and is why this is a good default to use for all heredoc syntax.

Interpolation

Variable interpolation is possible with any syntax discussed previously. For the purposes, of these examples, we’ll stick with the squiggly syntax. To start, you can use variable interpolation as you would anywhere else in your code:

value = "A demo"

content = <<~CONTENT
  An example of variable interpolation:

  #{value}
CONTENT

puts content

# An example of variable interpolation:
#
# A demo

Notice the use of #{value} is correctly interpolate as "A demo". In situations where you don’t want interpolation you can use single quotes. Example:

content = <<~'CONTENT'
  An example of disabled variable interpolation:

  #{value}
CONTENT

puts content

# An example of disabled variable interpolation:
#
# #{value}

Notice that we only had to single quote the beginning 'CONTENT' delimiter. Should single quotes not be to your liking, you can escape the pound sign for the same effect. Example:

content = <<~CONTENT
  An example of variable interpolation disabled:

  \#{value}
CONTENT

puts content

# An example of disabled variable interpolation:
#
# #{value}

Notice by using \#, we were able disable variable interpolation without using single quotes. Either way is fine but the former (i.e. single quotes) allows you to alter behavior without having to change the body of your heredoc should you need to pass in a variable later.

Advanced

Now that basic heredoc syntax is understood, let’s discuss more advanced usage that can streamline your code and improve maintenance even further.

Arguments

You can pass heredoc as an argument to a method. This is extremely handy in situations where it would be cumbersome to set a variable before messaging your method. Consider the following:

def repeat(text, max = 2) = max.times { puts text }

text = <<~CONTENT

  This is an example
  of a multiline
  heredoc repeated multiple times.
CONTENT

repeat text, 3

#
# This is an example
# of a multiline
# heredoc repeated multiple times.
#
# This is an example
# of a multiline
# heredoc repeated multiple times.
#
# This is an example
# of a multiline
# heredoc repeated multiple times.

Notice how we had to store the content of our heredoc in the text local variable. This definitely achieves our desired output but is cumbersome. We can do better by passing the heredoc delimiter in as the first argument. Here’s a refactor of the above example:

def repeat(text, max = 2) = max.times { puts text }

repeat <<~TEXT, 3

  This is an example
  of a multiline
  heredoc repeated multiple times.
TEXT

#
# This is an example
# of a multiline
# heredoc repeated multiple times.
#
# This is an example
# of a multiline
# heredoc repeated multiple times.
#
# This is an example
# of a multiline
# heredoc repeated multiple times.

Notice the opening TEXT delimiter is passed as the first argument while still supplying the second argument of 3. As a long as we keep the closing TEXT delimiter, we don’t have to worry about using an additional variable store heredoc content. This applies to any method argument regardless of position.

💡 For more information on method parameters and arguments, check out my Method Parameters And Arguments article.

Messages

You’re not limited to only using heredoc delimiters. You can pass additional messages to them. This is because heredocs are strings which means you can send any string-related message. The most common use case is striping whitespace. Example:

puts <<~TEXT.strip

  This is a multiline demonstration
  with additional messages.

TEXT

# This is a multiline demonstration
# with additional messages.

Notice the leading and trailing whitespace was removed by passing the #strip message. You’re not limited to a single message, you can chain them as well. Example:

puts <<~TEXT.strip.upcase

  This is a multiline demonstration
  with additional messages.

TEXT

# THIS IS A MULTILINE DEMONSTRATION
# WITH ADDITIONAL MESSAGES.

This time we stripped all whitespace and uppercased the entire message. Nice.

💡 While powerful, chaining multiple messages shouldn’t be abused. As a rule of thumb, one to two chained messages should be the limit. You also don’t want to end up chaining across multiple lines either as the readability and maintenance of your code will degrade.

Taking this a step further, you can use regular expressions in conjunction with your heredoc. Consider the following (albeit slightly subtle):

puts <<~CONTENT.gsub(/^(?=\w)/, "  ")
  puts "Running setup..."

  puts "Configuring databases..."
  Runner.call "bin/hanami db setup"
CONTENT

#   puts "Running setup..."
#
#   puts "Configuring databases..."
#   Runner.call "bin/hanami db setup"

With the above, we have a regular expression that looks for any line that begins (i.e. ^) with a word character via a look ahead expression (i.e. (?=\w)). This ensures you can indent all lines, by two spaces, that begin with a word while skipping lines that don’t. In this case, the blank line between two puts doesn’t need indentation. This might not seem all that significant but is important for testing purposes when you need to insert heredoc content within a file where you don’t want to see blank lines unnecessarily indented. Thankfully, regular expressions make this simple to achieve.

Conclusion

Heredocs are powerful when used judiciously. Hopefully, this look at heredocs has leveled you up so you can apply these techniques to your own code. Enjoy!