Ruby and inject

Leave a comment

One of the interesting features of Ruby is the fact that it can utilize some simple functional programming style methods to handle data manipulation. One of these methods is the inject method. This method is quoted as providing the following functionality:

Combines all elements of enum by applying a binary operation, specified by a block or a symbol that names a method or operator.

The actual signature looks like this:

enum.inject(initial, sym) → obj
enum.inject(sym) → obj
enum.inject(initial) {| memo, obj | block } → obj
enum.inject {| memo, obj | block } → obj

Now the explanation may seem a bit terse to the newcomer, so let’s take a look at an example use of inject to how what it does:

More

Ruby and ARGF

1 Comment

One of the hidden gems (pun completely intended) of Ruby is the ability to work with a list of files given as command line arguments, or passed in through STDIN redirection. This can be achieved through the use of the ARGF class. Interestingly enough it does have a method to get command line arguments as an array:

> ruby argf.rb -m test -t something LICENSE README
["-m", "test", "-t", "something", "LICENSE", "README"]

Though this method doesn’t show the true power of the ARGF class. A simple code snippet will help explain what can be done with it:

ARGF.each { | line |
  puts line
}

Now if this script is run with a list of files as arguments:

>ruby argf.rb LICENSE README
Copyright (c) 2011-2012 Onteria <onteria@live.jp>

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
# RubbySnippets

This is random ruby code I write to learn how ruby works. They might do useful things, they might not. Some of this is meant to supplement api / standard library documentation that's lacking in sample code.

The contents of the files listed are shown. This in a way does the same thing as cat. STDIN can also be used as well:

>ruby argf.rb < README
# RubbySnippets

This is random ruby code I write to learn how ruby works. They might do useful things, they might not. Some of this is meant to supplement api / standard library documentation that's lacking in sample code.

Now what it can’t do is handle a mix of files and arguments:

>ruby argf.rb -l arg1 -z arg2 LICENSE README
argf.rb:5:in `each': No such file or directory - -l (Errno::ENOENT)
        from argf.rb:5:in `<main>'

Also mixing files and STDIN won’t work as expected:

>ruby argf.rb LICENSE < README
Copyright (c) 2011-2012 Onteria <onteria@live.jp>

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

In this case the license file contents are shown, but the readme file contents are not. Now, the above works for getting the contents of all files at once. However, in the event of handling each file one by one, a different approach is required. In this case ARGF provides a file property to get a file handle of the current file. This can be used to handle each file using regular file handle methods:

while !ARGF.file.closed?
  puts "===== Current File: #{ARGF.filename} ====="
  ARGF.file.each_line { |line|
    puts line
  }
  ARGF.close
end

Since ARGF#close will close the file AND proceed to the next file, the closed? method of IO can be used to see if the last file was read in. This is because if there are no files left, ARGF#close will simply returned the closed stream of the last file. Additionally, ARGF.path and ARGF.filename can be used to get the path and filename of the current file being processed. This property can be used for handling files based on attributes such as location or file extension. A sample run of the code:

>ruby argf.rb LICENSE README
===== Current File: LICENSE =====
Copyright (c) 2011-2012 Onteria <onteria@live.jp>

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
===== Current File: README =====
# RubbySnippets

This is random ruby code I write to learn how ruby works. They might do useful things, they might not. Some of this is meant to supplement api / standard library documentation that's lacking in sample code.

Note that this method is only useful in cases where files are passed in, not if STDIN redirection is used. If this needs to be checked before processing files, simply check that the filename does not equal to “-“, which it is set to if the input is STDIN. This concludes a look into how ARGF works and hopefully provides insight into how it can be used in your code. The source code for this blog can be found at my GitHub Repository.

Basic REST Server With Webrick

Leave a comment

Ruby provides a lot of nifty tools for getting things up and running quickly. Recently I’ve been looking a lot into the standard Ruby library to see what interesting parts I could use to build my own creations. One of the common needs for a lot of the web based world is a web service to expose an API to clients. Thinking it through I came up with a solution using Webrick and message passing to create a RESTful JSON web service. As this uses JSON, you will need to install the JSON gem. This can be done through the following command:

[ONTERIA] > gem install json

First off a simple HTTP server is required. Webrick provides just that. I borrowed a code snippet from Gnome’s Guide to WEBrick, a highly recommended read for those who wish to look more into using Webrick. The base code gives this:

require 'webrick'

include WEBrick

def start_webrick(config = {})
  config.update(:Port => 9955)     
  server = HTTPServer.new(config)
  yield server if block_given?
  ['INT', 'TERM'].each {|signal| 
    trap(signal) {server.shutdown}
  }
  server.start
end

start_webrick(:DocumentRoot => 'C:\\MyWebsite')

This creates a server that listens on port 9955, with a document root of C:\MyWebsite. In the document root folder is a single file called index.html which has the following HTML code:

<html>
<head>
<title>My Webpage</title>
</head>
<body>
<h1>Welcome to my website!</h1>
</body>
</html>

After running the script, navigating to the site in Firefox produces the following:

So this gives us a basic webserver. However since this is going to be a RESTful web service, things need to be a bit more dynamic. In order to interface with dynamically generated content, Webrick utilizes servlets. The skeleton of a basic servlet is as follows:

class RestServlet < HTTPServlet::AbstractServlet
end

The servlet inherits from HTTPServlet::AbstractServlet, but doesn’t do much as is. However, it can be attached to a certain URL. Since this is meant to be a stand alone web service, we’ll attach it to the root path:

class RestServlet < HTTPServlet::AbstractServlet
end

start_webrick { | server |
  server.mount('/', RestServlet)
}

Now as is, running this code gives a generic 404:

This is because Webrick is expecting a response to send back to the client. Since no response has been setup, a 404 is returned. When Webrick is set to utilize a servlet, it references specific methods depending on the type of request. In this case a GET request is being performed. In order to handle this get request as a servlet, the do_GET method must be implemented:

class RestServlet < HTTPServlet::AbstractServlet
    def do_GET(req,resp)
      resp.body = 'Hello World'
      raise HTTPStatus::OK
    end
end

The do_GET method here takes two objects as arguments. The first is the client request, and the second is the server response, which is used to build up what gets sent back to the client. In this case, req is not used and the response body is set to a simple string. Afterwards an HTTPStatus of OK (converted to status code 200) is setup for Webrick to respond with the proper status. This will finally produce something that the browser can render:

With this being a RESTful service, the actual URL is often used to map to specific method calls. In this case, we’ll make a HelloService REST service that has a default response, and one that takes arguments. The class that will produce the result is short and simple:

module RestServiceModule
  class HelloService
    def self.index()
      return JSON.generate({:data => 'Hello World'})
    end

    def self.greet(args)
      return JSON.generate({:data => "Hello #{args.join(' ')}"})
    end
  end
end

Since the JSON classes are now being used, a require call is needed at the top as well:

require 'webrick'
require 'json'

include WEBrick

The first is an index class method, which will be used if no method is given. The next is a greet class method which takes an array of arguments provided by the URL. Finally note how this class is also wrapped in a module. More on why that’s important in a moment. Now it’s time for the core logic of the REST service:

class RestServlet < HTTPServlet::AbstractServlet
  def do_GET(req,resp)
      # Split the path into pieces, getting rid of the first slash
      path = req.path[1..-1].split('/')
      raise HTTPStatus::NotFound if !RestServiceModule.const_defined?(path[0])
      response_class = RestServiceModule.const_get(path[0])
      
      if response_class and response_class.is_a?(Class)
        # There was a method given
        if path[1]
          response_method = path[1].to_sym
          # Make sure the method exists in the class
          raise HTTPStatus::NotFound if !response_class.respond_to?(response_method)
          # Remaining path segments get passed in as arguments to the method
          if path.length > 2
            resp.body = response_class.send(response_method, path[2..-1])
          else
            resp.body = response_class.send(response_method)
          end
          raise HTTPStatus::OK
        # No method was given, so check for an "index" method instead
        else
          raise HTTPStatus::NotFound if !response_class.respond_to?(:index)
          resp.body = response_class.send(:index)
          raise HTTPStatus::OK
        end
      else
        raise HTTPStatus::NotFound
      end
  end
end

Lots of code here, so let’s take it step by step.

      # Split the path into pieces, getting rid of the first slash
      path = req.path[1..-1].split('/')
      raise HTTPStatus::NotFound if !RestServiceModule.const_defined?(path[0])
      response_class = RestServiceModule.const_get(path[0])

As the comment explains, the first part takes the request path, such /my/request/path, and turns it into an array using the split method of the String class. Since a standard split would produce a blank item for the root slash, a splice operation is used to get the second character (slices start from index 0) to the last character. Now the first part of the URL gives us the class. So we need to get the actual class given the string name of the class. Since class names are constants which map to the actual class, the const_get method can be used to obtain it. However before we use const_get, we need to make sure the const is defined (ie. the class is actually defined). A quick check against const_defined? can be used for that sanity check.

Now about the module part. Since the code is set to expect any class, the actual class to look for is searched via a specific module namespace. This prevents a malicious user from trying to access toplevel classes such as File. Note that the module name RestServiceModule is not set in stone, and this code could be modified to accept another module name instead. This will be looked at in a future blog post.

      if response_class and response_class.is_a?(Class)

Now we need to make sure that the first part of the URL path is actually a class. The reason being is that get_const works for all constants, not just ones that map to classes. This is just a minor sanity check.

        # There was a method given
        if path[1]
          response_method = path[1].to_sym
          # Make sure the method exists in the class
          raise HTTPStatus::NotFound if !response_class.respond_to?(response_method)
          # Remaining path segments get passed in as arguments to the method
          if path.length > 2
            resp.body = response_class.send(response_method, path[2..-1])
          else
            resp.body = response_class.send(response_method)
          end
          raise HTTPStatus::OK

Next a check is done to see if a method name was given. If one was given, the name is first converted to a symbol so we can utilize the respond_to? and send methods for dynamic calling. Next we see if our class actually has such a method. If it doesn’t, a NotFound (404) response will be returned. Since the method could have arguments, these need to be checked for and passed along. This is accomplished by sending the rest of the path array using splicing. If there are no arguments, the method is simply called, and finally an OK status (200) is returned.

        # No method was given, so check for an "index" method instead
        else
          raise HTTPStatus::NotFound if !response_class.respond_to?(:index)
          resp.body = response_class.send(:index)
          raise HTTPStatus::OK
        end
      else
        raise HTTPStatus::NotFound
      end

Now if no method was given, we use a default method called index. Since no method is provided, that means no arguments were provided either, so that check can be skipped. As with before a 404 is returned if there is no such method, and a 200 okay with the method result as the body is returned if there is such a method. Finally, the else from the previous check if the first path piece was a class is set to return a 404 if it wasn’t. This ends the service mapping code. Now to run a few tests:

HelloService with no method

HelloService with greet method and no arguments *to be looked into in another blog post

HelloService with greet method and one argument

HelloService with greet method and multiple arguments

Invalid class

Invalid method

This concludes a basic look into a simple RESTful JSON service using Webrick, JSON, and class mapping. It’s still far from complete, and in future posts we’ll look into how to improve on the following:

  • The default method to look for is forced to index, and should be customizable
  • POSTs are not handled
  • The module to look for classes in should be customizable
  • No authentication is provided
  • No validation is done against arguments
  • Sanity checks on the URL content need to be implemented
  • The error pages are very generic
  • Method names have to match the Ruby naming convention, so URLs with dashes won’t work
  • The code needs to be split up for manageability

The full code listing:

require 'webrick'
require 'json'

include WEBrick

def start_webrick(config = {})
  config.update(:Port => 9955)     
  server = HTTPServer.new(config)
  yield server if block_given?
  ['INT', 'TERM'].each {|signal| 
    trap(signal) {server.shutdown}
  }
  server.start
end

class RestServlet < HTTPServlet::AbstractServlet
  def do_GET(req,resp)
      # Split the path into pieces, getting rid of the first slash
      path = req.path[1..-1].split('/')
      raise HTTPStatus::NotFound if !RestServiceModule.const_defined?(path[0])
      response_class = RestServiceModule.const_get(path[0])
      
      if response_class and response_class.is_a?(Class)
        # There was a method given
        if path[1]
          response_method = path[1].to_sym
          # Make sure the method exists in the class
          raise HTTPStatus::NotFound if !response_class.respond_to?(response_method)
          # Remaining path segments get passed in as arguments to the method
          if path.length > 2
            resp.body = response_class.send(response_method, path[2..-1])
          else
            resp.body = response_class.send(response_method)
          end
          raise HTTPStatus::OK
        # No method was given, so check for an "index" method instead
        else
          raise HTTPStatus::NotFound if !response_class.respond_to?(:index)
          resp.body = response_class.send(:index)
          raise HTTPStatus::OK
        end
      else
        raise HTTPStatus::NotFound
      end
  end
end

module RestServiceModule
  class HelloService
    def self.index()
      return JSON.generate({:data => 'Hello World'})
    end

    def self.greet(args)
      return JSON.generate({:data => "Hello #{args.join(' ')}"})
    end
  end
end

start_webrick { | server |
  server.mount('/', RestServlet)
}