RSense Development Guide

Index



1. Frontend Model

RSense provides just a frontend which can be used from a terminal. It means there is no way to embed RSense into other frontends (e.g. Editors, IDE). So how we can use it from other frontends? Let us explain.

RSense has a CLI (command line interface) frontend only. Try type inference from a terminal:

% cat test.rb
1_|_
% bin/rsense --file=test.rb
type: Fixnum

bin/rsense is wrapper frontend script which provides us a way to operate RSense transparency in server/client model. When bin/rsense is executed, bin/rsense tries to execute a daemon server if it doesn't live by forking and executing primitive frontend. In a time after forking, it will create an unix domain socket to communicates with a frontend. After executing the daemon, bin/rsense tries to open the unix domain socket and sends a given command from a terminal.

Considering a command bin/rsense --file=test.rb, first of all, bin/rsense executes a daemon server by:

% java -cp ... org.cx4a.rsense.Main script ...

In a script mode, it reads a command from standard input and executes as it is a command from a terminal. For example,

% java -cp ... org.cx4a.rsense.Main some-command --some-option1=foo --some-option2=foo

is equivalent to:

% java -cp ... org.cx4a.rsense.Main script
some-command --some-option1=foo --some-option2=foo
^D

A main difference of these ways is that in a script mode you can execute a series of command. It enables us to save initialization time and do continuous processing. Actually, it is used as a test script. See also testing.

After executing the server, as we said, bin/rsense sends a command from a terminal to the server. So we can operate RSense transparency as it seems like a single command.

Using from other frontends is very easy. All of they have to do is just a execution bin/rsense with arbitrary command and options. For example, a type inspection can be implemented in emacs like

(message "%s" (shell-command-to-string (format "bin/rsense type-inference --file=%s --location=%s" (buffer-file-name) (1- (point)))))

All complex things will be done automatically.

2. Type Inference Algorithm

RSense uses a type inference algorithm called CPA (Cartesian Product Algorithm) which is developed by Ole Agensen. With some restriction of CPA such like data-polymorphism, RSense uses a modified CPA indeed. We think CPA is very simple to implement and a best way to accomplish what RSense tries to privde.

3. Type Annotation

In RSense, type annotation is needed to resolve a data-polymorphism problem and ad-hoc parameter polymorphism.

Type annotation is written as a comment which takes a form of ##% .... Type annotation can be written before class declaration and method declaration.

3.1. Method Type Annotation

By adding type annotation to method, method call will be analyzed more precissly. Consider a method which takes one argument. If the argument is String, it will return Fixnum, otherwise returns Float. Such strange methods are often seen in Ruby world, you know. Anyway, it is difficult to analyze such methods. In this case, a result of f() will be infered as Fixnum and Float without type annotation.

# How do we analyze this method?
def f(a)
  if a.instanceof(String)
    1
  else
    2.3
  end
end

With type annotation, you can help analyzers to infer precisely. Method type annotation takes a form of:

##% method-name ('<' type-vars ('|' constraints)? '>')? '(' arguments ')' block? '->' result

Learn from examples. Adding type annotation for the method above:

##% f(String) -> Fixnum
##% f(a) -> Float
def f(a)
  ...
end

this method has two method type annotations. One of them is f(String) -> Fixnum which can be read as "If first argument is a kind of String, it returns Fixnum object." "a kind of" implies a test satisfied by calling ruby's kind_of? method. We call it guard. Two of them is f(a) -> Float which can be read as "If one polymorphic parameter is given, it returns Float." So now, this method can be analyzed precisely.

A parameter can be optional and rest. If you want to make parameters optional, put "?" before the parameter expressions. If you want to make parameters rest, put "*" before the parameter expressions. For example,

##% f(?a) -> Fixnum
def f() end

This can be read as "method f can take no argument or one argument, and in such cases f returns Fixnum." You can also use guard as optional.

##% f(?String) -> Fixnum
def f() end

This can be read as "method f can take no arugment or a kind of String, and in such cases f returns Fixnum."

As same as optional, you can use rest parameter.

##% f(*a) -> Fixnum
def f() end

This can be read as "method f can take any arguments and returns Fixnum." Of couse rest parameters with guard is allowed.

##% f(*String) -> Fixnum
def f() end

If you don't want to write any arguments, just use "()".

##% f() -> Fixnum
def f() end

If a method call doesn't satisfy any type annotation of the method, RSense call tell the wrong call somehow. Currently, such wrong call will be printed out as warning into log.

OK, continue to learn.

By the way, do you know what is a in above examples? It is called type variable which can contain any types. Generally, it is used for two ways.

  1. Matching any types
  2. Keeping types

In above examples, we just use it as (1) case. Now let's see (2) case.

##% f(a) -> a
def f() end

When we call f(1), it returns Fixnum because type variable a contains Fixnum given by an argument. Strictly, we should write it as following by using type variable declaration.

##% f<a>(a) -> a
def f() end

However, in our system, it can be omitted as much as possible. So, when we use type variable declaration? The time to do more complex things.

Consider a method which takes one arguments, which has to be a kind of String, and returns the argument. How can we write?

# Wrong
##% f(a) -> a
def f(a) end
# Wrong
##% f(String) -> String
def f(a) end

Two examples are both wrong. First case can take any argument, second case can be return a kind of String. Now we have to type variable declaration with constraints. Constraints gives type variables limit of typing. It takes a form of a <= b which means type expression a is subclass of type expression b. In our system, constraints will be satified as much as possible. In other words, to satify constrains, type expressions will be interpreted as free. For example, a <= String constraint can be satisfied by adding String to type variable a if type varaible a has no types. If type variable a has types, RSense make sure that the constraint is satisfied by checking each type of type variable a. For another example, a <= b constraint can be satified by adding types of type variable a into type variable b.

Now we can resolve the problem.

##% f<a | a <= String>(a) -> a
def f(a) end

"|" can be read as "where" which means starting of constraints. So this type annotation is correct and can be read as "method f takes one argument which is subclass of String and returns the argument."

For more detail, you should read next secion.

3.2. Class Type Annotation

By adding type annotation to class, class can be data-polymorphic. Think following code.

class C
  def get() @x end
  def set(x) @x = x end
end
a = C.new
a.set(1)
b = C.new
b.set('Hello')
b.get    # What type of mine?

The answer is Fixnum and String. Why? It is because of that non data-polymorphic objects shares their instance variables. Why shares instance variables? It is to achieve balance of preciseness and performance and based on hypothesis that programmers rarely use data-polymorphic. However, there is serious problem when we use Array, Hash, or something else container class.

So, we adopt class type annotation to realize data-polymorphism restrictedly. When you want to create data-polymorphic class, write class type annotation with type variables. Specified type variables will be independent from each instances. That is, with class type annotation and method type annotation, you can realize data-polymorphic class.

##% C<t>
class C
  ##% get() -> t
  def get() @x end
  ##% set<v | v <= t>(v) -> v
  def set(x) @x = x end
end

First, with class type annotation, class C is declared as data-polymorphic class. Second, with method type annotation, method get returns type variable t of class C. Last, with method type annotation, method set adds a type of an argument into type variable t of class C and returns the argument. Now we can use this class correctly.

a = C.new        # a = C<>
a.set(1)         # a = C<Fixnum>
b = C.new        # b = C<>
b.set('Hello')   # b = C<String>
b.get            # t = String (C<String>)

Type variables of class will be shared in methods, included class, inherited class. That is, you should be careful to name type variables. Here are simple rules to name type variables.

A reason why type variables will be shared is to keep implementation simple. Consider Enumerable module.

##% Enumerable<t>
module Enumerable
  ##% collect<v>() {t -> v} -> Array<v>
  def collect() [yield self] end
  ...
end
##% Array<t>
class Array
  include Enumerable
end

Classes which includes Enumerable module will automatically export useful functions. If type variables will not be shared, we have to write all of methods in Enumerable module in Array class.

Now we give you a lesson. See following code.

##% Array<t>
class Array
  ##% push<v | v <= Array<t> >(*v) -> self
  def push(*obj) self end
  ...
end

This is Array#push method type annotation. How do you can read? If you can read it as "push can any arguments as an array and types of the array elements will be added to type variable t of class Array, and finally return self.", you are ready to write type annotation!

3.3. Stubs

Stub is a collection of type annotation. Currently, they are located in stubs/. We are hard working to write stubs of builtin library and standard library by refering Ruby documents. We welcome any contribution especially about stubs :)

3.3.1. Builtin Library Status

Mostly done.

3.3.2. Standard Library Status

Status meanings

Status Meaning
Blank Nothing is started.
- No need to create a stub, but it should be done.
= No need to create a stub.
Done It has done, but a little work remains.
OK It has fully done.

Name Status
abbrev =
base64 =
benchmark
bigdecimal Done
cgi Done
cgi-lib
complex =
csv
curses
date
date2
dbm
debug
delegate
digest
dl
drb
e2mmap
English =
Env
erb
eregex
etc
expect
fcntl
fileutils -
finalize
find =
forwardable
ftools
gdbm
generator
getoptlong
getopts
gserver
iconv
importenv
io/nonblock
io/wait
ipaddr
irb
jcode
kconv
logger
mailread
mathn
matrix -
md5
mkmf
monitor
mutex_m
net/ftp
net/ftptls
net/http
net/https
net/imap
net/pop
net/protocol
net/smtp
net/telnet
net/telnets
nkf
observer
open3
open-uri
openssl
optparse Done
ostruct
parsearg
parsedate
pathname
ping
pp
prettyprint
profile
profiler
pstore
pty
racc/parser
rational OK
rbconfig
readbytes
readline
resolv
resolv-replace
rexml
rinda/rinda
rinda/tuplespace
rss
rubyunit
scanf -
sdbm
securerandom
set
sha1
shell
shellwords
singleton
soap
socket
stringio OK
strscan
sync
syslog
tempfile
test/unit
thread
thwait
time
timeout
tk
tmpdir
tracer
tsort
un
uri
weakref
webrick
win32/registry
win32/resolv
Win32API
win32ole
wsdl
xmlrpc
xsd
yaml
zlib

4. Partial Update

A most important thing for RSense is speed. If users have to wait 10 senconds to do type inspection, it is finally a garbage. So, we have to keep RSense faster as much as possible. One of ways to make RSense faster is a technique called partial update. When RSense is requested to do code-completion or type-inference, RSense checks for AST cache. If there is AST cache, RSense analyzes an edit delta and apply updating to modified portion of the source code.

5. Testing

You can run test scripts easily:

% ant test

5.1. Writing Tests

Test will be written as a rsense script file. See test scripts which are located in test/script. For example, if you want to write a new fixture, add the following code to test/script/builtin.rsense.

type-inference --test=NeedToTest? --should-be=Fixnum
1_|_
EOF

After that, do ant test. It's very easy.

As we see, special options for testing are available on type-inference and code-completion command. --test= option names its fixture. --should-be= option verifies a result equals expected data. --should-be-empty option verifies a result is empty. --should-contain= option verifies a result contains expected data. --should-not-contain option verifies a result doesn't contain unexpected data.

6. Source Tree

rsense/
|- bin/                                  - Pseudo frontend and utilities
|- build_lib/                            - Libraries needed to build RSense
|- doc/                                  - Documentation
|- etc/                                  - Other frontends and etc files
|- lib/                                  - Libraries needed to run
|- src/                                  - Real frontend source code
|  |- org/cx4a/rsense/
|              |- parser/                - Type annotation parser
|              |- ruby/                  - Pseudo ruby runtime
|              |- typing/                - Type inference
|              |- util/
|- stubs/                                - Type annotation stubs
|- test/                                 - Test cases and test scripts
CPA
Constraint based type inference algorithm.
DCPA
Data-polymorphic CPA.
DDP
Demand-drived type-inference algorithm.
Ecstatic
Static code checking tool and compiler for Ruby based on CPA.
Starkiller
Static code checking tool and compiler for Python based on CPA.
Diamondback Ruby
Static code checking tool for Ruby based on constraint-based algorithm.