Thursday, May 01, 2008

NAME

lab 88 - degrees of freedom

NOTES

The Vitanuova downloads page has historical snapshots of Inferno source from 1996 to 2003 containing all three editions before Inferno went open source. I was curious to see how well Inferno has sustained a standard set of interfaces over the last ten years so I downloaded all of them and poked around.

The biggest overall change came with 4th edition, when many parts of the system were upgraded, including the Styx protocol, several builtin modules, dis format, the limbo language and VM. Also, significantly, Inferno adopted open source licenses granting developers the freedom to modify any part of the system: something that might impact sustainability for good or ill.

While every edition prior to 4th has had some interfaces changed, these changes did not break backwards compatibility. A 3rd edition emu can run dis code from the 1st edition archive.

The difference between 3rd and 4th was large enough that limbo code needed to be ported, or at the very least recompiled, to run on the new emu.

The Sys interface is evolving still with additions made since 4th edition was released. These types of changes, adding a function or a new constant, do not break backward compatibility, but in a network of emus where there is diversity of versions, link typechecks do fail when a module is expecting an interface newer than the one available. Where is the standard interface here? This problem seems to violate some of the core ideas of Inferno, and Inferno doesn't provide an easy way of working around compatibility issues with builtin modules.

Inferno's core idea is to provide standard interfaces that free content and service providers from concern of the details of diverse hardware, software, and networks over which their content is delivered. (/sys/doc/bltj.ms)

The BLTJ paper describing Inferno listed the several dimensions of portability and versatility provided by the OS,

  • Portability across processors: it currently runs on Intel, Sparc, MIPS, ARM, HP-PA, and PowerPC architectures and is readily portable to others.
  • Portability across environments: it runs as a stand-alone operating system on small terminals, and also as a user application under Windows NT, Windows 95, Unix (Irix, Solaris, FreeBSD, Linux, AIX, HP/UX) and Plan 9. In all of these environments, Inferno applications see an identical interface.
  • Distributed design: the identical environment is established at the user's terminal and at the server, and each may import the resources (for example, the attached I/O devices or networks) of the other. Aided by the communications facilities of the run-time system, applications may be split easily (and even dynamically) between client and server.
  • Minimal hardware requirements: it runs useful applications stand-alone on machines with as little as 1 MB of memory, and does not require memory-mapping hardware.
  • Portable applications: Inferno applications are written in the type-safe language Limbo, whose binary representation is identical over all platforms.
  • Dynamic adaptability: applications may, depending on the hardware or other resources available, load different program modules to perform a specific function. For example, a video player application might use any of several different decoder modules.

Now that we have a decade of Inferno history, how many of the above degrees of freedom still hold when the whole time span is considered as one network of interconnected emus?

Don't standard interfaces also imply standard across time? A network of emus can not be expected to upgrade all at the same time. A standard is also a constraint against change in an interface. One degree of freedom is expressly limited; keep the abstraction constant.

The dilemma faced is whether to freeze an interface to provide long term compatibility based on a standard, but risk the possibility of being held back from adopting new ideas and becoming irrelevant, or to keep changing interfaces to solve new problems but pay the cost of compatibility problems.

In general it seems filesystems, namespaces and textual interfaces all lend well to creating a sustainable software environment. However, the more complex limbo language and module interfaces have shown themselves to be not so well preserved. By comparison, maybe unfairly because of the different goals of the creators, see how the works of Knuth are intended to withstand time. He specifically structures his software so that incompatibilities do not creep in, such as leaving no undefined gaps in font tables so that no one is tempted to fill them (TeX Book), or defining instructions to fill all 256 possible slots in MMIX, and making his source readable but not permitting edits except through his CWEB change file system. Knuth's software is the only kind I know of that take seriously the problem of long term compatibility.

It would be nice to run 1st edition dis code in a current emu if for no other reason than to prove the sustainability of infernos standard interfaces, but a major barrier to that is the need to bind in old Sys and Draw modules. I assume that compatibility to older interfaces should be provided through limbo modules so that the emu doesn't bear the extra weight and complexity of carrying multiple builtin implementations. There is no way to override where a builtin module is loaded from, though this might be a nice feature. For example, if /dev/dis/draw.dis represented the builtin draw module, I might bind limbo implementation over it, so that a load Draw "$Draw", would take the compatibility simulation over the builtin. (This might also work nicely going the other way so that we could bind builtin modules over limbo modules for optimization.)

This is not possible for Sys however, because we can't simulate variadic args in limbo (e.g., sys->print). A solution to this would be nice! But an alternative to providing more mechanism is simply to freeze the various interfaces.

Sunday, April 06, 2008

NAME

lab 87 - mux for nintendo ds

NOTES

In an earlier post I talked about updating mux to 4th edition Inferno in the hope of one day running it on a Nintendo DS.

Well, Inferno is now booting on the DS so I got to try it for real.

I started with getting the mux window manager working in standard inferno. Then I changed the resolution down to 256x192 and tried to get everything to fit. The files in this lab include the version of mux I ended up putting in the nds file running on the DS.

Things to try if you download it. Rocker moves up and down selection. 'A' key enters, 'B' key backs out back up to the higher level. 'Start' key returns to the top level menu.

Try Today's Newspaper, and The Thisburgh has the only working graphic. Under news, click through to actually read an article. Under games, try connect4. Audio control would look cool if any of the graphics actually came in. The Financial Reports gives a ticker. It scrolls slowly only because of the sleep interval in the code is incorrect.

If you want to try this version of mux using hosted inferno just remember you need to compile prefab into your emu. Include prefab in the mod and lib sections of your emu config file, also uncomment prefab in the /libinterp/mkfile.

Mux uses irsim for key controls. I changed my local inferno-ds code to have the DS keys output the same characters as used by irsim.

The files in this lab include the movies and tvlist apps and their data. The data didn't fit on the 4MB .nds file. But they will fit when we get the GBA ROM or dldi interface working.

I think mux is a good path to follow for DS development. It's small, starts quickly, uses the keys effectively since it was designed for remote controls, the programs are easy to understand, and they hit most of the applications I'd like to start with, small games, news reader, email reader, simple database browser (movies, tvlist), and audio.

FILES

code for lab 87

NAME

lab 86 - srv

NOTES

This from a post on 9fans, and also on Tip O' the Day

% dc >[0=1] | echo 0 > /srv/desk

Plan 9's srv(3) acts as a bulletin board for open file descriptors, other namespaces see all the files in srv, and so can read and write to /srv/desk.

Inferno has srv(3) which is a file2chan registry, but is also visible to all namespaces on the host. (see also srv9(3))

The current implementation of sh-file2chan(1) does not allow the above. The closest I got was,

% load file2chan
% calc >[0=1] | {file2chan /chan/desk {rblock; putrdata &} {fetchwdata > /fd/0}} &

% stream -b 1 /chan/desk
% echo 1+1 > /chan/desk

I tried implementing a command equivalent to srv(4) on Plan 9. It takes a command block or network address and post it in the srv registry.

% srv {calc >[1=0]}  /chan/desk

It using an existing '#s' instance if there is one, else binds a new one. Now we can open a console to /chan/desk from another window

% {cat & cat  >[1=0] < /dev/cons} <> /chan/desk

and other windows can write to /chan/desk, the output will be seen in the console.

Questions.

  1. Why isn't Plan 9 srv(3) in Inferno?
  2. File2chan(2) seems under used. Is that because of the shell interface sh-file2chan?
  3. Is there another interface that would make file2chan more usable?
  4. Mount(1) supports mounting from a file, as Plan 9's does. But the inferno srv(3) device must do extra copies of the read and write buffers to implement the interface. Is the file interface of Plan9 srv more elegant than the extra file2chan syscall in Inferno?
  5. Aren't there benefits to using channels in Inferno that make file2chan preferable?

Files in this lab are for the inferno srv(4) implementation.

FILES

lab 86 code

Monday, March 24, 2008

NAME

lab 85 - stowage

NOTES

In an earlier post I defined a venti-lite based on two shell scripts, getclump and putclump, that stored files in a content addressed repository, which in that instance was just an append-only gzip tar archive with an index.

After learning a little about the git SCM, this lab re-writes those scripts to use a repository layout more like git's. The key thing to know about the git repository is that it uses sha1sum(1) content addressing and that it stores the objects as regular files in a filesystem using the hash as the directory and filename,

  objects/hh/hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh

In the objects directory is 256 directories named for every 2 character prefix of the sha1hash of the object. The filename is the remaining 38 characters of the hash.

Putclump calculates the hash, slices it to make the prefix and filename, tests if the file already exists, and if not writes the compressed data to the new file. Here is the important part of putclump,

 (sha f) := `{sha1sum $file}
 (a b) := ${slice 0 2 $sha} ${slice 2 40 $sha}
 
 if {ftest -e $hold/objects/$a/$b} {} {
  mkdir -p $hold/objects/$a
  gzip < $file > $hold/objects/$a/$b
 }

Getclump just needs to look up the file given a hash

 sha := $1
 (a b) := ${slice 0 2 $sha} ${slice 2 40 $sha}
 files := `{ls $hold/objects/$a/$b^* >[2] /dev/null}
 if {~ $#files 1} {gunzip < $files } 

Because the git repository uses a regular file system to store objects, it makes it considerably easier to work with than the compacted file system like tar.gz, or an application specific binary format like venti. This is because instead of having to create new tools to read and write binary formats, we can re-use existing tools, like sh(1), tarfs(4), updatelog(8), and applylog(8).

For example, I wrote a script, stow, that takes a tarball and stores it in my repository, called the hold. The hold should be created first with the following directories,

 /n/hold/logs
 /n/hold/objects
 /n/hold/stowage

Then give stow the name of a .tar or .tgz file. Files not found in the hold and that were added are printed to stdout.

 % stow acme-0.11.tgz
 ...
 %

Stow uses updatelog(8) to create a stowage manifest file for the tarball I added. This manifest is saved under /n/stowage. The manifest records the pathname, perms, and sha1 hash of every file in the tarball.

Now that I've stowed all my tarballs I need a way of getting things out.

I built a holdfs, derived from tarfs(4), to read the stowage manifest and present files from the hold. By default the file system is mounted on /mnt/arch.

 % holdfs /n/hold/stowage/acme-0.11

The hold with its stowage is be a step up from a directory tarpit of tarballs. I can accumulate a version history based on tar.gz releases like that for acme-sac and inferno. The vitanuova downloads site contains inferno history going back to 1997. My downloads page contains snapshots of inferno from 2002 to 2006 and acme-sac after that.

My intended application for this was that I could encourage forks of a project and merge back many individuals releases into a single repository and still do useful comparisons.

Using a filled hold I should be able to do analysis of a file history based on the stowage manifests. Contained in this lab are a few experimental scripts to build out more of an SCM. For example, the script hold/diff attempts to use updatelog to compare a manifest with the current tree. And hold/difflog uses a modified applylog(8) to compare two manifests.

The nautical references suggest a distributed and loosely coupled network like that of shipping, and is also influenced by git's design. The unit of transfer is a tarball. It is stowed into the ships hold, along with the manifest. A file system interprets the manifests and gives an interface for searching the hold. There is also keep a ships log of what was stowed and when. I can extract patches, files, tarballs, or the complete stowage in my hold to share with someone else.

This is a simple system:

     28 putclump.sh
     16 getclump.sh
     54 stow.sh
    669 holdfs.b
    767 total

But then most of what was needed already existed in inferno.

FILES

lab 85 code

Sunday, March 23, 2008

NAME

lab 84 - gridfs pattern (mapreduce)

NOTES

I've mentioned mapreduce in previous posts. It makes a good example application for thinking about grid computing. This lab is also about mapreduce although the point here is to illustrate an inferno pattern for grid computing. I'll call it here the gridfs pattern.

Say you have a grid of compute nodes and you want to distribute and coordinate work among them. For the gridfs pattern you construct a synthetic file system that will get exported to all the nodes. The file system is the master process and all clients to the file system are workers.

Both cpu(1) and rcmd(1) use the rstyxd(8) protocol that exports the local namespace when running remote jobs. To implement the gridfs pattern we bind our master fs into our local namespace so it gets exported when we run multiple workers across our compute grid.

A very simple example of this pattern is explained in the Simple Grid Tutorial Part 2. I export a named pipe with a provider process writing one line at a time to the pipe; then multiple worker processes running across the grid consume lines from the pipe as fast as they can do the work.

The mapreduce source files I've included in this lab are a concrete (rough and experimental) example of this pattern taken to the next level. The namespace it exports is the following,

  mapreduce/clone
  mapreduce/n
  mapreduce/n/ctl
  mapreduce/n/status

Each worker opens the clone file and gets a unique connection to the master process, represented by a numbered directory. The open clone file becomes an open file descriptor to the ctl file of the new connection. The worker reads messages from ctl describing new work, and it can write back messages about work completed. The master process will keep the status file up to date with the progress of the worker, analogous to prog(3).

An advantage of this approach over the simpler named pipe is that the master process knows exactly when the worker has closed the connection and knows how much work they have completed based on the messages written to the ctl file. It also provides a better interface to the user; The ps(1) command can easily be adapted to read the status files from the mapreduce namespace.

To try out some examples using mapreduce I need to provide a mapper and reducer function. I wrote a module interface for a mapper,

Mapper : module {
    map: fn(key, value: string, emit: chan of (string, string));
};

This takes a key and value and maps it to an intermediate key and value, which it emits on a channel; it may emit many intermediate key value pairs for a single input key value pair. Here's an implementation for a mapper that takes a string input, tokenizes it, and outputs the token and '1', which will be added later for a wordcount.

# the map function may not get the whole file in one go. maybe
# just a segment, or a line.
map(nil, value: string, emit: chan of (string, string))
{
 if(sys == nil)
  sys = load Sys Sys->PATH;
 if(str == nil)
  str = load String String->PATH;
 (nil, f) := sys->tokenize(value, 
               "[]{}()!@#$%^&*?><\":;.,|\\-_~`'+=/ \t\n\r");
 for ( ; f != nil; f = tl f) {
  ss := str->tolower(hd f);
  emit <-= (ss, "1");
 }
}

There is also an interface for a reducer,

Reducer : module {
    reduce: fn(key: string, input: chan of string, emit: chan of string);
};

This takes all the intermediate values for a key and emits a value. Here's the adder, used by the wordcount.

reduce(nil: string, v: chan of string, emit: chan of string)
{
 value := 0;
 while((s :=<- v) != nil)
  value += int s;
 emit <-= string value;
}

The mapper and reducer interfaces are known by a worker process that loads them on demand. An intermediate process that combines values of the same keys and sorts them is also implemented in the worker process (See the Google MapReduce paper for a good explanation.) This implementation of mapreduce knows only how to walk directory hierarchies and print the file names to all the worker processes. Here's an example of a mapreduce command line that counts words in all files below /lib/legal.

  % mkdir /mnt/mapreduce
  % mapreduce -M4 -R3 wordcount adder /lib/legal
  % ls /mnt/mapreduce
  /mnt/mapreduce/clone

Mapreduce should launch and manage all its own processes. However, for the code checked into this lab, to illustrate what is going on, I have it launching nothing. It just mounts the file system on /mnt/mapreduce. The arguments '-M4 -R3' say to expect 4 Mapper processes and 3 Reducer processes. As workers connect it will configure it to be a mapper or reducer depending on whether work remains. Therefore, after running the above command and doing a cat(1) on /mnt/mapreduce/clone we should see the config line then the pathnames for the first worker.

  % cat /mnt/mapreduce/clone
  worker -m -R 3 -d wordcount -i 1
  /lib/legal/GPL 0 17982
  ...

The pathnames are divided up among the workers as fast as they can process them. So in this implementation mapreduce functions almost the same as the named pipe in the simple grid tutorial. The cat of the first clone file will return all pathnames!

Mapreduce however is still expecting more workers. Cat the clone file three more times to see the input to the next 3 workers. The next cat after that you should see the config and input to the reducer. For example from a remote node,

  % rcmd ${nextcpu} cat /n/client/mnt/mapreduce/clone

Doing a listing on the /mnt/mapreduce path should show you the current workers connected (if any). After all reducers have disconnected, the mapreduce filesystem will report it's done and exit.

Lets run it again for real using the mapreduce worker processes.

  % mapreduce -m /mnt/mapreduce -M4 -R3 wordcount adder /lib
  % for i in 1 2 3 4 {mapreduce/worker /mnt/mapreduce/clone&}
  % for i in 1 2 3 {mapreduce/worker /mnt/mapreduce/clone&}

You should see the result files in /tmp/out.*

For the GSoC 2008 I suggested a project where the student implement a splitjoin file system. Create a coarse grained splitjoin service as defined loosely here (PDF) (see slides 18 on for fine grained task parallelism). This suggested implementation is really another concrete example of the gridfs pattern. It would allow control over how messages are passed round robin to all the workers. It would permit different configurations of how many to push to each node, how many to join from each node, how many commands to duplicate. E.g.,

filter | splitjoin -d10 -m5 -n3 {cmd} | filter 

creates 10 duplicates of the cmd, take input from a pipeline and distributes m=5 records at a time round robin to each node and join the output n=3 records at a time from each task back out to the pipeline.

Splitjoin would take care of launching the task, and monitoring the task for completion. (Ideally, it would interact with the registry to decide where to launch services.)

Because Plan 9/Inferno is not participating this year in GSoC I will probably have a crack at this.

FILES

lab 84 code

Tuesday, March 11, 2008

NAME

lab 83 - lcmd local cpu

NOTES

While thinking of the Simple Grid Tutorial Part 1 and Part 2, I wondered whether I could implement the equivalent of rcmd(1) but for a local emu launched using os(1). For example,

 lcmd math/linbench 100

would launch a new emu, export the local fs to it through a pipe rather than a network socket, and run the command in that namespace. The idea seemed simple, no apparent obstacles, but it actually took me a couple of evenings to get it to work. So I'm posting it more because of the effort rather than its value.

First lets look at what rcmd does, ignoring the networking. Given its arguments it builds a string, calculates its length + 1, and writes the length then the string to a file descriptor, then exports the local namespace to the same file descriptor. Well that part is easy to do in sh(1). Here it is as a braced block assuming all work is done on file descriptor 1.

fn lcmd {
 load expr string
 args := $*
 s := sh -c ${quote $"args}
 echo ${expr ${len $"s} 1 + } 
 echo $s
 export / /fd/1
}

We can test that,

 lcmd {ls } | auxi/rstyxd

Now, instead of running rstyxd in the current VM, I want to run another instance and run in it that. This is where it gets complicated. You might think this might work,

 
 lcmd {ls} | os emu auxi/rstyxd

It doesn't because os treats stdin as read only, stdout as write only. Because export(1) needs to read and write on one file descriptor, and so does rstyxd(8), we need to setup extra pipes, both on the local end and the remote end.

Another problem presents itself in emu. At startup rstyxd will see /dev/cons as stdin. But I'd need to bypass the keyboard handling and get the direct stdin from the pipe. We see the answer to that in /dev,

% ls -l /dev/host*
--rw-r--r-- c 0 caerwyn caerwyn 0 Oct 30 22:43 /dev/hostowner
---w--w--w- c 0 caerwyn caerwyn 0 Oct 30 22:43 /dev/hoststderr
--r--r--r-- c 0 caerwyn caerwyn 0 Oct 30 22:43 /dev/hoststdin
---w--w--w- c 0 caerwyn caerwyn 0 Oct 30 22:43 /dev/hoststdout

This looks good but when I tried them they were not fully implemented in the current emu. The details are not interesting. I fixed that in the acme-sac tree and committed it.

Finally, we can build our full lcmd

fn lcmd {
 load std expr string
 pctl forkns
 args := $*
 s := sh -c ${quote $"args}
 bind '#|' /tmp/lpipe
 
 {
  echo ${expr ${len $"s} 1 + }  >/fd/0;  
  echo $s >/fd/0; 
  export / /fd/0
 } <>/tmp/lpipe/data    &
 
 os -d 'd:/acme-sac' d:/acme-sac/sys/Nt/386/bin/icell.exe -c1  sh -c '
  bind  ''#|'' /tmp/pipes; 
  cat /tmp/pipes/data > /dev/hoststdout& 
  cat /dev/hoststdin > /tmp/pipes/data& 
  auxi/rstyxd <>/tmp/pipes/data1 >[2] /dev/null;  
  echo halt > /dev/sysctl' /tmp/lpipe/data1
}

Heh!

I'm using icell.exe built using the cell config from acme-sac. This is a really small emu configuration. The directories /tmp/pipes, /tmp/lpipe are assumed to exist.

From this definition we can replace rcmd with lcmd in the commands for rsplit and lk in the Grid Tutorial Part 2 and get emu tools for multicores without the setup required for the grid.

Tuesday, February 12, 2008

NAME

lab 82, again - txt2iaf

NOTES

After thinking the matter a little, I finally wrote the txt2iaf app. This allows to use any text file, with one or two columns of data, to be converted to an iaf file that can be played using auplay(1). Now, my sound analysis goes something like this:

1. Record something using a microphone. For now, I only record stuff using some app in Windows (Sound Recorder) or Linux (arecord(1)). This is because Inferno has some issues when trying to record something from /dev/audio: an annoying tick every so often that wrecks my sound analysis intentions. Maybe, I can help to fix this problem, probably related with buffering.

2. Convert the wav file obtained to an iaf file, using wav2iaf(1).

3. Get the data from the iaf file to a text format, using iaf2txt(?).

4. Read data from the text file using any data analysis package.

5. Do whatever you want to with the data.

6. If you wish or need to, output the data in a text file.

7. Using txt2iaf(?), create an iaf file using the data in the text file. Enjoy your new song using auplay(1) ;-)

8. If you want to, convert your iaf file to a wav file using iaf2wav(?).

I also fixed some ugly code in iaf2txt(?) and iaf2wav(?) (really ugly, indeed). Please, update.

Cheers

FILES

lab 82 code

Wednesday, February 06, 2008

NAME

lab 82 - iaf2txt and iaf2wav

NOTES

Currently, a couple of my friends and I are playing a little with human voices. For this purpose I wrote two applications to convert iaf files to plain text and to the wav audio format. Both apps support the standard iaf files as described in audio(6), except for the encoding: only PCM is supported for now.

Why in the world would one need a text file converted from an iaf? Well... text files are easier to handle with data analysis software like the R programming language. I know MATLAB supports working with wav files directly, but there are mainly two reasons I needed an iaf to txt converter:

1. I do no use MATLAB.

2. When I wrote the iaf2txt app there was not an iaf to wav converter.

Maybe R can handle wav files directly, but I do not know.

I am not really sure if I need a text to iaf converter, but I am thinking about this issue. So far I do not need one.

FILES

lab 82 code