Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

jsmith (3335)

jsmith
  (email not shown publicly)
http://www.jamesgottlieb.com/
Jabber: jgsmith@tamu.edu

I'm a web applications developer trying to bring all things open source to all things humanities at Texas A&M University.

Journal of jsmith (3335)

Monday March 10, 2008
05:12 PM

RDF::Server 0.02 released

Version 0.02 has been uploaded. It fixes a few things, such as a bug that worked on older versions of Moose. Apparently, a new Moose was released between the time I started development and when I released 0.01. Hopefully I've not had that happen again.

I've added beginning support for FastCGI. It passes tests using lighttpd as the translator from HTTP to FastCGI. I don't have process monitoring though, so I don't recommend it for production environments. For that matter, I wouldn't recommend any of RDF::Server for production yet.

I've added an RDF semantic. The semantic manages the configuration of how URIs are handled once the interface role has taken its stab at translating the HTTP request into something the semantic backend can work with. With the RDF semantic, the whole RDF model is a document. The semantic doesn't try to impose any subdivision on the model like the Atom semantic does.

Feed support is in development. Feeds shouldn't work yet. I'm probably going to bring SPARQL/RDQL into play along with feeds.

I added an rdf-server script that will take a configuration file and run the resulting service. No need to create Perl modules that bring together the appropriate roles. Some example configuration scripts are included in the distribution.

Enough code is there now that I can start working on the layers above/around it to do what I want to do, so I'll probably start focusing more on documentation. A lot of the detail isn't there that needs to be for someone coming to the distribution fresh.

Friday February 29, 2008
10:39 PM

RDF::Server 0.01 on CPAN Now

I exceeded 90% overall test coverage, so 0.01 is out on CPAN. I still have a lot of documentation to write.

  Protocols: Embedded (basic Perl API - implies the REST interface), HTTP (POE HTTP server)
  Interface: REST
  Semantic: Atom
  Formatters: Atom, RDF, JSON

For version 0.02, I'm aiming for overall coverage of 95%. I'm adding an RDF semantic that works with the model as a complete document instead of the Atom semantic of working with parts of the model as documents. More bug fixes. Better documentation.

Atom support is a bit sketchy at the moment. Resources can be managed, but categories, collections, workspaces, and services are still a work in progress.

JSON support is read-only for now. I hope to allow resources to be created/modified using JSON in 0.03. I'll also try to add YAML support in 0.03.

12:50 AM

RDF::Server 0.01 any day now

I'm getting closer. I'm working on small details now.

All tests successful, 2 tests skipped (pod tests).
Files=17, Tests=219, 114 wallclock secs (81.32 cusr + 18.51 csys = 99.83 CPU)

File stmt bran cond sub pod time total
...
Total 88.2 70.2 55.9 89.1 35.8 100.0 82.1

My goal is to get the Total:total coverage to at least 90% before doing an initial release. That will be additional tests, but might also include additional documentation. I hope I can get that as early as tomorrow (Friday). I'm at the point now where code is being driven completely by tests.

The first release won't be as strict as it should be on the Atom spec. It will, however, require that the /atom:entry/atom:content/@type of resources be 'application/rdf+xml' for now.

It will ship with read/write support for Atom and RDF as well as read support for JSON. I'll get write support for JSON in a near future release. I'll probably want it for some projects anyway.

It automagically manages an atom:updated and dc:created triple for each resource. The creator and other similar attributes will have to be supplied by an outside application for now.

RDF::Server doesn't understand authentication or authorization yet -- and I've not even thought a lot about that except that I might want to limit which predicates certain people can modify or read. I don't actually have a use case for that yet, so I've not tackled it.

The initial release will only support RDF::Core, but I want to extend support to RDF::Redland and probably RDF::Helper. Support for a triple store consists of two modules, one for the model and one for the resource. Everything else in the framework is (or should be) independent of the store.

Future directions: SPRQL, RDQL, and inference engines as feed producers -- because I want to do magic with the resulting feeds.

Tuesday February 19, 2008
11:37 PM

RDF::Server almost done

By the power invested in me by Moose, I can do:

  package My::Server;

  use RDF::Server;
  protocol 'HTTP';
  interface 'REST';
  semantic 'Atom';

  render xml => 'Atom';
  render rdf => 'RDF';

Then, to instantiate it, I use the following configuration (as an example):

  my $server = My::Server -> new(
          default_renderer => 'Atom',
          handler => [ service => (
                  path_prefix => '/',
                  workspaces => [
                  {
                          title => 'Workspace',
                          collections => [
                                  {
                                          title => 'All of Foo',
                                          path_prefix => 'foo/',
                                          model => {
                                                  namespace => 'http://www.example.com/foo/',
                                                  class => 'RDFCore'
                                          }
                                  }
                          ]
                  }
                  ]
          } ]
  );

This results in the following urls:

- / - an app:service document
- /foo/ - an app:collection document (because there's no path component configured for the app:workspace)
- /foo/$id - an atom:entry document for the RDF resources centered around http://www.example.com/foo/$id.

The handler attribute is supposed to be a code ref that returns the information (so it can be dynamically built with each request if you really want it to be), but the module that defines the Atom semantics overloads the handler attribute's Moose type and allows coercion. This should allow configuration to be from a config file if the right Moose role is included in the server class.

I might make the rendering management configurable instead of part of the module definition. Unlike the protocol, interface, and semantic modules, the renderers don't include code, attributes, or expectations in the server class.

So far, I have passing tests for fetching app:service and app:collection documents and creating, fetching, and adding triples to RDF resources.

One thing I'm doing that might not be quite 'usual' is that I'm treating an RDF model as a collection of RDF resource documents. RDF resource documents are a collection of RDF triples centered around a particular RDF subject. I'm not treating the entire body of knowledge in the model as a single document. That's part of what's in the Atom semantic I'm working with.

Hopefully more next week, including something on CPAN with a lot more documentation than what I have now.

Friday February 08, 2008
06:49 PM

RDF/Atom/REST Server

It's been a few years since I last posted here. I've gone on to a different job and jumped back into Perl after an extended time away. Stuff has happened in that time.

Like Moose.

The use of roles is nice. I can declare an expected interface and easily test that an object or class implements it. I can use that information to know what kind of capabilities I can expose to the world.

I'm working on a framework for building RDF servers. I need such a beast to support some research projects at the university, but decided that I should produce something that might be useful to someone else. Of course, that means it has to be able to do things that I don't need to do.

But I don't want to have to program every conceivable behavior.

Instead, I'm building a framework that lets me get done what I need while staying flexible enough that someone else can replace bits of it to get done what they need.

So far, I have the following as a way to build a server (subject to change, of course):

    package My::Server;

    use RDF::Server;

    with 'MooseX::Daemonize';
    with 'MooseX::SimpleConfig';
    with 'MooseX::Getopt';

    interface 'REST';
    protocol 'HTTP';
    style 'Atom';

    #----------

    my $server = My::Server -> new(
            handler => [ workspace => {
                    collections => [
                            { # defines an app:collection
                                    path_prefix => 'gallery',
                                    entry_handler => [ RDFCore => { } ],
                                    categories => [ ]
                            }
                    ]
            }]
    );

    $server -> start; # would run on port 8080 by default

That will bring together everything needed to run a standalone server exposing a REST interface over HTTP using POE. Other protocols could be Apache2 or FastCGI (or even plain CGI if we really wanted to). I'm not sure what other interfaces should be. If I can figure out how, I'll make the triad HTTP/REST/Atom the default.

My::Server->handler is an object that will handle the actual request. The HTTP and REST classes just handle the communication with the world and how the request gets handed off to the handler object. This lets the same data backend work with different protocols and interfaces.

The handlers determine if we're speaking RSS or Atom and how the path (as supplied by the interface class) maps to data. The interface class also determines what action is requested.

The 'style' declaration is optional and mainly determines how the configuration is handled. The Atom style expects a hierarchy based on RFC 5023, though the top-level document type can be any of the ones defined in the app: namespace.

Hopefully, if I can get everything together and tested and working, I'll get something out on CPAN in the next week or two.

Thursday March 10, 2005
03:54 PM

Perl Syntax Mangling and XML Compilers

My mind has been trying to wrap itself around the idea of an XML Compiler Compiler. That would be something that takes a description of an XML language (such as a RelaxNG description with some additional bits to explain how to actually do the compile) and writes a SAX handler that will do the compile. Of course, we need to allow one XML language to embed or be embedded in another XML language. (A lot of this comes from the work on the Gestinanna project that resulted in a compiler for statemachines and another for workflows that shared a lot of common code.)

With that in mind, I started fresh work on the compiler code without trying to hack the existing Gestinanna modules. I also changed where I put the commas and semi-colons.

Now, I have code like the following:

package XML::Compiler::SAXHandler

; sub new_handler {
    my($type, %params) = @_

  ; return bless { %params
                 , Context => [ ]
                 , Current_NS => { }
                 } => $type
}

; sub start_document {
    my $e = shift

  ; my $sub
  ; foreach my $ns ($e -> handled_namespaces) {
        foreach my $h ($e -> ns_handlers($ns)) {
            if( ($sub = $h -> can('start_document'))
                && ($sub != \&start_document) ) {
                $sub->($h, $e) && last
            }
        }
    }
}

; sub start_element {
    my($e, $el) = @_
  ; $el -> {Parent} ||= $e -> {Current_Element}
  ; $el -> {Namespaces} = { %{$el -> {Parent} -> {Namespaces} || {}}
                          , %{$el -> {Namespaces} || {}}
                          }
  ; $e -> {Current_Element} = $el

  ; my $ns = $el -> {NamespaceURI}

  ; my @attribs
  ; my @defined_ns

  ; foreach my $attr (@{$el -> {Attributes}}) {
    }

  ; $el -> {_Defined_NS} = \@defined_ns
  ; $el -> {Attributes} = \@attribs

  # need to handle block v. expression context setting
  ; push @{$e -> {Context}}, 0
}

After you uncross your eyes, you might wonder why I put the semis and the commas at the beginning of the line (and instead of commas and semis being optional at the end of a block, they are now optional at the beginning of a block, with semis optional after a block as well). The key to this is the last comment in the code example. Semicolons delimit series of statements while commas delimit series of expressions. The traditional approach of putting these at the end of their respective part means we need to know the type of code we just emitted. By putting them at the beginning, we only need to know what kind of code we are expecting.

By managing what we expect when we see a start element, we can hopefully simplify some of the code that otherwise would need to handle the selection of the statement or expression terminator in the end element.

Monday July 26, 2004
02:44 PM

Gestinanna 0.02

The right way . . . is to separate the meaning of a program from the implementation details.

Saying less about implementation should also make programs more flexible. Specifications change while a program is being written, and this is not only inevitable, but desirable.

---Paul Graham, ``The Hundred-Year Language''

Gestinanna uses XML vocabularies to describe workflows and controllers as state machines. Using taglibs, it provides various extensions for accessing the Gestinanna::POF packages, workflows, scripting, authorization and authentication, and various misc. odds and ends.

I finally got around to getting the code released. Twice as much code as the previous release :) Work got a bit funner for a few weeks.

Change log:

  o Basic workflow support. See the listserv package for an example.

  o Added a package system. Shell support is not complete, but sufficient for installing a package.

  o Moved sample pages into packages installable via the package system.

  o Added site configuration single inheritance.

  o Extensive unit tests in the low-level modules. Not yet complete.

  o Split state machine XML schema into two namespaces: state machine and scripting.

  o shell tool may be configured not to use a pager.

  o Added `site clone' command to the shell. Both cloning and creating a site will now create a site configuration as well.

  o Added `site config' command to the shell to edit an existing site's configuration. The resulting configuration must be parsable by XML::LibXML.

  o The `site uri add' command now takes an argument to indicate the object type the URI refers to.

  o Gestinanna::Request::get_url added to resolve an object to a url (for use by Apache::AxKit::Provider::Gestinanna).

  o Alzabo naming routine updated for Alzabo 0.82.

  o Namespace handlers for XSM scripts are now configurable on a per-site basis.

  o Namespaces can be specified for namespace handlers, overriding the default. This is useful for using different handlers that offer the same interface but work with different backends.

  o Configuration information specific to a content handler type is now enclosed in a tag.

Monday June 21, 2004
09:02 PM

Gestinanna release by end of week

I should have a new release of Gestinanna by the end of the week. I have re-written quite a bit and modified even more :/ But we now have a package system and inheritable site configurations as well as inheritable RDBMS schema definitions. We have a *lot* of unit tests and some refactoring. We also have a new testing framework that lets me embed tests in the code (via pod) and track test results on a per-method basis. I also can make a nice dependency graph in SVG showing the test results.

I will get the Apache side of it working, make sure stuff installs and runs, and then ship a release. Workflows won't work yet, though some code is there. Security isn't tied in everywhere it needs to be. Several things still remain before it can be considered beta, but we're getting there.

Unfortunately, I might have to pause development for a while. It seems management prefers rapid development (with lots of security holes) over secure development (even though we handle SSNs and student grades). I'll know by the end of the week what will happen.

Wednesday June 02, 2004
09:05 AM

Update on Gestinanna

I have a new supervisor. He started at the beginning of May, was here for two weeks, and is now finishing up a three-week vacation. It wasn't us :) He had already made plans for the vacation (and conference, actually), so they let him keep it. He was moving within the University, so is wasn't like he was changing employers.

He's had some good effects on the Gestinanna project. He's forced me to get organized such that we could bring someone else into the project if we needed to.

As a result, I have code that can sketch out the dependencies between the various modules. I can automatically build tests from tests embedded in pod (which I am now writing furiosly -- unit tests are good, but take a while to write). I can run tests in the order required by module and method dependencies. I can generate reports showing how many tests pass/fail for each method. I just need to make all this a subclass of Module::Build and release it. The SVG dependency graph is my personal favorite.

I've added initial support for workflows based on the Workflow module released on CPAN a short while ago. I had been wondering for some time who I was going to manage coordination between applications to make things happen in particular ways (approval of changes to allow them to go production, for example). Workflow.pm fell into my lap :)

I'm wrestling with continuations. I have rudementary ones in a sense (at least in the sense of a paper I saw the other day), but I can't do the following yet:

  <value name="/email">
      <call-out state-machine="/std/email/composer"/>
  </value>

and have /email//* be all the information needed to send the e-mail. This is a simple and contrived exmaple in some ways, but the more interesting example is when one program dynamically determines that it must call another to get needed information (such as a site configuration manager calling out to the machine for a particular content provider type - allowing dynamic addition of new content provider types without requiring changes to the overall config manager application).

Part of the problem is that we are using a SAX-based XML->Perl compiler. By time we know we need to manage a continuation, we might be embedded in a loop, if-then-else (choose/when/otherwise) sequence, or other difficult context. If we assume we will have a continuation, then loops will probably be slow and clunky for all cases. I'll have to think about this some more, but it doesn't look impossible -- just a little difficult at the moment.

Continuations in the above sense won't be on CPAN by OSCon, but I hope the other stuff is. That's my goal anyway at this point.

Friday April 16, 2004
09:17 AM

Accidental Inheritable Site Configurations

Another version of the Gestinanna framework is on its way -- still working out some ripples. Exciting though.

I went to a newly installed test system to install the framework and found several assumptions that caused it to fail. It had worked just fine on my development system because it and that system had grown up together. Anyone else will see errors and missing parts if they try to install it. Oh, well. It's only the initial release of a fairly large, complex system. Not a good first impression, but hopefully it can be fixed with a second release.

Well, what started as an effort to just fix those problems and get another release out has snowballed into a series of refactorings that are making things a lot better.

There's now a packaging system. The shell tool can create and open packages (just compressed tarballs), list the contents, edit the contents of a file, manage the configuration file, manifest, and readme, and write them to disk. The package manager can find them. They can almost be installed.

The installation of a package led me to wonder where the information was that told me where to install them to. I wasn't trying to force installation on a per-site basis, especially when all the sites in a schema share the tables in a schema, which is where the files are installed to from the package. The repository for views, controllers, xslt, etc., is in the RDBMS schema and not necessarily tied to a particular site. Yet all the configuration information that tied a data type (view, xsm, xslt, etc.) to a table (View, XSM, XSLT, etc. and their associated auxiliary tables) was tied to a site.

So I saw the need for a global configuration. But if I try to create a site 0 using Alzabo, it ignores the primary key (site = 0) and uses the next sequence (since it is auto-incrementing the primary key). What to do....

Now, I have a site configuration package, Gestinanna::SiteConfiguration, whose constructor can take a parent site as an argument. This parent site doesn't have to be a global configuration. It just supplies information that can be overridden or augmented by the child site. The key is that the parent site is just a Gestinanna::SiteConfiguration object.

This simple design decision led to me being able to make arbitrarily long chains of single-inheriting site configurations :) Something I had been wanting to do for a long time, but hadn't quite figured out. It wasn't something that was important to me since we didn't really need it that much, but I knew someone else might want it. Now we have it, and it may make life a *lot* easier for anyone using this framework to manage multiply-branded applications.

Now, all the tag paths, data providers, content providers, etc., can be configured in one top-level configuration for a site that is never exposed. All the exposed sites can then inherit from it and just change the theme - resulting in the ability to manage one site, but have it show up on multiple URLs with different brandings.

The ripples? Well, the site configuration is now managed by a class instead of being a hash that the rest of the system could freely inspect. Data providers (for example, Gestinanna::POF::*) now need to be able to get info from the DOM during configuration (done during server startup). Content providers (Gestinanna::ContentProvider::*) likewise. URL mappings will need to be inheritable instead of looking at current site and site 0 (the marker for a global mapping). We now have a site path to search in addition to the tag path. The factory is created and managed by the site configuration. Etc., etc., ....

I was hoping to have a new set of packages out this week, but it looks like I'll need to hold off until next week. All because I wanted to fix some errors and get away from telling people to put a file in a particular location after installation :/