ANNOUNCE: brillig 0.3 - not quite the Brill tagger
creswick at gmail.com
Wed Sep 7 17:40:43 BST 2011
2011/9/7 Eric Kow <eric.kow at gmail.com>:
> But hopefully I won't have to, because I was actually just saying
> something incredibly simple and non-technical, that the brillig
> executable could just provide a thin wrapper around different kinds of
> taggers (as alternatives to each other, completely disjoint).
> You know, files go in, tags come out...
Is anyone else interested in supporting the Apache UIMA CAS format(s)?
I'm not a *huge* fan of the gritty system design details in UIMA (it
seems absurdly difficult to actually use an analysis engine / pear in
an application) but at least the file format for annotations is
It would also be nice to provide some sort of a bridge to another rich
set of NLP libraries, while the Haskell infrastructure is getting off
(In a tangential note: This thread has been great for bringing some
tagging libraries to my attention... I didn't realize there were so
many options already!)
> but this was before I looked
> at the training file format and understood that this is what sequor
> provides. Oh well, this probably makes brillig just a bit redundant in
> infrastructure terms. :-)
>> For what it's worth, I just trained Sequor (using several spelling
>> features as encoded in the data/mlcomp2.features template) on the
>> initial 90% of the Brown corpus, and tested on the final 10%, and got
>> an accuracy of 96.2%. Training takes several hours, but tagging runs
>> at more than 3000 words/second.
> PS. can we have a small release with '-rtsopts'?
> Eric Kow <http://erickow.com>
> NLP mailing list
> NLP at projects.haskell.org
More information about the NLP