Generating interoparable taxonomies from tags - Annotation Graph, Processing, Rendering.
There still is alot of loud brooding here. I'm not yet well acquainted with your terminology :) Maybe we should rather call this 'interoparable categorzation / navigation model' ? see also
comment-Label-46
determisnistic vs. heuristic taxonomies
I guess what is mainly meant with taxonomies in SnipSnap lingo is the path of nodes linking snippets to sub snippets - which would actually be an inappropriate term. Here I'm rather talking of taxonomies in Snipspace. Would it be more appropriate to talk of categories or label hierarchies?. A rigid taxonomy in Snipspace would only make sense as corrective maintained by experts in case statistics really go bad. It would also makes sense as feed for some context sensitive label autosuggestion (cf.
Taxonomy handling /workflow) but you could just as well take the heuristic taxonomy for that. Demanding deterministic taxonomies from the start will make your users hate the system. I think this is already going into topic-map-style complexity.
del.icio.us has shown that simple tags are more usable:
"Folksonomies remove the inherent ambiguity of hierarchy by removing any all concept of hierarchy from the organization scheme. Complex structure is instead represented by set algebra of intersection, union, and difference over sometimes-overlapping facets. This motivates the annotator by eliminating the need for memorization of ontology construction guidelines, and also does wonders to reduce annotator variance"
http://www.isi.edu/~mote/papers/Folksonomy.htmlLet us say we allow multiple tags and - contrary to del.icio.us - we take tag order into consideration. Would the results be beter than clustering tag bags on union, intersection and difference criteria? I doubt it.
What does make sense, is a distinction between the real
facets alias
tags alias
categories alias
topics vs.
types . The type definition here is vague.
ToDo == UML diagram on these lines
http://www.topicmaps.org/xtm/#conceptualmodel (of course less complex than xtm).
Annotation graph for Taxonomies
This diagram
MetaData gives a general overview of interoperability possibilities. However There is no discussion or detail on the annotation graph format or processing requirements. It further says
"Planned MetaData Exports are RDF, RSS, Dublin Core and XTM." any follow up? What is the annotation graph format XFML RDF OWL?
Definition of import/export format
Is there any wiki-independent interchange standard?
Processing Requirements (processing possibly wiki external)
- standardized format?
- standardized interchange format
- preprocessing (stemming etc.)
- classification / taxonomy construction
- adaptable combination of various algorithms (SOM, non-/hierarchical clustering)
- taxonomy merging: If taxonomies are to be interoperable - i.e. imprting/exporting from wordpress, del.icio.us I'd have to merge them in an efficent way.
Taxonomy handling /workflow (wiki internal)
- methods of feeding taxonomy subsets back to consumer (wiki, blog ..whatever)
- optinmal method of integrating them in the editing workflow (label autosuggestion ...)
Preferaby the taxonomy could be exported and imported - in that case I'd only have to deal with the snipsnap interchange api and the rendering engine. I'd like to procees Tags externally using the GATE framework
http://gate.ac.uk/ i.e. Annotation Graph is optimally ATLAS conformant
http://www.nist.gov/speech/atlas/. I would happily give up ATLAS if there is some standard compliant wiki- annotation-graph format.
My scenario
Users are writing in 7 different languages on a similar topic. Most of them speak 2-3 languages. The Languages they speak are set in their preferences so that they get 2-3 label input-fileds for the respective languages. Whatever snip they write is labeled in both languages.
What I'm gaining this way are
equivalence lists referencing multilingual snip instances.
A Taxonomy is built by heuristic methods (see processing). I can chose my language of preference and view the taxonomy in that language. I can also see wheather snip instances on a topic exist or not and also see instances which only exist in other languages.