Apache Stanbol Version of OpenCalais Integration – Alfresco DevCon 2012 lightning talk slides

I uploaded my slides to SlideShare from the first lightning talk presentation I made at Alfresco DevCon 2012 San Jose:

An Alfresco Apache Stanbol Integration (port of OpenCalais integration) – Alfresco DevCon 2012 San Jose

It covers the port of the OpenCalais Integration and its Share UI extension to work with Apache Stanbol. These integrations support auto-tagging, semantic tag clouds, and semantic geo-tagged maps. Both integrations are open source and available on Google Code .

FlexibleShare updated

FlexibleShare extends FlexibleDashboard (dashboard framework, BI charting, reporting pods) with FlexSpaces doc management pods (Alfresco backend) and adds additional Flex pods for Share collaboration (Alfresco Share backend). All three of these projects are open source. FlexibleShare has been updated to use code from the latest versions of FlexSpaces and FlexibleDashboard, and the Share pods have added site selection drop downs.  Also added an  Alfresco Add-Ons page for FlexibleShare.

The doc management portion now has support for Alfresco 4.0, and a new preferences dialog for easier setup of server domain/port and API key setup of optional semantic auto-tagging with the OpenCalais Integration for Alfresco. The default config in the flexibleShareAirPods.xml just has the combined multi-view FlexSpaces pod shown in the top left. This screenshot also shows the available search, tasks, and local files pods (the all repository doc lib pod is not shown). In the AIR version, files from the local files pod can be copied into a doc lib view via drag/drop. Also in the AIR version, multi-select files can be copied via drag / drop from the desktop into a doc lib view,  copied out via drag/drop, or the native desktop clipboard can be used to copy/paste of files between the desktop and a doc lib   (AIR can do more than the HTML5 drag in available in some browers).

flexibleshareairbld4-33percent.png

The Share collaboration wiki, blog, discussions, doclib Flex pods are now more usable out of the box with added drop-downs to select Share site to work with (instead of setting the share site shortName in the pods xml file). More work is needed to hook up the calendar pod to load Share site calendar info (and an add event dialog is not available yet). Although the calendar pod is able to load iCalendar files, more work is needed to get it to work with the iCalendar data available from the Alfresco “slingshot” /calendar/eventList?site={shortName}&format=calendar webscript.

flexiblesharesharepods2.png

Planned for FlexibleShare:  calendar pod hookup with Share sites,   multiple repository support,  support for CMIS repositories, drag/drop copy between repositories, support for Alfresco Cloud repositories, multi-repository search,  Solr facets search navigation,  support for Apache Stanbol semantic auto-tagging / semantic search,  mobile/touch?, and a port/translation to HTML5 / CSS / JavaScript (FlexSpaces, CMIS Spaces, FlexibleDashboard, FlexibleShare).

Steve Reiner
Integrated Semantics
@stevereiner on twitter

OpenCalais Integration updated for Alfresco 4.0 and added use of new Share config mechanism. Apache Stanbol Plans

The OpenCalais Integration for Alfresco was finally updated for Alfresco 4.0 . Given the shift away from the Alfresco Forge to the Alfresco Add+Ons catalog site, the new home for the OpenCalais integration is now on a  Google Code site pointed to by its add+ons page

For Alfresco 4.0 with Solr enabled, an issue was fixed. (Some code that needed to get to newly added top level semantic tag categories right away needed to change to use a CategoryService API instead of using a search query since there is added delay in indexing. Changing the alfresco.cron  value  in solrcore.properties from 15 secs to 150 secs helped to get something that was intermittent to be reproducible every time.)

The Share Integration (semantic tag cloud dashlet, semantic geo-tagged map dashlet, auto-tag action menu in doc libraries and repository) was updated to use the new doclib action config mechanism added in 4.0 . Its much nicer to put an added action menu in a web-extension/share-config-custom.xml file than to setup the  modified versions of actions-common.get.head.ftl , documentlist.get.config.xml, etc in web-extension.  (Helpful ECM Stuff blog post on Share action config in Alfresco 4.0)

To use the free OpenCalais service, you need to get an API key from  opencalais.com  This allows you to submit 50,000 documents a day. More requests are supported in the non free version called Calais vs. the free (but not open source) service called OpenCalais. Note that document size per submission is to 100k bytes in all versions, the service retains extracted metadata (doesn’t retain content). So its geared more for news articles than large sensitive documents.  Calais has a test page to try out giving it text and seeing what it extracts.

To use the Share auto-tag action menu  (used to do a one time auto-tag on a document) you need a Calais api key setup in module\calais\module-context.xml  (see readme.txt).  Semantic tags will be listed in the properties section of a Share document details page  (not with regular tag UI since a different category content model / custom root category is used for semantic tags). You can also add one or more semantic tag clouds dashlets and a semantic geo-tagged map dashlet to share dashboards (site and/or global) to navigate from semantic tags to documents.   In explorer, doing a one time auto-tag you need to used the run a rule on a doc action and give the Calais key each time in the dialog.   A  rule to auto-tag documents in a folder can be setup in Explorer or Share (using the “Auto-tag with Calais” action, and you need to give the Calais api key as a parameter to this).

FlexSpaces has support for the OpenCalais integration in all its versions (desktop AIR client, Flex in-browser, Mobile AIR).  Like Share it supports semantic tag clouds, a semantic geo-tagged map, and one time auto-tagging like Share. It has additional OpenCalais features: semantic tag suggestion, adding / removing semantic tags on a document.  You can setup a Calais api key (and Alfresco server info) in FlexSpaces preferences dialog that was added in the 2012.02.08 version and avoid having to do this in FlexSpacesConfig.xml . Info entered is sticky and per user on their local machine (stored in a Local Shared Object). So theoretically each user could submit 50,000 documents a day to OpenCalais if they each signed up for a key.  FlexibleShare includes FlexSpaces and its semantic features, but hasn’t been updated with the preferences dialog or other recent FlexSpaces changes yet (update: FlexibleShare 6/28/2012 version now has the preferences dialog and Alfresco 4.0 support too).

FlexSpaces Preferences Dialog

Plan to have an Alfresco integration with Apache Stanbol on the same Semantics4Alfresco Google Code  site with the OpenCalais integration.  Apache Stanbol (derived from the IKS project) is fully open source, is a general stack of frameworks for semantic content management and can do more than content enhancement,  can get around the drawbacks of OpenCalais, and gives you more flexibility to setup customized ontologies vs. the fixed support Calias has. Stanbol can also call other enhancement engines instead of the default OpenNLP or even chain them together. Stanbol has an adapter for OpenCalais. For enhancing news, OpenCalais works better out of the box than OpenNLP.   Zaizi has already done Stanbol integration work, although only a version for an old IKS version is currently open source.  Integrated Semantics will leverage  / extend any newer Stanbol integration that Zaizi makes available open source.  A Stanbol integration could extend Solr facets with semantic facets.

Flexible Dashboard, FlexibleShare, and Flexible Liferay updates

  FlexibleDashboard

FlexibleDashboard

FlexibleDashboard is a dashboard application / framework focused on the uses of dashboards for BI / Reporting / Charting. It started with esria dashboard code and evolved from there, adding flexmdi cascading / tiling (esria pod drag/drop in tile mode), more pods, pods in Flex modules, Spring ActionScript configurable pods, and both Flex+Browser and Flex+Air versions. 

Additional pods beyond the esria charting pods include: JasperReports viewer, BIRT Report viewer, OLAP pivotable grid with XMLA datasource support and MDX query editor (from Grebulon sourceforge project), Pentaho Charts, GridPod, ChartGridPod, calendar, and iframe html. The AIR version has additional Flex+AIR pods: webkit HTML, web browser, Google gadgets, Liferay portlet gadget, and a local files browser.

In February, FlexibleDashboard was ported from Flex 3 to Flex 4 in build3, including a first pass of using spark controls instead of halo controls. I got mostly through porting the esria dashboard part, then discovered code by Greg Lafrance (who later wrote a 4 part Adobe devnet series of articles part1, part2, part3, part4)  that helped me port the remaining parts of the esria code in FlexibleDashboard. I also used a skin from flexdevtips for pod windows.  Instead of porting the full flexmdi code to Flex 4 / spark, all the style setting code was removed, and just the basic parts were ported.

In April, build 4 of FlexibleDashboard added having pods built in separate flex modules and instead of having the esria style hardcoded switch on pod type to pod class, the module path is listed in the pods xml files on each pod. 

Build 4 also introduced having each use of a pod/module configurable separately with Spring ActionScript in a separate context xml file (in src/spring-actionscript/ dir). This is currently used in 3 differently configured GridPods and in one ChartGridPod.  A similar approach could be used with other Flex frameworks that support modules (Parsley, etc.).  The GridPods and ChartGridPod  just reference a data service interface IDataService, and the particular data service implementation is configured and injected / autowired with Spring ActionScript. 

Build 4 also has 3 simple data service class implementations (SoapDataService, XmlDataService, RemoteObjectDataService) which get data via BlazeDS (or from LCDS).  The shared ChannelSet for the grid pods is configured in src /spring-actionscript/ application-config.xml   The esria dashboard like config is still in data/ FlexibleDashboardPods.xml (or in data/ FlexibleDashboardAirPods.xml for the AIR version).

 FlexibleShare

FlexibleShare

FlexibleShare adds the following to FlexibleDashboard: FlexSpaces pods for Alfresco document management along with Flex based collaboration pod front ends (wiki, blog, calendar, doc lib, discussions) to an Alfresco Share backend.  FlexibleShare was ported from Flex 3 to Flex 4 in May.  The additional pods are now in Flex modules too.

FlexibleLiferay

FlexibleLiferay extends FlexibleDashboard to provide a Flex portal container for Liferay.  The Flex+AIR client is able to get all of the places/layouts, tabs, and portlets you would normally see as a Liferay portal user and display them using a Flex based container with an HTML control for each portlet (doesn’t support pure Flex/Flash portlets yet).

The basic idea was for a Flex Portal, instead of starting from scratch, leverage the server side and services of an existing portal. Another use case is for views from an existing portal can be included in a larger enterprise Flex application.  Note that FlexibleDashboard and FlexibleShare can also display individual Liferay portlets in a configurable pod using the Liferay widget, just not whole existing portal layouts.

Recently FlexibleLiferay was ported from Flex 3 to Flex 4. The server side code was changed from Liferay 5.x ext environment code to a simpler web plugin for Liferay 6.x. The FlexibleLiferay client uses BlazeDS/AMF to remote to Java APIs provided in the web plugin. Built versions of the AIR client and the server piece (web plugin) are now available (previously there was only code available in the svn).

Alfresco OpenCalais Integration Share UI

The Alfresco OpenCalais Integration now has UI (Spring Surf / HTML /JavaScript / YUI)  for Alfresco Share in addition to the support in FlexSpaces (Flex/Flash).  The Share UI has a semantic tag cloud dashlet, a geo-tagged (Google map based) semantic map dashlet, and an auto-tagging action.  The Share UI is for Alfresco 3.3 and 3.4.

share-calais-dashlets-2.png

The dashlets will show semantic tags in all share sites when added to the overall Share dashboard, and show site specific semantic tags when added to site dashboards.  Clicking on a tag in the semantic tag cloud or on a semantic tag map marker will take you to a search results list of documents with the semantic tag.  The semantic tag cloud dashlet can be changed to show semantic tags for a specific category or all categories.

The semantic tag cloud dashlet is based on  Will Abson’s tag cloud dashlet in the Alfresco Share Extras collection. Will now also has a Google map dashlet in this collection showing geo-location of photo files using Tika extracted metadata available in Alfresco 3.4.

share-calais-autotag-action-2.png

The added auto-tag action menu (in more menu and details page) can be used to auto tag the selected document with the OpenCalais service. This action is added to both site document libraries and repository document library page menu.The auto-tagging action can also be setup in a content rule to auto-tag all documents in a folder in the rule UI of Alfresco Explorer or in Share (choose to perfom the action “Auto-tag with Calais”).

Note that semantic tags are implemented with categories with a custom root category. They won’t show up in regular Alfresco tag or category UI.  Currently only the Alfresco Explorer details page will list semantic tags (update 3/30/2011: will now show up the Share doc details page too in the 1.3.1 version of the OpenCalais integration).

FlexSpaces, in addition to having the semantic tag clouds, semantic map, and auto-tag action features in the Share UI, also has support for suggesting semantic tags and for editing what semantic tags are assigned to a document.  See the semantic features in action in this screen-cam of an older version of FlexSpaces.