The Alfresco GenAI Semantic github project is available now. This is a fork of the Alfresco GenAI project with spaCy NLP python library entity linking to DBpedia and Wikidata added for now.
The Alfresco GenAI project provides support for generative AI with local or cloud LLMs for Alfresco. This includes summarization, categorization, image description, chat prompting about doc content.
The Alfresco GenAI Semantic project adds named entity recognition (NER) / entity linking of documents in Alfresco to Wikidata and DBpedia. Currently 2 custom aspects have multi-value properties for the links, alfresco tags aren’t used yet.
The spaCy NLP python library along with spaCy projects are used. The spaCyOpenTapioca project is used for getting Wikidata entity links. The DBpedia Spotlight for SpaCy project is used for getting DBpedia entity links. Note these both use external servers, which can be setup locally. NER can also be done with just spaCy. The spaCy-LLM python package that integrates Large Language Models (LLMs) into spaCy pipelines is available. The Alfresco GenAI Semantic project currently doesn’t use spacy-llm yet.
Below shows what test\space-station.txt after upload and entity linking with the Entity Link Wikidata aspect looks like in the Alfresco ACA content app in the view details when expanded out:
Below shows what test\space-station.txt after entity linking with the Entity Link DBpedia aspect looks like in the Alfresco ACA content app in the view details when expanded out: