Flexible GraphRAG initial version

Flexible GraphRAG on GitHub

Flexible GraphRAG is an open source python platform supporting document processing, Knowledge Graph auto-building, Schema support, RAG and GraphRAG setup, hybrid search (fulltext, vector, graph), and AI Q&A query capabilities.

X.com Steve Reiner @stevereiner LinkedIn Steve Reiner LinkedIn

Has a MCP Server, Fast API Backend, Docker support, Angular, React, and Vue UI clients

Built with LlamaIndex which provides abstractions for allowing multiple vector, search graph databases, LLMs to be supported.

Supports currently:

Graph Databases: Neo4j, Kuzu

Vector Databases: Neo4j, Qdrant, Elasticsearch, OpenSearch

Search Databases/Engines: Elasticsearch, OpenSearch, LlamaIndex built-in BM25

LLMs: OpenAI, Ollama

Data Sources: File System, Hyland Alfresco, CMIS

A configurable hybrid search system that optionally combines vector similarity search, full-text search, and knowledge graph GraphRAG on document processed (Docling) from multiple data sources (filesystem, Alfresco, CMIS, etc.). It has both a FastAPI backend with REST endpoints and a Model Context Protocol (MCP) server for MCP clients like Claude Desktop, etc. Also has simple Angular, React, and Vue UI clients (which use the REST APIs of the FastAPI backend) for using interacting with the system.

  • Hybrid Search: Combines vector embeddings, BM25 full-text search, and graph traversal for comprehensive document retrieval

Knowledge Graph GraphRAG: Extracts entities and relationships from documents to create graphs in graph databases for graph-based reasoning

  • Configurable Architecture: LlamaIndex provides abstractions for vector databases, graph databases, search engines, and LLM providers
  • Multi-Source Ingestion: Processes documents from filesystems, CMIS repositories, and Alfresco systems
  • FastAPI Server with REST API: FastAPI server with REST API for document ingesting, hybrid search, and AI Q&A query
  • MCP Server: MCP server that provides MCP Clients like Claude Desktop, etc. tools for document and text ingesting, hybrid search and AI Q&A query.
  • UI Clients: Angular, React, and Vue UI clients support choosing the data source (filesystem, Alfresco, CMIS, etc.), ingesting documents, performing hybrid searches and AI Q&A Queries.
  • Deployment Flexibility: Supports both standalone and Docker deployment modes. Docker infrastructure provides modular database selection via docker-compose includes – vector, graph, and search databases can be included or excluded with a single comment. Choose between hybrid deployment (databases in Docker, backend and UIs standalone) or full containerization.

Check-ins 8/5/25 thru 8/9/25 provided:
1. Added LlamaIndex support, configurability, KG Building, GraphRAG, Hybrid Search, AI Q&A Query, Angular, React, and Vue UIs. Based on CMIS GraphRAG UI and CMIS GraphRAG which didn’t use LlamaIndex (used neo4j-graphrag python package)
2. Also added a FastMCP based MCP Server that uses the FastAPI server.

Check-in today 8/15/25 provided:

Added: Multiple Databases Support, Docker, Schemas, and Ollama support

  1. Leveraging LlamaIndex abstractions, added support for more search, vector and graph databases (beyond previous Neo4j, built-in BM25). Now support:
    Neo4j graph database, or Neo4j graph and vectors (also Neo4j browser / console)
    Elasticsearch search, or search and separate vector (also Kibana dashboard)
    OpenSearch search, or search+vector hybrid search (also OpenSearch Dashboards)
    Qdrant vector database (also its dashboard)
    Kuzu graph database support (also Kuzu explorer)
    LlamaIndex built-in local BM25 full text search
    (Note: LlamaIndex supports additonal vector and graph databases which we could support)
  2. Added composable Docker support
    a. As way to run search, graph, and vector databases. Also dashboards, and alfreso
    (comment out includes for what you have exernally or don’t use)
    b. Databases together with Flexible GraphRAG backend, and Angular, React, and Vue UIs
  3. Added Schema support for Neo4j (optional), and Kuzu (needed). Support default and custom
    schemas you configure in your environment (.env file, etc.)
  4. Added Ollama support in addition to OpenAI. Tested thru Ollama gpt-oss:20b, llama3.1, llama3.2.
    (Note: LlamaIndex supports additonal LLMs which we could support)

Creating Knowledge Graphs automatically for GraphRAG: Part 1: with NLP

(next post Part 2: with LLM)

I first investigated how NLP could be used for both entity recognition and relation extraction for creating a knowledge graphs of content. Tomaz Bratanic’s Neo4j blog article  used Relik for NLP along with LlamaIndex for creating a graph in Neo4j, and setting up an embedding model for use with LLM queries.

In my llama_relik github project, I used the  notebook from the blog article and changed it to use fastcoref instead of coreferee. Fastcoref was mentioned in the medium article version of the Neo4j blog article in the comments. It’s supposed to work better. There is also a python file in this project than can be used instead of the notebook.

I submitted some fixes to Relik on Windows, but it performs best on Linux in general and was more able to use the GPU “cuda” mode instead of “cpu”.

Similar work has been done using Rebel for NLP by Neo4j / Tomaz Bratanic, Saurav Joshi, and Qrious Kamal

Note that Relik has closed information extraction (CIE) models that do both entity linking (EL) and relation extraction (RE) . It also has models focused on either EL or RE.

Below is a screenshot from Neo4j with a knowledge graph created with the python file from the llama_relik project using the “relik-cie-small” model with the spacy space station sample text (ignore chunk node and it’s mentions relations). Notice how it has separate entities for “ISS” and “International Space Station” .

The “relik-cie-large” model finds more relations in screenshot below. It also has separate entities for “ISS” and “International Space Station” (and throws in second “International Space Station”).

Alfresco GenAI Semantic project updated: now adds regular Alfresco tags, uses local Wikidata and DBpedia entity recognizers

The Alfresco GenAI Semantic  github project  now adds regular Alfresco tags when performing auto tagging when enhancing with links to Wikidata and DBpedia. Semantic entity linking info is kept in 3 parallel multi-value properties (labels, links, super type lists) in the WikiData and DBpedia custom aspects. The labels values are used for the tag labels.

I switched to a local, private Wikidata recognizer.  The spaCy-entity-linker python library is used for getting Wikidata entity links without having to call a public serivce api. It was created before spaCy had its own entity linking system. It still has the advantage of not needing to do training. Had previously used the  spaCyOpenTapioca library, which calls an OpenTapioca public web service api URL. Note the URLs in the links properties do go to the public website wikidata.org if used in your application.

I also switched to a local, private DBpedia Spotlight entity recognizer in a docker composed in. The local URL to this docker is given the to the spacy DBpedia Spotlight for SpaCy library. This library was using a public Spotlight web service api URL by default previously. Note the URLs in the links properties do go to to the public website dbpeda.org if used in your application.

For documents with the Wikidata or DBpedia aspects added to them, tags will show up in the Alfresco clients (ACA, ADW, Share) after PDF rendition creation and alfresco-genai-semantic AI Listener gets responses from REST apis in the genai-stack. Shown below are tags in the ACA community content app:

Multi-value Wikidata aspect properties of a document in the ACA client are shown below in the view details expanded out. The labels property repeats what the labels of the tags have. The links properties have URLs to wikidata.org. The super types properties have the zero “” or one or multiple comma separated super types in wikidata for each entity. These supertypes are wikidata ids (are links once you add “http://www.wikidata.org/wiki/” in front of the ids).

The same style DBpedia aspect multivalue properties are shown below in the ACA client. Note that the super types can be from Wikidata, DBpedia, Schema (schema.org), foaf, or DUL (ontologydesignpatterns.org DUL.owl), etc.