Flexible GraphRAG initial version

Flexible GraphRAG on GitHub

Flexible GraphRAG is an open source python platform supporting document processing, Knowledge Graph auto-building, Schema support, RAG and GraphRAG setup, hybrid search (fulltext, vector, graph), and AI Q&A query capabilities.

X.com Steve Reiner @stevereiner LinkedIn Steve Reiner LinkedIn

Has a MCP Server, Fast API Backend, Docker support, Angular, React, and Vue UI clients

Built with LlamaIndex which provides abstractions for allowing multiple vector, search graph databases, LLMs to be supported.

Supports currently:

Graph Databases: Neo4j, Kuzu

Vector Databases: Neo4j, Qdrant, Elasticsearch, OpenSearch

Search Databases/Engines: Elasticsearch, OpenSearch, LlamaIndex built-in BM25

LLMs: OpenAI, Ollama

Data Sources: File System, Hyland Alfresco, CMIS

A configurable hybrid search system that optionally combines vector similarity search, full-text search, and knowledge graph GraphRAG on document processed (Docling) from multiple data sources (filesystem, Alfresco, CMIS, etc.). It has both a FastAPI backend with REST endpoints and a Model Context Protocol (MCP) server for MCP clients like Claude Desktop, etc. Also has simple Angular, React, and Vue UI clients (which use the REST APIs of the FastAPI backend) for using interacting with the system.

  • Hybrid Search: Combines vector embeddings, BM25 full-text search, and graph traversal for comprehensive document retrieval

Knowledge Graph GraphRAG: Extracts entities and relationships from documents to create graphs in graph databases for graph-based reasoning

  • Configurable Architecture: LlamaIndex provides abstractions for vector databases, graph databases, search engines, and LLM providers
  • Multi-Source Ingestion: Processes documents from filesystems, CMIS repositories, and Alfresco systems
  • FastAPI Server with REST API: FastAPI server with REST API for document ingesting, hybrid search, and AI Q&A query
  • MCP Server: MCP server that provides MCP Clients like Claude Desktop, etc. tools for document and text ingesting, hybrid search and AI Q&A query.
  • UI Clients: Angular, React, and Vue UI clients support choosing the data source (filesystem, Alfresco, CMIS, etc.), ingesting documents, performing hybrid searches and AI Q&A Queries.
  • Deployment Flexibility: Supports both standalone and Docker deployment modes. Docker infrastructure provides modular database selection via docker-compose includes – vector, graph, and search databases can be included or excluded with a single comment. Choose between hybrid deployment (databases in Docker, backend and UIs standalone) or full containerization.

Check-ins 8/5/25 thru 8/9/25 provided:
1. Added LlamaIndex support, configurability, KG Building, GraphRAG, Hybrid Search, AI Q&A Query, Angular, React, and Vue UIs. Based on CMIS GraphRAG UI and CMIS GraphRAG which didn’t use LlamaIndex (used neo4j-graphrag python package)
2. Also added a FastMCP based MCP Server that uses the FastAPI server.

Check-in today 8/15/25 provided:

Added: Multiple Databases Support, Docker, Schemas, and Ollama support

  1. Leveraging LlamaIndex abstractions, added support for more search, vector and graph databases (beyond previous Neo4j, built-in BM25). Now support:
    Neo4j graph database, or Neo4j graph and vectors (also Neo4j browser / console)
    Elasticsearch search, or search and separate vector (also Kibana dashboard)
    OpenSearch search, or search+vector hybrid search (also OpenSearch Dashboards)
    Qdrant vector database (also its dashboard)
    Kuzu graph database support (also Kuzu explorer)
    LlamaIndex built-in local BM25 full text search
    (Note: LlamaIndex supports additonal vector and graph databases which we could support)
  2. Added composable Docker support
    a. As way to run search, graph, and vector databases. Also dashboards, and alfreso
    (comment out includes for what you have exernally or don’t use)
    b. Databases together with Flexible GraphRAG backend, and Angular, React, and Vue UIs
  3. Added Schema support for Neo4j (optional), and Kuzu (needed). Support default and custom
    schemas you configure in your environment (.env file, etc.)
  4. Added Ollama support in addition to OpenAI. Tested thru Ollama gpt-oss:20b, llama3.1, llama3.2.
    (Note: LlamaIndex supports additonal LLMs which we could support)

Python-Alfresco-MCP-Server 1.1.0 released

Video: Python-Alfresco-MCP-Server with Claude Desktop and MCP Inspector
https://x.com/stevereiner/status/1950418564562706655

Model Context Protocol Server (MCP) for Alfresco Content Services (Community and Enterprise)

This uses FastMCP 2.0 and Python-Alfresco-API

A full featured MCP server for Alfresco in search and content management areas. Features complete documentation, tests, examples,
and config samples for various MCP clients (Claude Desktop, MCP Inspector, references to configuring others).

Python-Alfresco-MCP-Server on Github
https://github.com/stevereiner/python-alfresco-mcp-server

Tools:
Basic search, advanced search, metadata search, and cmis query,
upload, download, check-in, checkout, cancel checkout,
create folder, folder browse, delete node,
get/set properties, repository info.

(With python-alfresco-api having full coverage of the 7 Alfresco REST APIs
you could customize with what tools you want from 191 in core, 29 in workflow,
3 in authentication, 1 in search, 1 in discovery, 18 in model, 1 search sql for solr)

Resources: repository info repeated

Prompts: search and analyze

Latest on Github 7/29/25

  • readme.md focuses on install with uv and uvx
  • docs\install_with_pip_pipx.md covers install with pip and pipx
  • sample configs for Claude Deskop (stdio) with uv, uvx, pipx for windows and mac
  • sample configs for mcp-inspector with uv, uvx, pipx for both http and stdio

Python-Alfresco-MCP-Server v1.1.0 7/25/25

  • Refactored code into single file per tool (organized in tools/search/,
    tools/core/, resources/, prompts/, utils/
  • Changes for python-afresco-api 1.1.1
  • Must better testing (143/143 passing)
  • Added uv support (latest readme and config samples also have uvx)
  • First version on PyPI.org

Python-Alfresco-MCP-Server v1.0 6/24/25
Changed to use FastMCP vs original code

Python-Alfresco-MCP-Server on PyPI
https://pypi.org/project/python-alfresco-mcp-server/
(On PyPI so don’t need source, still need python and optionally fast uv installed)

Thse can be used to test install or run one thing
# Tests that installation worked
uv tool run python-alfresco-mcp-server –help
uvx python-alfresco-mcp-server –help # alias for uv tool run

This install may not be needed
uv tool install python-alfresco-mcp-server

Python-Alfresco-API on Github
https://github.com/stevereiner/python-alfresco-api

Python-Alfresco-API on PyPI
https://pypi.org/project/python-alfresco-api/

X.com
https://x.com/stevereiner

LinkedIn
https://www.linkedin.com/in/steve-reiner-abbb5320/

Python-Alfresco-API Updated

 This is a complete Python client package for developing python code and apps for Alfresco. It supports using all 7 Alfresco REST APIs: Core, Search, Authentication, Discovery, Model, Workflow, Search SQL (Solr admin). It has Event support (activemq or Event Gateway). The project has extensive documentation, examples, and tests.

See Python-Alfresco-MCP-Server . This is a Model Context Protocol (MCP) Server that uses Python Alfresco API

https://github.com/stevereiner/python-alfresco-api

https://pypi.org/project/python-alfresco-api

You need Python 3.10+ installed.

This can be used to install:

pip install python-alfresco-api

The released v1.1.1 version goes well beyond the previous 1.0.x version.

It has a generated well organized hierarchical structure for higher level clients (1.0.x only had 7 wrapper files). Its generated from the openapi-python-client generated low level “raw clients”

Pydantic v2 models are now used the high level clients. Hopefully in v1.2 the low level clients will too. This can be done by configuring the openapi-python-client generator with templates. Some things need to be worked out, so no guarantees. This will simplify things and avoid model conversions.

Added utilities for upload, download, versioning, searching. etc. Using the utilities reduced the amount of code you need to do these operations.

A well organized hierarchical structure of linked md docs for the high level client apis and models documentation is also generated.

Documentation now has diagrams for overall architecture, model levels, and client type.

Readme now covers how to install an Alfresco Community docker from GitHub. In case you don’t already a Enterprise or Community version of Alfresco Content Services. Also see Hyland Alfresco .

Creating Knowledge Graphs automatically for GraphRAG: Part 2: with LLMs

And the winner is using LLMs to create knowledge graphs over using NLP. Can LLMs do a better job? The Neo4j LLM Graph Builder in particular, has shown they can. What about the cost of using OpenAI along with the loss of privacy of data by submitting? The answer is free and local LLM models (Llama3 versions are available thru ollama) work too with Graph Builder. I tested with OpenAI GPT-4o, llama3, llama3.1, llama3.2. I noticed gemma2 is also available thru ollama. With these local LLMs, you will need a high end Nvidia card to work best.

Neo4j Labs LLM Knowledge Graph Builder main info site

Short Youtube demo video

The Online LLM Graph Builder can be used. You need to provide it with your Aura Neo4j connection info (you can create an account for a free Aura DB). It only has Diffbot, OpenAI, and Gemini LLM models available.

Graph Builder can upload from local files, AWS S3, web pages, Wikipedia, and Youtube. Google GCS can be a source if configured.

First choose the LLM model to use. Then upload one or more files. Then choose generate graph. You can view the graphs with the basic viewer (which allows hiding chunk nodes, community nodes, so you can see the entities and relationships). The Bloom viewer is also available, which is more complicated.

You can also chat with the data using GraphRAG and your chosen LLM. Answers have a icon below them that when clicked, provides info on graph doc sources, what entities, and what chunks were used to answer.

LLM Graph Builder Github project (Apache 2.0 open source)

The online version doesn’t have the llama3 models. So you need to clone the github project and build locally. To add using Meta Llama3 models, you need to configure it. You use the example.env to create a .env file and then add an optional OpenAI key, LLM model configuration, and indicate you initial Neo4j database info. Neo4j connection info can also be provided in the UI. Then do docker compose up. I have a fork of the main branch in my LLM Graph Builder that has added: configuration for lllama3, llama3.1, llama3.2, and openai gpt-4 choices, some neo4j connection config examples, switched to 8090 to not conflict with Alfresco 8080, has an additional debug log to so you can check on model config. and has a sample files folder with space-station.txt.

Speaking of Alfresco, I could add to my Alfresco GenAI Semantic project to call the separable backend of Graph Builder to generate a knowledge graph of new or updated Alfresco documents that have a new custom aspect. The backend may only have support for sources coming for the app’s kinds of sources currently. Also note in terms of UI integration, Alfresco’s ADF components and the ACA client use Angular. Neo4j Graph Builder’s front end uses React (and so does some of their other software projects).

space-station.txt with OpenAI GPT-4o:

space-station.txt with Meta Llama3:

space-station.txt with Meta Llama3.1:

space-station.txt with smaller Meta Llama3.2:

OpenAI GPT-4o with Albert Einstein Wikipedia page (340 nodes, 230 relationships):

Meta Llama3 with Albert Einstein Wikipedia page (150 nodes, 150 relationships), not shown: Llama3.1 (had 161 nodes, 85 relationships), not shown Llama3.2 (125 nodes, 76 relationships)

Creating Knowledge Graphs automatically for GraphRAG: Part 1: with NLP

(next post Part 2: with LLM)

I first investigated how NLP could be used for both entity recognition and relation extraction for creating a knowledge graphs of content. Tomaz Bratanic’s Neo4j blog article  used Relik for NLP along with LlamaIndex for creating a graph in Neo4j, and setting up an embedding model for use with LLM queries.

In my llama_relik github project, I used the  notebook from the blog article and changed it to use fastcoref instead of coreferee. Fastcoref was mentioned in the medium article version of the Neo4j blog article in the comments. It’s supposed to work better. There is also a python file in this project than can be used instead of the notebook.

I submitted some fixes to Relik on Windows, but it performs best on Linux in general and was more able to use the GPU “cuda” mode instead of “cpu”.

Similar work has been done using Rebel for NLP by Neo4j / Tomaz Bratanic, Saurav Joshi, and Qrious Kamal

Note that Relik has closed information extraction (CIE) models that do both entity linking (EL) and relation extraction (RE) . It also has models focused on either EL or RE.

Below is a screenshot from Neo4j with a knowledge graph created with the python file from the llama_relik project using the “relik-cie-small” model with the spacy space station sample text (ignore chunk node and it’s mentions relations). Notice how it has separate entities for “ISS” and “International Space Station” .

The “relik-cie-large” model finds more relations in screenshot below. It also has separate entities for “ISS” and “International Space Station” (and throws in second “International Space Station”).