Documentation for local deployments
Prerequisites
You will need to have a Neo4j Database V5.15 or later with APOC installed to use this Knowledge Graph Builder. You can use any Neo4j Aura database (including the free tier database). Neo4j Aura automatically includes APOC and run on the latest Neo4j version, making it a great choice to get started quickly. You can also use the free trial in Neo4j Sandbox, which also includes Graph Data Science.
If want to use Neo4j Desktop instead, you will not be able to use the docker-compose deployment method. You will have to follow the separate deployment of backend and frontend section. |
Docker-compose
By default only OpenAI and Diffbot are enabled since Gemini requires extra GCP configurations.
In your root folder, create a .env file with your OPENAI and DIFFBOT keys (if you want to use both):
OPENAI_API_KEY="your-openai-key"
DIFFBOT_API_KEY="your-diffbot-key"
if you only want OpenAI:
LLM_MODELS="OpenAI GPT 3.5,OpenAI GPT 4o"
OPENAI_API_KEY="your-openai-key"
if you only want Diffbot:
LLM_MODELS="Diffbot"
DIFFBOT_API_KEY="your-diffbot-key"
You can then run Docker Compose to build and start all components:
docker-compose up --build
Additional configs
By default, the input sources will be: Local files, Youtube, Wikipedia and AWS S3. This is the default config applied if you do not overwrite it in your .env file:
REACT_APP_SOURCES="local,youtube,wiki,s3"
If however you want the Google GCS integration, add gcs
and your Google client ID:
REACT_APP_SOURCES="local,youtube,wiki,s3,gcs"
GOOGLE_CLIENT_ID="xxxx"
The REACT_APP_SOURCES
should be a comma-separated list of the sources you want to enable.
You can of course combine all (local, youtube, wikipedia, s3 and gcs) or remove any you don’t want or need.
Development (Separate Frontend and Backend)
Alternatively, you can run the backend and frontend separately:
-
For the frontend:
-
Create the frontend/.env file by copy/pasting the frontend/example.env.
-
Change values as needed
-
Run:
-
cd frontend
yarn
yarn run dev
-
For the backend:
-
Create the backend/.env file by copy/pasting the backend/example.env.
-
Change values as needed
-
Run:
-
cd backend
python -m venv envName
source envName/bin/activate
pip install -r requirements.txt
uvicorn score:app --reload
ENV
Env Variable Name | Mandatory/Optional | Default Value | Description |
---|---|---|---|
|
Optional |
|
API key for OpenAI (if enabled) |
|
Optional |
API key for Diffbot (if enabled) |
|
|
Optional |
|
Model for generating the text embedding (all-MiniLM-L6-v2 , openai , vertexai) |
|
Optional |
|
Flag to enable text embedding |
|
Optional |
|
Minimum score for KNN algorithm for connecting similar Chunks |
|
Optional |
|
Flag to enable Gemini |
|
Optional |
|
Flag to enable Google Cloud logs |
|
Optional |
|
Number of chunks to combine when extracting entities |
|
Optional |
|
Number of chunks processed before writing to the database and updating progress |
|
Optional |
|
URI for Neo4j database |
|
Optional |
|
Username for Neo4j database |
|
Optional |
|
Password for Neo4j database |
|
Optional |
API key for LangSmith |
|
|
Optional |
Project for LangSmith |
|
|
Optional |
|
Flag to enable LangSmith tracing |
|
Optional |
Endpoint for LangSmith API |
|
|
Optional |
URL for backend API |
|
|
Optional |
|
URL for Bloom visualization |
|
Optional |
|
List of input sources that will be available |
|
Optional |
|
Models available for selection on the frontend, used for entities extraction and Q&A Chatbot (other models: |
|
Optional |
|
Environment variable for the app |
|
Optional |
|
Time per chunk for processing |
|
Optional |
|
Size of each chunk for processing |
|
Optional |
Client ID for Google authentication for GCS upload |