Implement an Automated Report-Generation Agent with crewAI and Neo4j
 
					Graph ML and GenAI Research, Neo4j
8 min read

Build dynamic data-driven reports automatically
I’m a fan of agentic flows with LLMs. They not only enable more advanced Text2Cypher implementations but also open the door to various semantic-layer implementations. It’s an incredibly powerful and versatile approach.
In this blog post, I set out to implement a different kind of report-generation agent. Instead of the usual Q&A use case, this agent generates detailed reports about specific industries in a given location. The implementation leverages crewAI, a platform that empowers developers to easily orchestrate AI agents.

This system orchestrates three agents working in harmony to deliver a comprehensive business report:
- Data Researcher Agent: Specializes in gathering and analyzing industry-specific data for organizations in a given city, providing insights into company counts, public companies, combined revenue, and top-performing organizations
- News Analyst Agent: Focuses on extracting and summarizing the latest news about relevant companies, offering a snapshot of trends, market movements, and sentiment analysis
- Report Writer Agent: Synthesizes the research and news insights into a well-structured, actionable markdown report, ensuring clarity and precision without adding unsupported information
Together, these agents form a flow for generating insightful industry reports tailored to specific locations.
The code is available on GitHub.
Dataset
We will use the companies database on the Neo4j demo server, which includes detailed information about organizations, individuals, and even the latest news for some organizations. We used the Diffbot API to fetch this data.

The dataset focuses on investors, board members, and related aspects, making it an excellent resource for demonstrating industry report generation.
# Neo4j connection setup
URI = "neo4j+s://demo.neo4jlabs.com"
AUTH = ("companies", "companies")
driver = GraphDatabase.driver(URI, auth=AUTH)Next, we need to define the OpenAI key as we will be using GPT-4o throughout this blog post:
# Set your OpenAI API key
os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI key: ")
llm = LLM(model='gpt-4o', temperature=0)Knowledge Graph-Based Tools
We will begin by implementing tools that enable an agent/LLM to retrieve relevant information from the database. The first tool will focus on fetching key statistics about companies within a specific industry in a given city:
industry_options = ["Software Companies", "Professional Service Companies", "Enterprise Software Companies", "Manufacturing Companies", "Software As A Service Companies", "Computer Hardware Companies", "Media And Information Companies", "Financial Services Companies", "Artificial Intelligence Companies", "Advertising Companies"]
class GetCityInfoInput(BaseModel):
    """Input schema for MyCustomTool."""
    city: str = Field(..., description="City name")
    industry: str = Field(..., description=f"Industry name, available options are: {industry_options}")
class GetCityInfo(BaseTool):
    name: str = "Get information about a specific city"
    description: str = "You can use this tools when you want to find information about specific industry within a city."
    args_schema: Type[BaseModel] = GetCityInfoInput
    def _run(self, city: str, industry: str) -> str:
        data, _, _ = driver.execute_query("""MATCH (c:City)<-[:IN_CITY]-(o:Organization)-[:HAS_CATEGORY]->(i:IndustryCategory)
WHERE c.name = $city AND i.name = $industry
WITH o
ORDER BY o.nbrEmployees DESC
RETURN count(o) AS organizationCount,
     sum(CASE WHEN o.isPublic THEN 1 ELSE 0 END) AS publicCompanies,
     sum(o.revenue) AS combinedRevenue,
     collect(CASE WHEN o.nbrEmployees IS NOT NULL THEN o END)[..5] AS topFiveOrganizations""", city=city, industry=industry)
        return [el.data() for el in data]The GetCityInfo tool retrieves key statistics about companies in a specific industry within a given city. It provides information such as total number of organizations, count of public companies, combined revenue, and top five organizations by number of employees. We could expand this tool, but for our purposes I kept it simple.
Use the second tool to fetch the latest information about a given company:
class GetNews(BaseTool):
    name: str = "Get the latest news for a specific company"
    description: str = "You can use this tool when you want to find the latest news about specific company"
    def _run(self, company: str) -> str:
        data, _, _ = driver.execute_query("""MATCH (c:Chunk)<-[:HAS_CHUNK]-(a:Article)-[:MENTIONS]->(o:Organization)
WHERE o.name = $company AND a.date IS NOT NULL
WITH c, a
ORDER BY a.date DESC
LIMIT 5
RETURN a.title AS title, a.date AS date, a.sentiment AS sentiment, collect(c.text) AS chunks""", company=company)
        return [el.data() for el in data]The GetNews tool retrieves the latest news about a specific company. It provides details such as article titles, publication dates, sentiment analysis, and key excerpts from the articles. This tool helps to stay updated on recent developments and market trends related to a particular organization, allowing us to generate more detailed summaries.
Agents
As mentioned, we will implement three agents. crewAI requires minimal prompt engineering because the platform handles the rest.
We implement the agents as follows:
# Define Agents
class ReportAgents:
    def __init__(self):
        self.researcher = Agent(
            role='Data Researcher',
            goal='Gather comprehensive information about specific companies that are in relevant cities and industries',
            backstory="""You are an expert data researcher with deep knowledge of 
            business ecosystems and city demographics. You excel at analyzing 
            complex data relationships.""",
            verbose=True,
            allow_delegation=False,
            tools=[GetCityInfo()],
            llm=llm
        )
        self.news_analyst = Agent(
            role='News Analyst',
            goal='Find and analyze recent news about relevant companies in the specified industry and city',
            backstory="""You are a seasoned news analyst with expertise in 
            business journalism and market research. You can identify key trends 
            and developments from news articles.""",
            verbose=True,
            allow_delegation=False,
            tools=[GetNews()],
            llm=llm
        )
        self.report_writer = Agent(
            role='Report Writer',
            goal='Create comprehensive, well-structured reports combining the provided research and news analysis. Do not include any information that isnt explicitly provided.',
            backstory="""You are a professional report writer with experience in 
            business intelligence and market analysis. You excel at synthesizing 
            information into clear, actionable insights. Do not include any information that isn't explicitly provided.""",
            verbose=True,
            allow_delegation=False,
            llm=llm
        )In crewAI, agents are defined by specifying their role, goal, and backstory, with optional tools to enhance their capabilities. In this setup, three agents are implemented: a Data Researcher for gathering detailed information about companies in specific cities and industries using the GetCityInfo tool; a News Analyst tasked with analyzing recent news about relevant companies using the GetNews tool; and a Report Writer, who synthesizes the gathered information and news into a structured, actionable report without relying on external tools. This clear definition of roles and objectives ensures effective collaboration among the agents.
Tasks
In addition to defining the agents, we also need to outline the tasks they will tackle. In this case, we’ll define three distinct tasks:
# Define Tasks
city_research_task = Task(
    description=f"""Research and analyze {city_name} and its business ecosystem in {industry_name} industry:
    1. Get city summary and key information
    2. Find organizations in the specified industry
    3. Analyze business relationships and economic indicators""",
    agent=agents.researcher,
    expected_output="Basic statistics about the companies in the given city and industry as well as top performers"
)
news_analysis_task = Task(
    description=f"""Analyze recent news about the companies provided by the city researcher""",
    agent=agents.news_analyst,
    expected_output="Summarization of the latest news for the company and how it might affect the market",
    context=[city_research_task]
)
report_writing_task = Task(
    description=f"""Create a detailed markdown report about the
    results you got from city research and news analysis tasks.
    Do not include any information that isn't provided""",
    agent=agents.report_writer,
    expected_output="Markdown summary",
    context=[city_research_task, news_analysis_task]
)The tasks align with the agents’ capabilities. The city research task analyzes the business ecosystem of a specified city and industry, gathering key statistics and identifying top-performing organizations, handled by the Data Researcher. The news analysis task examines recent developments related to these companies, summarizing trends and market impacts, using output from city research and performed by the News Analyst. The report writing task synthesizes those findings into a comprehensive markdown report, completed by the Report Writer.
Finally, we just have to put it all together:
# Create and run the crew
crew = Crew(
    agents=[agents.researcher, agents.news_analyst, agents.report_writer],
    tasks=[city_research_task, news_analysis_task, report_writing_task],
    verbose=True,
    process=Process.sequential,
)Let’s test it!
city = "Seattle"
industry = "Hardware Companies"
report = generate_report(city, industry)
print(report)The agent’s intermediate steps are too detailed to include here, but the process begins by gathering key statistics for the specified industry and identifying relevant companies, followed by retrieving the latest news about those companies.
The results:
# Seattle Computer Hardware Industry Report
## Overview
The Computer Hardware Companies industry in Seattle comprises 24 organizations, including 4 public companies. The combined revenue of these companies is approximately $229.14 billion. This report highlights the top performers in this industry and recent news developments affecting them.
## Top Performers
1. **Microsoft Corporation**
   - **Revenue**: $198.27 billion
   - **Employees**: 221,000
   - **Status**: Public Company
   - **Mission**: To empower every person and organization on the planet to achieve more.
2. **Nvidia Corporation**
   - **Revenue**: $26.97 billion
   - **Employees**: 26,196
   - **Status**: Public Company
   - **Formerly Known As**: Mellanox Technologies and Cumulus Networks
3. **F5 Networks**
   - **Revenue**: $2.695 billion
   - **Employees**: 7,089
   - **Status**: Public Company
   - **Focus**: Multi-cloud cybersecurity and application delivery
4. **Quest Software**
   - **Revenue**: $857.415 million
   - **Employees**: 4,055
   - **Status**: Public Company
   - **Base**: California
5. **SonicWall**
   - **Revenue**: $310 million
   - **Employees**: 1,600
   - **Status**: Private Company
   - **Focus**: Cybersecurity
These companies significantly contribute to Seattle's economic landscape, driving growth and innovation in the hardware industry.
## Recent News and Developments
- **Microsoft Corporation**: Faces legal challenges with its Activision Blizzard acquisition, which could impact its gaming market strategy.
  
- **Nvidia Corporation**: Experiences strong demand for GPUs in China, highlighting its critical role in AI advancements and potentially boosting its market position.
- **F5 Networks**: Gains recognition for its cybersecurity solutions, enhancing its industry reputation.
- **Quest Software**: Launches a new data intelligence platform aimed at improving data accessibility and AI model development.
- **SonicWall**: Undergoes leadership changes and releases a threat report, emphasizing its focus on cybersecurity growth and challenges.
These developments are poised to influence market dynamics, investor perceptions, and competitive strategies within the industry.Note that the demo dataset is outdated; we don’t import news regularly.
Summary
Building an automated report-generation pipeline using agentic flows, Neo4j, and crewAI offers a glimpse into how LLMs can move beyond simple Q&A interactions. By assigning specialized tasks to a suite of agents and arming them with the right tools, we can orchestrate a dynamic, exploratory workflow that pulls relevant data, processes it, and composes well-structured insights.
Through this approach, agents collaborate to uncover key statistics about an industry in a given city, gather the latest news, and synthesize everything into a polished markdown report. This goes to show that LLMs can be deployed in creative, multi-step processes, enabling sophisticated use cases like automated business intelligence and data-driven content creation.
The code is available on GitHub.
Implementing an Automated Report-Generation Agent was originally published in Neo4j Developer Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.








