Knowledge Integration¶
Agentle provides a powerful knowledge integration system that allows agents to leverage information from various sources when generating responses. This feature is particularly useful for building specialized agents that need domain-specific knowledge beyond their pre-trained capabilities.
Basic Knowledge Integration¶
Here’s a simple example of integrating knowledge with an agent:
from agentle.agents.agent import Agent
from agentle.agents.knowledge.static_knowledge import StaticKnowledge
from agentle.generations.providers.google.google_genai_generation_provider import GoogleGenaiGenerationProvider
# Create an agent with static knowledge
travel_expert = Agent(
name="Japan Travel Expert",
generation_provider=GoogleGenaiGenerationProvider(),
model="gemini-2.0-flash",
instructions="You are a Japan travel expert who provides detailed information about Japanese destinations.",
# Provide static knowledge from multiple sources
static_knowledge=[
# Include knowledge from local documents
StaticKnowledge(content="data/japan_travel_guide.pdf", cache=3600), # Cache for 1 hour
# Include knowledge from websites
StaticKnowledge(content="https://www.japan-guide.com/", cache="infinite"), # Cache indefinitely
# Include direct text knowledge
"Tokyo is the capital of Japan and one of the most populous cities in the world."
]
)
# The agent will incorporate the knowledge when answering
response = travel_expert.run("What should I know about visiting Tokyo in cherry blossom season?")
print(response.text)
Knowledge Source Types¶
The framework supports multiple knowledge source types through the StaticKnowledge
class:
Documents: PDF, DOCX, TXT, PPTX, and other document formats
URLs: Web pages and online resources
Raw Text: Direct text snippets
You can provide knowledge in several ways:
As strings: The framework will automatically convert them to
StaticKnowledge
objectsagent = Agent( # ... other agent settings ... static_knowledge=[ # URLs as strings "https://example.com/research-paper.html", # Local file paths as strings "data/report.pdf", "references/definitions.txt", ] )
As StaticKnowledge objects: For maximum control, especially with caching
from agentle.agents.knowledge.static_knowledge import StaticKnowledge agent = Agent( # ... other agent settings ... static_knowledge=[ # Document with 1 hour cache StaticKnowledge(content="data/report.pdf", cache=3600), # URL with infinite cache StaticKnowledge(content="https://example.com/api-docs", cache="infinite"), # Raw text with no cache StaticKnowledge(content="This is raw knowledge text", cache=None), ] )
Caching Behavior¶
The caching system works as follows:
If
cache
is not specified or isNone
, content is parsed fresh each timeIf
cache
is an integer, the content is cached for that many seconds (requires aiocache)If
cache
is the string “infinite”, the content is cached indefinitely until the process ends (requires aiocache)
To enable caching, you’ll need to install the optional aiocache package:
pip install aiocache
Caching is particularly useful for large documents or URLs that are expensive to parse repeatedly.
How Knowledge Integration Works¶
When you provide static knowledge to an agent:
The agent uses appropriate document parsers to extract content from each knowledge source
If caching is enabled and the aiocache package is installed, parsed content is cached for the specified duration
The parsed content is organized into a structured knowledge base format
This knowledge base is appended to the agent’s instructions
When the agent responds to queries, it can leverage this knowledge base
Custom Document Parsers¶
For specialized knowledge extraction needs, you can provide a custom document parser to the agent:
from agentle.agents.agent import Agent
from agentle.agents.knowledge.static_knowledge import StaticKnowledge
from agentle.parsing.parsers.file_parser import FileParser
# Create a custom document parser with specialized settings
custom_parser = FileParser(
strategy="high", # Use high-detail parsing
visual_description_agent=your_custom_vision_agent # Customize image analysis
)
# Create an agent with the custom parser
research_agent = Agent(
# ... other agent settings ...
static_knowledge=[
StaticKnowledge(content="research_papers/paper.pdf", cache=3600),
# ... other knowledge sources ...
],
document_parser=custom_parser
)
You can also create completely custom document parsers by implementing the DocumentParser
abstract base class:
from typing import override
from pathlib import Path
from agentle.parsing.document_parser import DocumentParser
from agentle.parsing.parsed_document import ParsedDocument
from agentle.parsing.section_content import SectionContent
# Create a custom parser
class CustomParser(DocumentParser):
"""Custom document parser implementation"""
@override
async def parse_async(self, document_path: str) -> ParsedDocument:
# Implement your custom parsing logic here
path = Path(document_path)
# For this example, we'll just use a placeholder
parsed_content = f"Content from {path.name} would be parsed with custom logic"
# Return in the standard ParsedDocument format
return ParsedDocument(
name=path.name,
sections=[
SectionContent(
number=1,
text=parsed_content,
md=parsed_content
)
]
)
# Use the custom parser with an agent
agent = Agent(
name="Document Expert",
# ... other agent settings ...
static_knowledge=[
StaticKnowledge(content="documents/report.pdf", cache="infinite")
],
# Pass your custom parser to the agent
document_parser=CustomParser()
)
Practical Example: Legal Assistant¶
Here’s a comprehensive example showing how to create a legal assistant with domain-specific knowledge:
from agentle.agents.agent import Agent
from agentle.agents.knowledge.static_knowledge import StaticKnowledge
from agentle.generations.providers.google.google_genai_generation_provider import GoogleGenaiGenerationProvider
from agentle.parsing.factories.file_parser_default_factory import file_parser_default_factory
# Create a legal assistant with domain-specific knowledge
legal_assistant = Agent(
name="Legal Assistant",
generation_provider=GoogleGenaiGenerationProvider(),
model="gemini-2.0-flash",
instructions="You are a legal assistant specialized in contract law. Help users understand legal concepts and review contracts.",
# Provide multiple knowledge sources with different caching strategies
static_knowledge=[
# Local document sources with caching
StaticKnowledge(content="legal_docs/contract_templates.pdf", cache=3600), # Cache for 1 hour
StaticKnowledge(content="legal_docs/legal_definitions.docx", cache="infinite"), # Cache indefinitely
# Online resources with caching
StaticKnowledge(content="https://www.law.cornell.edu/wex/contract", cache=86400), # Cache for 1 day
# Direct knowledge snippets (no need for caching)
"Force majeure clauses excuse a party from performance when extraordinary events prevent fulfillment of obligations."
],
# Optional: Use a custom document parser for specialized parsing needs
document_parser=file_parser_default_factory(strategy="high")
)
# The agent will leverage all provided knowledge when responding
response = legal_assistant.run("What should I look for in a non-disclosure agreement?")
print(response.text)
Best Practices¶
Caching Strategy: Use appropriate caching based on how frequently the source changes
Knowledge Organization: Organize knowledge into related documents rather than one large document
Quality over Quantity: Provide high-quality, relevant knowledge rather than overwhelming the agent
Test Different Sources: Experiment with different knowledge sources to find the best combination
Update Regularly: Keep knowledge sources updated, especially for domains that change frequently