Background
This is an internal Vention project focused on streamlining data retrieval across multiple sources, including Salesforce, internal storage, and marketing systems.
Vention's marketing team works with large volumes of distributed information, which makes it difficult to access the right data quickly and consistently. To address this, Vention’s AI team designed an agentic RAG system on Claude as a deliberate architectural solution.
The platform is intended to help marketers navigate information faster and with greater accuracy, enabling quicker responses to new requests for marketing materials.
Project description
Vention’s marketing team works with large volumes of data from multiple sources:
- Bragging lists that highlight the company’s key wins
- Proprietary research on client personas and markets
- Guidelines for specific tasks and marketing functions
- External and internal events
- Existing marketing materials
When a request for new marketing materials comes in, the main challenge lies in retrieving the latest and factually accurate data. Information is spread across multiple boards and owned by different teams and contributors, which makes it harder to access the right inputs quickly.
Vention designs agentic RAG systems on Claude and AWS Bedrock that turn scattered knowledge into usable outputs, taking you from PoC to production.
Our solution
Vention’s AI team worked closely with the Marketing department to design a RAG system that would meet the evolving needs of Vention’s marketing in the long term.
The solution included the following modules:
- A client web app that marketers used to submit requests and retrieve model outputs
- An orchestration layer built on AWS Bedrock Agents, routing requests between Claude Haiku and Sonnet and coordinating tool use
- An AWS-hosted data layer, including a data lake, vector database, and MCP server for accurate data retrieval
- A two-model (Haiku and Sonnet) RAG core that supported efficient and reliable information retrieval
- AWS Lambda functions that handled automations and routed requests across the system
Web app
The web app served as the primary touchpoint between the system and marketing specialists. The interface was designed as a chat-based workspace where users interacted with Claude Haiku using natural language, reviewed their conversation history, and attached files when needed.
Request processing via Haiku
As the most lightweight model in Claude’s lineup, Haiku was used to handle request decomposition and routing across the system.
When a user submitted an input, Haiku broke it down into more specific terms to form a clearer prompt, identified what information was needed to fulfill the request, and retrieved relevant data from storage.
Then, depending on the scenario, one of two paths followed:
- If the request focused on fact-checking (for example, “Do we have a case study with company X?”), Haiku returned a direct response
- If the request required a creative output (for example, “Help me build a factual base for a Y deck”), Haiku gathered the relevant information from internal storage, tracked the request, and passed it to Sonnet for generation along with the user’s input and context. After Sonnet generated the output, Haiku returned it to the user.
Internal storage, MCP, caching, and API integrations
To improve awareness and efficiency, the system was planned to integrate with marketing’s internal Monday boards and cache frequent requests (such as bragging lists or recent case studies) for faster retrieval.
A local SQLite database was used to cache frequent queries and return responses directly from cache where available, which reduced API calls and improved response times.
Through integration with Monday via an MCP gateway, the agentic RAG system added another layer of context to the Marketing department’s workflows. For example, when a user requested information on security services for a deck, the agent could surface not only the latest materials but also highlight related content in progress that was likely to be completed soon, which, in turn, helps teams make more informed decisions about what to include.
Key stats
Weeks to develop a PoC
Rework reduction
Less time spent on fetching internal information
Results
The work Vention’s team has done has led to a 40% reduction in time spent retrieving information from internal sources, which has allowed marketing specialists to focus more on creative tasks.
Improved access to information increased situational awareness and reduced the time spent adjusting content in marketing materials by about 15%. Greater efficiency carried over into day-to-day workflows, where relevant information became easier to find and use across tasks.
Following these results, the team moved into PoC development and testing, with the next steps focused on validating the Monday integration and preparing for the MVP phase and a broader rollout.
Tech stack
Frontend
Next.js
Cloud
AWS
AWS Lambda
Bedrock Agents
AI
Claude Haiku
Claude Sonnet via AWS Bedrock
Agent architecture
MCP Server
Storage and databases
Amazon S3
OpenSearch Serverless
Integrations
Monday.com
API
MCP
Auth/security
Amazon Cognito
Amazon Secrets Manager
FAQs
What is an agentic RAG system?
An agentic RAG system combines retrieval-augmented generation with an orchestration layer that can break down user requests, fetch relevant data, and decide how to process it. Instead of relying on a single prompt-response flow, it routes tasks across models, tools, and data sources to produce more accurate and context-aware outputs.
Why use two Claude models instead of one?
Vention’s system uses Claude Haiku and Sonnet for distinct roles. Haiku handles request decomposition, routing, and data retrieval, where speed and cost efficiency are critical. Sonnet supports more complex reasoning and content generation. The split keeps the system responsive while maintaining high output quality.
How does MCP work in this architecture?
MCP (Model Context Protocol) acts as a gateway between the model and external systems. It defines how the agent accesses tools such as APIs, databases, and internal platforms like Monday. In Vention’s setup, MCP enables structured data retrieval and consistent interaction between the RAG system and connected services.




