Real Time Data (From Google)
You can simply use the AI Engine add-on Web Search to have real time data added to your chatbot context.
OpenAI models are trained to respond that they can’t access live sources (like the internet) by default. So, if your query mentions “web search” or “internet,” you may get such a response. The filters (All Integration Types except Function Calling) add data from the web into the chatbot context, but the AI model itself doesn’t know where this data originates and won’t recognize it as coming from a web search. However, if you’re using a function-calling integration, the AI model will call this function to fetch live data directly.
How does it work?
If you are not using function calling, the system works as a RAG (Retrieval-Augmented Generation). Before it is actually sent, the user’s query will be transformed into a web request and sent through the API you choose in the add-on settings. The data from the web search will then be added to the current context for the ongoing discussion; you can think of it as embeddings in terms of how the data is added.
If you are using function calling, your chatbot can simply be asked to search the web, and it will run this call by itself. This will trigger the add-on web search query, and once it is done, the model will interpret the results.
Difference between Google and Tavily
Using Google will perform a Google search and have the same result (depending on your search engine settings) as if you were doing a manual search on the Google home page. This means you mostly have titles, links, and excerpts. The Google API isn’t a web crawler; you cannot have access to the content inside the searched web items; you cannot ask to have the content of a specific page. The only thing it does is a Google web search. Those results are then converted to text and handed to the model for interpretation.
If you are using Tavily, you will likely get better results. This will perform the same as Google but will also be able to send the content of the searched items, like a web crawler. It will also use AI to provide an already formulated response for the web search based on web results for your model to base its response on. You will also be able to receive image references (that can be displayed in the chatbot as well with Markdown format). This also allows you to do the other way around: send a link to any website in the chatbot, and the link will get crawled, so the chatbot can “visit” and understand any website link that is sent to it. Don’t be shy of using the “Advanced Settings” section where you can enable this behaviors.
How to check if my chatbot uses the web content?
You can go inside the AI Engine settings and enable the Dev Tools to have access to the log console. Whenever the Web Search add-on makes a search, you can see the result in this console. Also, you can check the context added to your chatbot in the discussion and/or the query tab as well.
Make sure that your chatbot’s context length is long enough to include the web search content. Also, if you are using embeddings and a merge integration, ensure that the embeddings are not taking up all of the context already, which might not let the web search content be added.
Function Calling & Assistants
For OpenAI Assistants to call functions, the function calls must be registered on the OpenAI servers. If you want to use this integration with an Assistant, make sure you have registered the function; otherwise, it won’t be able to call any functions, and nothing will happen. You can do so by clicking the “Set Function” button.
On your OpenAI playground the function should be registered as such.
If you’re unable to register the function due to some issues, you can manually add the function definition using the + button. Here’s an example:
{ "name": "mwaiAddonWebSearch", "description": "Search the web to retrieve additional/fresh information, answers, real-time data, etc.", "strict": false, "parameters": { "type": "object", "properties": { "query": { "type": "string", "description": "The terms to search for." } }, "required": [ "query" ] } }